Multiple imputation of incomplete zero‐inflated count data | Dr. Kristian Kleinke

Multiple imputation of incomplete zero‐inflated count data

Abstract

Empirical count data are often zero-inflated and overdispersed. Currently, there is no software package that allows adequate imputation of these data. We present multiple imputation routines for these kinds of count data based on a Bayesian regression approach or alternatively based on a bootstrap approach that work as add-ons for the popular multiple imputation by chained equations (mice) software in R (VAN BUUREN and GROOTHUIS-OUDSHOORN, Journal of Statistical Software, vol. 45, 2011, p. 1). We demonstrate in a Monte Carlo simulation that our procedures are superior to currently available count data procedures. It is emphasized that thorough modeling is essential to obtain plausible imputations and that model mis-specifications can bias parameter estimates and standard errors quite noticeably. Finally, the strengths and limitations of our procedures are discussed, and fruitful avenues for future theory and software development are outlined.

Publication
Statistica Neerlandica, 67(3), 311–336