Multiple imputation of incomplete count data - the countimp package in R | Dr. Kristian Kleinke

Multiple imputation of incomplete count data - the countimp package in R

Abstract

Count data are typically not normally distributed. A count variable is often skewed and sometimes also zero-inflated, meaning that there is quite a large number of zeros in that variable. Count data therefore require special data analysis techniques like for example Poisson regression, negative binomial regression or zero-inflation models. The non-normal distribution of a count variable must also be taken into account, when missing data in a count variable are to be imputed. Research by Yu, Burton, and Rivero-Arias (2007) for example suggests that the use of multiple imputation (MI) techniques should be avoided, when the distribution of the empirical data deviates too strongly from the distributional assumptions of the selected imputation procedure. Kleinke and Reinecke (2013) have proposed multiple imputation procedures for various types of count data. We first give a brief overview of the imputation functions from the R package countimp. We then compare the performance of our count data imputation procedures against several standard MI procedures and other ad hoc missing data methods. Finally, we demonstrate, how to impute missing count data with the countimp functions using empirical data from the CRIMOC project (www.crimoc.org).

Date
Sep 20, 2016 10:00 AM
Event
2016 Eurocrim conference
Location
Münster, Germany