Hot deck imputation
|
Speakers |
Adrian P. Mander and David G. Clayton, MRC
Biostatistics Group, Cambridge
|
Missing data can be a serious problem in many statistical analyses. The
problems are manifested as a loss of efficiency or possible bias. If the data
is Missing Completely At Random (MCAR) then analysis of the data using
case deletion (ignoring the lines of data with any missing) will give
unbiased answers but inflated confidence intervals. Missing At Random
(MAR) can lead to bias if the data is analysed using case deletion. There are
solutions using likelihood based approaches; however, these may rely on
assumptions on the missing data process. Imputation is an alternative which
can solve MCAR and MAR problems with relatively few assumptions. The function
that will be discussed only imputes values for one variable in the dataset.
The imputation method involves bootstrapping the observed data values and
hence is called hotdeck imputation, a term first used by the survey
practitioners. The method will be illustrated by using an example taken from a
two-stage sample case-control study design. A comparison is made between the
biased case deletion analysis and the imputed analysis.
|
|