Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: remaining missings after multiple imputation

From   "Anne Jurczok" <>
To   <>
Subject   st: remaining missings after multiple imputation
Date   Mon, 19 Apr 2010 14:39:29 +0200

I am currently trying to do a multiple imputation for a dataset about
affluence and wealth.  My sample consists of 472 households. I started off
with ice but switched to the mi-commands of Stata 11. The following problem
occurred for which I would kindly ask for your advice and help: Not all
cases of my dataset are considered for the imputation.
The main variable of interest is ?lnvermgen? (variable of the asset for one
household). However, some predictor variables which I chose (according to
van Buuren/Bohuizen/Knook 1999) have missing data as well. I created a
logarithm of the continuous variables to relaxe the assumption of
multivariate normality. Additionally, I transformed the categorical
variables into dummies in order to be able to use mi impute mvn (suggested
by Allison 2002). I decided against using ice, since I have different types
of missings in my dataset (hard and soft
missings) and I couldn?t find any literature about different types of
missings handled by ice.
Therefore, my model is based on the mi impute mvn command, because my data
are MAR, non-monotone and have multivariate missings.
mi impute mvn lnvermgen v1_14_p v1_16_p_5 v1_16_p_7 stib_gen_b v3_3_b
v3_5 v3_9 ///
v3_16_b v3_28 v3_29 erbsumme = v1_14_b v1_16_b v3_1_2 v3_2 v3_8_2
v3_8_4 v3_8_7 ///
v3_40, add(20) force

I receive the following output:

Multivariate imputation            Imputations = 10
Multivariate normal regression     added = 10
Imputed: m=1 through m=10          updated =  0

Prior: uniform                      Iterations = 1000
	                            burn-in = 100
	                            between = 100

	   Observations per m
Variable   complete incomplete imputed	total
lnvermgen	232		95		73	327
v1_14_p	320		7		5	327
v1_16_p_3	323		4		3	327
v1_16_p_5	323		4		3	327
v1_16_p_6	323		4		3	327
v1_16_p_7	323		4		3	327
stib_gen_b	325		2		1	32
 v3_3_b	289		38		23	327
 v3_5      304		23		18	327
 v3_9      288		39		28	327
_16_b      321		6		1	327
v3_28      323		4		3	327
v3_29      311		16		11	327
erbsumme 	322		5		4	327
(complete + incomplete = total; imputed is the minimum across m
  of the number of filled in observations.)

Not all cases of my dataset are considered for the imputation.
E.g. lnvermgen had 137 missings in 472 cases, only 73 of them were imputed,
same happened with the other imputed variables I researched in the handbook
and the statalist archive; moreover, I  
searched google, but couldn?t find a hint on this specific problem.   
Also, I rechanged the dataset and the model in different ways, but received
the same output.
Therefore my question: does anybody know why not all missings are considered
in the process of imputation? And how can I solve this problem?
Thank you in advance,
Anne Jurczok

Anne Jurczok
Universitaet Potsdam
Humanwissenschaftliche Fakultaet
Department for Education

*   For searches and help try:

© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index