Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down at the end of May, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Is a simple dummy imputation a valid procedure?


From   Maarten Buis <maartenlbuis@gmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: Is a simple dummy imputation a valid procedure?
Date   Tue, 12 Jul 2011 13:45:05 +0200

On Tue, Jul 12, 2011 at 12:11 PM, Andrea Bennett asked:
> I know there are more advanced methods to deal with missing data (ICE or mi). But if one is not interested in the variable containing missing values per se and only wants to include this variable as an additional control because the distribution of missing values of this particular variable is not quite equal between a control and a treatment group, is it then ok to use the median income of students in the same class and impute this value where a students income is missing, add a dummy for where there was a missing value and interact the two?

On Tue, Jul 12, 2011 at 12:55 PM, daniel klein responded:
> According to Allison(2002) you cannot do this. Not only will the point
> estimate for the variable with missing values be biased, but also the
> point estimates for other variables, even if the data were MCAR
> (Allison 2002: 9-10), in which case listwise deletion may be an
> acceptable alternative to imputation.
>
> Allison, P.D. (2002) Missing Data. Thousand Oaks, CA: Sage.

There is a bit more to this than that. If you look at footnote 4 of
Allison (2002) you can see that there is a special case where this
dummy method does make sense: This will happen when there was a
special reason why income is missing, e.g. income is only observed
when one has a job. If you use the dummy variable method in that case
the coefficient income is the effect income when one has a job, and
the dummy compares people with average income and a job to people
without a job. In these cases the missing values you see in Stata
aren't real missing values in the sense that a value needs to exist in
before it to be missing, and in observations without a job income does
not exist so it cannot be missing. If you have genuine missing values
on income, e.g. persons have an income but refuse to tell it you, than
Daniel's remark is correct and you cannot use that dummy method.

Hope this helps,
Maarten

--------------------------
Maarten L. Buis
Institut fuer Soziologie
Universitaet Tuebingen
Wilhelmstrasse 36
72074 Tuebingen
Germany


http://www.maartenbuis.nl
--------------------------

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index