Statalist The Stata Listserver


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: ice multiple imputation -- negative values and values outside prediction range


From   Maarten buis <maartenbuis@yahoo.co.uk>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: ice multiple imputation -- negative values and values outside prediction range
Date   Thu, 8 Mar 2007 02:52:40 +0000 (GMT)

--- SARAH K BRUCH <bruch@wisc.edu> wrote:
> I am using ice for multiple imputation and have some questions
> regarding how to handle predicted negative values (for variables that
> should only be positive, e.g., income) and predicted values that are
> considerably outside of the range of acceptable values for a given
> variable. 
> 
> Several of my variables are cognitive achievement scores for
> children. When I impute using standardized scores, the means of the
> imputed variables are extremely different (on the order of 5 times
> the size) from the mean values for the unimputed versions. When I use
> the raw scores, the means of the imputed and unimputed versions are
> very similar, although the range of values continues to be outside
> the range of allowable values for the variable (e.g., some children
> are assigned negative cognitive scores). So, I have two questions:
> 
> (1) why would the standardized and raw score versions of the
> variables yield such different imputed values?
> (2) what is the appropriate way to deal with imputed values that are
> outside of the "acceptable" range?
> 
> Additionally, is there an option in ice that forces predictions into
> the range of existing values? And, if so, is it advisable to use it?

Sounds like there is something wrong with your imputation model. Have
you checked if it has converged? -ice- will give you imputed datasets
even if it hasn't converged and it will not give you a warning, it's up
to you to check. You can do that with the trace option, which stores
for each cycle for each variable the mean of the imputed values. The
model has converged if for each variable the plot of the mean against
the cycle number looks like white noice (no trend, just random jumps up
and down).

One thing that can go wrong is if you subdivided your sample (either by
running -ice- on different subsamples, or by adding the interaction
terms) in such a way that for some subdivisions you imputation model
contains more parameters than complete observations.

You can force a value to be postive by imputing the log of that
variable. You might want to do that with passive imputation if you want
to enter income instead of log income as a predictor for the other
equations. However, given the weird results you get I would focus now
on getting the model right.

Hope this helps,

Maarten

-----------------------------------------
Maarten L. Buis
Department of Social Research Methodology
Vrije Universiteit Amsterdam
Boelelaan 1081
1081 HV Amsterdam
The Netherlands

visiting address:
Buitenveldertselaan 3 (Metropolitan), room Z434

+31 20 5986715

http://home.fsw.vu.nl/m.buis/
-----------------------------------------


		
___________________________________________________________ 
Copy addresses and emails from any email account to Yahoo! Mail - quick, easy and free. http://uk.docs.yahoo.com/trueswitch2.html
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index