Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down at the end of May, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: multimple imputation in steps??


From   "Ploutz-Snyder, Robert (JSC-SK)[USRA]" <robert.ploutz-snyder-1@nasa.gov>
To   "statalist@hsphsun2.harvard.edu" <statalist@hsphsun2.harvard.edu>
Subject   st: multimple imputation in steps??
Date   Wed, 7 Apr 2010 14:41:01 -0500

Colleagues;

I have a dataset with some missing data and am considering mi to get more robust modeling, but I can't quite figure out the right methodology for my situation.  I posted a question to the statalist earlier (no reply) but this is a different question.

Consider a dataset where x1-x5 are separate but correlated variables that measure "factor1"  and further that X6-x10 all tap into "factor2."  

Using mi impute in a very general sense, I could simply impute all missing items in one step by using

mi impute x1-x10

But that ignores the shared variance structure of the factors (nothing on the right side of the impute equation).

Instead I would like to generate factor scores prior to imputations, which will result in factor1 and factor2 scores for subjects who are missing no data.  Then the second step would be to use mi impute to impute the FACTOR scores from the items loading on the factors, so that the factor loading structure is preserved in the imputed data.  This assumes that the factor structure for subjects with any missing data match the factor structure for complete subjects, but I'm willing to assume that.

So following the factor analysis and predict statements, I could use

mi impute mvn factor1=x1-x5

...and then separately 

mi impute mvn factor2=x6-x10

But this results in different samples in the m>0 imputations, and I can't use mi estimate commands.  


What is the solution here?  How can we use mi impute that preserve the underlying covariance structure among the data, i.e. allows different modeling of missing data based on what we know about the data a-priori, but then also use all of the m>0 imputations in estimation commands following imputation?


Thanks in advance.
Rob

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index