Notice: On March 31, it was **announced** that Statalist is moving from an email list to a **forum**. The old list will shut down on April 23, and its replacement, **statalist.org** is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
"JVerkuilen (Gmail)" <jvverkuilen@gmail.com> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: Multiple Imputation in Longitudinal Multilevel Model |

Date |
Wed, 6 Mar 2013 10:00:26 -0500 |

On Wed, Mar 6, 2013 at 9:32 AM, Stas Kolenikov <skolenik@gmail.com> wrote: > 1. Are read1-read3 and math1-math3 three measurements taken at the > same time for a given individual, or measurements taken over three > periods? If the former, then your model is "flat", as it does not > recognize and utilize the longitudinal/multilevel nature of the data. Yes, you need to put that in, which can be quite challenging. Usually you need to add in some independent variables to capture the time and panel trend aspects. If you can afford to add in dummies for each group (i.e., fixed effects) it's worth it, and for the time structure a linear, quadratic and cubic term, or some kind of regression spline structure is also worth considering. > 2. Once you've done -ice-, don't touch anything (let alone anything as > drastic as -drop if _mj==0-), and use -mi: estimate- for everything. I > don't really know how well either -mi- or -ice- go with -reshape-, but > I suspect that if not done properly, it will screw up the delicate > mechanics of -mi-. And given that you can use chained equations in MI, I'd really suggest doing things with MI directly, not -ice-. Nothing bad about -ice-, but being able to run entirely in MI is likely to be much easier. > 3. I agree with Jay that 4 imputations are woefully insufficient. I > have heard the arguments that you don't see much Monte Carlo > variability beyond 5 imputations, but I can put two arguments in favor > of a much greater number, like M=50: first, you don't explore the > multivariate space of missing data enough (M=5 may be OK for a > univariate mean, but I can't see how it can work for a 30-dimensional > space), and second, I want my minimum degrees of freedom to be greater > than the nominal sample size, so that the limitation on the accuracy > really comes from the data rather than the computer. The original argument came from Don Rubin doing some calculations on univariate means and OLS regression coefficients. It really doesn't extend past that. Kenward & Carpenter did some work on this suggesting that you should have many more imputations. This is discussed in the MI manual, p. 5, with citations. But it depends on what you want to know, so for a univariate mean it's no big deal and you can get away with small imputations whereas if you're doing logistic regression on relatively rare events you need to have many more. > 4. If you are bringing additional variables to the -xtmixed- model, > you would probably have been better off using these variables in > imputation. You had a reason to believe that they affected the > response, and for that same reason they should be in the imputation > model. I'll go one step further: The imputation model needs to be more comprehensive than the analysis model. * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/faqs/resources/statalist-faq/ * http://www.ats.ucla.edu/stat/stata/

**References**:**st: Multiple Imputation in Longitudinal Multilevel Model***From:*Anthony Fulginiti <fulginitipsy@yahoo.com>

**Re: st: Multiple Imputation in Longitudinal Multilevel Model***From:*"JVerkuilen (Gmail)" <jvverkuilen@gmail.com>

**Re: st: Multiple Imputation in Longitudinal Multilevel Model***From:*Stas Kolenikov <skolenik@gmail.com>

- Prev by Date:
**Re: st: macro issue** - Next by Date:
**st: ln transform and box cox** - Previous by thread:
**Re: st: Multiple Imputation in Longitudinal Multilevel Model** - Next by thread:
**Re: st: Multiple Imputation in Longitudinal Multilevel Model** - Index(es):