--- On Sun, 15/2/09, Paul Allison <allison@soc.upenn.edu> wrote: > Graham is right. In multiple imputation, interactions should > be imputed as though they are additional variables, not > constructed by multiplying imputed values. The same is true if > you have x and x^2 in a model. The x^2 term should be imputed > just like any other variable, not constructed by squaring > the imputed values of x. While this principle may seem > counterintuitive, it is easily demonstrated by simulation that > the more "natural" way to do it produces biased estimates. If Paul had said hat one should not impute y x z and after one has created the imputed datasets create x2 by squaring x, then I would have agreed immediately. However, this is not how the -passive()- option works. As I understand it, the -passive()- option implies that while other variables are imputed the full model, including interactions, polynomial terms, etc., is used. Only when during the Gibbs sample for example a square terms is imputed, is the knowledge about the deterministic relationship between the variables used. So, the imputation model does include the non-linearity / interaction terms but it also respects the deterministic relationship between interaction terms, polynomial terms, etc. So I expected it to be superior to a model that adds noise where none exists (e.g. in the relationship between x and x square). So I took up Paul's challenge and created the simulation below. The procedure proposed by Paul does seem to result in biased estimates, but the -passive()- option seems to perform worse. This is unexpected for me, as the imputation model is exactly correct for this data (I created the data that way), and in the past I got simulations showing unbiased estimates from -ice- models, so I expected that at least one of the two would be unbiased. This suggest to me that there is an error in my simulation, but I can't find it. I also sent this message to Patrick Royston, who is to the best of my knowledge on the statalist. Maybe he can spot the error. -- Maarten *---------------------- begin simulation ---------------------- capture program drop sim program define sim, rclass drop _all matrix C = (1, .5, .5 \ .5, 1, .5\ .5, .5, 1) drawnorm w x z, n(400) corr(C) gen x2 = x^2 gen y = x + x2 + z + w + rnormal(0,.5) replace x = . if runiform() < invlogit(y - z -3) replace x2 = . if x == . replace w = . if runiform() < invlogit(y + z -4) reg y x x2 z w return scalar cc = _b[x] preserve ice y x x2 z w, m(5) clear micombine reg y x x2 z w return scalar full = _b[x] restore ice y x x2 z w, m(5) clear passive(x2:x^2) micombine reg y x x2 z w return scalar pas = _b[x] end sim exit simulate cc=r(cc) full=r(full) pas=r(pas), reps(1000) : sim twoway kdensity cc || kdensity full || kdensity pas , xline(1) *--------------------- end simulation --------------------------- ----------------------------------------- Maarten L. Buis Department of Social Research Methodology Vrije Universiteit Amsterdam Boelelaan 1081 1081 HV Amsterdam The Netherlands visiting address: Buitenveldertselaan 3 (Metropolitan), room N515 +31 20 5986715 http://home.fsw.vu.nl/m.buis/ ----------------------------------------- > ----------------------------------------------------------------- > Paul D. Allison > Department of Sociology > University of Pennsylvania > > > > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/statalist/faq > * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

