[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
Maarten buis <[email protected]> |

To |
stata list <[email protected]> |

Subject |
st: ice command question about interactions |

Date |
Fri, 15 Jan 2010 10:08:46 -0800 (PST) |

A while ago Alan Acock asked a question on interactions in an imputation model: <http://www.stata.com/statalist/archive/2009-02/msg00602.html>. The main issue was that there is an emerging literature claiming that one should not use the -passive()- option in -ice- (see, -ssc d ice-), but instead create interactions, squares etc. in the un-imputed data, and impute these as if they were normal variables: John Graham (2009) "Missing Data Analysis: Making it Work in the Real World", Annual Review of Psychology, 60:549-576. Paul von Hippel (2009) "How to impute interactions, squares, and other transformed variables", Sociological Methodology, 39:265-291. Alan wrote: > ice allows us to passively estimate an interaction term by estimating > the main effects and then multiplying these together so the interaction > of X&Y will be the imputed X times the imputed Y. This seems necessary > to preserve the interpretation of the interaction. > > Graham says we need to include the interaction term. "The problem with > excluding such variables from the imputation model is that all > imputation is done under the assumption that the correlation is r = 0 > between the omitted variable and all other variables in the imputation." > This is the same argument that Graham makes for imputing the dependent > variable in the imputation (a sensible thing to do). > > I understand the importance of including the dependent variable when > doing multiple imputations, and see how Graham could apply this to the > interaction term, but it makes no sense to me to have an interaction of > X and Y not equal X*Y. Paul Allison responded: > Graham is right. In multiple imputation, interactions should be imputed > as though they are additional variables, not constructed by multiplying > imputed values. The same is true if you have x and x^2 in a model. The > x^2 term should be imputed just like any other variable, not constructed > by squaring the imputed values of x. While this principle may seem > counterintuitive, it is easily demonstrated by simulation that the more > "natural" way to do it produces biased estimates. I was skeptical and tried to do that simulation. I did not have much time, and I did not get the simulation right. I still posted it, in case my first attemp at a solution might be helpful to someone. Right now I am about to start a new imputation project, so I thought it was time to take this subject on again. I rewrote the simulation and ran it. This time the results seem more reasonable. It supported the claim by von Hippel and Graham, and showed -passive()- really seems to introduce some bias, and that first transforming and than imputing really reduces it. The true interaction effect was 1, with -passive()- it had a bias of -.14 (MC standard error = .0007), while the bias reduced to -.007 (MC standard error = .0002) without -passive()-. To run this simulation one needs: 1) a couple of hours, 2) -ice- (ssc install ice-), 3) -mim- (ssc install mim-), 4) -simsum- (see this talk at the last UK Stata Users' meeting: <http://ideas.repec.org/p/boc/usug09/08.html>) *-------------------- begin simulation ----------------------- set more off program drop _all program define sim, rclass drop _all matrix C = (1, .25, .25 \ .25, 1, .25 \ .25, .25, 1) drawnorm x1 x2 x3, n(250) corr(C) gen x12= x1*x2 gen y = x1 + x2 + x3 + x12 + .25*rnormal() replace x1 = . if runiform() < invlogit(-2 - y + x3) replace x2 = . if runiform() < invlogit(-2 - y + x3) ice y x1 x2 x3 x12, m(5) clear passive(x12:x1*x2) mim, storebv : reg y x1 x2 x3 x12 return scalar b = _b[x1] return scalar se = _se[x1] return scalar b12 = _b[x12] return scalar se12 = _se[x12] keep if _mj ==0 drop _m* ice y x1 x2 x3 x12, m(5) clear mim, storebv : reg y x1 x2 x3 x12 return scalar hb = _b[x1] return scalar hse = _se[x1] return scalar hb12 = _b[x12] return scalar hse12 = _se[x12] end timer clear 1 timer on 1 simulate b=r(b) se=r(se) b12=r(b12) se12=r(se12) /// hb=r(hb) hse=r(hse) hb12=r(hb12) hse12 = r(hse12), /// reps(10000) : sim timer off 1 timer list simsum b hb, true(1) se(se hse) mcse simsum b12 hb12 , true(1) se(se12 hse12) mcse *------------------------- end simulation ------------------------ ( For more on how to use examples I sent to statalist see: http://www.maartenbuis.nl/stata/exampleFAQ.html ) Hope this helps, Maarten -------------------------- Maarten L. Buis Institut fuer Soziologie Universitaet Tuebingen Wilhelmstrasse 36 72074 Tuebingen Germany http://www.maartenbuis.nl -------------------------- * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

- Prev by Date:
**st: using adjust with mim** - Next by Date:
**st: RE: graph bar stacked loop question** - Previous by thread:
**st: using adjust with mim** - Next by thread:
**st: ice command question about interactions** - Index(es):

© Copyright 1996–2024 StataCorp LLC | Terms of use | Privacy | Contact us | What's new | Site index |