[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Re: ice command question about interactions

From   Maarten buis <>
Subject   Re: st: Re: ice command question about interactions
Date   Sun, 15 Feb 2009 22:17:46 +0000 (GMT)

--- On Sun, 15/2/09, Paul Allison <> wrote:
> Graham is right. In multiple imputation, interactions should
> be imputed as though they are additional variables, not
> constructed by multiplying imputed values. The same is true if
> you have x and x^2 in a model. The x^2 term should be imputed
> just like any other variable, not constructed by squaring
> the imputed values of x. While this principle may seem
> counterintuitive, it is easily demonstrated by simulation that
> the more "natural" way to do it produces biased estimates.  

If Paul had said hat one should not impute y x z and after one 
has created the imputed datasets create x2 by squaring x, then 
I would have agreed immediately. However, this is not how the 
-passive()- option works. As I understand it, the -passive()- 
option implies that while other variables are imputed the full 
model, including interactions, polynomial terms, etc., is used. 
Only when during the Gibbs sample for example a square terms is 
imputed, is the knowledge about the deterministic relationship 
between the variables used. So, the imputation model does 
include the non-linearity / interaction terms but it also 
respects the deterministic relationship between interaction 
terms, polynomial terms, etc. So I expected it to be superior 
to a model that adds noise where none exists (e.g. in the 
relationship between x and x square). So I took up Paul's 
challenge and created the simulation below.

The procedure proposed by Paul does seem to result in biased
estimates, but the -passive()- option seems to perform worse.
This is unexpected for me, as the imputation model is exactly
correct for this data (I created the data that way), and in 
the past I got simulations showing unbiased estimates from
-ice- models, so I expected that at least one of the two would
be unbiased. This suggest to me that there is an error in my
simulation, but I can't find it. I also sent this message to
Patrick Royston, who is to the best of my knowledge on the
statalist. Maybe he can spot the error.

-- Maarten

*---------------------- begin simulation ----------------------
capture program drop sim
program define sim, rclass
	drop _all
	matrix C = (1, .5, .5 \ .5, 1, .5\ .5, .5, 1)
	drawnorm w x z, n(400) corr(C)
	gen x2 = x^2
	gen y = x + x2 + z + w + rnormal(0,.5)
	replace x = . if runiform() < invlogit(y - z -3)
	replace x2 = . if x == .
	replace w = . if runiform() < invlogit(y + z -4)
	reg y x x2 z w
	return scalar cc = _b[x]
	ice y x x2 z w, m(5) clear
	micombine reg y x x2 z w
	return scalar full = _b[x]	
	ice y x x2 z w, m(5) clear passive(x2:x^2)
	micombine reg y x x2 z w
	return scalar pas = _b[x]	
simulate cc=r(cc) full=r(full) pas=r(pas), reps(1000) : sim
twoway kdensity cc || kdensity full || kdensity pas , xline(1)
*--------------------- end simulation ---------------------------

Maarten L. Buis
Department of Social Research Methodology
Vrije Universiteit Amsterdam
Boelelaan 1081
1081 HV Amsterdam
The Netherlands

visiting address:
Buitenveldertselaan 3 (Metropolitan), room N515

+31 20 5986715

> -----------------------------------------------------------------
> Paul D. Allison
> Department of Sociology
> University of Pennsylvania
> *
> *   For searches and help try:
> *
> *
> *


*   For searches and help try:

© Copyright 1996–2015 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index