Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: mi passive - When is it executed?

From   daniel klein <>
Subject   Re: st: mi passive - When is it executed?
Date   Tue, 12 Jun 2012 01:42:45 +0200


I might be wrong here, but I think your syntax does not what you want it to do.

First, I understand you want to do passive imputation (Royston, 2005).
That is, you do not want to create age squared (and higher order
terms) in your dataset, and impute these variables as "just another
variable" (Von Hippel, 2009). Yet, I think your code is closer to the
latter approach. It generates three passive variables, and does so
once only. The variables are created before the imputation algorithm

The way I understand the documentation of -mi passive-, you want to
use it after you have imputed your variables. Performing passive
impuations with -mi impute chained-, I guess you want to change the

mi impute chained (reg) ageimp (reg, include(age2imp age3imp age4imp)
lnearnimp [...]

to something like

mi impute chained (reg) ageimp (reg,
include((ageimp^2)(ageimp^3)(ageimp^4))) lnearnimp [...]

Also see [MI] Example 6, pp. 156.

Others might have a deeper understanding.


Royston, P. (2005). Multiple imputation of missing values: update.
Stata Journal 5 (2), 188-201.

Von Hippel, P. T. (2009). How to impute interactions, squares, and
other transformed variables. Sociological Methodology 39, 265–291.

I am wondering when, during the imputation step, Stata actually
generates a passive variable. I am currently using Stata 12's "mi
impute chained" to execute imputation of about 8 different variables,
including age and earnings. I want to use a quartic in age when
imputing earnings. [...] In both the burn-in and imputation steps, I
want Stata to first impute the value of age where it is missing (which
it does automaticall). Then I want it to generate the age quartic
based on the imputed values. Finally, I want it to use the age quartic
to impute log earnings (which it also does, but I'm not sure if the
quartic is using the latest imputed values) It's not clear to me from
reading the manual that Stata is smart enough to know when I want
these things calculated, though. So does anyone know when a passive
variable is generated during the imputation
 process? Is it doing it before it imputes earnings? Or at the very end?

A much-abbreviated version of the code I have a question about goes as follows:

mi set wide
mi register: ageimp lnearnimp
mi passive: gen age2imp = ageimp*ageimp
mi passive: gen age3imp = age2imp*ageimp
mi passive: gen age4imp = age3imp*ageimp
mi impute chained (reg) ageimp (reg, include(age2imp age3imp age4imp)
lnearnimp, [some options here] add(10)

*   For searches and help try:

© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index