Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: mi impute, conditional()

From   A Loumiotis <>
Subject   Re: st: mi impute, conditional()
Date   Fri, 25 Oct 2013 13:15:27 +0300

I think there are two cases to consider here and your question
concerns case 2 although the specific code you provided is more
relevant with case 1.

Case 1: Restriction on a complete variable

In this case, an equation specific -if- restriction should be used and
not an equation specific -conditional()- option.  In addition, the
conditional variable being imputed does not need to be constant
outside of the conditional restriction.  So for the example that you
provided you could use the -if- restriction instead and there will be
no problems with price being not constant outside the conditioned
sample.  You might though need to use the -force- option if the
variable price has missing values outside the conditioned sample and
variable price is used as a covariate in other prediction equations.

Case 2: Restriction on an incomplete variable

Here we need to use an equation specific -conditional()- option since
the restrictions are defined in terms of variables that are itself
missing.  Also the missing values of the conditioning variables should
be nested within the missing values of the conditional variable being
imputed.  Imputation of missing values of the conditional variable
occurs only if the restriction is satisfied based on non-missing
values or imputed values of the conditioning variables.  The question
I think you are posing is what to do with the missing values of the
conditional variable whenever the conditioning variables are imputed
in such a way that the restriction is not satisfied.  Since -mi- works
only if the conditional variable takes a constant value outside of the
restriction then -mi- imputes these values to that constant.

What are the alternatives? Suppose we make -mi, conditional()- work
even if the conditional variable is not constant outside the
restriction.  But how would we want -mi- to handle the missing values
of the conditional variable whenever the conditioning variables are
imputed in such a way that the restriction is not satisfied?  I see
two alternatives: a. keep them missing, b. impute them. If we do
decide to impute them why would we need the condition in the first

Sometimes what we really want is not to use a conditional prediction
equation but a prediction equation that the range of imputed values
are different for different conditions.  This is possible when using
-intreg- as the imputation method.  It would be convenient if it
becomes possible for a prediction of a nominal variable especially in
the case of the imputation of panel data surveys.  This would be
helpful to solve some of the issues discussed in


On Thu, Oct 24, 2013 at 10:40 PM, Stas Kolenikov <> wrote:
> I have trouble understanding how to specify the -conditional()-
> option, and why it has the limitations that it does. I want to impute
> missing values for all observations in the data set, but only use a
> part of the data set to calibrate the imputation model on. I thought I
> would achieve this with -conditional()- options, but ran into
> limitations that I don't understand.
> Suppose I want to impute the repair record using the model only on the
> domestic cars, as prices for the foreign cars work in a way that
> different from what I am interested in (however, I am happy to keep
> the existing values, if there are any, for foreign cars, and impute
> the missing values, but all using only the model based on the domestic
> cars). So I am trying
> sysuse auto, clear
> set seed 10203
> mi set wide
> mi register imputed rep78
> mi register regular price weight length foreign
> mi impute chained (pmm, cond( foreign==0 ) ) rep78 = price weight
> length foreign, add(5)
> I get an error:
> conditional(): imputation variable not constant outside conditional sample;
>     rep78 is not constant outside the subset identified by
> (foreign==0) within the
>     imputation sample
>  -- above applies to specification (pmm , cond( foreign==0 )) rep78 =
> price weight length
>     foreign
> Why does -mi impute- bother checking what's going on with the
> complement of the conditional sample? I understand the medical
> examples that the manual gives (I agree that it does not make sense to
> ask males about the number of pregnancies, or non-smokers about the
> number of cigarettes), but that's not a check that would be relevant
> for every situation. Insisting on a single non-missing values outside
> of the conditional sample (a version of the monotone missing data
> pattern, I guess) is extremely constricting on behalf of -mi
> impute-... which is supposed to be very general, right? In my example,
> I don't see any reason why my data should have this restrictive set
> up. With -webuse nlswork-, I may have been using the salary equation
> from the 1970s to calibrate the imputation model, and apply the model
> to the data from 1980s, which may have some real and some missing data
> that would not fit the expectations of the -conditional()- approach
> here.
> Is there an override to turn off this check for
> out-of-conditional-sample constant value?
> -- Stas Kolenikov, PhD, PStat (ASA, SSC)
> -- Senior Survey Statistician, Abt SRBI
> -- Opinions stated in this email are mine only, and do not reflect the
> position of my employer
> --
> *
> *   For searches and help try:
> *
> *
> *
*   For searches and help try:

© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index