Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
A Loumiotis <[email protected]> |

To |
[email protected] |

Subject |
Re: st: mi impute, conditional() |

Date |
Fri, 25 Oct 2013 13:15:27 +0300 |

I think there are two cases to consider here and your question concerns case 2 although the specific code you provided is more relevant with case 1. Case 1: Restriction on a complete variable In this case, an equation specific -if- restriction should be used and not an equation specific -conditional()- option. In addition, the conditional variable being imputed does not need to be constant outside of the conditional restriction. So for the example that you provided you could use the -if- restriction instead and there will be no problems with price being not constant outside the conditioned sample. You might though need to use the -force- option if the variable price has missing values outside the conditioned sample and variable price is used as a covariate in other prediction equations. Case 2: Restriction on an incomplete variable Here we need to use an equation specific -conditional()- option since the restrictions are defined in terms of variables that are itself missing. Also the missing values of the conditioning variables should be nested within the missing values of the conditional variable being imputed. Imputation of missing values of the conditional variable occurs only if the restriction is satisfied based on non-missing values or imputed values of the conditioning variables. The question I think you are posing is what to do with the missing values of the conditional variable whenever the conditioning variables are imputed in such a way that the restriction is not satisfied. Since -mi- works only if the conditional variable takes a constant value outside of the restriction then -mi- imputes these values to that constant. What are the alternatives? Suppose we make -mi, conditional()- work even if the conditional variable is not constant outside the restriction. But how would we want -mi- to handle the missing values of the conditional variable whenever the conditioning variables are imputed in such a way that the restriction is not satisfied? I see two alternatives: a. keep them missing, b. impute them. If we do decide to impute them why would we need the condition in the first place? Sometimes what we really want is not to use a conditional prediction equation but a prediction equation that the range of imputed values are different for different conditions. This is possible when using -intreg- as the imputation method. It would be convenient if it becomes possible for a prediction of a nominal variable especially in the case of the imputation of panel data surveys. This would be helpful to solve some of the issues discussed in http://www.stata.com/statalist/archive/2013-10/msg00062.html Antonis On Thu, Oct 24, 2013 at 10:40 PM, Stas Kolenikov <[email protected]> wrote: > I have trouble understanding how to specify the -conditional()- > option, and why it has the limitations that it does. I want to impute > missing values for all observations in the data set, but only use a > part of the data set to calibrate the imputation model on. I thought I > would achieve this with -conditional()- options, but ran into > limitations that I don't understand. > > Suppose I want to impute the repair record using the model only on the > domestic cars, as prices for the foreign cars work in a way that > different from what I am interested in (however, I am happy to keep > the existing values, if there are any, for foreign cars, and impute > the missing values, but all using only the model based on the domestic > cars). So I am trying > > sysuse auto, clear > set seed 10203 > mi set wide > mi register imputed rep78 > mi register regular price weight length foreign > mi impute chained (pmm, cond( foreign==0 ) ) rep78 = price weight > length foreign, add(5) > > I get an error: > > conditional(): imputation variable not constant outside conditional sample; > rep78 is not constant outside the subset identified by > (foreign==0) within the > imputation sample > -- above applies to specification (pmm , cond( foreign==0 )) rep78 = > price weight length > foreign > > Why does -mi impute- bother checking what's going on with the > complement of the conditional sample? I understand the medical > examples that the manual gives (I agree that it does not make sense to > ask males about the number of pregnancies, or non-smokers about the > number of cigarettes), but that's not a check that would be relevant > for every situation. Insisting on a single non-missing values outside > of the conditional sample (a version of the monotone missing data > pattern, I guess) is extremely constricting on behalf of -mi > impute-... which is supposed to be very general, right? In my example, > I don't see any reason why my data should have this restrictive set > up. With -webuse nlswork-, I may have been using the salary equation > from the 1970s to calibrate the imputation model, and apply the model > to the data from 1980s, which may have some real and some missing data > that would not fit the expectations of the -conditional()- approach > here. > > Is there an override to turn off this check for > out-of-conditional-sample constant value? > > -- Stas Kolenikov, PhD, PStat (ASA, SSC) > -- Senior Survey Statistician, Abt SRBI > -- Opinions stated in this email are mine only, and do not reflect the > position of my employer > -- http://stas.kolenikov.name > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/faqs/resources/statalist-faq/ > * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/faqs/resources/statalist-faq/ * http://www.ats.ucla.edu/stat/stata/

**References**:**st: mi impute, conditional()***From:*Stas Kolenikov <[email protected]>

- Prev by Date:
**Re: st: Analysis of SIR or HR with timevarying covariates and late entry big data** - Next by Date:
**st: Imputation of missing data in an unbalanced panel using ICE** - Previous by thread:
**st: mi impute, conditional()** - Next by thread:
**st: Stata 13/12 for Mac and Mavericks** - Index(es):