[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
ymarchenko@stata.com (Yulia Marchenko, StataCorp LP) |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: mi in Stata 11 |

Date |
Fri, 07 Aug 2009 13:54:43 -0500 |

JIBONAYAN RAYCHAUDHURI <jibonayanrc@yahoo.com> asks if Stata 11's -mi- command provides imputation methods for panel data: > Can mi in Stata 11.0 perform imputation over panel data (which has been > tsset)? Does the data need to be arranged in wide form (from long form) > before mi can be applied to the data set? -mi- does not provide imputation methods specifically designed to impute complex data, such as panel, longitudinal data, complex survey data, time-series data, etc. The methods employed by -mi- rely on the iid assumption which is violated in these data, and to the best of my knowledge the methodologies for imputation methods relaxing this assumption have yet to be fully developed. In some cases, there are ways of using existing iid imputation methods to impute complex data. For example, longitudinal data can be reshaped to wide form (one variable for each time period) and then the MVN model can be used for imputation. In Stata, you can use -mi impute mvn- to do that. Say we have subjects' weights measured at three time periods: y1, y2, y3. We can type . mi register imputed y1 y2 y3 // register variables to be imputed . mi impute mvn y1 y2 y3, add(10) // create 10 imputations If you now want to reshape your data to long form after imputation, you can use -mi reshape-: . mi reshape y, i(id) j(time) where variable 'id' contains observation identifiers and new variable 'time' will contain the time periods after the data are reshaped. Note that you should use -mi reshape- rather than -reshape- to reshape _mi_ data. Similarly, if your data are in long form, you can use -mi reshape- to reshape it to wide form prior to using -mi impute-. In the presence of clustering, stratification, missing data can be imputed conditionally on the design variables, provided there are not too many clusters or strata. For example, if continuous variables x1 and x2 contain missing values and data are stratified on race, you can account for stratification by including variable 'race' as a factor variable in the imputation model: . mi register imputed x1 x2 . mi impute mvn x1 x2 = i.race, add(5) In the above, I could have also used -mi impute monotone-, instead, if I knew that the pattern of missing data is monotone. See Rubin (1987), Schafer (1997, 29-35, 372-377), for example, for more information about imputing complex data. Jibonayan mentions the use of -tsset- which implies that the data are also time-series data. I'm not aware of imputation methods applicable to filling in time-series data. In reply to Jibonayan's question, Martin Weiss <martin.weiss1@gmx.de> points out: > There is an -mi tsset- command as seen in which makes me think there is > support for imputation for panel data... Although -mi- does not provide direct methods for imputing panel data, time-series data, etc., it provides ways of 'mi setting' such data in case users already have imputations for it from other sources and need to perform data manipulation. Jibonayan also asks if a user-written command -levpet- can be used with -mi-: > Is it possible to combine mi with the levpet method of generating TFP?,i.e., > can levpet be applied over imputed data sets and the overall TFP measures, > thus generated, combined? Technically, you can use -mi estimate- with -levpet- (or with any other estimation command outside the list of supported commands in -help mi estimation-) to obtain combined estimates of the coefficients if you specify -mi estimate-'s option -cmdok-: . mi estimate, cmdok: levpet ... Statistically, it is your responsibility to verify that multiple imputation (MI) is applicable for the estimation method used. In general, as long as approximate (asymptotic) normality holds for an estimator and the variance of the estimator is a consistent estimate of the true variability in the complete data, it should be ok to apply MI combination rules to this estimator. Now, what Jibonayan really wants are the combined estimates of the predictions after using -mi estimate- with -levpet- which -mi- does not provide. There is no definitive recommendation on how predictions must be handled within the MI framework. Jibonayan may want to check out user-written command -mim- (and -mim: predict-, in particular) for a way of obtaining predicted values with multiply-imputed data. In any case, Jibonayan should first decide on what would be an appropriate way of imputing the time-series data before performing the analysis. References: Rubin, D. B. 1987. Multiple Imputation for Nonresponse in Surveys. New York: Wiley. Schafer, J. L. 1997. Analysis of Incomplete Multivariate Data. Boca Raton, FL: Chapman & Hall/CRC. -- Yulia ymarchenko@stata.com * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**Re: st: mi in Stata 11***From:*JIBONAYAN RAYCHAUDHURI <jibonayanrc@yahoo.com>

**Re: st: mi in Stata 11***From:*JIBONAYAN RAYCHAUDHURI <jibonayanrc@yahoo.com>

**Re: st: mi in Stata 11***From:*JIBONAYAN RAYCHAUDHURI <jibonayanrc@yahoo.com>

- Prev by Date:
**st: AW: Twoway Plot - Change Symbol** - Next by Date:
**st: STATA 10 help** - Previous by thread:
**st: AW: mi in Stata 11** - Next by thread:
**Re: st: mi in Stata 11** - Index(es):

© Copyright 1996–2014 StataCorp LP | Terms of use | Privacy | Contact us | What's new | Site index |