[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
ymarchenko@stata.com (Yulia Marchenko, StataCorp LP) |

To |
statalist@hsphsun2.harvard.edu |

Subject |
RE: st: Stata 11 imputation |

Date |
Tue, 28 Jul 2009 15:46:18 -0500 |

Peter Lachenbruch <Peter.Lachenbruch@oregonstate.edu> asks: > In my problem, I have some continuous (maybe 'normal') variables, some > dichotomous variables, and some categorical variables. It looks like mi > impute will allow me to impute the normal variables and all others, but when > I want to impute the categorical variables it looks as if I will re-impute > the normal ones as categories. I will likely need to continue to use ICE. In a general case, when the pattern of missingness is arbitrary and when all variables are of different types and must modeled simultaneously, -ice- is the most flexible method. With this in mind, we developed -mi import ice- and -mi export ice- allowing users to switch between -mi-'s and -ice-'s data formats easily. This way you can still use -ice- to obtain imputations and then import imputed data to -mi- to utilize Stata's new data management and estimation capabilities. As a side note, there are alternative methods to the imputation via chained equations (ICE) for multivariate categorical and mixed data which are based on the underlying joint models, log-linear and general-location models (Schafer 1997). These methods, however, are very restrictive with respect to the dimensionality of the model and are also often difficult to converge. Therefore, ICE remains the most practical choice albeit less theoretically justified. Hypothetically, one can use -mi impute mvn- in the above most general case and then round imputed binary and categorical variables (if needed) afterwords. Depending on the number of binary and categorical variables, the underlying assumption of joint normality, however, may be a suspect. In any case, extensive simulations are needed to investigate the robustness of this method with mixed types of variables. Let me briefly describe the cases for which one could still use -mi impute- in the presence of different types of variables. 1. When the pattern of missingness is monotone (which I admit is rare in practice), one can use -mi impute monotone- to impute variables of different types simultaneously. 2. If there are only a few observations destroying a monotone-missing pattern, one can consider discarding those observations and then proceed with using -mi impute monotone-. 3. If it is reasonable to assume independence among blocks of variables, these blocks of variables can be imputed separately using combinations of -mi impute monotone-, -mi impute mvn-, and any of the available univariate imputation methods (e.g. -mi impute regress-, -mi impute logit-, etc.). In the case of (3), you might type . mi impute mvn x1 x2 = x3, add(20) . mi impute monotone (mlogit) x4 x5 = x3, replace Note that the second command did *NOT* replace the imputed values in x1 and x2. In the above, we assume (x1, x2) and (x4, x5) are independent conditionally on complete covariate x3. We also assume that x1 and x2 are continuous with arbitrarily missing values, x4 and x5 are categorical and follow a monotone-missing pattern. References: Schafer, J. L. 1997. Analysis of Incomplete Multivariate Data. Boca Raton, FL: Chapman & Hall/CRC. -- Yulia ymarchenko@stata.com * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

- Prev by Date:
**st: Random merging** - Next by Date:
**Re: st: row number corresponding to a column maximum** - Previous by thread:
**RE: st: Stata 11 imputation** - Next by thread:
**st: -set memory- in Stata 11** - Index(es):

© Copyright 1996–2014 StataCorp LP | Terms of use | Privacy | Contact us | What's new | Site index |