[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

From |
"Gresch,Cornelia" <gresch@mpib-berlin.mpg.de> |

To |
"Rodrigo Alfaro A." <ralfaro@bcentral.cl>, <statalist@hsphsun2.harvard.edu> |

Subject |
st: RE: R: MICE-Imputation: dealing with "plausible missings" and multilevel data |

Date |
Wed, 24 Sep 2008 10:02:15 +0200 |

Dear all, first: thanks to Carlo and Rodrigo for your suggestions/comments! I appreciate your feedback very much. I guess I need to explain my current problem a little bit more detailed: for the imputation of our dataset we want to run MICE (Multiple Imputation by Chained Equation, coined by Steff van Buuren and implemented in Stata by Patrick Royston (see package st0067_1 from http://www.stata-journal.com/software/sj5-2); name of the ado: - ice - ). Which is currently the most convincing procedure. Our problem is a bit different than the one Rodrigo described: we have two levels: class and students. For the current imputation I am only interested in the individual student level (so - at least at the moment I don't need to impute data on the class-level). However, some values on the student level depend on the class they visit - (like e.g. grades which depend on the teacher leading to different grade point averages between classes or the evaluation of the teacher by the students). Therefore, whenever I analyse this data I have to control for this second class-level (using gllamm or estimating robust standard errors). This means also for the imputation-procedure, that I have to control for this second level. And this doesn't seem to be possible (at least as I know not within the -ice-module). Is there any hint for including this second level into the imputation-model? Regarding the "plausible missings" we have the same problem: theoretically we only should impute the missings for observations, which are supposed to have a valid value. Anyway, I technically don't know how to implement this with the ice-ado. I cannot exclude all groups that should not be imputed on once specific variable from the whole imputation-process or run separate imputation models, (at least not in a way, that all other variables I am interested in are still imputed in this whole imputation-procedure). Do you have any further suggestions? Is there any expert on MICE and the ice-module, who could comment on this? Thanks very much! Cornelia -----Original Message----- From: Rodrigo Alfaro A. [mailto:ralfaro@bcentral.cl] Sent: Monday, September 22, 2008 7:11 PM To: statalist@hsphsun2.harvard.edu Cc: Gresch,Cornelia Subject: RE: R: MICE-Imputation: dealing with "plausible missings" and multilevel data /// Cornelia, We have a similar problem but in a different setting: level, and groups. (1) Level: It seems that you need to define at what level you want to do the imputation. In our case, we are working with household surveys then we have 2 levels of information: individuals and households. Our goal is to have an analysis of debt/income/asset then household level is the one we choose. Given our experience in the last Stata meeting at UK we conclude that we need to do some conditional imputation at the individual level, such as Hot-Deck or maybe some non-proper imputation method. With that in hand we could add that information at the next level (households). For example, missing observations in labor income could be "solved" by some previous conditional imputation in order to provide some estimate for the total household income, which will be the sum of individual ones. (2) Groups: We conclude that conditions on variables define groups of observations. Again, in our household survey we have cases with and without bank loans. In this case we consider 2 groups to do the imputation. It should be noted that: we cannot impute some amount of bank loans for households without it, and also we cannot force that without bank loans implies zero bank loans. In our case, those households do not have that kind of loans for a specific reason: they do not have access to them. As you see creating groups implies a huge task in the sense that we need to specific some particular kind of household in the combination of many variables. However, this makes more sense than imputing any value and then putting the true missing back. I hope this helps you. Best regards, Rodrigo. -----Mensaje original----- De: owner-statalist@hsphsun2.harvard.edu [mailto:owner-statalist@hsphsun2.harvard.edu] En nombre de Carlo Lazzaro Enviado el: Viernes, 19 de Septiembre de 2008 06:09 a.m. Para: statalist@hsphsun2.harvard.edu CC: 'Gresch,Cornelia' Asunto: st: R: MICE-Imputation: dealing with "plausible missings" and multilevel data Dear Cornelia, hoping to point out a useful reference on dealing with missing data, please find below the following one: Briggs A, Clark T, Wolstenholme J and Clarke P. Missing....presumed at random: cost-analysis of incomplete data. Health Economics 2003; 12: 377-392 Kind Regards, Carlo -----Messaggio originale----- Da: owner-statalist@hsphsun2.harvard.edu [mailto:owner-statalist@hsphsun2.harvard.edu] Per conto di Gresch,Cornelia Inviato: venerdì 19 settembre 2008 10.10 A: statalist@hsphsun2.harvard.edu Oggetto: st: MICE-Imputation: dealing with "plausible missings" and multilevel data Hi all, currently I'm occupied by implementing an imputation-model for a large dataset using Multiple imputation by the MICE system of chained equations (ado -ice-). Here I have two questions: First I would like to know how to deal with missing values which should neither be imputed themselves nor serve (with misleadingly imputed values) as predictors for other variables. E.g.: We have some questions which only should be answered by respondents with migration background on which we apply a filter for further tracing (e.g. if they are motivated to go back to their original country). In such a case it makes only sense to impute values for respondents with migration background. This is less problematic since the meaningless values can just be deleted after the imputation process. However, while running MICE, the variable with the misleadingly imputed values is also used to impute other variables. And this definitely doesn't make sense. Is there anybody of you who also has dealt with this problem and found a handy solution for it? E.g. is there any way to use a conditional "passive option" to replace all values = 99 in case the preceding filter variable has a specific value? The only (not convincing) solution I would see was to exclude the corresponding variables with "plausible missings" as predictors for all other variables (which could be done by using the eqlist-option). Furthermore, we have multilevel-data (individual-level and school-level) - is there any smart way to integrate this upper level into the imputation-procedure? Thanks in advance for any response/suggestion/support Cornelia * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/ ******************************************************************************** ADVERTENCIA: La información contenida en esta transmisión, y en cualquier archivo adjunto, está sujeta a reserva legal conforme a la normativa aplicable al Banco Central de Chile, y no puede ser usada o difundida por personas distintas de su o sus destinatarios. Si usted ha recibido esta transmisión por error, por favor notifique inmediatamente al remitente respondiendo por este mismo medio y elimínela de su sistema. El Banco Central de Chile no se hará responsable de la exactitud y veracidad de la información contenida en este mensaje, así como de su modificación, copia, divulgación o reenvío, total o parcial. Su uso no autorizado puede ser sancionado de conformidad con las leyes chilenas. El Banco Central de Chile transmite sus decisiones a través de comunicados oficiales, los que pone a disposición del público en su página de Internet: www.bcentral.cl DISCLAIMER: The information contained in this email or any attached file, is subject to legal privilege pursuant to the laws and regulations applicable to the Central Bank of Chile , and may not be used or disseminated by any person other than its intended recipients. If you have received this transmission in error, please notify the sender immediately by reply to this email address and delete it from your system. The Central Bank of Chile shall not be liable for the accuracy or authenticity of the contents of this message, whether amended, copied, forwarded or disclosed in any form, in whole or in part. Please note that unauthorized use may be penalized in conformity with the Chilean law. The Central Bank of Chile communicates its decisions by official releases, and makes them available to the public in its WebPages: www.bcentral.cl * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**References**:**st: R: MICE-Imputation: dealing with "plausible missings" and multilevel data***From:*"Carlo Lazzaro" <carlo.lazzaro@tin.it>

**st: RE: R: MICE-Imputation: dealing with "plausible missings" and multilevel data***From:*"Rodrigo Alfaro A." <ralfaro@bcentral.cl>

- Prev by Date:
**st: R: survival analysis** - Next by Date:
**[no subject]** - Previous by thread:
**st: RE: R: MICE-Imputation: dealing with "plausible missings" and multilevel data** - Next by thread:
**st: Median test & ANOVA with sampling weights** - Index(es):

© Copyright 1996–2017 StataCorp LLC | Terms of use | Privacy | Contact us | What's new | Site index |