Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: RE: R: MICE-Imputation: dealing with "plausible missings" and multilevel data


From   "Gresch,Cornelia" <gresch@mpib-berlin.mpg.de>
To   "Rodrigo Alfaro A." <ralfaro@bcentral.cl>, <statalist@hsphsun2.harvard.edu>
Subject   st: RE: R: MICE-Imputation: dealing with "plausible missings" and multilevel data
Date   Wed, 24 Sep 2008 10:02:15 +0200

Dear all,

first: thanks to Carlo and Rodrigo for your suggestions/comments! I appreciate your feedback very much.

I guess I need to explain my current problem a little bit more detailed:
for the imputation of our dataset we want to run MICE (Multiple Imputation by Chained Equation, coined by Steff van Buuren and implemented in Stata by Patrick Royston (see package st0067_1 from http://www.stata-journal.com/software/sj5-2); name of the ado: - ice - ). Which is currently the most convincing procedure. 

Our problem is a bit different than the one Rodrigo described: we have two levels: class and students. For the current imputation I am only interested in the individual student level (so - at least at the moment I don't need to impute data on the class-level). However, some values on the student level depend on the class they visit - (like e.g. grades which depend on the teacher leading to different grade point averages between classes or the evaluation of the teacher by the students). Therefore, whenever I analyse this data I have to control for this second class-level (using gllamm or estimating robust standard errors). This means also for the imputation-procedure, that I have to control for this second level. And this doesn't seem to be possible (at least as I know not within the -ice-module). Is there any hint for including this second level into the imputation-model?

Regarding the "plausible missings" we have the same problem: theoretically we only should impute the missings for observations, which are supposed to have a valid value. Anyway, I technically don't know how to implement this with the ice-ado. I cannot exclude all groups that should not be imputed on once specific variable from the whole imputation-process or run separate imputation models, (at least not in a way, that all other variables I am interested in are still imputed in this whole imputation-procedure). 

Do you have any further suggestions?
Is there any expert on MICE and the ice-module, who could comment on this?

Thanks very much!

Cornelia 


-----Original Message-----
From: Rodrigo Alfaro A. [mailto:ralfaro@bcentral.cl] 
Sent: Monday, September 22, 2008 7:11 PM
To: statalist@hsphsun2.harvard.edu
Cc: Gresch,Cornelia
Subject: RE: R: MICE-Imputation: dealing with "plausible missings" and multilevel data



///

Cornelia,

We have a similar problem but in a different setting: level, and groups.

(1) Level: It seems that you need to define at what level you want to do the imputation. In our case, we are working with household surveys then we have 2 levels of information: individuals and households. Our goal is to have an analysis of debt/income/asset then household level is the one we choose. Given our experience in the last Stata meeting at UK we conclude that we need to do some conditional imputation at the individual level, such as Hot-Deck or maybe some non-proper imputation method. With that in hand we could add that information at the next level (households). For example, missing observations in labor income could be "solved" by some previous conditional imputation in order to provide some estimate for the total household income, which will be the sum of individual ones.

(2) Groups: We conclude that conditions on variables define groups of observations. Again, in our household survey we have cases with and without bank loans. In this case we consider 2 groups to do the imputation. It should be noted that: we cannot impute some amount of bank loans for households without it, and also we cannot force that without bank loans implies zero bank loans. In our case, those households do not have that kind of loans for a specific reason: they do not have access to them. As you see creating groups implies a huge task in the sense that we need to specific some particular kind of household in the combination of many variables. However, this makes more sense than imputing any value and then putting the true missing back. 

I hope this helps you.
Best regards, Rodrigo. 

  	

 

-----Mensaje original-----
De: owner-statalist@hsphsun2.harvard.edu [mailto:owner-statalist@hsphsun2.harvard.edu] En nombre de Carlo Lazzaro
Enviado el: Viernes, 19 de Septiembre de 2008 06:09 a.m.
Para: statalist@hsphsun2.harvard.edu
CC: 'Gresch,Cornelia'
Asunto: st: R: MICE-Imputation: dealing with "plausible missings" and multilevel data


Dear Cornelia,
hoping to point out a useful reference on dealing with missing data, please find below the following one:

Briggs A, Clark T, Wolstenholme J and Clarke P. Missing....presumed at
random: cost-analysis of incomplete data. Health Economics 2003; 12: 377-392

Kind Regards,

Carlo
-----Messaggio originale-----
Da: owner-statalist@hsphsun2.harvard.edu
[mailto:owner-statalist@hsphsun2.harvard.edu] Per conto di Gresch,Cornelia
Inviato: venerdì 19 settembre 2008 10.10
A: statalist@hsphsun2.harvard.edu
Oggetto: st: MICE-Imputation: dealing with "plausible missings" and multilevel data

Hi all,


currently I'm occupied by implementing an imputation-model for a large dataset using Multiple imputation by the MICE system of chained equations (ado -ice-).

Here I have two questions: 

First I would like to know how to deal with missing values which should neither be imputed themselves nor serve (with misleadingly imputed
values) as predictors for other variables. 

E.g.: We have some questions which only should be answered by respondents with migration background on which we apply a filter for further tracing (e.g. if they are motivated to go back to their original country). In such a case it makes only sense to impute values for respondents with migration background. This is less problematic since the meaningless values can just be deleted after the imputation process.

However, while running MICE, the variable with the misleadingly imputed values is also used to impute other variables. And this definitely doesn't make sense. Is there anybody of you who also has dealt with this problem and found a handy solution for it?

E.g. is there any way to use a conditional "passive option" to replace all values = 99 in case the preceding filter variable has a specific value? The only (not convincing) solution I would see was to exclude the corresponding variables with "plausible missings" as predictors for all other variables (which could be done by using the eqlist-option).


Furthermore, we have multilevel-data (individual-level and school-level)
- is there any smart way to integrate this upper level into the imputation-procedure?


Thanks in advance for any response/suggestion/support Cornelia


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

********************************************************************************
ADVERTENCIA: La  información  contenida  en  esta  transmisión, y  en  cualquier archivo  adjunto, está  sujeta a reserva legal conforme a la normativa aplicable  al  Banco  Central  de  Chile, y  no  puede  ser usada o difundida  por personas distintas  de  su o sus destinatarios. Si usted ha recibido esta transmisión por error,  por  favor  notifique  inmediatamente al remitente respondiendo por este mismo medio y elimínela de su sistema.
El  Banco Central de Chile no se hará responsable de la exactitud y veracidad de la información contenida en este mensaje, así  como  de su  modificación, copia, divulgación  o  reenvío,  total  o  parcial.   Su  uso  no  autorizado puede ser sancionado de conformidad con las leyes chilenas. 
El  Banco  Central  de  Chile  transmite  sus decisiones a través de comunicados oficiales, los  que  pone  a  disposición  del público en su página de Internet: www.bcentral.cl 

DISCLAIMER: The information  contained  in  this  email or any attached file, is subject to legal  privilege  pursuant  to the laws and regulations applicable to the Central  Bank  of  Chile , and may not be used or disseminated by any person other  than  its  intended recipients. If you have received this transmission in error, please  notify  the sender immediately by reply to this email address and delete it from your system.
The Central Bank  of  Chile shall not be liable for the accuracy or authenticity of the contents of this message, whether amended, copied, forwarded or disclosed in  any  form, in  whole  or  in part.  Please note that unauthorized use may be penalized  in  conformity  with  the  Chilean law.    
The Central  Bank of Chile communicates its decisions by  official releases, and 
makes them available to the public in its WebPages: www.bcentral.cl

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index