Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
From | James Bernard <jamesstatalist@gmail.com> |
To | statalist@hsphsun2.harvard.edu |
Subject | Re: st: Imputation of missing data in an unbalanced panel using ICE |
Date | Sat, 26 Oct 2013 00:09:27 +0800 |
Thanks Richard! It is a relief then On Sat, Oct 26, 2013 at 1:04 AM, Richard Williams <richardwilliams.ndu@gmail.com> wrote: > At 09:09 AM 10/25/2013, James Bernard wrote: >> >> Thanks Antonis, >> >> How about taking the average of the imputations for an observation. >> Let's say we have 7 imputations (m=7). Then for a particular >> obesrvation, we could take the average of the 7 imputed value? >> >> Does this work? > > > When there is no clear cut statistical solution I personally am open to > improvisation. There are plenty of things where you don't need accuracy to > 12 decimal places. You just need to be in the ballpark. So, you might try > one imputation, a few imputations or all the imputations. You might report, > say, that the R^2 statistics or the BIC statistics or whatever ranged > between this and that. Another possibility would be a diagnostic test and > you run it on different imputations and it always leads to the same > conclusions. If you get conflicting results or borderline results you have > to worry more, but if it is a clear cut decision no matter what you do then > don't worry about it too much. > > >> Thanks >> >> James >> >> On Fri, Oct 25, 2013 at 9:41 PM, A Loumiotis >> <antonis.loumiotis@gmail.com> wrote: >> > I would first create a dummy that will be used to tell -ice- which >> > values to impute: >> > >> > ***** >> > clear >> > input str1 Firm Year X >> > "A" 2000 . >> > "A" 2001 10 >> > "A" 2002 6 >> > "A" 2003 . >> > >> > "B" 1998 3 >> > "B" 1999 . >> > "B" 2000 . >> > "B" 2001 4 >> > "B" 2002 6 >> > "B" 2003 2 >> > end >> > >> > replace X=.a if X==. >> > reshape wide X, i(Firm) j(Year) >> > foreach v of varlist X* { >> > gen c`v'=`v'!=. >> > replace `v'=0 if c`v'==0 >> > } >> > ****** >> > >> > I would then run -ice- using the -conditional()- option (you should >> > fill in the remaining parts for the -ice- command: >> > ice ..., conditional(X1998:cX1998==1, ...) >> > >> > I don't think it is a good idea to use only the results from the first >> > imputation because your estimates will underestimate the true >> > variance. >> > >> > Antonis >> > >> > On Fri, Oct 25, 2013 at 2:46 PM, James Bernard >> > <jamesstatalist@gmail.com> wrote: >> >> Hi all, >> >> >> >> I have been using imputation techniques. Stata offers a wide range of >> >> commands to conduct imputation. >> >> >> >> I have a unbalanced panel data. Several variables have missing values. >> >> To benefit from the fact that the available observation of a variable >> >> at certain times can help estimate the missing values at other times, >> >> I changed the format of my data from long to wide and used ICE using >> >> the instruction from this site: >> >> http://www.ats.ucla.edu/stat/stata/faq/mi_longitudinal.htm >> >> >> >> These instructions work for a balanced panel data set where all firms >> >> are supposed to have values in all years. >> >> >> >> But, imagine that one firm has to have values from 2000-2003, and >> >> another from 1998-2003. And, suppose we have a variable (X) for which >> >> some observations across these two firms are missing >> >> >> >> Firm Year X >> >> --------- --------- ------- >> >> A 2000 . >> >> A 2001 10 >> >> A 2002 6 >> >> A 2003 . >> >> >> >> B 1998 3 >> >> B 1999 . >> >> B 2000 . >> >> B 2001 4 >> >> B 2002 6 >> >> B 2003 2 >> >> >> >> Reshaping the data from long to wide would lead to: creation of 6 new >> >> varibale named "X1998", "X1999",......"X2003".... and values of X1998 >> >> and X1999 will be missing for firm A >> >> >> >> And running the ICE, it would predict values for X1998 and X1999 for >> >> both firm A and B. >> >> >> >> The next step is to get the data into long form and run the -mi- >> >> commands to make the estimation which use Rubin rules for combining >> >> the data on the m imputations made. >> >> >> >> One may argue that I can let the ICE predict the values of X1998 and >> >> X1999 for firm A. Reshape the data into long format and remove the >> >> values of X from firm A in 1998 and in 1999, because firm A is not >> >> supposed to have values in 1998 and 1999. >> >> >> >> My question is: Does asking ICE to predict values of X1998 and X1999 >> >> for firm A affect the way it predicts the value of X2000 (which is the >> >> main observation we have to impute)? >> >> >> >> Does the technique I used make sense? >> >> >> >> Also, how wrong is to use only the first imputation (M=1) to run the >> >> model, instead of using all the imputations? >> >> >> >> Thanks, >> >> James >> >> * >> >> * For searches and help try: >> >> * http://www.stata.com/help.cgi?search >> >> * http://www.stata.com/support/faqs/resources/statalist-faq/ >> >> * http://www.ats.ucla.edu/stat/stata/ >> > * >> > * For searches and help try: >> > * http://www.stata.com/help.cgi?search >> > * http://www.stata.com/support/faqs/resources/statalist-faq/ >> > * http://www.ats.ucla.edu/stat/stata/ >> * >> * For searches and help try: >> * http://www.stata.com/help.cgi?search >> * http://www.stata.com/support/faqs/resources/statalist-faq/ >> * http://www.ats.ucla.edu/stat/stata/ > > > ------------------------------------------- > Richard Williams, Notre Dame Dept of Sociology > OFFICE: (574)631-6668, (574)631-6463 > HOME: (574)289-5227 > EMAIL: Richard.A.Williams.5@ND.Edu > WWW: http://www.nd.edu/~rwilliam > > > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/faqs/resources/statalist-faq/ > * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/faqs/resources/statalist-faq/ * http://www.ats.ucla.edu/stat/stata/