I would first create a dummy that will be used to tell -ice- which values to impute: ***** clear input str1 Firm Year X "A" 2000 . "A" 2001 10 "A" 2002 6 "A" 2003 . "B" 1998 3 "B" 1999 . "B" 2000 . "B" 2001 4 "B" 2002 6 "B" 2003 2 end replace X=.a if X==. reshape wide X, i(Firm) j(Year) foreach v of varlist X* { gen c`v'=`v'!=. replace `v'=0 if c`v'==0 } ****** I would then run -ice- using the -conditional()- option (you should fill in the remaining parts for the -ice- command: ice ..., conditional(X1998:cX1998==1, ...) I don't think it is a good idea to use only the results from the first imputation because your estimates will underestimate the true variance. Antonis On Fri, Oct 25, 2013 at 2:46 PM, James Bernard <jamesstatalist@gmail.com> wrote: > Hi all, > > I have been using imputation techniques. Stata offers a wide range of > commands to conduct imputation. > > I have a unbalanced panel data. Several variables have missing values. > To benefit from the fact that the available observation of a variable > at certain times can help estimate the missing values at other times, > I changed the format of my data from long to wide and used ICE using > the instruction from this site: > http://www.ats.ucla.edu/stat/stata/faq/mi_longitudinal.htm > > These instructions work for a balanced panel data set where all firms > are supposed to have values in all years. > > But, imagine that one firm has to have values from 2000-2003, and > another from 1998-2003. And, suppose we have a variable (X) for which > some observations across these two firms are missing > > Firm Year X > --------- --------- ------- > A 2000 . > A 2001 10 > A 2002 6 > A 2003 . > > B 1998 3 > B 1999 . > B 2000 . > B 2001 4 > B 2002 6 > B 2003 2 > > Reshaping the data from long to wide would lead to: creation of 6 new > varibale named "X1998", "X1999",......"X2003".... and values of X1998 > and X1999 will be missing for firm A > > And running the ICE, it would predict values for X1998 and X1999 for > both firm A and B. > > The next step is to get the data into long form and run the -mi- > commands to make the estimation which use Rubin rules for combining > the data on the m imputations made. > > One may argue that I can let the ICE predict the values of X1998 and > X1999 for firm A. Reshape the data into long format and remove the > values of X from firm A in 1998 and in 1999, because firm A is not > supposed to have values in 1998 and 1999. > > My question is: Does asking ICE to predict values of X1998 and X1999 > for firm A affect the way it predicts the value of X2000 (which is the > main observation we have to impute)? > > Does the technique I used make sense? > > Also, how wrong is to use only the first imputation (M=1) to run the > model, instead of using all the imputations? > > Thanks, > James > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/faqs/resources/statalist-faq/ > * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/faqs/resources/statalist-faq/ * http://www.ats.ucla.edu/stat/stata/

