Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Imputation of missing data in an unbalanced panel using ICE


From   Richard Williams <richardwilliams.ndu@gmail.com>
To   statalist@hsphsun2.harvard.edu, statalist@hsphsun2.harvard.edu
Subject   Re: st: Imputation of missing data in an unbalanced panel using ICE
Date   Fri, 25 Oct 2013 12:04:15 -0500

At 09:09 AM 10/25/2013, James Bernard wrote:
Thanks Antonis,

How about taking the average of the imputations for an observation.
Let's say we have 7 imputations (m=7). Then for a particular
obesrvation, we could take the average of the 7 imputed value?

Does this work?

When there is no clear cut statistical solution I personally am open to improvisation. There are plenty of things where you don't need accuracy to 12 decimal places. You just need to be in the ballpark. So, you might try one imputation, a few imputations or all the imputations. You might report, say, that the R^2 statistics or the BIC statistics or whatever ranged between this and that. Another possibility would be a diagnostic test and you run it on different imputations and it always leads to the same conclusions. If you get conflicting results or borderline results you have to worry more, but if it is a clear cut decision no matter what you do then don't worry about it too much.

Thanks

James

On Fri, Oct 25, 2013 at 9:41 PM, A Loumiotis
<antonis.loumiotis@gmail.com> wrote:
> I would first create a dummy that will be used to tell -ice- which
> values to impute:
>
> *****
> clear
> input str1 Firm Year X
> "A"           2000       .
> "A"           2001      10
> "A"           2002       6
> "A"           2003       .
>
> "B"           1998       3
> "B"           1999       .
> "B"           2000        .
> "B"           2001        4
> "B"           2002        6
> "B"           2003        2
> end
>
> replace X=.a if X==.
> reshape wide X, i(Firm) j(Year)
> foreach v of varlist X* {
>     gen c`v'=`v'!=.
>     replace `v'=0 if c`v'==0
> }
> ******
>
> I would then run -ice- using the -conditional()- option (you should
> fill in the remaining parts for the -ice- command:
> ice ..., conditional(X1998:cX1998==1, ...)
>
> I don't think it is a good idea to use only the results from the first
> imputation because your estimates will underestimate the true
> variance.
>
> Antonis
>
> On Fri, Oct 25, 2013 at 2:46 PM, James Bernard <jamesstatalist@gmail.com> wrote:
>> Hi all,
>>
>> I have been using imputation techniques. Stata offers a wide range of
>> commands to conduct imputation.
>>
>> I have a unbalanced panel data. Several variables have missing values.
>> To benefit from the fact that the available observation of a variable
>> at certain times can help estimate the missing values at other times,
>> I changed the format of my data from long to wide and used ICE using
>> the instruction from this site:
>> http://www.ats.ucla.edu/stat/stata/faq/mi_longitudinal.htm
>>
>> These instructions work for a balanced panel data set where all firms
>> are supposed to have values in all years.
>>
>> But, imagine that one firm has to have values from 2000-2003, and
>> another from 1998-2003. And, suppose we have a variable (X) for which
>> some observations across these two firms are missing
>>
>> Firm       Year        X
>> ---------    ---------    -------
>> A           2000       .
>> A           2001      10
>> A           2002       6
>> A           2003       .
>>
>> B           1998       3
>> B           1999       .
>> B           2000        .
>> B           2001        4
>> B           2002        6
>> B           2003        2
>>
>> Reshaping the data from long to wide would lead to: creation of 6 new
>> varibale named "X1998", "X1999",......"X2003".... and values of X1998
>> and X1999 will be missing for firm A
>>
>> And running the ICE, it would predict values for X1998 and X1999 for
>> both firm A and B.
>>
>> The next step is to get the data into long form and run the -mi-
>> commands to make the estimation which use Rubin rules for combining
>> the data on the m imputations made.
>>
>> One may argue that I can let the ICE predict the values of X1998 and
>> X1999 for firm A. Reshape the data into long format and remove the
>> values of X from firm A in 1998 and in 1999, because firm A is not
>> supposed to have  values in 1998 and 1999.
>>
>> My question is: Does asking ICE to predict values of X1998 and X1999
>> for firm A affect the way it predicts the value of X2000 (which is the
>> main observation we have to impute)?
>>
>> Does the technique I used make sense?
>>
>> Also, how wrong is to use only the first imputation (M=1) to run the
>> model, instead of using all the imputations?
>>
>> Thanks,
>> James
>> *
>> *   For searches and help try:
>> *   http://www.stata.com/help.cgi?search
>> *   http://www.stata.com/support/faqs/resources/statalist-faq/
>> *   http://www.ats.ucla.edu/stat/stata/
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/faqs/resources/statalist-faq/
> *   http://www.ats.ucla.edu/stat/stata/
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/

-------------------------------------------
Richard Williams, Notre Dame Dept of Sociology
OFFICE: (574)631-6668, (574)631-6463
HOME:   (574)289-5227
EMAIL:  Richard.A.Williams.5@ND.Edu
WWW:    http://www.nd.edu/~rwilliam

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index