Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: R: Imputation of missing data in an unbalanced panel using ICE


From   James Bernard <jamesstatalist@gmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: R: Imputation of missing data in an unbalanced panel using ICE
Date   Sat, 26 Oct 2013 00:08:17 +0800

Thanks Nick.

yes, I agree. using imputation, though tempting, raises other issues.
So, again, lime many thing in statistics, it is a matter of
cost-benefit analysis

On Fri, Oct 25, 2013 at 11:32 PM, Nick Cox <njcoxstata@gmail.com> wrote:
> There is a bundle of issues here.
>
> Carlo touches on one, which is what multiple imputation does and does
> not purport to provide.
>
> Another is that the method being used here -reshape-s panel data to
> wide, imputes and then -reshape-s back.
>
> This really does raise the question of precisely what assumptions are
> needed about variations in time to make that legitimate. There's no
> white magic independently of whether tacit assumptions match the data
> generating process. I've not thought this through either -- I don't do
> this stuff -- but I want to send a Hang on there... signal of caution.
>
> No-one seems interested any more in interpolation as a rough family of
> methods of filling in gaps in time series. Rather, it is a smooth
> method of filling in gaps and raises questions of its own too.
>
>
> Nick
> njcoxstata@gmail.com
>
>
> On 25 October 2013 16:17, Carlo Lazzaro <carlo.lazzaro@tiscalinet.it> wrote:
>> James asked:
>> "Also, how wrong is to use only the first imputation (M=1) to run the model,
>> instead of using all the imputations?".
>>
>> The approach James proposes would seem to rule out the between variance
>> component (that is, the variance between different M=n datasets generated
>> via MI), which is a qualifying features of MI.
>>
>> Kind regards,
>> Carlo
>>
>> -----Messaggio originale-----
>> Da: owner-statalist@hsphsun2.harvard.edu
>> [mailto:owner-statalist@hsphsun2.harvard.edu] Per conto di James Bernard
>> Inviato: venerdě 25 ottobre 2013 13:47
>> A: statalist@hsphsun2.harvard.edu
>> Oggetto: st: Imputation of missing data in an unbalanced panel using ICE
>>
>> Hi all,
>>
>> I have been using imputation techniques. Stata offers a wide range of
>> commands to conduct imputation.
>>
>> I have a unbalanced panel data. Several variables have missing values.
>> To benefit from the fact that the available observation of a variable at
>> certain times can help estimate the missing values at other times, I changed
>> the format of my data from long to wide and used ICE using the instruction
>> from this site:
>> http://www.ats.ucla.edu/stat/stata/faq/mi_longitudinal.htm
>>
>> These instructions work for a balanced panel data set where all firms are
>> supposed to have values in all years.
>>
>> But, imagine that one firm has to have values from 2000-2003, and another
>> from 1998-2003. And, suppose we have a variable (X) for which some
>> observations across these two firms are missing
>>
>> Firm       Year        X
>> ---------    ---------    -------
>> A           2000       .
>> A           2001      10
>> A           2002       6
>> A           2003       .
>>
>> B           1998       3
>> B           1999       .
>> B           2000        .
>> B           2001        4
>> B           2002        6
>> B           2003        2
>>
>> Reshaping the data from long to wide would lead to: creation of 6 new
>> varibale named "X1998", "X1999",......"X2003".... and values of X1998 and
>> X1999 will be missing for firm A
>>
>> And running the ICE, it would predict values for X1998 and X1999 for both
>> firm A and B.
>>
>> The next step is to get the data into long form and run the -mi- commands to
>> make the estimation which use Rubin rules for combining the data on the m
>> imputations made.
>>
>> One may argue that I can let the ICE predict the values of X1998 and
>> X1999 for firm A. Reshape the data into long format and remove the values of
>> X from firm A in 1998 and in 1999, because firm A is not supposed to have
>> values in 1998 and 1999.
>>
>> My question is: Does asking ICE to predict values of X1998 and X1999 for
>> firm A affect the way it predicts the value of X2000 (which is the main
>> observation we have to impute)?
>>
>> Does the technique I used make sense?
>>
>> Also, how wrong is to use only the first imputation (M=1) to run the model,
>> instead of using all the imputations?
>>
>> Thanks,
>> James
>> *
>> *   For searches and help try:
>> *   http://www.stata.com/help.cgi?search
>> *   http://www.stata.com/support/faqs/resources/statalist-faq/
>> *   http://www.ats.ucla.edu/stat/stata/
>>
>>
>> *
>> *   For searches and help try:
>> *   http://www.stata.com/help.cgi?search
>> *   http://www.stata.com/support/faqs/resources/statalist-faq/
>> *   http://www.ats.ucla.edu/stat/stata/
>
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/faqs/resources/statalist-faq/
> *   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index