Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: st: Multiple imputation with panel data


From   Lance Erickson <[email protected]>
To   "[email protected]" <[email protected]>
Subject   RE: st: Multiple imputation with panel data
Date   Thu, 5 Jul 2012 22:42:38 +0000

Veronica,

Perhaps I'm misunderstanding your problem, but if you have wide format data and there are no values for any of the observations in 2007 for one of the variables in the imputation model, with data like...

Id	x2003	x2007	y2003	y2007
1	5	.	8	9
2	4	.	3	3
3	3	.	8	5

then I don't think that multiple imputation is an option for you. My understanding of MI is more intuitive than technical but I believe that to impute values for a given variable, there has to be some information about how the variable is distributed. But if, in the example above, x2007 is all missing then there is no existing information that can inform the estimation of missing values. In other words, MI can't create data that you don't have. (Even though I think people sometimes seem to prefer listwise deletion to MI because it feels like that's exactly what MI is doing.) It can only give you estimates of what the data might be based on existing values and their relationship of those existing values to other variables in the imputation model. There are many others on Statalist that are substantially better credentialed than I to answer your question but that's my take.

Best,
Lance

-----Original Message-----
From: [email protected] [mailto:[email protected]] On Behalf Of Veronica Galassi
Sent: Thursday, July 05, 2012 1:00 PM
To: [email protected]
Subject: Re: st: Multiple imputation with panel data

Hi Oliver, 

Thank you for your kind reply!

I am not quite sure whether I got your hint or not...maybe my explanation was just not clear enough, sorry about that!!!
I think my case is slightly different from what you were describing because I am not interested in the missing data between 2003 and 2007.
In that case, as you said, I would just fit a line.
What I am trying to impute are the missing data inside the year 2003 and
2007 respectively. 
And things are made even more complicated by the fact that for the main explanatory variable of my model I have got only observations for the year
2003 but not for 2007. That's why I was thinking about multiple imputation!
But maybe you are right, I just have too many missing data.

Best,

Veronica



On Thu, 05 Jul 2012 19:27:47 +0200, Oliver Jones <[email protected]> wrote:
> Hi Veronica,
> I have just one hint: Maybe two observations are just not enough to do
the
> imputation.
> Just think about it, I give a number, e.g. 3.145 percent, for 2003 and 
> a number, e.g 5.0 percent, for 2007 and ask you what are the values 
> for the years in
between.
> Can you imagine some fancy method two figure it out?
> I would suspect, under the assumption you don't have any other 
> information, that there is no best solution. Maybe you could just draw 
> a line between the years.
> 
> Best
> Oliver
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/

--
VERONICA GALASSI
MSc Development Economics
University of Sussex
Mobile: +44 78 5563 0276

14 Auckland Drive,
BN2 4JS, Brighton, UK

E-mail: [email protected]
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index