Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: st: Multiple imputation with panel data

From   Lance Erickson <>
To   "" <>
Subject   RE: st: Multiple imputation with panel data
Date   Thu, 5 Jul 2012 22:42:38 +0000


Perhaps I'm misunderstanding your problem, but if you have wide format data and there are no values for any of the observations in 2007 for one of the variables in the imputation model, with data like...

Id	x2003	x2007	y2003	y2007
1	5	.	8	9
2	4	.	3	3
3	3	.	8	5

then I don't think that multiple imputation is an option for you. My understanding of MI is more intuitive than technical but I believe that to impute values for a given variable, there has to be some information about how the variable is distributed. But if, in the example above, x2007 is all missing then there is no existing information that can inform the estimation of missing values. In other words, MI can't create data that you don't have. (Even though I think people sometimes seem to prefer listwise deletion to MI because it feels like that's exactly what MI is doing.) It can only give you estimates of what the data might be based on existing values and their relationship of those existing values to other variables in the imputation model. There are many others on Statalist that are substantially better credentialed than I to answer your question but that's my take.


-----Original Message-----
From: [] On Behalf Of Veronica Galassi
Sent: Thursday, July 05, 2012 1:00 PM
Subject: Re: st: Multiple imputation with panel data

Hi Oliver, 

Thank you for your kind reply!

I am not quite sure whether I got your hint or not...maybe my explanation was just not clear enough, sorry about that!!!
I think my case is slightly different from what you were describing because I am not interested in the missing data between 2003 and 2007.
In that case, as you said, I would just fit a line.
What I am trying to impute are the missing data inside the year 2003 and
2007 respectively. 
And things are made even more complicated by the fact that for the main explanatory variable of my model I have got only observations for the year
2003 but not for 2007. That's why I was thinking about multiple imputation!
But maybe you are right, I just have too many missing data.



On Thu, 05 Jul 2012 19:27:47 +0200, Oliver Jones <> wrote:
> Hi Veronica,
> I have just one hint: Maybe two observations are just not enough to do
> imputation.
> Just think about it, I give a number, e.g. 3.145 percent, for 2003 and 
> a number, e.g 5.0 percent, for 2007 and ask you what are the values 
> for the years in
> Can you imagine some fancy method two figure it out?
> I would suspect, under the assumption you don't have any other 
> information, that there is no best solution. Maybe you could just draw 
> a line between the years.
> Best
> Oliver
> *
> *   For searches and help try:
> *
> *
> *

MSc Development Economics
University of Sussex
Mobile: +44 78 5563 0276

14 Auckland Drive,
BN2 4JS, Brighton, UK

*   For searches and help try:

*   For searches and help try:

© Copyright 1996–2017 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index