Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
Veronica Galassi <V.Galassi@sussex.ac.uk> |

To |
<statalist@hsphsun2.harvard.edu> |

Subject |
Re: st: Multiple imputation with panel data |

Date |
Fri, 06 Jul 2012 10:33:48 +0100 |

Dear all, You cannot even imagine how much I appreciate your advice!!! This is my very first quantitative research using Stata and my dataset is not exactly how a researcher would expect it to be. The example reported by Lance describes perfectly my situation. But apart from this variable which has got missing information for one entire year, I have got also many other data missing at random for other variables. So maybe I should first try to estimate the coefficients of x2003 = b_0 + b_1*y2003 and then use them to predict x2007 as Oliver was suggesting. But isn't this way of proceeding the same than extrapolating values for x2007 exploiting the linear relationship between x and y? Because maybe I could simply use extrapolation. I have also read that in order to perform something which is closer to what Stata does when performing multiple imputation, I could compute the variance of the residuals obtained from the first regression and then predict x2007. Randomly drawing m numbers I could multiply each of these m numbers by the standard deviation of the residuals and then adding this value up to the predicted values of x2007 I would be able to obtain m imputations from my original dataset. Using Rubin's rule I would then obtain one single value from my m imputations. Do you think this makes sense? Once I have done that, I should try again to perform multiple imputation in Stata to impute the rest of the dataset following what Wes was suggesting. Cheers, Veronica So On Fri, 06 Jul 2012 01:42:50 +0200, Oliver Jones <ojones@wiwi.uni-bielefeld.de> wrote: > Hi Veronica, > > if the little data example Lance gave is describing your situation, then I > agree with his > conclusion that you can not impute the missing values. > > To be precise there is a way to get reasonable values for x2007 but the > result will not help > in explaining y2007! The way I'm talking about is to estimate x2003 = b_0 > + b_1*y2003 then > assume that the parameters didn't change over time and calculate b_0 + > b_1*y2007 which is your > estimate for x2007... > > But as Lence said, others might come up with something more helpful... > > Best Oliver > > Am 06.07.2012 00:42, schrieb Lance Erickson: >> Veronica, >> >> Perhaps I'm misunderstanding your problem, but if you have wide format >> data and there are no values for any of the observations in 2007 for one >> of the variables in the imputation model, with data like... >> >> Id x2003 x2007 y2003 y2007 >> 1 5 . 8 9 >> 2 4 . 3 3 >> 3 3 . 8 5 >> >> then I don't think that multiple imputation is an option for you. My >> understanding of MI is more intuitive than technical but I believe that >> to impute values for a given variable, there has to be some information >> about how the variable is distributed. But if, in the example above, >> x2007 is all missing then there is no existing information that can >> inform the estimation of missing values. In other words, MI can't create >> data that you don't have. (Even though I think people sometimes seem to >> prefer listwise deletion to MI because it feels like that's exactly what >> MI is doing.) It can only give you estimates of what the data might be >> based on existing values and their relationship of those existing values >> to other variables in the imputation model. There are many others on >> Statalist that are substantially better credentialed than I to answer >> your question but that's my take. >> >> Best, >> Lance >> >> -----Original Message----- >> From: owner-statalist@hsphsun2.harvard.edu >> [mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of Veronica >> Galassi >> Sent: Thursday, July 05, 2012 1:00 PM >> To: statalist@hsphsun2.harvard.edu >> Subject: Re: st: Multiple imputation with panel data >> >> Hi Oliver, >> >> Thank you for your kind reply! >> >> I am not quite sure whether I got your hint or not...maybe my >> explanation was just not clear enough, sorry about that!!! >> I think my case is slightly different from what you were describing >> because I am not interested in the missing data between 2003 and 2007. >> In that case, as you said, I would just fit a line. >> What I am trying to impute are the missing data inside the year 2003 and >> 2007 respectively. >> And things are made even more complicated by the fact that for the main >> explanatory variable of my model I have got only observations for the >> year >> 2003 but not for 2007. That's why I was thinking about multiple >> imputation! >> But maybe you are right, I just have too many missing data. >> >> Best, >> >> Veronica >> >> >> >> On Thu, 05 Jul 2012 19:27:47 +0200, Oliver >> Jones<ojones@wiwi.uni-bielefeld.de> wrote: >>> Hi Veronica, >>> I have just one hint: Maybe two observations are just not enough to do >> the >>> imputation. >>> Just think about it, I give a number, e.g. 3.145 percent, for 2003 and >>> a number, e.g 5.0 percent, for 2007 and ask you what are the values >>> for the years in >> between. >>> Can you imagine some fancy method two figure it out? >>> I would suspect, under the assumption you don't have any other >>> information, that there is no best solution. Maybe you could just draw >>> a line between the years. >>> >>> Best >>> Oliver >>> * >>> * For searches and help try: >>> * http://www.stata.com/help.cgi?search >>> * http://www.stata.com/support/statalist/faq >>> * http://www.ats.ucla.edu/stat/stata/ >> >> -- >> VERONICA GALASSI >> MSc Development Economics >> University of Sussex >> Mobile: +44 78 5563 0276 >> >> 14 Auckland Drive, >> BN2 4JS, Brighton, UK >> >> E-mail: v.galassi@sussex.ac.uk >> * >> * For searches and help try: >> * http://www.stata.com/help.cgi?search >> * http://www.stata.com/support/statalist/faq >> * http://www.ats.ucla.edu/stat/stata/ >> >> * >> * For searches and help try: >> * http://www.stata.com/help.cgi?search >> * http://www.stata.com/support/statalist/faq >> * http://www.ats.ucla.edu/stat/stata/ -- VERONICA GALASSI MSc Development Economics University of Sussex Mobile: +44 78 5563 0276 14 Auckland Drive, BN2 4JS, Brighton, UK E-mail: v.galassi@sussex.ac.uk * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**Re: st: Multiple imputation with panel data***From:*Oliver Jones <ojones@wiwi.uni-bielefeld.de>

**References**:**st: Multiple imputation with panel data***From:*Oliver Jones <ojones@wiwi.uni-bielefeld.de>

**Re: st: Multiple imputation with panel data***From:*Veronica Galassi <V.Galassi@sussex.ac.uk>

**RE: st: Multiple imputation with panel data***From:*Lance Erickson <lance_erickson@byu.edu>

**Re: st: Multiple imputation with panel data***From:*Oliver Jones <ojones@wiwi.uni-bielefeld.de>

- Prev by Date:
**Re: st: R squared of OLS with dummy variables** - Next by Date:
**Re: st: R squared of OLS with dummy variables** - Previous by thread:
**Re: st: Multiple imputation with panel data** - Next by thread:
**Re: st: Multiple imputation with panel data** - Index(es):