Statalist The Stata Listserver


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: RE: RE: Re: RE: RE: filling in missing panel data as a trend line


From   "Jason Yackee" <jyackee@law.usc.edu>
To   <statalist@hsphsun2.harvard.edu>
Subject   st: RE: RE: Re: RE: RE: filling in missing panel data as a trend line
Date   Sun, 10 Sep 2006 09:13:42 -0700

Thank you Nick & Rodrigo.  I am chagrined at the simplicity of the
solution!  Indeed it works perfectly.



-----Original Message-----
From: owner-statalist@hsphsun2.harvard.edu
[mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of Nick Cox
Sent: Sunday, September 10, 2006 1:05 AM
To: statalist@hsphsun2.harvard.edu
Subject: st: RE: Re: RE: RE: filling in missing panel data as a trend
line

Not "as always"! 

But as Rodrigo underlines, what Jason wants is 
exactly linear interpolation, and there is no 
code to write beyond working out the command 
syntax for your case. 

In fact, Rodrigo's first line of code seems enough:

bysort Country: ipolate Education Year, g(Education_ipo)

Nick 
n.j.cox@durham.ac.uk 

Rodrigo A. Alfaro
 
> I think that Nick is right (as always). For your particular case
> try the following:
> 
> bysort Country: ipolate Education Year, g(Education_ipo)
> gen Education_5=Education
> replace Education = Education_ipo
> drop Education_ipo

Jason Yackee

> Thank you for the suggestion.  I don't think -ipolate- quite works for
> what I have in mind, but maybe I am wrong.  Here is a hypothetical
> picture of the data.  "Education" is simply the average total years of
> education of a country's population.  
> 
> Country Year Education
> Mex. 1970 3.4 
> Mex. 1971 .
> Mex. 1972 .
> Mex. 1973 .
> Mex. 1974 . 
> Mex. 1975 4.2
> Mex. 1976 .
> Mex. 1977 .
> Mex. 1978 .
> Mex. 1979 .
> Mex. 1980 4.7
> Nic. 1970 1.5
> Nic. 1971 .
> Nic. 1972 .
> ~~~ ~~~ ~~~
> Nic. 1980 3.2
> 
> Perhaps a better way of describing what I want to do is to fill in the
> years between survey dates with a sort of moving average, so that the
> differences between the measured years are evenly split between the
> (in-between) missing years.  So for Mexico, the difference between
> measured year 1975 and measured year 1970 is 4.2 - 3.4 = 0.8.  To
> linearly fill in the missing values, I would make 1971 = [3.4 +
> (0.8*1)/5], 1972 = [3.4 + (0.8*2)/5], and so on.
> 
> I could obviously do this by hand, but for 140 countries and 30 years
> this would take some time.  So I take it that I would have to 
> write some
> code automate the process?  Since I am new to code-writing, any ideas
> would be very much appreciated.
 
Nick Cox
 
> This sounds like linear interpolation: see -ipolate-. 
> Panel data should be interpolated separately -by <panelid>:-. 

Jason Yackee, PhD Candidate; J.D.
 
> > For my panel data set I have a variable ("education") that 
> > has only been
> > collected every five years.  My data set is otherwise 
> annual; I would
> > like to fill in the missing data for "education" on the basis of a
> > regression/trend line between each five-year observation, 

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index