# st: RE: Re: RE: RE: filling in missing panel data as a trend line

 From "Nick Cox" To Subject st: RE: Re: RE: RE: filling in missing panel data as a trend line Date Sun, 10 Sep 2006 09:05:12 +0100

```Not "as always"!

But as Rodrigo underlines, what Jason wants is
exactly linear interpolation, and there is no
code to write beyond working out the command

In fact, Rodrigo's first line of code seems enough:

bysort Country: ipolate Education Year, g(Education_ipo)

Nick
n.j.cox@durham.ac.uk

Rodrigo A. Alfaro

> I think that Nick is right (as always). For your particular case
> try the following:
>
> bysort Country: ipolate Education Year, g(Education_ipo)
> gen Education_5=Education
> replace Education = Education_ipo
> drop Education_ipo

Jason Yackee

> Thank you for the suggestion.  I don't think -ipolate- quite works for
> what I have in mind, but maybe I am wrong.  Here is a hypothetical
> picture of the data.  "Education" is simply the average total years of
> education of a country's population.
>
> Country Year Education
> Mex. 1970 3.4
> Mex. 1971 .
> Mex. 1972 .
> Mex. 1973 .
> Mex. 1974 .
> Mex. 1975 4.2
> Mex. 1976 .
> Mex. 1977 .
> Mex. 1978 .
> Mex. 1979 .
> Mex. 1980 4.7
> Nic. 1970 1.5
> Nic. 1971 .
> Nic. 1972 .
> ~~~ ~~~ ~~~
> Nic. 1980 3.2
>
> Perhaps a better way of describing what I want to do is to fill in the
> years between survey dates with a sort of moving average, so that the
> differences between the measured years are evenly split between the
> (in-between) missing years.  So for Mexico, the difference between
> measured year 1975 and measured year 1970 is 4.2 - 3.4 = 0.8.  To
> linearly fill in the missing values, I would make 1971 = [3.4 +
> (0.8*1)/5], 1972 = [3.4 + (0.8*2)/5], and so on.
>
> I could obviously do this by hand, but for 140 countries and 30 years
> this would take some time.  So I take it that I would have to
> write some
> code automate the process?  Since I am new to code-writing, any ideas
> would be very much appreciated.

Nick Cox

> This sounds like linear interpolation: see -ipolate-.
> Panel data should be interpolated separately -by <panelid>:-.

Jason Yackee, PhD Candidate; J.D.

> > For my panel data set I have a variable ("education") that
> > has only been
> > collected every five years.  My data set is otherwise
> annual; I would
> > like to fill in the missing data for "education" on the basis of a
> > regression/trend line between each five-year observation,

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```