 Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

# Re: st: new variables creation with unbalanced panel

 From Laura To statalist@hsphsun2.harvard.edu Subject Re: st: new variables creation with unbalanced panel Date Wed, 13 Oct 2010 08:56:43 -0700 (PDT)

```Mr. Kohler and Mr. Booth,

Thank you so much for the replies. I have tried the solution suggested by Ulrich

Kohler, however I could not make it work yet, which could be, of course, due to
my lack of experience. I appreciate the suggestions for further reading a lot.

I did not try the second suggestion from Eric Booth yet. It will take me a
little time to understand the code. Thank you so much for the time put into
this.

In my initial email I tried to simplify the problem a little. Here is a more
detailed explanation. My panel contains poverty indicators (var x) which as
are not available every year and the gaps have various lenghts, among
countries and across time. I also have other macro variable which are more or
less available yearly (var y).

I need to create new variables based on var x,
for example, yearly growth, between two consecutive available x's,
within a country:

growthx(t)=(log x(t)-log x(t-i))/i , where t-i and t are two consecutive years
for which var x is available.

Also, based on x, for year t, I need  the variable initialx(t)=x(t-i), within a
country.

Based on y, for year t I need the initial value of y which is the value of y
in the previous year when x was observed

initialy(t)=y(t-i)

and the average value of y for all the years between t-i and t.

avgy(t)=mean (y(t-i), y(t-i+1)..., y(t))

My simple minded approach is to create a variable that stores the positions
where x is observed, for each country.

tsset id year
sort id year
by id (year), sort: gen pos=_n if mean<.

Then for each nonmissing value of pos, determine the position of the previous
non-missing pos. However, I am not familiar enough with the STATA syntax and I
can't make this work. It would be something like this (only it would need to
work).

by id (year), sort: gen previous=max(range position 1 position-1) if position<.

(this statement would have to find the maximum value of pos in the range between

the first observation of pos and the (current-1) observation,  for each
country.)

If this can be accomplished, then I could reffer to values of x and y at
position previous or in the range (previous position)

I don't know if this is actually feasible in STATA or not.

Again, many, many thanks for the help.

Gratefully,
Laura

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```