Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: summarizing data over 3 or 5 year periods for macro panel data.

From   Manhal Mohammad Ali <>
Subject   st: summarizing data over 3 or 5 year periods for macro panel data.
Date   Wed, 20 Jul 2011 03:07:32 +0100

Dear Stata users,

For macro panel data sets (where N is not so big and T is moderate),
researchers and academics summarize or average data over certain
number of periods for example 3 or 5 or 10 for example in growth

I have a panel data from 1980 to 2009 for 39 countries and I want to
average my data over a 3 or 5 year periods for the variables y and x
to then do regression of average y for  3 or 5 year periods on average
x for 3 or 5 year period. My variables y and x are gdp and inflation.
This is what is I did so far for a 3 year period example

egen idthird = seq(), block(3)
bysort id idthird: egen my = mean(y). Similarly for variable x.

This gives me then a "3 independent years groups averages". Then I
want to regress  3 period mean or average of y, my on the  3 period
average of mx.

But this is the type of data I get (I did not put values for y and x
for simplicity)
Country	year	idthird	my	mx	y	x
a	       1980	  1	        2.5	1.9
b	       1981	  1      	2.5	1.9
c	       1982	  1       	2.5	1.9
d	       1983	  2	        2.8	1.3
e              1984	  2	        2.8	1.3
f	       1985	  2	        2.8	1.3
g	       1986	  3	        1.6	1.3

where 2.5 is the average of the first three years for variable my and
1.9 is the average for the first three years for variable mx.

Now how can I regress the average of y for 3 periods, my on mx, the
average of variable x for 3 periods  given that the above data now
looks the one above.

You can clearly see what the problem is -  there is three 2.5 values
which is average for first three years then there is average of 2.8
for next three years and so on for variable my and similarly for
variable mx. Shouldn’t there be one entry of 2.5 for the first group 1
(1980 -1982) then 2.8 for the next group 2 (1983-1985) for variable my
and similarly for variable mx.

My basic problem and question is how to generate a sequence of
variables summarized over 3 ot 5 periods and the do regression using
those new averaged variables. I would really appreciate your kind help
as I am doing an MSc dissertation and I am little running back
Thank you very much.

Manhal Ali
University of Bristol.

*   For searches and help try:

© Copyright 1996–2017 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index