Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: RE: No need for loops? [looped round again]


From   "Nick Cox" <n.j.cox@durham.ac.uk>
To   <statalist@hsphsun2.harvard.edu>
Subject   st: RE: No need for loops? [looped round again]
Date   Mon, 8 Dec 2003 17:06:58 -0000

Last week Adrian de la Garza posted a question

> I have the next database in panel format and I need to take
> averages within countries when my dummy variable 'program'
> indicates it. Basically, I need to generate a variable that
> computes the 12-month average for the observations
> immediatly prior to a 1 in 'program'.
>
> Take Brazil in 1983m3, when program = 1. I want my
> generated variable, let's call it ratio_before, to get the
> value of the 12-month average right before 1983m3, and put
> this value in the same line where program = 1. I know this
> average should be 0.00639.
>
> Then, I also want to take averages for the subsequent
> 12-months (including the month when program = 1), for the
> 12-month period after that, etc. These are always annual
> (12-month) averages for the first year when the program is
> implemented as indicated by the 1 in the dummy, then for
> the 2nd year, for the 3rd year, etc., and we can call these
> variables ratio_1y, ratio2y, ratio3y, ratio4y, and ratio5y.
> All of these computed averages should go in the same line
> where program = 1.
>
> The only thing is: I should stop taking the averages for
> the subsequent years if a new program is implemented (i.e.,
> if I find another 1 in my 'program' variable within the
> period for the average computed).
>
> date      country     ratio                program
> 1982m1	 bra	 0.001943	 0
> 1982m2	 bra	 0.003863	 0

I indicated two ways of attacking this:

1. Using -by:-, and two devices, calculating the sum over #
periods as the difference between two cumulative sums, and
reversing time as a way of looking forwards.

2. Using -tsspell- from SSC.

As a postscript, here is another way of doing it using the -egen-
function -filter()- available from -egenmore- on SSC. Note that
this has _just_ been updated on SSC, thanks to Kit Baum,
precisely to make problems like this one easier. To install
or to replace, you need to

. ssc inst egenmore

or

. ssc inst egenmore, replace

as the case may be.

-filter()- has been updated while I opened it up, to Stata 8;
the previous version, which was for Stata 7, remains in -egenmore-
as -filter7()-.

In this problem, the key is to get variables such as

	the average of the previous 12 months
	the average of this month and the next 11 months
	whether a dummy is ever non-zero in the next 11 months
	whether a dummy is ever non-zero in the year after next

All of these can be got in one line with -egen, filter()-.
However, it depends on a prior -tsset-, in this problem

. tsset country date

. egen prev12 = filter(ratio), lag(1/12) normalise

is the average of the previous 12, whereas

. egen next12 = filter(ratio), lag(-11/0) normalise

is the average of this month and the next 11.

. egen pronext11 = filter(program), lag(-11/-1)

sums -program- over the next 11 months, while

. egen pronextyear = filter(program), lag(-23/-12)

looks ahead a year beyond that. Restricting the
variable to -if program == 1-, etc., can then
be done if desired.

A note on the technicalities follows my signature.

Nick
n.j.cox@durham.ac.uk

Syntax of -egen, filter()-
==========================

filter(timeseriesvar) , lags(numlist) [ coef(numlist) { normalise
| normalize } ] calculates the linear filter which is the sum of
terms

coef_i * Li.timeseriesvar   or   coef_i * Fi.timeseriesvar

coef() defaults to a vector the same length as lags() with each
element 1.

filter(y), l(0/3) c(0.4(0.1)0.1) calculates

0.4 * y + 0.3 * L1.y + 0.2 * L2.y + 0.1 * L3.y

filter(y), l(0/3) calculates the sum

1 * y + 1 * L1.y + 1 * L2.y + 1 * L3.y

or, more simply put,

y + L1.y + L2.y + L3.y

Leads are specified as negative lags.  -normalise- (or -normalize-,
according to taste) specifies that coefficients are to be divided
by their sum so that they add to 1 and thus specify a weighted
mean.

filter(y), l(-2/2) c(1 4 6 4 1) n

calculates

(1/16) * F2.y + (4/16) * F1.y + (6/16) * y + (4/16) * L1.y + (1/16) *
L2.y

The data must have been declared time series data by -tsset-.
Note that this may include panel data, which are automatically
filtered separately within each panel.

The order of terms in -coef()- is taken to be the same as that in
-lags()-.

Stata 8 is required.

(What has changed over the previous version is principally
that the -coef()- vector is no long required. Thus -filter()-
produces sums by default and -filter(), normalise- produces
unweighted, or equally weighted, moving averages by
default.)

Nick
n.j.cox@durham.ac.uk

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index