[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

From |
"Nick Cox" <n.j.cox@durham.ac.uk> |

To |
<statalist@hsphsun2.harvard.edu> |

Subject |
st: RE: No need for loops? [looped round again] |

Date |
Mon, 8 Dec 2003 17:06:58 -0000 |

Last week Adrian de la Garza posted a question > I have the next database in panel format and I need to take > averages within countries when my dummy variable 'program' > indicates it. Basically, I need to generate a variable that > computes the 12-month average for the observations > immediatly prior to a 1 in 'program'. > > Take Brazil in 1983m3, when program = 1. I want my > generated variable, let's call it ratio_before, to get the > value of the 12-month average right before 1983m3, and put > this value in the same line where program = 1. I know this > average should be 0.00639. > > Then, I also want to take averages for the subsequent > 12-months (including the month when program = 1), for the > 12-month period after that, etc. These are always annual > (12-month) averages for the first year when the program is > implemented as indicated by the 1 in the dummy, then for > the 2nd year, for the 3rd year, etc., and we can call these > variables ratio_1y, ratio2y, ratio3y, ratio4y, and ratio5y. > All of these computed averages should go in the same line > where program = 1. > > The only thing is: I should stop taking the averages for > the subsequent years if a new program is implemented (i.e., > if I find another 1 in my 'program' variable within the > period for the average computed). > > date country ratio program > 1982m1 bra 0.001943 0 > 1982m2 bra 0.003863 0 I indicated two ways of attacking this: 1. Using -by:-, and two devices, calculating the sum over # periods as the difference between two cumulative sums, and reversing time as a way of looking forwards. 2. Using -tsspell- from SSC. As a postscript, here is another way of doing it using the -egen- function -filter()- available from -egenmore- on SSC. Note that this has _just_ been updated on SSC, thanks to Kit Baum, precisely to make problems like this one easier. To install or to replace, you need to . ssc inst egenmore or . ssc inst egenmore, replace as the case may be. -filter()- has been updated while I opened it up, to Stata 8; the previous version, which was for Stata 7, remains in -egenmore- as -filter7()-. In this problem, the key is to get variables such as the average of the previous 12 months the average of this month and the next 11 months whether a dummy is ever non-zero in the next 11 months whether a dummy is ever non-zero in the year after next All of these can be got in one line with -egen, filter()-. However, it depends on a prior -tsset-, in this problem . tsset country date . egen prev12 = filter(ratio), lag(1/12) normalise is the average of the previous 12, whereas . egen next12 = filter(ratio), lag(-11/0) normalise is the average of this month and the next 11. . egen pronext11 = filter(program), lag(-11/-1) sums -program- over the next 11 months, while . egen pronextyear = filter(program), lag(-23/-12) looks ahead a year beyond that. Restricting the variable to -if program == 1-, etc., can then be done if desired. A note on the technicalities follows my signature. Nick n.j.cox@durham.ac.uk Syntax of -egen, filter()- ========================== filter(timeseriesvar) , lags(numlist) [ coef(numlist) { normalise | normalize } ] calculates the linear filter which is the sum of terms coef_i * Li.timeseriesvar or coef_i * Fi.timeseriesvar coef() defaults to a vector the same length as lags() with each element 1. filter(y), l(0/3) c(0.4(0.1)0.1) calculates 0.4 * y + 0.3 * L1.y + 0.2 * L2.y + 0.1 * L3.y filter(y), l(0/3) calculates the sum 1 * y + 1 * L1.y + 1 * L2.y + 1 * L3.y or, more simply put, y + L1.y + L2.y + L3.y Leads are specified as negative lags. -normalise- (or -normalize-, according to taste) specifies that coefficients are to be divided by their sum so that they add to 1 and thus specify a weighted mean. filter(y), l(-2/2) c(1 4 6 4 1) n calculates (1/16) * F2.y + (4/16) * F1.y + (6/16) * y + (4/16) * L1.y + (1/16) * L2.y The data must have been declared time series data by -tsset-. Note that this may include panel data, which are automatically filtered separately within each panel. The order of terms in -coef()- is taken to be the same as that in -lags()-. Stata 8 is required. (What has changed over the previous version is principally that the -coef()- vector is no long required. Thus -filter()- produces sums by default and -filter(), normalise- produces unweighted, or equally weighted, moving averages by default.) Nick n.j.cox@durham.ac.uk * * For searches and help try: * http://www.stata.com/support/faqs/res/findit.html * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**st: nchoosek in Stata?***From:*"Dimitriy V. Masterov" <dvmaster@lily.src.uchicago.edu>

- Prev by Date:
**st: extended function: piece** - Next by Date:
**Re: st: Missing Data** - Previous by thread:
**st: extended function: piece** - Next by thread:
**st: nchoosek in Stata?** - Index(es):

© Copyright 1996–2017 StataCorp LLC | Terms of use | Privacy | Contact us | What's new | Site index |