In response to a question about calculating a 60 day lagged std
dev, Nick Winter <nwinter@policystudies.com> suggested:
>
> gen sum1=0
> gen sum2=0
> gen entity2=entity^2
> sort year day
> forval i=0/59 {
> qui by year: replace sum1=sum1+entity[_n-`i']
> qui by year: replace sum2=sum2+entity2[_n-`i']
> }
> gen sd_lag60 = sqrt( (1/59) * (sum2 - (sum1^2)/60) )
>
> Basically, this creates x and x^2, then creates sum1 and sum2
in the
> -forvalues- loop, which are the sum of (x) and the sum of (x^2)
for the
> sixty lagged observations. Then the standard deviation is easy
to
> calculate.
>
This is a great approach for the problem, but I think there is a
quicker variant on his solution that avoids the forval loop (60x2
replace statements). Also, I think the problem was to calculate
the lagged sd BY Entity for the variable Value...
bysort entity year (day) : gen sum1=sum(value)
by entity year (day): gen sum2=sum(value^2)
by entity year (day): gen sd_lag60=sqrt( (1/59) *
(sum2-sum2[_n-59]) - ((sum1-sum1[_n-59])^2)/60) )
(forgive any line wrapping)
This approach takes advantage of Stata's cumulative sum function
so one can get the sum for any period by subtracting the starting
value from the ending value
This approach (like Nick's) assumes no missing data, but could
deal with missing data with a little more work..
Michael Blasnik
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/