Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

# Re: st: Slow -rolling- regressions on panel data

 From Austin Nichols To statalist@hsphsun2.harvard.edu Subject Re: st: Slow -rolling- regressions on panel data Date Mon, 26 Sep 2011 11:13:37 -0400

```Nick Cox <njcoxstata@gmail.com>:
Except, not necessarily.
The link I provided to
http://www.stata.com/statalist/archive/2008-10/msg00136.html
indicates how you can generate 7 variables, including the regression
coefficients,
without using -if- or -in- restrictions.
Suppose the window is 5 periods instead of 16, and try:

webuse grunfeld, clear
ren mvalue y
set type double
g x=l.y
g xy=x*y
g xx=x^2
g sumxx=xx+l.xx+l2.xx+l3.xx+l4.xx
g sumxy=xy+l.xy+l2.xy+l3.xy+l4.xy
g sumx=x+l.x+l2.x+l3.x+l4.x
g sumy=y+l.y+l2.y+l3.y+l4.y
g b=(5*sumxy-sumx*sumy)/(5*sumxx-sumx^2)
reg y x in 2/6, nohe
reg y x in 3/7, nohe
l com y x b in 1/7

On Mon, Sep 26, 2011 at 10:50 AM, Nick Cox <njcoxstata@gmail.com> wrote:
> Whatever you do here is 250,000 regressions. That's the nub.
>
> Something that is going to be slow, almost always, is
>
> if level_firm == `l'
>
> as Stata will just go through all observations testing whether they qualify.
>
> Nick
>
> On Mon, Sep 26, 2011 at 3:37 PM, Richard Herron
> <richard.c.herron@gmail.com> wrote:
>> I am using -rolling- for rolling regressions on panel data, but it is
>> exceedingly slow. I found a Statalist thread
>> (http://www.stata.com/statalist/archive/2009-09/msg01239.html) with a
>> more manual solution, but it is equally slow (both are too slow to run
>> to completion in a reasonable amount of time).
>>
>> Is -regress- the bottleneck? I only want the AR(1) coefficient; is
>> there a different approach I should take? Are rolling
>> regressions/calculations best done in different software?
>>
>> Thanks!
>>
>> * ----- begin code -----
>> * generate data
>> clear
>> set obs 250000
>> egen firm = seq(), from(1) to(2500) block(100)
>> egen date = seq(), from(1) to(100)
>> generate eps = 1 + rnormal()
>> sort firm date
>> tsset firm date
>>
>> * generate variables for rolling regressions
>> bysort firm (date): generate l_eps = eps[_n - 1]
>> label variable l_eps "One-Quarter Lagged EPS"
>> bysort firm (date): generate end = _n
>> label variable end "Firm-Quarter (for rolling regressions)"
>>
>> * the simple approach is very slow
>> rolling _b, window(16) clear: regress eps l_eps, noconstant
>>
>> * and the approach from an old Statalist thread
>> http://www.stata.com/statalist/archive/2009-09/msg01239.html) is
>> equally slow
>> tempfile tempfile_rr
>> egen level_firm = group(firm)
>> summarize level_firm, meanonly
>> forvalues l = 1/`r(max)' {
>>   rolling if level_firm == `l'
>> ///
>>       , window(16) keep(firm) ///
>>       saving(`tempfile_rr', replace) nodots ///
>>       : regress eps l_eps, noconstant
>>   merge 1:1 firm end using "`tempfile_rr'" ///
>>       , update replace nogenerate keepusing(firm end _b_l_eps)
>> }
>> label variable _b_l_eps "Earnings Persistence"
>> * ----- end code -----

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```