Notice: On March 31, it was **announced** that Statalist is moving from an email list to a **forum**. The old list will shut down on April 23, and its replacement, **statalist.org** is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
Partho Sarkar <partho.ss+lists@gmail.com> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: how can i make my loop run faster? |

Date |
Tue, 20 Sep 2011 10:03:04 +0530 |

I guess Stefano might have solved his problem by now, but just to complete this, here is a post by Brian R. Landy from an older thread which gives the complete code for -rolling-, including merging the results files. http://www.stata.com/statalist/archive/2009-09/msg01239.html The thread also points out the speed problems with rolling for panel data. P.Sarkar On Mon, Sep 19, 2011 at 10:18 PM, Partho Sarkar <partho.ss+lists@gmail.com> wrote: > Sorry, I made a mistake in that post. -rolling- will only work on one > panel at a time, So you could do : > > levelsof firm==`z', local(firms) > foreach j of local firms { > rolling _b if firm=`j',w(20) saving(tryroll`j'): regress y x > } > > Partho > > On Mon, Sep 19, 2011 at 10:02 PM, Partho Sarkar > <partho.ss+lists@gmail.com> wrote: >> I think the -rolling- time series command can help do this. E.g. >> once you a) tsset the panel as before, and b) sort the dataset by >> -sort panelvar datevar- >> >> rolling _b,w(20) saving(tryroll): regress y x >> >> would divide up your entire time span into overlapping windows of >> width 20, run a regression for each panel in each window, and save the >> panel ids, the start & end of each window, and the regression >> coefficients, in a Stata data file called "tryroll". >> >> See -help rolling- and the manual entry for details & examples. Given >> your special requirements, you will probably have to do this in 2 or >> more steps, and manipulate the results further to get exactly what you >> want. >> >> Partho >> >> On Mon, Sep 19, 2011 at 8:20 PM, Stefano Rossi <sr525@cornell.edu> wrote: >>> Partho, >>> >>> Many thanks for this, it is very helpful. >>> >>> This raises one question, though: a crucial part of my procedure is that I need to run regressions only on 12 observations for each firm-period pair; that is, if a firm i has data back to period t=-50, say, I still have to run the regression only on the 12 observations from -1 to -12, ignoring all others. This worked well with my loop, but I do not see readily how to do this with statsby. Can you please advise? >>> >>> Best, >>> >>> Stefano >>> >>> >>> >>> -----Original Message----- >>> From: owner-statalist@hsphsun2.harvard.edu [mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of Partho Sarkar >>> Sent: Monday, September 19, 2011 1:06 AM >>> To: statalist@hsphsun2.harvard.edu >>> Subject: Re: st: how can i make my loop run faster? >>> >>> Stefano >>> >>> You don't seem to be actually making any use of the panel structure of >>> the data. Stata has very neat built-in procedures for dealing with >>> such data. >>> >>> Very briefly, 2 pointers (I am ignoring the special wrinkle in your >>> problem that you want to run 20 seoarate regressions for each "firm >>> i-period t" pair- you would have to adapt the procedure accordingly): >>> >>> A. I would use -tsfill, full- to fill in the time values and balance the panel. >>> >>> B. If you use tsset panelvar datavar (or xtset), where panelvar is >>> your panel identifier, and datevar the date variable, you can use: >>> >>> statsby _b _se, by(panelvar): regress y x >>> >>> to do all the regressions in one go (assuming a single regression for >>> each "firm i-period t" pair), rather than separately within a long >>> loop. You can collect the results saved in r-class macros, as with >>> _b & _se above. See -help statsby- >>> >>> Having said all that, I have never tried to run a set of regressions >>> with 30,000 firms & 200 time periods in a single run of a program!!! >>> I suspect this will be painfully slow no matter how efficient your >>> code. An obvious alternative would be to split the firms into, say, 10 >>> subsets, do the regression for each subset, and put all the results >>> together. >>> >>> Hope this helps >>> >>> Partho Sarkar >>> Consultant Econometrician >>> Indicus Analytics >>> New Delhi, India >>> >>> >>> On Mon, Sep 19, 2011 at 5:22 AM, Stefano Rossi <sr525@cornell.edu> wrote: >>>> Dear Statalist Users, >>>> >>>> I wonder if you can help me make a faster loop? >>>> I have an unbalanced panel of about 30,000 firms and 200 periods, and for each "firm i-period t" pair I need to run 10 regressions on the 12 observations from t-1 to t-12 of the same firm i, and another 10 regressions on the observations from t+1 to t+12 of the same firm i. I have come up with the following program, which works well as it does what it should do, but it is very slow (due to the many ifs I suspect) - here's a simplified version of it with just two regressions: >>>> >>>> forval z = 1/30000 { >>>> levelsof period if firm==`z', local(sample) >>>> foreach j of local sample { >>>> local k = `j' - 13 >>>> capture reg y x if firm ==`z' & period<`j' & period>`k' & indicator==1 >>>> if _rc==0 { >>>> predict y_hat, xb >>>> replace before = y_hat[_n-1] if firm == `z' & period == `j' >>>> drop y_hat >>>> } >>>> local w = `j' + 13 >>>> capture reg y x if firm ==`z' & period>`j' & period<`w' & indicator==1 >>>> if _rc==0 { >>>> predict y_hat, xb >>>> replace after = y_hat[_n+1] if firm == `z' & period == `j' >>>> drop y_hat >>>> } >>>> } >>>> } >>>> >>>> Right now, it takes several minutes for each firm, so if I run it for the whole sample it would take weeks. >>>> Is there any way to make it (a lot) faster? >>>> >>>> >>>> * >>>> * For searches and help try: >>>> * http://www.stata.com/help.cgi?search >>>> * http://www.stata.com/support/statalist/faq >>>> * http://www.ats.ucla.edu/stat/stata/ >>>> >>> >>> * >>> * For searches and help try: >>> * http://www.stata.com/help.cgi?search >>> * http://www.stata.com/support/statalist/faq >>> * http://www.ats.ucla.edu/stat/stata/ >>> >>> * >>> * For searches and help try: >>> * http://www.stata.com/help.cgi?search >>> * http://www.stata.com/support/statalist/faq >>> * http://www.ats.ucla.edu/stat/stata/ >>> >> > * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**RE: st: how can i make my loop run faster?***From:*Stefano Rossi <sr525@cornell.edu>

**References**:**st: how can i make my loop run faster?***From:*Stefano Rossi <sr525@cornell.edu>

**Re: st: how can i make my loop run faster?***From:*Partho Sarkar <partho.ss+lists@gmail.com>

**RE: st: how can i make my loop run faster?***From:*Stefano Rossi <sr525@cornell.edu>

**Re: st: how can i make my loop run faster?***From:*Partho Sarkar <partho.ss+lists@gmail.com>

**Re: st: how can i make my loop run faster?***From:*Partho Sarkar <partho.ss+lists@gmail.com>

- Prev by Date:
**Re: st: Oddity with -bootstrap- handling an option on command input** - Next by Date:
**RE: st: how can i make my loop run faster?** - Previous by thread:
**Re: st: how can i make my loop run faster?** - Next by thread:
**RE: st: how can i make my loop run faster?** - Index(es):