Notice: On March 31, it was **announced** that Statalist is moving from an email list to a **forum**. The old list will shut down on April 23, and its replacement, **statalist.org** is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
Partho Sarkar <partho.ss+lists@gmail.com> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: how can i make my loop run faster? |

Date |
Mon, 19 Sep 2011 22:02:29 +0530 |

I think the -rolling- time series command can help do this. E.g. once you a) tsset the panel as before, and b) sort the dataset by -sort panelvar datevar- rolling _b,w(20) saving(tryroll): regress y x would divide up your entire time span into overlapping windows of width 20, run a regression for each panel in each window, and save the panel ids, the start & end of each window, and the regression coefficients, in a Stata data file called "tryroll". See -help rolling- and the manual entry for details & examples. Given your special requirements, you will probably have to do this in 2 or more steps, and manipulate the results further to get exactly what you want. Partho On Mon, Sep 19, 2011 at 8:20 PM, Stefano Rossi <sr525@cornell.edu> wrote: > Partho, > > Many thanks for this, it is very helpful. > > This raises one question, though: a crucial part of my procedure is that I need to run regressions only on 12 observations for each firm-period pair; that is, if a firm i has data back to period t=-50, say, I still have to run the regression only on the 12 observations from -1 to -12, ignoring all others. This worked well with my loop, but I do not see readily how to do this with statsby. Can you please advise? > > Best, > > Stefano > > > > -----Original Message----- > From: owner-statalist@hsphsun2.harvard.edu [mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of Partho Sarkar > Sent: Monday, September 19, 2011 1:06 AM > To: statalist@hsphsun2.harvard.edu > Subject: Re: st: how can i make my loop run faster? > > Stefano > > You don't seem to be actually making any use of the panel structure of > the data. Stata has very neat built-in procedures for dealing with > such data. > > Very briefly, 2 pointers (I am ignoring the special wrinkle in your > problem that you want to run 20 seoarate regressions for each "firm > i-period t" pair- you would have to adapt the procedure accordingly): > > A. I would use -tsfill, full- to fill in the time values and balance the panel. > > B. If you use tsset panelvar datavar (or xtset), where panelvar is > your panel identifier, and datevar the date variable, you can use: > > statsby _b _se, by(panelvar): regress y x > > to do all the regressions in one go (assuming a single regression for > each "firm i-period t" pair), rather than separately within a long > loop. You can collect the results saved in r-class macros, as with > _b & _se above. See -help statsby- > > Having said all that, I have never tried to run a set of regressions > with 30,000 firms & 200 time periods in a single run of a program!!! > I suspect this will be painfully slow no matter how efficient your > code. An obvious alternative would be to split the firms into, say, 10 > subsets, do the regression for each subset, and put all the results > together. > > Hope this helps > > Partho Sarkar > Consultant Econometrician > Indicus Analytics > New Delhi, India > > > On Mon, Sep 19, 2011 at 5:22 AM, Stefano Rossi <sr525@cornell.edu> wrote: >> Dear Statalist Users, >> >> I wonder if you can help me make a faster loop? >> I have an unbalanced panel of about 30,000 firms and 200 periods, and for each "firm i-period t" pair I need to run 10 regressions on the 12 observations from t-1 to t-12 of the same firm i, and another 10 regressions on the observations from t+1 to t+12 of the same firm i. I have come up with the following program, which works well as it does what it should do, but it is very slow (due to the many ifs I suspect) - here's a simplified version of it with just two regressions: >> >> forval z = 1/30000 { >> levelsof period if firm==`z', local(sample) >> foreach j of local sample { >> local k = `j' - 13 >> capture reg y x if firm ==`z' & period<`j' & period>`k' & indicator==1 >> if _rc==0 { >> predict y_hat, xb >> replace before = y_hat[_n-1] if firm == `z' & period == `j' >> drop y_hat >> } >> local w = `j' + 13 >> capture reg y x if firm ==`z' & period>`j' & period<`w' & indicator==1 >> if _rc==0 { >> predict y_hat, xb >> replace after = y_hat[_n+1] if firm == `z' & period == `j' >> drop y_hat >> } >> } >> } >> >> Right now, it takes several minutes for each firm, so if I run it for the whole sample it would take weeks. >> Is there any way to make it (a lot) faster? >> >> >> * >> * For searches and help try: >> * http://www.stata.com/help.cgi?search >> * http://www.stata.com/support/statalist/faq >> * http://www.ats.ucla.edu/stat/stata/ >> > > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/statalist/faq > * http://www.ats.ucla.edu/stat/stata/ > > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/statalist/faq > * http://www.ats.ucla.edu/stat/stata/ > * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**Re: st: how can i make my loop run faster?***From:*Partho Sarkar <partho.ss+lists@gmail.com>

**References**:**st: how can i make my loop run faster?***From:*Stefano Rossi <sr525@cornell.edu>

**Re: st: how can i make my loop run faster?***From:*Partho Sarkar <partho.ss+lists@gmail.com>

**RE: st: how can i make my loop run faster?***From:*Stefano Rossi <sr525@cornell.edu>

- Prev by Date:
**st: RE: xtscc and small samples (equal size T and N)** - Next by Date:
**Re: st: how can i make my loop run faster?** - Previous by thread:
**RE: st: how can i make my loop run faster?** - Next by thread:
**Re: st: how can i make my loop run faster?** - Index(es):