Notice: On March 31, it was **announced** that Statalist is moving from an email list to a **forum**. The old list will shut down on April 23, and its replacement, **statalist.org** is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
Stefano Rossi <sr525@cornell.edu> |

To |
"statalist@hsphsun2.harvard.edu" <statalist@hsphsun2.harvard.edu> |

Subject |
RE: st: how can i make my loop run faster? |

Date |
Mon, 19 Sep 2011 10:50:40 -0400 |

Partho, Many thanks for this, it is very helpful. This raises one question, though: a crucial part of my procedure is that I need to run regressions only on 12 observations for each firm-period pair; that is, if a firm i has data back to period t=-50, say, I still have to run the regression only on the 12 observations from -1 to -12, ignoring all others. This worked well with my loop, but I do not see readily how to do this with statsby. Can you please advise? Best, Stefano -----Original Message----- From: owner-statalist@hsphsun2.harvard.edu [mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of Partho Sarkar Sent: Monday, September 19, 2011 1:06 AM To: statalist@hsphsun2.harvard.edu Subject: Re: st: how can i make my loop run faster? Stefano You don't seem to be actually making any use of the panel structure of the data. Stata has very neat built-in procedures for dealing with such data. Very briefly, 2 pointers (I am ignoring the special wrinkle in your problem that you want to run 20 seoarate regressions for each "firm i-period t" pair- you would have to adapt the procedure accordingly): A. I would use -tsfill, full- to fill in the time values and balance the panel. B. If you use tsset panelvar datavar (or xtset), where panelvar is your panel identifier, and datevar the date variable, you can use: statsby _b _se, by(panelvar): regress y x to do all the regressions in one go (assuming a single regression for each "firm i-period t" pair), rather than separately within a long loop. You can collect the results saved in r-class macros, as with _b & _se above. See -help statsby- Having said all that, I have never tried to run a set of regressions with 30,000 firms & 200 time periods in a single run of a program!!! I suspect this will be painfully slow no matter how efficient your code. An obvious alternative would be to split the firms into, say, 10 subsets, do the regression for each subset, and put all the results together. Hope this helps Partho Sarkar Consultant Econometrician Indicus Analytics New Delhi, India On Mon, Sep 19, 2011 at 5:22 AM, Stefano Rossi <sr525@cornell.edu> wrote: > Dear Statalist Users, > > I wonder if you can help me make a faster loop? > I have an unbalanced panel of about 30,000 firms and 200 periods, and for each "firm i-period t" pair I need to run 10 regressions on the 12 observations from t-1 to t-12 of the same firm i, and another 10 regressions on the observations from t+1 to t+12 of the same firm i. I have come up with the following program, which works well as it does what it should do, but it is very slow (due to the many ifs I suspect) - here's a simplified version of it with just two regressions: > > forval z = 1/30000 { > levelsof period if firm==`z', local(sample) > foreach j of local sample { > local k = `j' - 13 > capture reg y x if firm ==`z' & period<`j' & period>`k' & indicator==1 > if _rc==0 { > predict y_hat, xb > replace before = y_hat[_n-1] if firm == `z' & period == `j' > drop y_hat > } > local w = `j' + 13 > capture reg y x if firm ==`z' & period>`j' & period<`w' & indicator==1 > if _rc==0 { > predict y_hat, xb > replace after = y_hat[_n+1] if firm == `z' & period == `j' > drop y_hat > } > } > } > > Right now, it takes several minutes for each firm, so if I run it for the whole sample it would take weeks. > Is there any way to make it (a lot) faster? > > > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/statalist/faq > * http://www.ats.ucla.edu/stat/stata/ > * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**Re: st: how can i make my loop run faster?***From:*Partho Sarkar <partho.ss+lists@gmail.com>

**References**:**st: how can i make my loop run faster?***From:*Stefano Rossi <sr525@cornell.edu>

**Re: st: how can i make my loop run faster?***From:*Partho Sarkar <partho.ss+lists@gmail.com>

- Prev by Date:
**Re: st: String variables over 244 in a dataset with two delimiters** - Next by Date:
**Re: st: use of local in variable name** - Previous by thread:
**Re: st: how can i make my loop run faster?** - Next by thread:
**Re: st: how can i make my loop run faster?** - Index(es):