Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down at the end of May, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: how can i make my loop run faster?


From   Partho Sarkar <partho.ss+lists@gmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: how can i make my loop run faster?
Date   Tue, 20 Sep 2011 10:03:04 +0530

I guess Stefano might have solved his problem by now, but just to
complete this, here is a post by  Brian R. Landy from an older thread
which gives the complete code for -rolling-, including merging the
results files.

http://www.stata.com/statalist/archive/2009-09/msg01239.html

The thread also points out the speed problems with rolling for panel data.

P.Sarkar

On Mon, Sep 19, 2011 at 10:18 PM, Partho Sarkar
<partho.ss+lists@gmail.com> wrote:
> Sorry, I made a mistake in that post. -rolling- will only work on one
> panel at a time, So you could do :
>
> levelsof firm==`z', local(firms)
> foreach j of local firms {
> rolling _b if firm=`j',w(20)  saving(tryroll`j'): regress y x
> }
>
> Partho
>
> On Mon, Sep 19, 2011 at 10:02 PM, Partho Sarkar
> <partho.ss+lists@gmail.com> wrote:
>> I think the  -rolling- time series command can help do this.  E.g.
>> once you a) tsset the panel as before, and b) sort the dataset by
>> -sort panelvar datevar-
>>
>> rolling _b,w(20)  saving(tryroll): regress y x
>>
>> would divide up your entire time span into overlapping windows of
>> width 20, run a regression for each panel in each window, and save the
>> panel ids, the start & end of each window, and the regression
>> coefficients, in a Stata data file called "tryroll".
>>
>> See -help rolling- and the manual entry for details & examples.  Given
>> your special requirements, you will probably have to do this in 2 or
>> more steps, and manipulate the results further to get exactly what you
>> want.
>>
>> Partho
>>
>> On Mon, Sep 19, 2011 at 8:20 PM, Stefano Rossi <sr525@cornell.edu> wrote:
>>> Partho,
>>>
>>> Many thanks for this, it is very helpful.
>>>
>>> This raises one question, though: a crucial part of my procedure is that I need to run regressions only on 12 observations for each firm-period pair; that is, if a firm i has data back to period t=-50, say, I still have to run the regression only on the 12 observations from -1 to -12, ignoring all others.  This worked well with my loop, but I do not see readily how to do this with statsby.  Can you please advise?
>>>
>>> Best,
>>>
>>> Stefano
>>>
>>>
>>>
>>> -----Original Message-----
>>> From: owner-statalist@hsphsun2.harvard.edu [mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of Partho Sarkar
>>> Sent: Monday, September 19, 2011 1:06 AM
>>> To: statalist@hsphsun2.harvard.edu
>>> Subject: Re: st: how can i make my loop run faster?
>>>
>>> Stefano
>>>
>>> You don't seem to be actually making any use of the panel structure of
>>> the data.  Stata has very neat built-in procedures for dealing with
>>> such data.
>>>
>>> Very briefly, 2 pointers (I am ignoring the special wrinkle in your
>>> problem that you want to run 20 seoarate regressions for each "firm
>>> i-period t" pair- you would have to adapt the procedure accordingly):
>>>
>>> A.  I would use -tsfill, full- to fill in the time values and balance the panel.
>>>
>>> B. If you use tsset panelvar datavar (or xtset), where panelvar is
>>> your panel identifier, and datevar the date variable, you can use:
>>>
>>> statsby _b _se, by(panelvar): regress y x
>>>
>>> to do all the regressions in one go (assuming a single regression for
>>> each "firm i-period t" pair), rather than separately within a long
>>> loop.   You can collect the results saved in r-class macros, as with
>>> _b & _se above.  See -help statsby-
>>>
>>> Having said all that, I have never tried to run a set of regressions
>>> with 30,000 firms & 200 time periods in a single run of a program!!!
>>> I suspect this will be painfully slow no matter how efficient your
>>> code. An obvious alternative would be to split the firms into, say, 10
>>> subsets, do the regression for each subset, and put all the results
>>> together.
>>>
>>> Hope this helps
>>>
>>> Partho Sarkar
>>> Consultant Econometrician
>>> Indicus Analytics
>>> New Delhi, India
>>>
>>>
>>> On Mon, Sep 19, 2011 at 5:22 AM, Stefano Rossi <sr525@cornell.edu> wrote:
>>>> Dear Statalist Users,
>>>>
>>>> I wonder if you can help me make a faster loop?
>>>> I have an unbalanced panel of about 30,000 firms and 200 periods, and for each "firm i-period t" pair I need to run 10 regressions on the 12 observations from t-1 to t-12 of the same firm i, and another 10 regressions on the observations from t+1 to t+12 of the same firm i.  I have come up with the following program, which works well as it does what it should do, but it is very slow (due to the many ifs I suspect) - here's a simplified version of it with just two regressions:
>>>>
>>>> forval z = 1/30000 {
>>>> levelsof period if firm==`z', local(sample)
>>>> foreach j of local sample {
>>>>       local k = `j' - 13
>>>>       capture reg y x if firm ==`z' & period<`j' & period>`k' & indicator==1
>>>>       if _rc==0 {
>>>>       predict y_hat, xb
>>>>       replace before = y_hat[_n-1] if firm == `z' & period == `j'
>>>>       drop y_hat
>>>>       }
>>>>       local w = `j' + 13
>>>>       capture reg y x if firm ==`z' & period>`j' & period<`w' & indicator==1
>>>>       if _rc==0 {
>>>>       predict y_hat, xb
>>>> replace after = y_hat[_n+1] if firm == `z' & period == `j'
>>>>       drop y_hat
>>>>       }
>>>>       }
>>>> }
>>>>
>>>> Right now, it takes several minutes for each firm, so if I run it for the whole sample it would take weeks.
>>>> Is there any way to make it (a lot) faster?
>>>>
>>>>
>>>> *
>>>> *   For searches and help try:
>>>> *   http://www.stata.com/help.cgi?search
>>>> *   http://www.stata.com/support/statalist/faq
>>>> *   http://www.ats.ucla.edu/stat/stata/
>>>>
>>>
>>> *
>>> *   For searches and help try:
>>> *   http://www.stata.com/help.cgi?search
>>> *   http://www.stata.com/support/statalist/faq
>>> *   http://www.ats.ucla.edu/stat/stata/
>>>
>>> *
>>> *   For searches and help try:
>>> *   http://www.stata.com/help.cgi?search
>>> *   http://www.stata.com/support/statalist/faq
>>> *   http://www.ats.ucla.edu/stat/stata/
>>>
>>
>

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index