Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down at the end of May, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Speed of bsample and nested loops


From   Nick Cox <njcoxstata@gmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: Speed of bsample and nested loops
Date   Wed, 5 Oct 2011 21:34:55 +0100

I wouldn't

     capture drop w
     quietly gen w = .

I would go outside the loop

     gen w = .

and

   replace w = .

inside it. That's partly a style preference. Not sure how faster it would be

As you only want the mean

                                       quietly summarize ar`x' if
boot_grp == `b' [fweight=w]

is overkill.

                                        summarize ar`x' if boot_grp ==
`b' [fweight=w], meanonly

would be faster.

On Wed, Oct 5, 2011 at 8:51 PM, Poliquin, Christopher <cpoliquin@hbs.edu> wrote:

> I am trying to speed up my code for bootstrapping and suspect there are significant gains to be made because right now it is super slow.
>
> I am trying to draw samples of size 1-3 with replacement from a file with about 300,000 rows.  It is a panel dataset of companies and their daily stock returns for two years.
>
> I have written a little program to loop over groups of companies and draw samples of size 1-3 from 5 different variables with returns data.  The mean of the sample is then written to a file.
>
> Could someone please look at this code and suggest areas that could be modified to make this run at a reasonable speed?  I have omitted the beginning because the real issue is probably the nested loops.
>
> program bootstrapping
>        // Bootstapping mean abnormal returns
>        // Pass sample name as first argument for saving output
>        // Pass replication number as second argument
>
>        egen boot_grp = group(id cl)
>        *[Some omitted stuff that is fast already]
>
>        // Open a file to hold the bootstrapped results.
>        file open boot using `1'_boots.txt, write text replace
>        file write boot "id" _tab "cl" _tab "sampsize" _tab "ar" _tab "mean" _n
>        forvalues k=1/`2' {
>                * This is the number of draws to make for each sample size
>                set seed `k'
>                forvalues j=1/3 {
>                        *Draws of size 1-3
>                        capture drop w
>                        quietly gen w = .
>                        // Sample with replacement, fweight in w
>                        bsample `j', strata(id cl permno) weight(w)
>                        foreach b of local boots {
>                                // Mean abnormal return for the sample
>                                // within id and cl grouping.
>                                forvalues x = 1/5 {
>                                        // Within each abnormal return measure...
>                                        quietly summarize ar`x' if boot_grp == `b' [fweight=w]
>                                        loc mu = r(mean) * 100
>                                        // Write bootstapped means to the output file
>                                        file write boot "`idb`b''" _tab "`gb`b''" _tab
>                                        file write boot "`j'" _tab "`x'" _tab
>                                        file write boot "`mu'" _n
>                                }
>                        }
>                }
>        }
>        file close boot
> end
>

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index