Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
Nick Cox <njcoxstata@gmail.com> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: Computing the proportion of significant variables after running numerous regressions |

Date |
Mon, 14 May 2012 10:15:52 +0100 |

No, you (and I) need to be more circumspect. After -bootstrap: regress- the results in memory are a mix of results for -bootstrap- and for the last replication of -regress-. So, you need to separate that out in your code. On Mon, May 14, 2012 at 9:52 AM, Nick Cox <njcoxstata@gmail.com> wrote: > You seem to be guessing that after -bootstrap: regress- there is a > quantity left in memory called -_ci_bc_cons-. Not so. Also, each > confidence interval is a pair of numbers, so you need to create two > variables to hold it, not one. The trick to these calculations is to > see what is left in memory after a command. By the way, 10 > replications would not be enough for most serious work. > > * load dataset > sysuse auto, clear > > * set up temporary file for results > tempfile results > tempname postfile > postfile `postfile' foreign _b_cons _se_cons _b_mpg _se_mpg _cons_ll > _cons_ul _b_ll _b_ul using "`results'" > > * run bootstrapped regression for each level of foreign > set seed 1 // so that you can repeat your analysis > levelsof foreign, local(levels) > foreach level of local levels { > bootstrap, rep(10): regress price mpg if foreign==`level' > mat ci = e(ci_bc) > post `postfile' (`level') (_b[_cons]) (_se[_cons]) (_b[mpg]) > (_se[mpg]) (ci[1,2]) (ci[2,2]) (ci[1,1]) (ci[2,1]) > } > postclose `postfile' > > * display results > use "`results'", clear > list > > > On Mon, May 14, 2012 at 9:30 AM, George Murray > <george.murray16@gmail.com> wrote: >> Phil, >> >> Thank you so much for your help, this worked perfectly. >> >> I have one more query, however. >> >> I also need a vector of the bias-corrected confidence intervals (which >> can be obtained with the -estat bootstrap- command). I replace two of >> the commands you suggested with these two commands as follows: >> >> -postfile `postfile' foreign _b_cons _se_cons _ci_bc_cons _b_mpg >> _se_mpg using "`results'"- .............(all I did was add >> "_ci_bc_cons") >> >> -post `postfile' (`level') (_b[_cons]) (_se[_cons]) (_ci_bc[_cons]) >> (_b[mpg]) (_se[mpg])- .............(all I did was add >> "(_ci_bc[_cons])") >> >> and I also wrote -estat boostrap- after the bootstrap, rep(10)... command >> >> However, I get the following error: >> >> _ci_bc not found >> post: above message corresponds to expression 3, variable _ci_bc_cons >> r(111); >> >> Does anyone know how to solve this problem? > > > On Mon, May 14, 2012 at 12:05 AM, Phil Clayton >> <philclayton@internode.on.net> wrote: >>> George, >>> >>> There are various ways to do this. One is to use -post- after each bootstrapped regression to store the results of that regression in a "results" dataset, similar to a Monte Carlo simulation. You can then access the results dataset and manipulate it however you like. >>> >>> Here's a basic example that uses the auto dataset and loops over the levels of "foreign" (ie 0 and 1), runs a bootstrapped regression of price on mpg for each level, and displays the resulting coefficients and standard errors. >>> >>> --------- begin example --------- >>> * load dataset >>> sysuse auto, clear >>> >>> * set up temporary file for results >>> tempfile results >>> tempname postfile >>> postfile `postfile' foreign _b_cons _se_cons _b_mpg _se_mpg using "`results'" >>> >>> * run bootstrapped regression for each level of foreign >>> set seed 1 // so that you can repeat your analysis >>> levelsof foreign, local(levels) >>> foreach level of local levels { >>> bootstrap, rep(10): regress price mpg if foreign==`level' >>> post `postfile' (`level') (_b[_cons]) (_se[_cons]) (_b[mpg]) (_se[mpg]) >>> } >>> postclose `postfile' >>> >>> * display results >>> use "`results'", clear >>> list >>> --------- end example --------- >>> >>> Since you're running ~1000 models you may wish to change "foreach" to "qui foreach", and monitor the iterations using the _dots command (see Harrison DA. Stata tip 41: Monitoring loop iterations. Stata Journal 2007;7(1):140, available at http://www.stata-journal.com/article.html?article=pr0030) >>> >>> Phil >>> >>> >>> On 13/05/2012, at 10:06 PM, George Murray wrote: >>> >>>> Dear Statalist, >>>> >>>> I am using the -foreach- command to run approximately 1000 >>>> (bootstrapped) regression models, however I require an efficient way >>>> of calculating the proportion of the regression models which have a >>>> statistically significant constant at the 5% level; and of the >>>> constants which are statistically significant, the proportion which >>>> are positive. Below each of the 1000 regressions I run, a table is >>>> displayed with the following format: >>>> >>>> --------------------------------------------------------------------------------------------------- >>>> | Observed Bootstrap >>>> V0 | Coef. Bias Std. Err. >>>> [95% Conf. Interval] >>>> -------------+------------------------------------------------------------------------------------ >>>> V1 | .00968169 -.0000537 .00057051 .008721 .0111218 (BC) >>>> V2 | -.00110469 .0000782 .000691 -.0023101 .000459 (BC) >>>> V3 | .00468313 -.0001562 .00084971 .0031954 .0064538 (BC) >>>> _cons | -.00076976 .0001811 .00176677 -.0044496 .0025584 (BC) >>>> -------------------------------------------------------------------------------------------------- >>>> >>>> I would be *very* grateful if someone knew the commands which would >>>> allow me calculate this. In the past, I have used (a highly tedious >>>> and embarrassing approach on) Excel where I filtered every Nth row, >>>> and wrote a command to display 1 if the coefficient lies within the >>>> confidence interval, and 0 if not. This time, however, I am running >>>> numerous models and require a quicker approach. >>>> >>>> One more question -- is there a way to create a new variable where the >>>> coefficients of V1 (for example) are saved, so I can calculate the >>>> mean, standard deviation etc.of V1? >>>> >>>> If someone could answer at least one of these two questions, it would >>>> be very much appreciated. * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**Re: st: Computing the proportion of significant variables after running numerous regressions***From:*Phil Clayton <philclayton@internode.on.net>

**References**:**st: Computing the proportion of significant variables after running numerous regressions***From:*George Murray <george.murray16@gmail.com>

**Re: st: Computing the proportion of significant variables after running numerous regressions***From:*Phil Clayton <philclayton@internode.on.net>

**Re: st: Computing the proportion of significant variables after running numerous regressions***From:*George Murray <george.murray16@gmail.com>

**Re: st: Computing the proportion of significant variables after running numerous regressions***From:*Nick Cox <njcoxstata@gmail.com>

- Prev by Date:
**Re: st: SE and CI by mrtab** - Next by Date:
**st: Statistical tests under heteroskedasticity** - Previous by thread:
**Re: st: Computing the proportion of significant variables after running numerous regressions** - Next by thread:
**Re: st: Computing the proportion of significant variables after running numerous regressions** - Index(es):