# Re: st: AW: Simulating stepwise regression

 From Tirthankar Chakravarty To statalist@hsphsun2.harvard.edu Subject Re: st: AW: Simulating stepwise regression Date Fri, 7 Aug 2009 12:10:16 +0100

```You should probably use -simulate-. Here is what it might look like:

***********************************
capture program drop sim
version 10
program define sim, rclass
drop _all
syntax , nreg(integer ) nobs(integer )
set obs `nobs'
forv i=1/`nreg' {
g x`i' = invnormal(uniform())
}
gen y = invnorm(uniform())
stepwise, pr(.2): regress y x*
qui indeplist
return scalar r2d2 = e(r2)
end

/*
simulate for each of the regressor and
sample size combinations required.
10,000 replications.
*/
foreach nobs of numlist 1000 1500 2000 {
forv nreg = 1(1)10 {
simulate r2d2=r(r2d2), reps(10000) ///
saving(sw_r2_`nobs'_`nreg'.dta, every(1) ///
replace) seed(123): sim, nreg(`nreg') ///
nobs(`nobs')
}
}
use sw_r2_1000_5, clear
kdensity r2d2
***********************************************

On Fri, Aug 7, 2009 at 11:18 AM, John Antonakis<john.antonakis@unil.ch> wrote:
> That's very helpful; thanks Martin.
>
> To extend the below, how would I simulate the r-square? That is, I want to
> run the simulation say 100 times, and then obtain the mean r-square from
> each simulation. Thus, I can show, at a specific sample size (n=100) and
> number of independent variables (k=5), what the r-square would be just by
> chance alone.
>
> As an extension, is there a way to vary the sample size (n from 50 to 1000,
> in increments of 50) and the number of independent variables (k=1 to k=100
> in increments of 1) in the simulation?
>
> Best,
> J.
>
> ____________________________________________________
>
> Prof. John Antonakis
> Associate Dean Faculty of Business and Economics
> University of Lausanne
> Internef #618
> CH-1015 Lausanne-Dorigny
> Switzerland
>
> Tel ++41 (0)21 692-3438
> Fax ++41 (0)21 692-3305
>
> Faculty page:
> http://www.hec.unil.ch/people/jantonakis&cl=en
>
> Personal page:
> http://www.hec.unil.ch/jantonakis
> ____________________________________________________
>
>
>
> On 07.08.2009 12:06, Martin Weiss wrote:
>>
>> <>
>> You could also -tokenize- the return from -indeplist- and have your
>> -program- return the regressors one by one...
>>
>>
>> *************
>> capt prog drop sim
>>
>> version 10.1
>>
>> program define sim, rclass
>>  drop _all
>>        set obs 100
>>        gen y = invnorm(uniform())
>>        gen x1 = invnorm(uniform())
>>        gen x2 = invnorm(uniform())
>>        gen x3 = invnorm(uniform())
>>        gen x4 = invnorm(uniform())
>>        gen x5 = invnorm(uniform())
>>        stepwise, pr(.2): regress y x1-x5
>>        qui indeplist
>>        tokenize "`r(X)'"
>>        ret loc one="`1'"
>>        ret loc two="`2'"
>>        ret loc three="`3'"
>>        ret loc four="`4'"
>>        ret loc five="`5'"
>> end
>>
>> sim
>>
>> ret li
>> *************
>>
>>
>>
>> HTH
>> Martin
>>
>>
>> -----Ursprüngliche Nachricht-----
>> Von: owner-statalist@hsphsun2.harvard.edu
>> [mailto:owner-statalist@hsphsun2.harvard.edu] Im Auftrag von John Antonakis
>> Gesendet: Freitag, 7. August 2009 11:47
>> An: statalist@hsphsun2.harvard.edu
>> Betreff: st: Simulating stepwise regression
>>
>> Hi:
>>
>> I would like to simulate the below. Note, I am no fan of stepwise--I just
>> want to demonstrate it evils
>>
>> However, I do not know
>>
>> 1. what to put in the place of "??"--that is, I want the program to
>> capture only the variables that were selected in the model as being
>> significant
>>
>> 2. how to simulate the r-square.
>>
>> 3. how to extend the simulation (a new program) such that I simulate from
>> n = 50 to n=1000 (in increments of 50), crossed with independent variables
>> ranging from x1 to x100.
>>
>> Regards,
>> John.
>>
>> Here is the program:
>>
>> set seed 123456
>>
>> capture program drop sim
>>  version 10.1
>> program define sim, eclass
>>        drop _all
>>
>> set obs 100
>>
>> gen y = invnorm(uniform())
>> gen x1 = invnorm(uniform())
>> gen x2 = invnorm(uniform())
>> gen x3 = invnorm(uniform())
>> gen x4 = invnorm(uniform())
>> gen x5 = invnorm(uniform())
>>
>> stepwise, pr(.2): regress y x1-x5
>>  end
>>
>> simulate ??? , reps(20) seed (123) : sim,
>>
>> foreach v in ?? {
>>  gen t_`v' = /*
>> */_b_`v'/_se_`v'
>>  gen p_`v' =/*
>> */ 2*(1-normal(abs(t_`v')))
>> }
>>
>> ____________________________________________________
>>
>> Prof. John Antonakis
>> Associate Dean Faculty of Business and Economics
>> University of Lausanne
>> Internef #618
>> CH-1015 Lausanne-Dorigny
>> Switzerland
>>
>> Tel ++41 (0)21 692-3438
>> Fax ++41 (0)21 692-3305
>>
>> Faculty page:
>> http://www.hec.unil.ch/people/jantonakis&cl=en
>>
>> Personal page:
>> http://www.hec.unil.ch/jantonakis
>> ____________________________________________________
>>
>>
>>
>> *
>> *   For searches and help try:
>> *   http://www.stata.com/help.cgi?search
>> *   http://www.stata.com/support/statalist/faq
>> *   http://www.ats.ucla.edu/stat/stata/
>>
>>
>> *
>> *   For searches and help try:
>> *   http://www.stata.com/help.cgi?search
>> *   http://www.stata.com/support/statalist/faq
>> *   http://www.ats.ucla.edu/stat/stata/
>>
>
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
>

--
To every ω-consistent recursive class κ of formulae there correspond
recursive class signs r, such that neither v Gen r nor Neg(v Gen r)
belongs to Flg(κ) (where v is the free variable of r).

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```