# Re: st: AW: Simulating stepwise regression

 From John Antonakis To statalist@hsphsun2.harvard.edu Subject Re: st: AW: Simulating stepwise regression Date Fri, 07 Aug 2009 18:44:47 +0200

```Thanks Tirthankar!

```
I see that separate files are stored for each simulation. How could one combine those results in one file?
```
```
Also, how would one generate a table (sample size on the horizontal and number of predictors on the vertical) with the simulated r-squares?
```Best,
J.

____________________________________________________

Prof. John Antonakis
Associate Dean Faculty of Business and Economics
University of Lausanne
Internef #618
CH-1015 Lausanne-Dorigny
Switzerland

Tel ++41 (0)21 692-3438
Fax ++41 (0)21 692-3305

Faculty page:
http://www.hec.unil.ch/people/jantonakis&cl=en

Personal page:
http://www.hec.unil.ch/jantonakis
____________________________________________________

On 07.08.2009 13:10, Tirthankar Chakravarty wrote:
```
```You should probably use -simulate-. Here is what it might look like:

***********************************
capture program drop sim
version 10
program define sim, rclass
drop _all
syntax , nreg(integer ) nobs(integer )
set obs `nobs'
forv i=1/`nreg' {
g x`i' = invnormal(uniform())
}
gen y = invnorm(uniform())
stepwise, pr(.2): regress y x*
qui indeplist
return scalar r2d2 = e(r2)
end

/*
simulate for each of the regressor and
sample size combinations required.
10,000 replications.
*/
foreach nobs of numlist 1000 1500 2000 {
forv nreg = 1(1)10 {
simulate r2d2=r(r2d2), reps(10000) ///
saving(sw_r2_`nobs'_`nreg'.dta, every(1) ///
replace) seed(123): sim, nreg(`nreg') ///
nobs(`nobs')
}
}
use sw_r2_1000_5, clear
kdensity r2d2
***********************************************

On Fri, Aug 7, 2009 at 11:18 AM, John Antonakis<john.antonakis@unil.ch> wrote:
```
```That's very helpful; thanks Martin.

To extend the below, how would I simulate the r-square? That is, I want to
run the simulation say 100 times, and then obtain the mean r-square from
each simulation. Thus, I can show, at a specific sample size (n=100) and
number of independent variables (k=5), what the r-square would be just by
chance alone.

As an extension, is there a way to vary the sample size (n from 50 to 1000,
in increments of 50) and the number of independent variables (k=1 to k=100
in increments of 1) in the simulation?

Best,
J.

____________________________________________________

Prof. John Antonakis
Associate Dean Faculty of Business and Economics
University of Lausanne
Internef #618
CH-1015 Lausanne-Dorigny
Switzerland

Tel ++41 (0)21 692-3438
Fax ++41 (0)21 692-3305

Faculty page:
http://www.hec.unil.ch/people/jantonakis&cl=en

Personal page:
http://www.hec.unil.ch/jantonakis
____________________________________________________

On 07.08.2009 12:06, Martin Weiss wrote:
```
```<>
You could also -tokenize- the return from -indeplist- and have your
-program- return the regressors one by one...

*************
capt prog drop sim

version 10.1

program define sim, rclass
drop _all
set obs 100
gen y = invnorm(uniform())
gen x1 = invnorm(uniform())
gen x2 = invnorm(uniform())
gen x3 = invnorm(uniform())
gen x4 = invnorm(uniform())
gen x5 = invnorm(uniform())
stepwise, pr(.2): regress y x1-x5
qui indeplist
tokenize "`r(X)'"
ret loc one="`1'"
ret loc two="`2'"
ret loc three="`3'"
ret loc four="`4'"
ret loc five="`5'"
end

sim

ret li
*************

HTH
Martin

-----Ursprüngliche Nachricht-----
Von: owner-statalist@hsphsun2.harvard.edu
[mailto:owner-statalist@hsphsun2.harvard.edu] Im Auftrag von John Antonakis
Gesendet: Freitag, 7. August 2009 11:47
An: statalist@hsphsun2.harvard.edu
Betreff: st: Simulating stepwise regression

Hi:

I would like to simulate the below. Note, I am no fan of stepwise--I just
want to demonstrate it evils

However, I do not know

1. what to put in the place of "??"--that is, I want the program to
capture only the variables that were selected in the model as being
significant

2. how to simulate the r-square.

3. how to extend the simulation (a new program) such that I simulate from
n = 50 to n=1000 (in increments of 50), crossed with independent variables
ranging from x1 to x100.

Regards,
John.

Here is the program:

set seed 123456

capture program drop sim
version 10.1
program define sim, eclass
drop _all

set obs 100

gen y = invnorm(uniform())
gen x1 = invnorm(uniform())
gen x2 = invnorm(uniform())
gen x3 = invnorm(uniform())
gen x4 = invnorm(uniform())
gen x5 = invnorm(uniform())

stepwise, pr(.2): regress y x1-x5
end

simulate ??? , reps(20) seed (123) : sim,

foreach v in ?? {
gen t_`v' = /*
*/_b_`v'/_se_`v'
gen p_`v' =/*
*/ 2*(1-normal(abs(t_`v')))
}

____________________________________________________

Prof. John Antonakis
Associate Dean Faculty of Business and Economics
University of Lausanne
Internef #618
CH-1015 Lausanne-Dorigny
Switzerland

Tel ++41 (0)21 692-3438
Fax ++41 (0)21 692-3305

Faculty page:
http://www.hec.unil.ch/people/jantonakis&cl=en

Personal page:
http://www.hec.unil.ch/jantonakis
____________________________________________________

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

```
```*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

```
```

```
```*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```