Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: RE: bootstrapping standard errors with several estimated regressors


From   Steven Samuels <ssamuels@albany.edu>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: RE: bootstrapping standard errors with several estimated regressors
Date   Mon, 9 Jul 2007 16:39:07 -0400

bootstrapping would cause statistical significance for all regressors to go down"? I've not seen this in the bootstrap literature. Indeed, your example, and that of Maarten, suggest that there is no order relation between model-based estimated standard errors and those estimated by the bootstrap.

You might be thinking that bootstrapping should cause p-values to rise because regressors, as well as responses, are being sampled. This is not so. Assume the classical multiple regression model. If the X variables are random and independent of independent of the error terms, then in the usual formula for the V(b), (X'X)^(-1) is replaced by its expectation. (WH Greene, Econometrics, McMillan, 1990).

You might also be thinking that the use of estimated regressors should lead to higher higher pvalues, compared to having the "true" regressors. This sounds right, although I am not expert in this area, but it is irrelevant. Both original and bootstrapped standard errors are based on the estimated regressors.

Perhaps you are confusing the estimates of coefficients with estimates of standard errors of coefficients. If model assumptions are right, then both model-based estimate of standard error and the bootstrap estimates of standard error are "good" estimates of the same quantity, the "true" standard error. However, the model-based estimate benefits from knowing that model is true. For example, in OLS, for example, the key assumption is that there is a constant SD. The model-based estimate standard error is therefore a function of one quantity besides the X'X matrix, namely the residual SD. The bootstrap estimate is valid even if the residual SD is not constant, as long as the observations are uncorrelated. The price for this greater validity is that, if the model is right, the bootstrap estimate of standard error will be more variable then the model-based estimate. See Efron & Tibshirani, An Introduction to the Bootstrap, Chapman & Hall, 1994.

-Steve


On Jul 9, 2007, at 5:25 AM, Erasmo Giambona wrote:


Thanks Maarten. Unfortunately, I have not been able to find any good
reference on the issue. Anyone suggestions would be appreciated,
Erasmo

On 7/6/07, Maarten Buis <M.Buis@fsw.vu.nl> wrote:
I have no direct answer for you, except that in the example below exactly the opposite happens. Not that I claim that this should be generally true.

*------------- begin example ---------------
sysuse auto, clear
factor wei length headroom trunk, factors(1)
predict si
reg mpg si foreign

drop si
capture program drop boo
program define boo, rclass
factor wei length headroom trunk, factors(1)
predict si
reg mpg si foreign
return scalar si = _b[si]
return scalar for = _b[foreign]
end

bootstrap si=r(si) for=r(for), reps(1000): boo
*------------- end example --------------

Hope this helps,
Maarten

-----------------------------------------
Maarten L. Buis
Department of Social Research Methodology
Vrije Universiteit Amsterdam
Boelelaan 1081
1081 HV Amsterdam
The Netherlands

visiting address:
Buitenveldertselaan 3 (Metropolitan), room Z434

+31 20 5986715

http://home.fsw.vu.nl/m.buis/
-----------------------------------------

-----Original Message-----
From: owner-statalist@hsphsun2.harvard.edu [mailto:owner- statalist@hsphsun2.harvard.edu]On Behalf Of Erasmo Giambona
Sent: vrijdag 6 juli 2007 19:46
To: statalist
Subject: st: bootstrapping standard errors with several estimated regressors

Dear Statalist users,
I am estimating a model which includes several estimated regressors
and two observed regressors. Therefore, I am bootstrapping the
standard errors. I had thought that bootstrapping would cause
statistical significance for all regressors to go down. However, I
noticed that statistical significance goes down for the estimated
regressors go down while it actually goes up for the 2 observed
regressors.
Is this what the theory predicts. Is there a good reference that
addresses this issue?
Any help would be appreciated.
Thanks very much,
Erasmo
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/

*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index