[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
"Stas Kolenikov" <skolenik@gmail.com> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: Bootstrap: Which standard errors to use? |

Date |
Mon, 8 Dec 2008 14:34:11 -0600 |

On 12/8/08, Antoine Terracol <terracol@univ-paris1.fr> wrote: > > Those are exactly the reported standard errors in your second panel. > Which, if I followed the thread correctly, should not come as a surprise > since Anupit's original -bootstrap- command called -logit, robust- Right, I did not really pay much attention up there :)). Well the -robust- standard errors are in fact closer to -oim- standard errors than to the bootstrap standard errors. It is difficult to come up with a meaningful suggestion in this situation as to which standard errors are better. A (former) econometrician inside me would like to remind that modeling the 0/1 decision to buy something (which this application seem to be related to based on the variable names at least) treated as the imperfect observation of the underlying continuous propensity to buy is subject to the scale indeterminacy, so that the identified combinations of parameters are "slope"/"standard deviation of the error term" rather than "slope" as it is the case with linear regression. Biostatisticians would rightfully raise a brow here -- "What is he talking about? This is a GLM with a canonical link... and the scale parameter here is 1". Well this is a matter of interpretation! If you want an economics interpretation, then you would need to make sure you control that sigma in the denominator to really talk about betas being on the same scale (and only then the bootstrap will make sense) -- which unfortunately cannot be guaranteed. Another aspect is the numeric stability of the logistic regression estimates. For some bootstrap samples, the logit estimates are not defined -- say if you sampled all zeroes, or as many ones as you have regressors in the model so that the outcome of 1 can be perfectly predicted with coefficient values at infinity. In some likelihood, the samples that are "close", in some sense, to those extreme outcomes may also produce "large" estimates of coefficients. Are those sensible outcomes for the bootstrap? Probably not; hence the bootstrap procedure might need to be modified to control the relative proportions of 0s and 1s. In the simplest way, you do some sort of stratified bootstrap: resample separately as many zero outcomes as there were in the original sample, and as many ones as there were originally. Is that a better bootstrap scheme? At least it takes care of that infinite estimates issue. In Stata, you can do this by simply adding -strata(response_variable)- to your bootstrap options. Stratification usually brings down variances, and I would expect in this case that the standard errors will now be much closer to the -oim- and -robust- ones. -- Stas Kolenikov, also found at http://stas.kolenikov.name Small print: I use this email account for mailing lists only. * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**Re: st: Bootstrap: Which standard errors to use?***From:*"Supnithadnaporn, Anupit" <gtg065t@mail.gatech.edu>

**References**:**Re: st: Bootstrap: Which standard errors to use?***From:*"Supnithadnaporn, Anupit" <gtg065t@mail.gatech.edu>

**Re: st: Bootstrap: Which standard errors to use?***From:*"Stas Kolenikov" <skolenik@gmail.com>

**Re: st: Bootstrap: Which standard errors to use?***From:*Antoine Terracol <terracol@univ-paris1.fr>

- Prev by Date:
**Re: st: RE: lambda--p-value** - Next by Date:
**Re: SV: SV: S: SV: st: Survey - raking - calibration - post stratification - calculating weights** - Previous by thread:
**Re: st: Bootstrap: Which standard errors to use?** - Next by thread:
**Re: st: Bootstrap: Which standard errors to use?** - Index(es):

© Copyright 1996–2017 StataCorp LLC | Terms of use | Privacy | Contact us | What's new | Site index |