Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
Bryan Sayer <bsayer@chrr.osu.edu> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: Bootstrapping & clustered standard errors (-xtreg-) |

Date |
Thu, 08 Sep 2011 17:20:35 -0400 |

... The sampling weights control mostly for unequal probabilities of selection, and for well-designed and well-conducted surveys, non-response adjustments are not that large, while probabilities of selection might differ quite notably.

Just my 2 cents worth. Bryan Sayer Monday to Friday, 8:30 to 5:00 Phone: (614) 442-7369 FAX: (614) 442-7329 BSayer@chrr.osu.edu On 9/8/2011 4:28 PM, Stas Kolenikov wrote:

Tobias, I would say that you are worried about exactly the wrong things. The sampling weights control mostly for unequal probabilities of selection, and for well-designed and well-conducted surveys, non-response adjustments are not that large, while probabilities of selection might differ quite notably. While it is true that if you can fully condition on the design variables and non-response propensity, you can ignore the weights, I am yet to see an example where that would happen. Believing that your model is perfect is... uhm... naive, let's put it mildly; if anything, econometrics moves away from making such strong assumptions as "my model is absolutely right" towards robust methods of inference that would allow for some minor deviations from the "absolutely right" scenario. There are no assumptions of normality made anywhere in the process of calculating the standard errors. All arguments are asymptotic, and you see z- rather than t-statistics in the output. In fact, the arguments justifying the bootstrap are asymptotic, as well. You can still entertain the bootstrap idea, but basically the only way to check that you've done it right is to compare the bootstrap standard errors with the clustered standard errors. If they are about the same, any of them is usable; if they are wildly different (say by more than 50%), I would not either of them, but I would first check to see that the bootstrap was done right. I know that PNAS is a huge impact factor journal in natural sciences, but a statistics journal? or an econometrics journal? I mean, it's cool to have a paper there on your resume, but I doubt many statalist subscribers look at this journal for methodological insights (some data miners or bioinformaticians or other statisticians on the margin of computer science do publish in PNAS, though). I would not turn to an essentially applied psychology paper for advice on clustered standard errors. The error that you report probably comes from the bootstrap producing a sample with fewer cluster identifiers than regressors in your model. Normally, this would be rectified by specifying -idcluster()- option; however in some odd cases, the bootstrap samples may still be underidentified. I don't know whether the fixed effects regression should be prone to such empirical underidentification. It might be, given that not all of the parameters of an arbitrary model are identified (the slopes of the time-invariant variables aren't). On Thu, Sep 8, 2011 at 3:30 AM, Tobias Pfaff <tobias.pfaff@uni-muenster.de> wrote:Dear Stas, Cam, Thanks for your input! I want to bootstrap as a robustness check since my residuals of the FE regression are not normally distributed. And bootstrapping as a robustness check because it does not assume normality of the residuals (e.g., Headey et al. 2010, appendix p. 3, http://www.pnas.org/content/107/42/17922.full.pdf?with-ds=yes). If I do bootstrapping with clustered standard errors as Jeff has explained I get the following error message: - insufficient observations an error occurred when bootstrap executed xtreg, posting missing values - Cam, you say that I would need custom bootstrap weights. My dataset provides individual weights with adjustments for non-response etc. I do not use weights for the regression because the possible selection bias is mitigated due to the fact that the variables which could cause the bias are included as control variables (e.g., income, employment status). Thus, I would argue that my model is complete and the unweighted analysis leads to unbiased estimators. 1. Would you still include weights for the bootstrapping? 2. Does bootstrapping need more degrees of freedom than the normal estimation of -xtreg- so that I get the above error message? 3. If bootstrapping is not a good idea in this case, what can I do to encounter the breach of the normality assumption of the residuals? (I already checked transformation of the variables, but that doesn't help) Regards, Tobias -----Ursprüngliche Nachricht-----Date: Wed, 7 Sep 2011 10:24:33 -0400 Subject: RE: st: Bootstrapping& clustered standard errors (-xtreg-) From: Cameron McIntosh<cnm100@hotmail.com> To: statalist@hsphsun2.harvard.eduStas, Tobias I agree with Stas that there is not much point in using the bootstrap in this case, unless you have custom bootstrap weights computed by a statistical agency for a complex sampling frame, which would incorporate adjustments for non-response and calibration to known totals, etc. I don't think that is the case here, so I would go with the -cluster- SEs too. My two cents, CamDate: Wed, 7 Sep 2011 09:03:27 -0500 Subject: Re: st: Bootstrapping& clustered standard errors (-xtreg-) From: skolenik@gmail.com To: statalist@hsphsun2.harvard.edu Tobias, can you please explain why you need the bootstrap at all? The bootstrap standard errors are equivalent to the regular -cluster- standard errors asymptotically (in this case, with the number of clusters going off to infinity), and, if anything, it is easier to get the bootstrap wrong than right with difficult problems. If -cluster- option works at all with -xtreg-, I see little reason to use the bootstrap. (Very technically speaking, in my simulations, I've seen the bootstrap standard errors to be more stable than -robust- standard errors with large number of the bootstrap repetitions that have to be in an appropriate relations with the sample size; whether that carries over to the cluster standard errors, I don't know.) On Tue, Sep 6, 2011 at 12:25 PM, Tobias Pfaff <tobias.pfaff@uni-muenster.de> wrote:Dear Statalisters, I do the following fixed effects regression: xtreg depvar indepvars, fe vce(cluster region) nonest dfadj Individuals in the panel are identified by the variable "pid". The time variable is "svyyear". Data were previously declared as panel data with -xtset pid svyyear-. Since one of my independent variables is clustered at the regional level (not at the individual level), I use the option -vce(clusterregion)-.Now, I would like to do the same thing with bootstrapped standarderrors.I tried several commands, however, none of them works so far. Forexample:xtreg depvar indepvars, fe vce(bootstrap, reps(3) seed(1)cluster(region))nonest dfadj .where I get the error message "option cluster() not allowed". None of the hints in the manual (e.g., -idcluster()-, -xtset, clear-,-i()-in the main command) were helpful so far. How can I tell the bootstrapping command that the standard errors shouldbeclustered at the regional level while using "pid" for panel individuals? Any comments are appreciated!

* * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**References**:**RE: st: Bootstrapping & clustered standard errors (-xtreg-)***From:*"Tobias Pfaff" <tobias.pfaff@uni-muenster.de>

**Re: st: Bootstrapping & clustered standard errors (-xtreg-)***From:*Stas Kolenikov <skolenik@gmail.com>

- Prev by Date:
**Re: st: Decomposition method for discrete-time event-history models?** - Next by Date:
**Re: st: Decomposition method for discrete-time event-history models?** - Previous by thread:
**Re: st: Bootstrapping & clustered standard errors (-xtreg-)** - Next by thread:
**RE: st: Bootstrapping & clustered standard errors (-xtreg-)** - Index(es):