[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

From |
"Sayer, Bryan" <BSayer@s-3.com> |

To |
"'doug levy'" <doughstatalist@yahoo.com>, "'statalist@hsphsun2.harvard.edu '" <statalist@hsphsun2.harvard.edu> |

Subject |
st: RE: bootstrapping survey data |

Date |
Fri, 11 Jul 2003 15:30:12 -0400 |

I'd have to think about this for awhile, but it still doesn't seem to me that you want to be doing both. Each bootstrap draw can't have the full population, and you need all observations to get correct variance estimates from the survey estimates. Plus, where do the variances come into play? Are they calculated on each replication and then used somewhere? And the sampling weights would need to be adjusted on each draw to reflect the full population (that's what replicate weights do). My guess is you are better off attempting to reflect the sample design in the bootstrap process, and ignore it in the initial estimation. Bryan Sayer Statistician, SSS Inc. bsayer@s-3.com -----Original Message----- From: doug levy [mailto:doughstatalist@yahoo.com] Sent: Friday, July 11, 2003 10:30 AM To: Sayer, Bryan; 'statalist@hsphsun2.harvard.edu ' Subject: RE: bootstrapping survey data The rationale for bootstrapping my coefficient estimates is to include some sense of the uncertainty of the parameters in a Monte Carlo simulation of the effect of a policy change. For the first order Monte Carlo simulation I am taking draws from a conditional distribution defined by the logit model I estimate using the NHIS. A second order Monte Carlo analysis reruns this simulation some large number of times, each time using a conditional distribution defined by the parameter estimates from one of the bootstrap draws. Thus, the error from the parameter estimates is taken into account in my simulation results. All of that said, bootstrapping assumes that the empirical distribution of the sample is a reasonable approximation of the actual distribution in the population. In order to have the resamples mimic the initial sample, I would like to take account of the sampling design, to the extent that it is known to me. While I don't have exact knowledge of the sampling design, it is my sense that for the purposes of making my simulation plausible, I should include what elements of the sampling design I can. Thus, my interest in bootstrapping my estimates from the NHIS. I welcome any opinions on either my analysis strategy in general or the Stata problem in particular. Best, Doug Levy --- "Sayer, Bryan" <BSayer@s-3.com> wrote: > I don't understand the point of doing both a > bootstrap and explicitly > accounting for the sample design (I'm presuming you > set PSU and strata since > you specified svylogit). > > Anyway, bootstrapping of complex survey design data > is not especially well > developed, outside of replication weights. > Generally, setting up the > process requires knowledge of the sampling variables > that might not be > public. > > NHIS is properly handled in Stata simply by setting > PSU, strata and pweight > (absent a very small subpop). > > Bryan Sayer > Statistician, SSS Inc. > > -----Original Message----- > From: doug levy > To: statalist@hsphsun2.harvard.edu > Sent: 7/10/03 4:40 PM > Subject: st: bootstrapping survey data > > Dear Statalisters, > I am trying to get bootstrapped estimates of > coefficients from a complex survey (the National > Health Interview Survey). To get standard estimates, > I > would type "svylogit Y X" after having set up the > weights, strata, and psu's. To get bootstrap > estimates, my first inclination was to run > 'bootstrap > "logit Y X [pw=weight]" _b, reps(1000) > strata(stratum) > psu(psu)'. However, Stata does not allow weights in bootstrap. I was > able to trick Stata into including weights by using only the weights > in the svyset command and running 'bootstrap "svylogit Y X" _b, > reps(1000) strata(stratum) psu(psu)'. This gives me > point estimates similar to the non-bootstrapped > estimates, which is reassuring, but will I get > reasonable standard errors? Is there a more orthodox > way of coding this? > Many thanks for any and all advice, > Doug Levy > > > __________________________________ > Do you Yahoo!? > SBC Yahoo! DSL - Now only $29.95 per month! http://sbc.yahoo.com > * > * For searches and help try: > * > http://www.stata.com/support/faqs/res/findit.html > * http://www.stata.com/support/statalist/faq > * http://www.ats.ucla.edu/stat/stata/ __________________________________ Do you Yahoo!? SBC Yahoo! DSL - Now only $29.95 per month! http://sbc.yahoo.com * * For searches and help try: * http://www.stata.com/support/faqs/res/findit.html * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

- Prev by Date:
**st: RE: Calcualting CIs for Beta distribution** - Next by Date:
**st: multinomial logit on grouped data** - Previous by thread:
**st: RE: bootstrapping survey data** - Next by thread:
**st: Cox predictive failure plot** - Index(es):

© Copyright 1996–2023 StataCorp LLC | Terms of use | Privacy | Contact us | What's new | Site index |