# Re: st: A question about bootstrap

 From Stas Kolenikov To statalist@hsphsun2.harvard.edu Subject Re: st: A question about bootstrap Date Tue, 10 Aug 2004 22:45:51 -0400 (EDT)

> 1. Is my understanding of the use of bootstrap correct? Any special
> thing I should take care in my case?

Yes. Clustering. I cannot imagine you have an i.i.d. sample of American
households. Or Zimbabwe households. Or Chinese households. Nobody does. If
they did not teach you in the microeconometric or labor class that the
data coming from the surveys are subject to clustering, ...  well, too
bad, to say the least. To give you an idea -- most likely, your standard
errors may be underestimated by a factor of 2 to 3, and that is far
greater discrepancy than underaccounting for estimation in the first
stage. And correcting for clustering properly is not an easy task. See

\harvarditem{Rao and Wu}{1988}{RaoWu88}
Rao, J.~N.~K., Wu, C.~F.~J. (1988). Resampling Inference With Complex
Survey Data.
{\it JASA}, {\bf 83}, pp.~231--241.

You would need to figure out -strata- and -psu- options of -bootstrap- to
make those corrections. It is not very intuitive if you have no experience
with it. Read help for the survey commands and the description of your
data for strata, cluster / PSU and weight variables.

In fact, if you can figure out the joint likelihood of your two stages,
you may be quite well of coding that as an -ml- routine, and what's more,
Stata can do cluster corrections and account for weights (at least in the
-lf- format of -ml-) by just specifying -svy- option in your program. As
far as the explicit matrix formulae are available for IV estimators, this
might be something doable, although may be with some analytic and/or
literature detective work.

> 2. I know there is a bootstrap command in Stata. Can it be used to do
> bootstrap in a two-stage process?

It is a universal one, you can use it for five-stages, if you need to,
although you may end up writing a small program.  Not the do-file, but a
real -program- that -return-s something in programming sense (not just the
output, in the user-reading-from-the-screen sense).  See -help program-
and programming manual. If you are about to attack the -ml-, that's what
you would be doing, anyway.

> 3. How can I obtain the standard error/confidence interval estimates
> from bootstrap of a non-linear function of parameter in Stata?

No problem, you just compute it for each of your bootstrap subsamples, and
then Stata will take care of it, with niceties like bias-corrected CI,
etc. Stata can also take care of the survey design issue, at least to some
extent, so that the subsamples coming off from the lower-level -bsample-
command do reflect the clustering structure, and you don't need to do any
special tricks to your resulting dataset of the point estimates from your
bootstrap samples.

---                                    Stas Kolenikov
--       Ph.D. student in Statistics at UNC-Chapel Hill
- http://www.komkon.org/~tacik/  -- Stas.Kolenikov@unc.edu

* This e-mail and all attachments to it are not intended to provide any
* reasonable point of view and was transmitted to you in error. It
* should be immediately deleted by all recipients unless they really
* enjoy communicating with the author :). Other restrictions apply.

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/