Statalist The Stata Listserver

[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: vce(boot) - clustering?

From   Scott Cunningham <>
Subject   Re: st: vce(boot) - clustering?
Date   Tue, 11 Apr 2006 11:52:32 -0400

I don't think there is anything special to -xtpoisson- with the
bootstrap variance estimation; just see what [R] bootstrap says. To be
on a safe side, I would specify -vce(bootstrap, cluster(id) )- so that
the bootstrap variance estimator knows that your data are clustered
(and that is how -vce(boot)- should default with the -xt- data, but I
don't know the details in the guts of Stata). This will produce "kind
of" cluster standard errors (and -cluster- is a generalization of
-robust-, so you are getting both for the price of one).
I'm wondering if maybe there has been a change in the bootstrap syntax from stata 8 to stata 9? -vce(boot)- does not have a cluster option - at least when I use it in the -xtpoisson- command. I tried:

. xi:xtpoisson depvar indepvar, fe i(id) vce(boot cluster(id))

and got:

(running xtpoisson on estimation sample)
Unknown function cluster()
error in expression: cluster(id)

The command -bootstrap- does allow for this kind of clustering, and I have code from Cameron at UC-Davis on a way to get clustered bootstrapped standard errors. But awhile back, before I understand the problem of serial correlation well enough to talk about it, I was told that the vce(boot) in Stata 9 subsumed what cameron's code was doing. But in going back over that old correspondence, it's not clear to me that vce(boot) is clustering. And the [R] on Bootstrapping - is that going to help me with the -vce(boot)- suffix to -xtpoisson-? I don't have the Stata 9 manuals, only the Stata 8 manuals, and so wasn't sure if I was even looking at the correct thing or not.

There's still a caveat with dependent data: you need to resample fewer
clusters than there were in the original data, at least as suggested
by Rao & Wu (1988), see below. I would imagine that this is even more
important with the discrete data, such as Poisson regression models.
Rao & Wu (1988) suggest using #clusters-3, to match the third moments
of the bootstrap and the empirical third moments. I don't know how you
would go about it in Stata; it gives you an option -size()- for the
bootstrap samples, but it looks like it is applicable to the data set
as a whole -- doesn't look like those two options are compatible with
one another :((. (Stata Corp., can you possibly fix -bsample- so that
it respects -size()- along with -cluster-, with understanding that
-size()- means the number of clusters to be resampled?)
I'm going to have to read Rao & Wu (1988) closely to follow what you're saying here. THanks for the citation.

And finally to make your results reproducible, you would want to
specify -set seed- right before your -xtpoisson- command.

I'd be hugely surprised if the bootstrap standard errors were way off
the analytical standard errors; and if they were, I would still trust
the analytical vce better than the bootstrap vce, as it is more
difficult to get the proper bootstrap vce in a clustered situation.
Remember that the bootstrap sampling should exactly reproduce the
sampling process and the dependencies in your data; if you fail to do
so, the bootstrap will be severely biased. So the panel bootstrap
should go about with the panels as a whole, and probably with
attrition within the panels if you were really fair :)).
*   For searches and help try:

© Copyright 1996–2015 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index