[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]
Re: st: vce(boot) - clustering?
I'm wondering if maybe there has been a change in the bootstrap
syntax from stata 8 to stata 9? -vce(boot)- does not have a cluster
option - at least when I use it in the -xtpoisson- command. I tried:
I don't think there is anything special to -xtpoisson- with the
bootstrap variance estimation; just see what [R] bootstrap says. To be
on a safe side, I would specify -vce(bootstrap, cluster(id) )- so that
the bootstrap variance estimator knows that your data are clustered
(and that is how -vce(boot)- should default with the -xt- data, but I
don't know the details in the guts of Stata). This will produce "kind
of" cluster standard errors (and -cluster- is a generalization of
-robust-, so you are getting both for the price of one).
. xi:xtpoisson depvar indepvar, fe i(id) vce(boot cluster(id))
(running xtpoisson on estimation sample)
Unknown function cluster()
error in expression: cluster(id)
The command -bootstrap- does allow for this kind of clustering, and I
have code from Cameron at UC-Davis on a way to get clustered
bootstrapped standard errors. But awhile back, before I understand
the problem of serial correlation well enough to talk about it, I was
told that the vce(boot) in Stata 9 subsumed what cameron's code was
doing. But in going back over that old correspondence, it's not
clear to me that vce(boot) is clustering. And the [R] on
Bootstrapping - is that going to help me with the -vce(boot)- suffix
to -xtpoisson-? I don't have the Stata 9 manuals, only the Stata 8
manuals, and so wasn't sure if I was even looking at the correct
thing or not.
I'm going to have to read Rao & Wu (1988) closely to follow what
you're saying here. THanks for the citation.
There's still a caveat with dependent data: you need to resample fewer
clusters than there were in the original data, at least as suggested
by Rao & Wu (1988), see below. I would imagine that this is even more
important with the discrete data, such as Poisson regression models.
Rao & Wu (1988) suggest using #clusters-3, to match the third moments
of the bootstrap and the empirical third moments. I don't know how you
would go about it in Stata; it gives you an option -size()- for the
bootstrap samples, but it looks like it is applicable to the data set
as a whole -- doesn't look like those two options are compatible with
one another :((. (Stata Corp., can you possibly fix -bsample- so that
it respects -size()- along with -cluster-, with understanding that
-size()- means the number of clusters to be resampled?)
And finally to make your results reproducible, you would want to
specify -set seed- right before your -xtpoisson- command.
I'd be hugely surprised if the bootstrap standard errors were way off
the analytical standard errors; and if they were, I would still trust
the analytical vce better than the bootstrap vce, as it is more
difficult to get the proper bootstrap vce in a clustered situation.
Remember that the bootstrap sampling should exactly reproduce the
sampling process and the dependencies in your data; if you fail to do
so, the bootstrap will be severely biased. So the panel bootstrap
should go about with the panels as a whole, and probably with
attrition within the panels if you were really fair :)).
* For searches and help try: