Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: STATA: multivariate probit w/ bootstrap


From   "Stas Kolenikov" <skolenik@gmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: STATA: multivariate probit w/ bootstrap
Date   Thu, 1 May 2008 09:28:12 -0500

On 5/1/08, Stas Kolenikov <skolenik@gmail.com> wrote:
> Fixing the seed looks like a very reasonable thing for those
>  procedures, and it explains the same bootstrap results, too.

Ha-ha, how about RTFMing the help file first? There's the -seed-
option of -mvprobit- that explicitly allows one to fix the seed for
the simulation. So I wonder if the following will do:

cap pro drop mymvprobit
pro def mymvprobit, eclass
   local seed = 1e6*uniform()
   mvprobit <your existing syntax> , seed(`seed')
   ereturn add
end

bootstrap, ... : mymvprobit

This will keep the random # generator going, but it will allow each
bootstrap replication to have fixed seed distinct from the default
123456789.

Daniel, what was your original reason that you wanted the bootstrap
for -mvprobit-, to begin with? On very rare occasions will the
bootstrap-based procedures be worthwhile dealing with if the
linearization/sandwich standard errors are available (and they are in
-mvprobit-); the two types of standard errors are asymptotically
equivalent, both are approximations to the true unknown variances of
the parameter estimates, and there is little to no telling which one
is better in finite samples (the bootstrap usually is, but here it is
confounded with having to reset the seed for each simulation, so there
is not only variability due to random resampling, but also due to
different seeds for the multinormal probability evaluation). You are
mouting a computationally intensive bootstrap on top of iterative
maximization; for each iteration, you would probably need a dozen or
so evaluations of the likelihood, for the three equation you have; and
within each likelihood evaluation, there's a simulation for the
multinormal probabilities. You would need hours before you get
anything reasonable; you would want to find the optimal number of
-draws- (I think the default 5 is ridiculously low, but I have little
to no experiece with simulated ML)... and at any rate your -bootstrap,
rep(5)- would not give you much. If anything, I would suggest
drastically reducing the number of observations to be bootstrapped
(say 10% of the data), to speed up each individual -mvprobit-; I would
expect Stata to take care of the necessary square root corrections for
the standard errors.

And if you also have some sort of cluster structure (if say you
sampled your settlements, and then sampled individuals within those),
then forget about the bootstrap, it is almost impossible to do it
right in this situation, and be content with -mvprobit, cluster()-
standard errors. You really won't be able to improve upon those.

-- 
Stas Kolenikov, also found at http://stas.kolenikov.name
Small print: Please do not reply to my Gmail address as I don't check
it regularly.
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index