Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: STATA: multivariate probit w/ bootstrap


From   "Daniel R. Petrolia" <Petrolia@agecon.msstate.edu>
To   <statalist@hsphsun2.harvard.edu>
Subject   Re: st: STATA: multivariate probit w/ bootstrap
Date   Thu, 01 May 2008 09:41:56 -0500

Stas,
  Thanks a lot for your detailed suggestions.  We are interested in willingness-to-pay (WTP) estimates, the mean of which requires both the estimated coefficients and the variable means to calculate.  What we would like to get is a confidence interval on our mean WTP, which, based on research, it is suggested that one bootstrap to derive a sample of anywhere from 50 to 1000 mean WTP estimates, and drop 2.5% of obs in the tails for a 95% C.I.  Well, to do this, we need to estimate our model over a number of bootstrapped samples.  This is easy enough with probit or even biprobit, but in our case, we have 3 probit equations whose errors are correlated and thus, ideally, should be estimated together.  Thus, the need for mvprobit or triprobit.
  Our latest attempt is to generate the required number of samples using SAS, then estimated the regression on each sample manually.

FYI,  this is the more complete code we were using to collect the data of interest
set seed 123456
bootstrap _b, reps(10) saving(results_b1.dta): mvprobit (option1_1 = option1 q34  high_low hurricane env rec hur_res env_res) (option2_1 = option2 q34  high_low hurricane env rec hur_res env_res) (option3_1 = option3 q34  high_low hurricane env rec hur_res env_res)
set seed 123456
bootstrap _b, reps(10) saving(results_mean1.dta): mean option1_1 option1 q34  high_low hurricane env rec hur_res env_res option2_1 option2 option3_1 option3

Regarding your concern on the number of draws and reps, we simply put in "low" values to speed up the process just to see if it worked.  If and when it worked, we would then adjust these according to our needs.

Thanks again for your help,
Dan Petrolia



>>> "Stas Kolenikov" <skolenik@gmail.com> 5/1/2008 9:28 AM >>>
On 5/1/08, Stas Kolenikov <skolenik@gmail.com> wrote:
> Fixing the seed looks like a very reasonable thing for those
>  procedures, and it explains the same bootstrap results, too.

Ha-ha, how about RTFMing the help file first? There's the -seed-
option of -mvprobit- that explicitly allows one to fix the seed for
the simulation. So I wonder if the following will do:

cap pro drop mymvprobit
pro def mymvprobit, eclass
   local seed = 1e6*uniform()
   mvprobit <your existing syntax> , seed(`seed')
   ereturn add
end

bootstrap, ... : mymvprobit

This will keep the random # generator going, but it will allow each
bootstrap replication to have fixed seed distinct from the default
123456789.

Daniel, what was your original reason that you wanted the bootstrap
for -mvprobit-, to begin with? On very rare occasions will the
bootstrap-based procedures be worthwhile dealing with if the
linearization/sandwich standard errors are available (and they are in
-mvprobit-); the two types of standard errors are asymptotically
equivalent, both are approximations to the true unknown variances of
the parameter estimates, and there is little to no telling which one
is better in finite samples (the bootstrap usually is, but here it is
confounded with having to reset the seed for each simulation, so there
is not only variability due to random resampling, but also due to
different seeds for the multinormal probability evaluation). You are
mouting a computationally intensive bootstrap on top of iterative
maximization; for each iteration, you would probably need a dozen or
so evaluations of the likelihood, for the three equation you have; and
within each likelihood evaluation, there's a simulation for the
multinormal probabilities. You would need hours before you get
anything reasonable; you would want to find the optimal number of
-draws- (I think the default 5 is ridiculously low, but I have little
to no experiece with simulated ML)... and at any rate your -bootstrap,
rep(5)- would not give you much. If anything, I would suggest
drastically reducing the number of observations to be bootstrapped
(say 10% of the data), to speed up each individual -mvprobit-; I would
expect Stata to take care of the necessary square root corrections for
the standard errors.

And if you also have some sort of cluster structure (if say you
sampled your settlements, and then sampled individuals within those),
then forget about the bootstrap, it is almost impossible to do it
right in this situation, and be content with -mvprobit, cluster()-
standard errors. You really won't be able to improve upon those.

-- 
Stas Kolenikov, also found at http://stas.kolenikov.name 
Small print: Please do not reply to my Gmail address as I don't check
it regularly.
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html 
*   http://www.stata.com/support/statalist/faq 
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index