# Re: st: STATA: multivariate probit w/ bootstrap

 From "Arne Risa Hole" To statalist@hsphsun2.harvard.edu Subject Re: st: STATA: multivariate probit w/ bootstrap Date Thu, 1 May 2008 17:16:59 +0100

```Dan,

If your WTP estimates are given by the ratio of two coefficients you
could also consider the -wtp- module which is downloadable from ssc
(type -ssc install wtp- to install). This will give you parametric
bootstrap (also called Krinsky Robb) CIs which are much faster to
calculate than non-parametric bootstrap CIs, especially when you are
using maximum simulated likelihood to fit your model.

Arne

On 01/05/2008, Daniel R. Petrolia <Petrolia@agecon.msstate.edu> wrote:
> Stas,
>  Thanks a lot for your detailed suggestions.  We are interested in willingness-to-pay (WTP) estimates, the mean of which requires both the estimated coefficients and the variable means to calculate.  What we would like to get is a confidence interval on our mean WTP, which, based on research, it is suggested that one bootstrap to derive a sample of anywhere from 50 to 1000 mean WTP estimates, and drop 2.5% of obs in the tails for a 95% C.I.  Well, to do this, we need to estimate our model over a number of bootstrapped samples.  This is easy enough with probit or even biprobit, but in our case, we have 3 probit equations whose errors are correlated and thus, ideally, should be estimated together.  Thus, the need for mvprobit or triprobit.
>  Our latest attempt is to generate the required number of samples using SAS, then estimated the regression on each sample manually.
>
> FYI,  this is the more complete code we were using to collect the data of interest
> set seed 123456
> bootstrap _b, reps(10) saving(results_b1.dta): mvprobit (option1_1 = option1 q34  high_low hurricane env rec hur_res env_res) (option2_1 = option2 q34  high_low hurricane env rec hur_res env_res) (option3_1 = option3 q34  high_low hurricane env rec hur_res env_res)
> set seed 123456
> bootstrap _b, reps(10) saving(results_mean1.dta): mean option1_1 option1 q34  high_low hurricane env rec hur_res env_res option2_1 option2 option3_1 option3
>
> Regarding your concern on the number of draws and reps, we simply put in "low" values to speed up the process just to see if it worked.  If and when it worked, we would then adjust these according to our needs.
>
> Thanks again for your help,
> Dan Petrolia
>
>
>
> >>> "Stas Kolenikov" <skolenik@gmail.com> 5/1/2008 9:28 AM >>>
> On 5/1/08, Stas Kolenikov <skolenik@gmail.com> wrote:
> > Fixing the seed looks like a very reasonable thing for those
> >  procedures, and it explains the same bootstrap results, too.
>
> Ha-ha, how about RTFMing the help file first? There's the -seed-
> option of -mvprobit- that explicitly allows one to fix the seed for
> the simulation. So I wonder if the following will do:
>
> cap pro drop mymvprobit
> pro def mymvprobit, eclass
>   local seed = 1e6*uniform()
>   mvprobit <your existing syntax> , seed(`seed')
> end
>
> bootstrap, ... : mymvprobit
>
> This will keep the random # generator going, but it will allow each
> bootstrap replication to have fixed seed distinct from the default
> 123456789.
>
> Daniel, what was your original reason that you wanted the bootstrap
> for -mvprobit-, to begin with? On very rare occasions will the
> bootstrap-based procedures be worthwhile dealing with if the
> linearization/sandwich standard errors are available (and they are in
> -mvprobit-); the two types of standard errors are asymptotically
> equivalent, both are approximations to the true unknown variances of
> the parameter estimates, and there is little to no telling which one
> is better in finite samples (the bootstrap usually is, but here it is
> confounded with having to reset the seed for each simulation, so there
> is not only variability due to random resampling, but also due to
> different seeds for the multinormal probability evaluation). You are
> mouting a computationally intensive bootstrap on top of iterative
> maximization; for each iteration, you would probably need a dozen or
> so evaluations of the likelihood, for the three equation you have; and
> within each likelihood evaluation, there's a simulation for the
> multinormal probabilities. You would need hours before you get
> anything reasonable; you would want to find the optimal number of
> -draws- (I think the default 5 is ridiculously low, but I have little
> to no experiece with simulated ML)... and at any rate your -bootstrap,
> rep(5)- would not give you much. If anything, I would suggest
> drastically reducing the number of observations to be bootstrapped
> (say 10% of the data), to speed up each individual -mvprobit-; I would
> expect Stata to take care of the necessary square root corrections for
> the standard errors.
>
> And if you also have some sort of cluster structure (if say you
> sampled your settlements, and then sampled individuals within those),
> then forget about the bootstrap, it is almost impossible to do it
> right in this situation, and be content with -mvprobit, cluster()-
> standard errors. You really won't be able to improve upon those.
>
> --
> Stas Kolenikov, also found at http://stas.kolenikov.name
> Small print: Please do not reply to my Gmail address as I don't check
> it regularly.
> *
> *   For searches and help try:
> *   http://www.stata.com/support/faqs/res/findit.html
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
>
> *
> *   For searches and help try:
> *   http://www.stata.com/support/faqs/res/findit.html
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
>
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```