Search
   >> Home >> Resources & support >> FAQs >> Bootstrapping vectors
The following question and answer is based on an exchange that started on Statalist.

How do I bootstrap a vector of results?

Title   Bootstrapping vectors
Author Jeffrey Pitblado, StataCorp
Date February 2003; updated July 2011; minor revisions June 2013

Question:

I have a program that calculates many statistics for each income quintile in my sample. To make this manageable, I store the estimates as variables—each statistic that I've calculated is a variable, and there are 5 observations, one for each quintile. I want to bootstrap the results, but it appears that bootstrap works only for scalars. I could break up the variables into scalars so that the call to bootstrap would be

    . bootstrap (stat1[1]) (stat1[2]) (stat1[3]) (stat1[4]) ///
      (stat1[5]) (stat2[1]) (stat2[2]): command ...

but that would be incredibly tedious because there are a lot of statistics. Is there any way to simplify this by posting variables (or vectors) of results?

Answer:

The bootstrap command understands _b to mean all elements in the e(b) vector (coefficients vector posted by estimation commands). For example, you can now easily bootstrap all the coefficients from a regression:

    . bootstrap  _b: regress mpg weight length ...

To take advantage of this syntax, you will have to modify your program so that it is an e-class command that posts the values of interest into e(b) instead of placing them in variables. Then, you can do something like

    . bootstrap _b, reps(100): command

Here is an example that posts the vector (1,2,3,4) to e(b):

 capture program drop myepost
 program myepost, eclass
 version 13.0
 tempname bb
 matrix `bb' = 1,2,3,4
 ereturn post `bb' 
 end

 myepost
 matrix list e(b)

Here is a log of the result:

 . myepost
        
 . matrix list e(b)
        
 e(b)[1,4]
     c1  c2  c3  c4
 y1   1   2   3   4

Now you can use the above idea to pass a vector of results to bootstrap. To see that the method is working, you can pass the coefficients of a regression:

 capture program drop myreg
 program myreg, eclass
     version 13.0
     tempname bb
     quietly regress mpg turn
     matrix `bb'=e(b)
     ereturn post `bb'
     ereturn local cmd="bootstrap"
 end

 clear
 sysuse auto
 set seed 12345
 bootstrap _b, reps(50) nowarn: myreg 
 set seed 12345
 bootstrap _b, reps(50): regress mpg turn

Here is a log of the result:

. bootstrap _b, reps(50) nowarn: myreg
(running myreg on estimation sample)

Bootstrap replications (50)
----+--- 1 ---+--- 2 ---+--- 3 ---+--- 4 ---+--- 5 
..................................................    50

Bootstrap results                               Number of obs      =        74
                                                Replications       =        50

      command:  myreg
Observed Bootstrap Normal-based
Coef. Std. Err. z P>|z| [95% Conf. Interval]
turn -.9457877 .0973972 -9.71 0.000 -1.136683 -.7548928
_cons 58.7965 4.100355 14.34 0.000 50.75995 66.83305
. set seed 12345 . bootstrap _b, reps(50): regress mpg turn (running regress on estimation sample) Bootstrap replications (50) ----+--- 1 ---+--- 2 ---+--- 3 ---+--- 4 ---+--- 5 .................................................. 50 Linear regression Number of obs = 74 Replications = 50 Wald chi2(1) = 94.30 Prob > chi2 = 0.0000 R-squared = 0.5172 Adj R-squared = 0.5105 Root MSE = 4.0477
Observed Bootstrap Normal-based
mpg Coef. Std. Err. z P>|z| [95% Conf. Interval]
turn -.9457877 .0973972 -9.71 0.000 -1.136683 -.7548928
_cons 58.7965 4.100355 14.34 0.000 50.75995 66.83305

There is one difference between the first program myepost and myreg: myreg has also saved e(cmd)="bootstrap". This is necessary so that bootstrap knows how it is to display the results. When you bootstrap an official Stata estimation command, bootstrap uses the estimation command's replay feature to display the coefficient table. This will show the bootstrapped standard errors since the bootstrapped covariance matrix is posted in e(V) by bootstrap. Since the command myreg doesn’t have a replay feature, you need to use bootstrap to display the results. You can do this by setting e(cmd)="bootstrap".

The Stata Blog: Not Elsewhere Classified Find us on Facebook Follow us on Twitter LinkedIn Google+ Watch us on YouTube