The following question and answer is based on an exchange that started on
Statalist.
How do I bootstrap a vector of results?
| Title |
|
Bootstrapping vectors |
| Author |
Jeffrey Pitblado, StataCorp |
| Date |
February 2003; updated July 2011 |
Question:
I have a program that calculates many statistics for each income quintile in
my sample. To make this manageable, I store the estimates as
variables—each statistic that I've calculated is a variable, and there
are 5 observations, one for each quintile. I want to bootstrap the results,
but it appears that bootstrap works only for scalars. I could break
up the variables into scalars so that the call to bootstrap would be
. bootstrap (stat1[1]) (stat1[2]) (stat1[3]) (stat1[4]) ///
(stat1[5]) (stat2[1]) (stat2[2]): command ...
but that would be incredibly tedious because there are a lot of statistics.
Is there any way to simplify this by posting variables (or vectors) of
results?
Answer:
The bootstrap
command understands _b to mean all elements in the e(b) vector
(coefficients vector posted by estimation commands). For example, you can
now easily bootstrap all the coefficients from a regression:
. bootstrap _b: regress mpg weight length ...
To take advantage of this syntax, you will have to modify your program so
that it is an e-class command that posts the values of interest into
e(b) instead of placing them in variables. Then, you can do
something like
. bootstrap _b, reps(100): command
Here is an example that posts the vector (1,2,3,4) to e(b):
capture program drop myepost
program myepost, eclass
version 12.0
tempname bb
matrix `bb' = 1,2,3,4
ereturn post `bb'
end
myepost
matrix list e(b)
Here is a log of the result:
. myepost
. matrix list e(b)
e(b)[1,4]
c1 c2 c3 c4
y1 1 2 3 4
Now you can use the above idea to pass a vector of results to
bootstrap. To see that the method is working, you can pass the
coefficients of a regression:
capture program drop myreg
program myreg, eclass
version 12.0
tempname bb
quietly regress mpg turn
matrix `bb'=e(b)
ereturn post `bb'
ereturn local cmd="bootstrap"
end
clear
sysuse auto
set seed 12345
bootstrap _b, reps(50) nowarn: myreg
set seed 12345
bootstrap _b, reps(50): regress mpg turn
Here is a log of the result:
. bootstrap _b, reps(50) nowarn: myreg
(running myreg on estimation sample)
Bootstrap replications (50)
----+--- 1 ---+--- 2 ---+--- 3 ---+--- 4 ---+--- 5
.................................................. 50
Bootstrap results Number of obs = 74
Replications = 50
command: myreg
------------------------------------------------------------------------------
| Observed Bootstrap Normal-based
| Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
turn | -.9457877 .0973972 -9.71 0.000 -1.136683 -.7548928
_cons | 58.7965 4.100355 14.34 0.000 50.75995 66.83305
------------------------------------------------------------------------------
. set seed 12345
. bootstrap _b, reps(50): regress mpg turn
(running regress on estimation sample)
Bootstrap replications (50)
----+--- 1 ---+--- 2 ---+--- 3 ---+--- 4 ---+--- 5
.................................................. 50
Linear regression Number of obs = 74
Replications = 50
Wald chi2(1) = 94.30
Prob > chi2 = 0.0000
R-squared = 0.5172
Adj R-squared = 0.5105
Root MSE = 4.0477
------------------------------------------------------------------------------
| Observed Bootstrap Normal-based
mpg | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
turn | -.9457877 .0973972 -9.71 0.000 -1.136683 -.7548928
_cons | 58.7965 4.100355 14.34 0.000 50.75995 66.83305
------------------------------------------------------------------------------
There is one difference between the first program myepost and
myreg: myreg has also saved e(cmd)="bootstrap". This
is necessary so that bootstrap knows how it is to display the
results. When you bootstrap an official Stata estimation command,
bootstrap uses the estimation command's replay feature to display the
coefficient table. This will show the bootstrapped standard errors since
the bootstrapped covariance matrix is posted in e(V) by
bootstrap. Since the command myreg doesn’t have a
replay feature, you need to use bootstrap to display the results.
You can do this by setting e(cmd)="bootstrap".
|