help bootstrap dialog: bootstrap
also see: bootstrap postestimation
-------------------------------------------------------------------------------
Title
[R] bootstrap -- Bootstrap sampling and estimation
Syntax
bootstrap exp_list [, options eform_option] : command
options description
-------------------------------------------------------------------------
Main
reps(#) perform # bootstrap replications; default is
reps(50)
Options
strata(varlist) variables identifying strata
size(#) draw samples of size #; default is _N
cluster(varlist) variables identifying resampling clusters
idcluster(newvar) create new cluster ID variable
saving(filename, ...) save results to filename; save statistics in
double precision; save results to filename
every # replications
bca compute acceleration for BCa confidence
intervals
mse use MSE formula for variance estimate
Reporting
level(#) set confidence level; default is level(95)
notable suppress table of results
noheader suppress table header
nolegend suppress table legend
verbose display the full table legend
nodots suppress the replication dots
noisily display any output from command
trace trace the command
title(text) use text as title for bootstrap results
display_options control spacing and display of omitted
variables and base and empty cells
Advanced
nodrop do not drop observations
nowarn do not warn when e(sample) is not set
force do not check for weights or svy commands;
seldom used
reject(exp) identify invalid results
seed(#) set random-number seed to #
+ group(varname) ID variable for groups within cluster()
+ jackknifeopts(jkopts) options for jackknife
+ coeflegend display coefficients' legend instead of
coefficient table
-------------------------------------------------------------------------
+ group(), jackknifeopts(), and coeflegend do not appear in the dialog
box.
weights are not allowed in command.
See [R] bootstrap postestimation for features available after estimation.
Menu
Statistics > Resampling > Bootstrap estimation
Description
bootstrap performs bootstrap estimation. Typing
. bootstrap exp_list, reps(#): command
executes command multiple times, bootstrapping the statistics in exp_list
by resampling observations (with replacement) from the data in memory #
times. This method is commonly referred to as the nonparametric
bootstrap.
command defines the statistical command to be executed. Most Stata
commands and user-written programs can be used with bootstrap, as long as
they follow standard Stata syntax. If the bca option is supplied,
command must also work with jackknife; see [R] jackknife. The by prefix
may not be part of command.
exp_list specifies the statistics to be collected from the execution of
command. If command changes the contents in e(b), exp_list is optional
and defaults to _b.
Because bootstrapping is a random process, if you want to be able to
reproduce results, set the random-number seed by specifying the seed(#)
option or by typing
. set seed #
where # is a seed of your choosing, before running bootstrap; see [R] set
seed.
Many estimation commands allow the vce(bootstrap) option. For those
commands, we recommend using vce(bootstrap) over bootstrap because the
estimation command already handles clustering and other model-specific
details for you. The bootstrap prefix command is intended for use with
nonestimation commands, such as summarize, user-written commands, or
functions of coefficients.
bs and bstrap are synonyms for bootstrap.
Options
+------+
----+ Main +-------------------------------------------------------------
reps(#) specifies the number of bootstrap replications to be performed.
The default is 50. A total of 50-200 replications are generally
adequate for estimates of standard error and thus are adequate for
normal-approximation confidence intervals. Estimates of confidence
intervals using the percentile or bias-corrected methods typically
require 1,000 or more replications.
+---------+
----+ Options +----------------------------------------------------------
strata(varlist) specifies the variables that identify strata. If this
option is specified, bootstrap samples are taken independently within
each stratum.
size(#) specifies the size of the samples to be drawn. The default is
_N, meaning to draw samples of the same size as the data. If
specified, # must be less than or equal to the number of observations
within strata().
If cluster() is specified, the default size is the number of clusters
in the original dataset. For unbalanced clusters, resulting sample
sizes will differ from replication to replication. For cluster
sampling, # must be less than or equal to the number of clusters
within strata().
cluster(varlist) specifies the variables that identify resampling
clusters. If this option is specified, the sample drawn during each
replication is a bootstrap sample of clusters.
idcluster(newvar) creates a new variable containing a unique identifier
for each resampled cluster. This option requires that cluster() also
be specified.
saving(filename[, suboptions]) creates a Stata data file (.dta file)
consisting of, for each statistic in exp_list, a variable containing
bootstrap replicates.
double specifies that the results for each replication be stored as
doubles, meaning 8-byte reals. By default, they are stored as
floats, meaning 4-byte reals. This option may be used without the
saving() option to compute the variance estimates by using double
precision.
every(#) specifies that results be written to disk every #th
replication. every() should be specified only in conjunction
with saving() when command takes a long time for each
replication. This option will allow recovery of partial results
should some other software crash your computer. See [P]
postfile.
replace specifies that filename be overwritten, if it exists. This
option is not shown in the dialog box.
bca specifies that bootstrap estimate the acceleration of each statistic
in exp_list. This estimate is used to construct BCa confidence
intervals. Type estat bootstrap, bca to display the BCa confidence
interval generated by the bootstrap command.
mse specifies that bootstrap compute the variance by using deviations of
the replicates from the observed value of the statistics based on the
entire dataset. By default, bootstrap computes the variance by using
deviations from the average of the replicates.
+-----------+
----+ Reporting +--------------------------------------------------------
level(#); see [R] estimation options.
notable suppresses the display of the table of results.
noheader suppresses the display of the table header. This option implies
nolegend. This option may also be specified when replaying
estimation results.
nolegend suppresses the display of the table legend. This option may
also be specified when replaying estimation results.
verbose specifies that the full table legend be displayed. By default,
coefficients and standard errors are not displayed. This option may
also be specified when replaying estimation results.
nodots suppresses display of the replication dots. By default, one dot
character is displayed for each successful replication. A red 'x' is
displayed if command returns an error or if one of the values in
exp_list is missing.
noisily specifies that any output from command be displayed. This option
implies the nodots option.
trace causes a trace of the execution of command to be displayed. This
option implies the noisily option.
title(text) specifies a title to be displayed above the table of
bootstrap results. The default title is the title saved in e(title)
by an estimation command, or if e(title) is not filled in, Bootstrap
results is used. title() may also be specified when replaying
estimation results.
display_options: noomitted, vsquish, noemptycells, baselevels,
allbaselevels; see [R] estimation options.
+----------+
----+ Advanced +---------------------------------------------------------
nodrop prevents observations outside e(sample) and the if and in
conditions from being dropped before the data are resampled.
nowarn suppresses the display of a warning message when command does not
set e(sample).
force suppresses the restriction that command not specify weights or be a
svy command. This is a rarely used option. Use it only if you know
what you are doing.
reject(exp) identifies an expression that indicates when results should
be rejected. When exp is true, the resulting values are reset to
missing values.
seed(#) sets the random-number seed. Specifying this option is
equivalent to typing the following command prior to calling
bootstrap:
. set seed #
The following options are available with bootstrap but are not shown in
the dialog box:
eform_option causes the coefficient table to be displayed in
exponentiated form; see [R] eform_option. command determines which
of the following are allowed (eform(string) and eform are always
allowed).
group(varname) re-creates varname containing a unique identifier for each
group across the resampled clusters. This option requires that
idcluster() also be specified.
This option is useful for maintaining unique group identifiers when
sampling clusters with replacement. Suppose that cluster 1 contains
3 groups. If the idcluster(newclid) option is specified and cluster
1 is sampled multiple times, newclid uniquely identifies each copy of
cluster 1. If group(newgroupid) is also specified, newgroupid
uniquely identifies each copy of each group.
jackknifeopts(jkopts) identifies options that are to be passed to
jackknife when it computes the acceleration values for the BCa
confidence intervals. This option requires the bca option and is
mostly used for passing the eclass, rclass, or n(#) option to
jackknife.
coeflegend; see [R] estimation options.
Examples
Setup
. sysuse auto
Compute bootstrap estimates
. bootstrap: regress mpg weight gear foreign
Same as above command
. bootstrap _b: regress mpg weight gear foreign
Change number of replications to 100
. bootstrap, reps(100): regress mpg weight gear foreign
Compute acceleration to obtain BCa confidence intervals
. bootstrap, bca: regress mpg weight gear foreign
Save results to bsauto file
. bootstrap, saving(bsauto): regress mpg weight gear foreign
Run bootstrap on difference in coefficients of weight and gear
. bootstrap diff=(_b[weight]-_b[gear]): regress mpg weight gear
foreign
bootstrap t statistic using 1000 replications, stratifying on foreign,
and saving results in bsauto file
. bootstrap t=r(t), rep(1000) strata(foreign) saving(bsauto,
replace): ttest mpg, by(foreign) unequal
Saved results
bootstrap saves the following in e():
Scalars
e(N) sample size
e(N_reps) number of complete replications
e(N_misreps) number of incomplete replications
e(N_strata) number of strata
e(N_clust) number of clusters
e(k_eq) number of equations
e(k_exp) number of standard expressions
e(k_eexp) number of extended expressions (i.e., _b)
e(k_extra) number of extra equations beyond the original
ones from e(b))
e(level) confidence level for bootstrap CIs
e(bs_version) version for bootstrap results
e(rank) rank of e(V)
Macros
e(cmdname) command name from command
e(cmd) same as e(cmdname) or bootstrap
e(command) command
e(cmdline) command as typed
e(prefix) bootstrap
e(title) title in estimation output
e(strata) strata variables
e(cluster) cluster variables
e(seed) initial random-number seed
e(size) from the size(#) option
e(exp#) expression for the #th statistic
e(mse) mse, if specified
e(vce) bootstrap
e(vcetype) title used to label Std. Err.
e(properties) b V
Matrices
e(b) observed statistics
e(b_bs) bootstrap estimates
e(reps) number of nonmissing results
e(bias) estimated biases
e(se) estimated standard errors
e(z0) median biases
e(accel) estimated accelerations
e(ci_normal) normal-approximation CIs
e(ci_percentile) percentile CIs
e(ci_bc) bias-corrected CIs
e(ci_bca) bias-corrected and accelerated CIs
e(V) bootstrap variance-covariance matrix
e(V_modelbased) model-based variance
When exp_list is _b, bootstrap will also carry forward most of the
results already in e() from command.
Also see
Manual: [R] bootstrap
Help: [R] bootstrap postestimation;
[R] jackknife, [R] permute, [R] simulate