Stata 11 help for bootstrap

help bootstrap dialog: bootstrap also see: bootstrap postestimation -------------------------------------------------------------------------------

Title

[R] bootstrap -- Bootstrap sampling and estimation

Syntax

bootstrap exp_list [, options eform_option] : command

options description ------------------------------------------------------------------------- Main reps(#) perform # bootstrap replications; default is reps(50)

Options strata(varlist) variables identifying strata size(#) draw samples of size #; default is _N cluster(varlist) variables identifying resampling clusters idcluster(newvar) create new cluster ID variable saving(filename, ...) save results to filename; save statistics in double precision; save results to filename every # replications bca compute acceleration for BCa confidence intervals mse use MSE formula for variance estimate

Reporting level(#) set confidence level; default is level(95) notable suppress table of results noheader suppress table header nolegend suppress table legend verbose display the full table legend nodots suppress the replication dots noisily display any output from command trace trace the command title(text) use text as title for bootstrap results display_options control spacing and display of omitted variables and base and empty cells

Advanced nodrop do not drop observations nowarn do not warn when e(sample) is not set force do not check for weights or svy commands; seldom used reject(exp) identify invalid results seed(#) set random-number seed to #

+ group(varname) ID variable for groups within cluster() + jackknifeopts(jkopts) options for jackknife + coeflegend display coefficients' legend instead of coefficient table ------------------------------------------------------------------------- + group(), jackknifeopts(), and coeflegend do not appear in the dialog box. weights are not allowed in command. See [R] bootstrap postestimation for features available after estimation.

Menu

Statistics > Resampling > Bootstrap estimation

Description

bootstrap performs bootstrap estimation. Typing

. bootstrap exp_list, reps(#): command

executes command multiple times, bootstrapping the statistics in exp_list by resampling observations (with replacement) from the data in memory # times. This method is commonly referred to as the nonparametric bootstrap.

command defines the statistical command to be executed. Most Stata commands and user-written programs can be used with bootstrap, as long as they follow standard Stata syntax. If the bca option is supplied, command must also work with jackknife; see [R] jackknife. The by prefix may not be part of command.

exp_list specifies the statistics to be collected from the execution of command. If command changes the contents in e(b), exp_list is optional and defaults to _b.

Because bootstrapping is a random process, if you want to be able to reproduce results, set the random-number seed by specifying the seed(#) option or by typing

. set seed #

where # is a seed of your choosing, before running bootstrap; see [R] set seed.

Many estimation commands allow the vce(bootstrap) option. For those commands, we recommend using vce(bootstrap) over bootstrap because the estimation command already handles clustering and other model-specific details for you. The bootstrap prefix command is intended for use with nonestimation commands, such as summarize, user-written commands, or functions of coefficients.

bs and bstrap are synonyms for bootstrap.

Options

+------+ ----+ Main +-------------------------------------------------------------

reps(#) specifies the number of bootstrap replications to be performed. The default is 50. A total of 50-200 replications are generally adequate for estimates of standard error and thus are adequate for normal-approximation confidence intervals. Estimates of confidence intervals using the percentile or bias-corrected methods typically require 1,000 or more replications.

+---------+ ----+ Options +----------------------------------------------------------

strata(varlist) specifies the variables that identify strata. If this option is specified, bootstrap samples are taken independently within each stratum.

size(#) specifies the size of the samples to be drawn. The default is _N, meaning to draw samples of the same size as the data. If specified, # must be less than or equal to the number of observations within strata().

If cluster() is specified, the default size is the number of clusters in the original dataset. For unbalanced clusters, resulting sample sizes will differ from replication to replication. For cluster sampling, # must be less than or equal to the number of clusters within strata().

cluster(varlist) specifies the variables that identify resampling clusters. If this option is specified, the sample drawn during each replication is a bootstrap sample of clusters.

idcluster(newvar) creates a new variable containing a unique identifier for each resampled cluster. This option requires that cluster() also be specified.

saving(filename[, suboptions]) creates a Stata data file (.dta file) consisting of, for each statistic in exp_list, a variable containing bootstrap replicates.

double specifies that the results for each replication be stored as doubles, meaning 8-byte reals. By default, they are stored as floats, meaning 4-byte reals. This option may be used without the saving() option to compute the variance estimates by using double precision.

every(#) specifies that results be written to disk every #th replication. every() should be specified only in conjunction with saving() when command takes a long time for each replication. This option will allow recovery of partial results should some other software crash your computer. See [P] postfile.

replace specifies that filename be overwritten, if it exists. This option is not shown in the dialog box.

bca specifies that bootstrap estimate the acceleration of each statistic in exp_list. This estimate is used to construct BCa confidence intervals. Type estat bootstrap, bca to display the BCa confidence interval generated by the bootstrap command.

mse specifies that bootstrap compute the variance by using deviations of the replicates from the observed value of the statistics based on the entire dataset. By default, bootstrap computes the variance by using deviations from the average of the replicates.

+-----------+ ----+ Reporting +--------------------------------------------------------

level(#); see [R] estimation options.

notable suppresses the display of the table of results.

noheader suppresses the display of the table header. This option implies nolegend. This option may also be specified when replaying estimation results.

nolegend suppresses the display of the table legend. This option may also be specified when replaying estimation results.

verbose specifies that the full table legend be displayed. By default, coefficients and standard errors are not displayed. This option may also be specified when replaying estimation results.

nodots suppresses display of the replication dots. By default, one dot character is displayed for each successful replication. A red 'x' is displayed if command returns an error or if one of the values in exp_list is missing.

noisily specifies that any output from command be displayed. This option implies the nodots option.

trace causes a trace of the execution of command to be displayed. This option implies the noisily option.

title(text) specifies a title to be displayed above the table of bootstrap results. The default title is the title saved in e(title) by an estimation command, or if e(title) is not filled in, Bootstrap results is used. title() may also be specified when replaying estimation results.

display_options: noomitted, vsquish, noemptycells, baselevels, allbaselevels; see [R] estimation options.

+----------+ ----+ Advanced +---------------------------------------------------------

nodrop prevents observations outside e(sample) and the if and in conditions from being dropped before the data are resampled.

nowarn suppresses the display of a warning message when command does not set e(sample).

force suppresses the restriction that command not specify weights or be a svy command. This is a rarely used option. Use it only if you know what you are doing.

reject(exp) identifies an expression that indicates when results should be rejected. When exp is true, the resulting values are reset to missing values.

seed(#) sets the random-number seed. Specifying this option is equivalent to typing the following command prior to calling bootstrap:

. set seed #

The following options are available with bootstrap but are not shown in the dialog box:

eform_option causes the coefficient table to be displayed in exponentiated form; see [R] eform_option. command determines which of the following are allowed (eform(string) and eform are always allowed).

group(varname) re-creates varname containing a unique identifier for each group across the resampled clusters. This option requires that idcluster() also be specified.

This option is useful for maintaining unique group identifiers when sampling clusters with replacement. Suppose that cluster 1 contains 3 groups. If the idcluster(newclid) option is specified and cluster 1 is sampled multiple times, newclid uniquely identifies each copy of cluster 1. If group(newgroupid) is also specified, newgroupid uniquely identifies each copy of each group.

jackknifeopts(jkopts) identifies options that are to be passed to jackknife when it computes the acceleration values for the BCa confidence intervals. This option requires the bca option and is mostly used for passing the eclass, rclass, or n(#) option to jackknife.

coeflegend; see [R] estimation options.

Examples

Setup . sysuse auto

Compute bootstrap estimates . bootstrap: regress mpg weight gear foreign

Same as above command . bootstrap _b: regress mpg weight gear foreign

Change number of replications to 100 . bootstrap, reps(100): regress mpg weight gear foreign

Compute acceleration to obtain BCa confidence intervals . bootstrap, bca: regress mpg weight gear foreign

Save results to bsauto file . bootstrap, saving(bsauto): regress mpg weight gear foreign

Run bootstrap on difference in coefficients of weight and gear . bootstrap diff=(_b[weight]-_b[gear]): regress mpg weight gear foreign

bootstrap t statistic using 1000 replications, stratifying on foreign, and saving results in bsauto file . bootstrap t=r(t), rep(1000) strata(foreign) saving(bsauto, replace): ttest mpg, by(foreign) unequal

Saved results

bootstrap saves the following in e():

Scalars e(N) sample size e(N_reps) number of complete replications e(N_misreps) number of incomplete replications e(N_strata) number of strata e(N_clust) number of clusters e(k_eq) number of equations e(k_exp) number of standard expressions e(k_eexp) number of extended expressions (i.e., _b) e(k_extra) number of extra equations beyond the original ones from e(b)) e(level) confidence level for bootstrap CIs e(bs_version) version for bootstrap results e(rank) rank of e(V)

Macros e(cmdname) command name from command e(cmd) same as e(cmdname) or bootstrap e(command) command e(cmdline) command as typed e(prefix) bootstrap e(title) title in estimation output e(strata) strata variables e(cluster) cluster variables e(seed) initial random-number seed e(size) from the size(#) option e(exp#) expression for the #th statistic e(mse) mse, if specified e(vce) bootstrap e(vcetype) title used to label Std. Err. e(properties) b V

Matrices e(b) observed statistics e(b_bs) bootstrap estimates e(reps) number of nonmissing results e(bias) estimated biases e(se) estimated standard errors e(z0) median biases e(accel) estimated accelerations e(ci_normal) normal-approximation CIs e(ci_percentile) percentile CIs e(ci_bc) bias-corrected CIs e(ci_bca) bias-corrected and accelerated CIs e(V) bootstrap variance-covariance matrix e(V_modelbased) model-based variance

When exp_list is _b, bootstrap will also carry forward most of the results already in e() from command.

Also see

Manual: [R] bootstrap

Help: [R] bootstrap postestimation; [R] jackknife, [R] permute, [R] simulate


© Copyright 1996–2009 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index