Bootstrap estimation (STB-21: ssi6.2) -------------------- ^bstrap^ progname [^, a^rgs^(^...^) c^luster^(^varnames^) d^ots ^r^eps^(^#^) si^ze^(^#^)^ ] Description ----------- ^bstrap^ runs the user defined program progname ^reps()^ times on bootstrap samples of size ^size()^. ^bstrap^ calls progname in two ways. At the outset, ^bstrap^ issues "progname ^?^" and expects progname to set the global macro S_1 to contain a list of variable names under which results are to be stored. Thereafter, ^bstrap^ issues straight "progname" calls, having first set memory to contain a bootstrap sample, and expects progname to perform the statistical calculation and store the results using ^post^. Details of ^post^ can be found in ^help post^, but enough information is provided below to use ^post^ successfully. Description, continued ---------------------- ^bstrap^ is a faster and more convenient variation on ^boot^; see ^help boot^. For those wishing to implement their own special-purpose bootstrapping routines, ^bstrap^ performs its sampling using ^bootsamp^; also see ^help boot^. ^bstrap^ collects the results using ^post^; see ^help post^. In addition to ^bstrap^, ^bs^ provides an even faster and more convenient way to achieve bootstrap standard errors on single statistics; see ^help bs^. Options ------- ^args(^...^)^ specifies any arguments to be passed to progname on invocation. The query call is then of the form "progname ^?^ ..." and subsequent calls of the form "progname ...". Options, continued ------------------ ^cluster(^varnames^)^ specifies the variable(s) identifying sampling clusters. The default is to treat each observation as representing its own cluster. The sample drawn during a replication is actually a bootstrap sample of clusters. ^dots^ requests a dot be placed on the screen at the beginning of every call to progname, thus providing entertainment when a large number of ^reps()^ are requested. ^reps(^#^)^ specifies the number of bootstrap replications to be performed; the default is ^reps(20)^. See ^help bs^ for more information. ^size(^#^)^ specifies the size of the samples to be drawn. The default is ^_N^, meaning to draw samples of the same size as the data. If ^cluster()^ is specified, the default ^_N^ means to draw samples containing the same number of clusters as the data. Unless all the clusters contain the same number of observations, resulting sample sizes will differ from replication to replication. If ^size(^#^)^ is specified, # must be less than or equal to the number of clusters or, if not clustered, the number of observations. Remarks ------- progname must have the following outline: ^program define^ progname ^if "`1'"=="?" {^ ^global S_1 "^variable names^"^ ^exit^ ^}^ perform estimation on sample in memory ^post^ results ^end^ There must be the same number of results following the ^post^ command as variable names following the ^global S_1^ command. Example 1 --------- Problem: Obtain a bootstrap standard error for the median of mpg in the auto data. Solution: ^summarize^ with the ^detail^ option calculates, among other things, medians and, according to Saved Results in [5s] summarize, stores the median in ^_result(10)^. The program is thus: (see next screen) (Note: the lengthy solution below should be compared with the much shorter solution to Example 1 in ^help bs^.) Example 1, continued -------------------- ^program define mpgmed^ ^version 3.1^ ^if "`1'"=="?" {^ ^global S_1 "median"^ ^exit^ ^}^ ^summarize mpg, detail^ ^post _result(10)^ ^end^ The steps are then: . ^use auto^ (load the data) . ^bstrap mpgmed^ (perform bootstrap) (continued ...) Example 1, continued -------------------- . ^bstrap mpgmed^ Bootstrap: Program: mpgmed Arguments: Replications: 50 Data set size: 74 Sample size: _N Variable | Obs Mean Std. Dev. Min Max ---------+----------------------------------------------------- median | 50 20.12 .9066647 19 22 The standard deviation is the bootstrap estimate of the standard error of the median. (continued ...) Example 1, continued -------------------- After running ^bstrap^, the data in memory contains the bootstrapped results: . ^describe^ Contains data Obs: 50 (max= 5040) mpgmed bootstrap Vars: 1 (max= 99) Width: 4 (max= 200) 1. median float %9.0g Sorted by: . ^list in 1/4^ median 1. 20 2. 21 3. 20.5 4. 20 Example 2 --------- Problem: Obtain a bootstrap estimate of the SE of the SE of the mean of mpg using the auto data. Solution: ^ci^ calculates standard errors of means. According to [5s] ci, ^ci^ saves the standard error in the global macro $S_4. ^program define sesemean^ ^version 3.1^ ^if "`1'"=="?" {^ ^global S_1 "se"^ ^exit^ ^}^ ^ci mpg^ ^post $S_4^ ^end^ Then, interactively: . ^bstrap sesemean^ Example 3 --------- Problem: Obtain bootstrap estimates of the standard errors of the coefficients in a regression of mpg on weight and displ. Use 100 replications. Solution: ^program define myreg^ ^version 3.1^ ^if "`1'"=="?" {^ ^global S_1 "weight displ cons"^ ^exit^ ^}^ ^regress mpg weight displ^ ^post _b[weight] _b[displ] _b[_cons]^ ^end^ Then, interactively: . ^bstrap myreg, reps(100)^ Speeding execution ------------------ See ^help bs^. Missing values -------------- See ^help bs^. Also see -------- STB: ssi6.2 (STB-21) On-line: ^help^ for ^bs^; ^boot^