Stata 15 help for jackknife

[R] jackknife -- Jackknife estimation


jackknife exp_list [, options eform_option] : command

options Description ------------------------------------------------------------------------- Main eclass number of observations used is stored in e(N) rclass number of observations used is stored in r(N) n(exp) specify exp that evaluates to the number of observations used

Options cluster(varlist) variables identifying sample clusters idcluster(newvar) create new cluster ID variable saving(filename, ...) save results to filename; save statistics in double precision; save results to filename every # replications keep keep pseudovalues mse use MSE formula for variance estimation

Reporting level(#) set confidence level; default is level(95) notable suppress table of results noheader suppress table header nolegend suppress table legend verbose display the full table legend nodots suppress replication dots dots(#) display dots every # replications noisily display any output from command trace trace command title(text) use text as title for jackknife results display_options control columns and column formats, row spacing, line width, display of omitted variables and base and empty cells, and factor-variable labeling eform_option display coefficient table in exponentiated form

Advanced nodrop do not drop observations reject(exp) identify invalid results

coeflegend display legend instead of statistics ------------------------------------------------------------------------- svy is allowed; see [SVY] svy jackknife. command is any command that follows standard Stata syntax. All weight types supported by command are allowed except aweights; see weight. coeflegend does not appear in the dialog box. See [R] jackknife postestimation for features available after estimation.


Statistics > Resampling > Jackknife estimation


jackknife performs jackknife estimation of the specified statistics (or expressions) for a Stata command or a user-written program. Statistics are jackknifed by estimating the command once for each observation or cluster in the dataset, leaving the associated observation or cluster out of the calculations. jackknife is designed for use with nonestimation commands, functions of coefficients, or user-written programs. To jackknife coefficients, we recommend using the vce(jackknife) option when allowed by the estimation command.

jknife is a synonym for jackknife.


+------+ ----+ Main +-------------------------------------------------------------

eclass, rclass, and n(exp) specify where command stores the number of observations on which it based the calculated results. We strongly advise you to specify one of these options.

eclass specifies that command store the number of observations in e(N).

rclass specifies that command store the number of observations in r(N).

n(exp) specifies an expression that evaluates to the number of observations used. Specifying n(r(N)) is equivalent to specifying the rclass option. Specifying n(e(N)) is equivalent to specifying the eclass option. If command stores the number of observations in r(N1), specify n(r(N1)).

If you specify no options, jackknife will assume eclass or rclass, depending on which of e(N) and r(N) is not missing (in that order). If both e(N) and r(N) are missing, jackknife assumes that all observations in the dataset contribute to the calculated result. If that assumption is incorrect, the reported standard errors will be incorrect. For instance, say that you specify

. jackknife coef=_b[x2]: myreg y x1 x2 x3

where myreg uses e(n) instead of e(N) to identify the number of observations used in calculations. Further assume that observation 42 in the dataset has x3 equal to missing. The 42nd observation plays no role in obtaining the estimates, but jackknife has no way of knowing that and will use the wrong N. If, on the other hand, you specify

. jackknife coef=_b[x2], n(e(n)): myreg y x1 x2 x3

jackknife will notice that observation 42 plays no role. The n(e(n)) option is specified because myreg is an estimation command but it stores the number of observations used in e(n) (instead of the standard e(N)). When jackknife runs the regression omitting the 42nd observation, jackknife will observe that e(n) has the same value as when jackknife previously ran the regression using all the observations. Thus jackknife will know that myreg did not use the observation.

+---------+ ----+ Options +----------------------------------------------------------

cluster(varlist) specifies the variables identifying sample clusters. If cluster() is specified, one cluster is left out of each call to command, instead of 1 observation.

idcluster(newvar) creates a new variable containing a unique integer identifier for each resampled cluster, starting at 1 and leading up to the number of clusters. This option may be specified only when the cluster() option is specified. idcluster() helps identify the cluster to which a pseudovalue belongs.

saving(filename [, suboptions]) creates a Stata data file (.dta file) consisting of (for each statistic in exp_list) a variable containing the replicates.

See prefix_saving_option for details about suboptions.

keep specifies that new variables be added to the dataset containing the pseudovalues of the requested statistics. See [R] jackknife for details. When the cluster() option is specified, each cluster is given at most one nonmissing pseudovalue. The keep option implies the nodrop option.

mse specifies that jackknife compute the variance by using deviations of the replicates from the observed value of the statistics based on the entire dataset. By default, jackknife computes the variance by using deviations of the pseudovalues from their mean.

+-----------+ ----+ Reporting +--------------------------------------------------------

level(#); see [R] estimation options.

notable suppresses the display of the table of results.

noheader suppresses the display of the table header. This option implies nolegend.

nolegend suppresses the display of the table legend. The table legend identifies the rows of the table with the expressions they represent.

verbose specifies that the full table legend be displayed. By default, coefficients and standard errors are not displayed.

nodots suppresses display of the replication dots. By default, one dot character is displayed for each successful replication. A red 'x' is displayed if command returns an error or if one of the values in exp_list is missing.

dots(#) displays dots every # replications. dots(0) is a synonym for nodots.

noisily specifies that any output from command be displayed. This option implies the nodots option.

trace causes a trace of the execution of command to be displayed. This option implies the noisily option.

title(text) specifies a title to be displayed above the table of jackknife results; the default title is Jackknife results or what is produced in e(title) by an estimation command.

display_options: noci, nopvalues, noomitted, vsquish, noemptycells, baselevels, allbaselevels, nofvlabel, fvwrap(#), fvwrapon(style), cformat(%fmt), pformat(%fmt), sformat(%fmt), and nolstretch; see [R] estimation options.

eform_option causes the coefficient table to be displayed in exponentiated form; see [R] eform_option. command determines which eform_option is allowed (eform(string) and eform are always allowed).

+----------+ ----+ Advanced +---------------------------------------------------------

nodrop prevents observations outside e(sample) and the if and in qualifiers from being dropped before the data are resampled.

reject(exp) identifies an expression that indicates when results should be rejected. When exp is true, the resulting values are reset to missing values.

The following option is available with jackknife but is not shown in the dialog box:

coeflegend; see [R] estimation options.



. jackknife exp_list: command

executes command once for each observation in the dataset, leaving the associated observation out of the calculations that make up exp_list.

command defines the statistical command to be executed. Most Stata commands and user-written programs can be used with jackknife, as long as they follow standard Stata syntax and allow the if qualifier; see [U] 11 Language syntax. The by prefix may not be part of command.

exp_list specifies the statistics to be collected from the execution of command. If command changes the contents in e(b), exp_list is optional and defaults to _b.

When the cluster() option is given, clusters are omitted instead of observations, and N is the number of clusters instead of the sample size.


Setup . sysuse auto

Jackknifed standard error of the sample mean . jackknife r(mean): summarize mpg

Jackknifed standard errors of the coefficients from a regression . jackknife: regress mpg weight trunk

Stored results

jknife stores the following in e():

Scalars e(N) sample size e(N_reps) number of complete replications e(N_misreps) number of incomplete replications e(N_clust) number of clusters e(k_eq) number of equations in e(b) e(k_extra) number of extra equations e(k_exp) number of expressions e(k_eexp) number of extended expressions (_b or _se) e(df_r) degrees of freedom

Macros e(cmdname) command name from command e(cmd) same as e(cmdname) or jackknife e(command) command e(cmdline) command as typed e(prefix) jackknife e(wtype) weight type e(wexp) weight expression e(title) title in estimation output e(cluster) cluster variables e(pseudo) new variables containing pseudovalues e(nfunction) e(N), r(N), n() option, or empty e(exp#) expression for the #th statistic e(mse) from mse option e(vce) jackknife e(vcetype) title used to label Std. Err. e(properties) b V

Matrices e(b) observed statistics e(b_jk) jackknife estimates e(V) jackknife variance-covariance matrix e(V_modelbased) model-based variance

When exp_list is _b, jackknife will also carry forward most of the results already in e() from command.

© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index