Stata 11 help for jackknife

help jackknife dialog: jackknife also see: jackknife postestimation -------------------------------------------------------------------------------

Title

[R] jackknife -- Jackknife estimation

Syntax

jackknife exp_list [, options eform_option ] : command

options description ------------------------------------------------------------------------- Main eclass number of observations used is stored in e(N) rclass number of observations used is stored in r(N) n(exp) specify exp that evaluates to the number of observations used

Options cluster(varlist) variables identifying sample clusters idcluster(newvar) create new cluster ID variable saving(filename, ...) save results to filename; save statistics in double precision; save results to filename every # replications keep keep pseudovalues mse use MSE formula for variance estimation

Reporting level(#) set confidence level; default is level(95) notable suppress table of results noheader suppress table header nolegend suppress table legend verbose display the full table legend nodots suppress the replication dots noisily display any output from command trace trace the command title(text) use text as title for jackknife results display_options control spacing and display of omitted variables and base and empty cells

Advanced nodrop do not drop observations reject(exp) identify invalid results

+ eform_option display coefficient table in exponentiated form + coeflegend display coefficients' legend instead of coefficient table ------------------------------------------------------------------------- + eform_option and coeflegend do not appear in the dialog box. svy is allowed; see [SVY] svy jackknife. All weight types supported by command are allowed except aweights; see weight. See [R] jackknife postestimation for features available after estimation.

Menu

Statistics > Resampling > Jackknife estimation

Description

jackknife performs jackknife estimation. Typing

. jackknife exp_list: command

executes command once for each observation in the dataset, leaving the associated observation out of the calculations that make up exp_list.

command defines the statistical command to be executed. Most Stata commands and user-written programs can be used with jackknife, as long as they follow standard Stata syntax and allow the if qualifier. The by prefix may not be part of command; see [U] 11 Language syntax.

exp_list specifies the statistics to be collected from the execution of command. If command changes the contents in e(b), exp_list is optional and defaults to _b.

Many estimation commands allow the vce(jackknife) option. For those commands, we recommend using vce(jackknife) over jackknife because the estimation command already handles clustering and other model-specific details for you. The jackknife prefix command is intended for use with nonestimation commands, such as summarize, user-written commands, or functions of coefficients.

jknife is a synonym for jackknife.

Options

+------+ ----+ Main +-------------------------------------------------------------

eclass, rclass, and n(exp) specify where command saves the number of observations on which it based the calculated results. We strongly advise you to specify one of these options.

eclass specifies that command save the number of observations in e(N).

rclass specifies that command save the number of observations in r(N).

n(exp) specifies an expression that evaluates to the number of observations used. Specifying n(r(N)) is equivalent to specifying the rclass option. Specifying n(e(N)) is equivalent to specifying the eclass option. If command saved the number of observations in r(N1), specify n(r(N1)).

If you specify no options, jackknife will assume eclass or rclass, depending on which of e(N) and r(N) is not missing (in that order). If both e(N) and r(N) are missing, jackknife assumes that all observations in the dataset contribute to the calculated result. If that assumption is incorrect, the reported standard errors will be incorrect. For instance, say that you specify

. jackknife coef=_b[x2]: myreg y x1 x2 x3

where myreg uses e(n) instead of e(N) to identify the number of observations used in calculations. Further assume that observation 42 in the dataset has x3 equal to missing. The 42nd observation plays no role in obtaining the estimates, but jackknife has no way of knowing that and will use the wrong N. If, on the other hand, you specify

. jackknife coef=_b[x2], n(e(n)): myreg y x1 x2 x3

jackknife will notice that observation 42 plays no role. The n(e(n)) option is specified because myreg is an estimation command but it saves the number of observations used in e(n) (instead of the standard e(N)). When jackknife runs the regression omitting the 42nd observation, jackknife will observe that e(n) has the same value as when jackknife previously ran the regression using all the observations. Thus jackknife will know that myreg did not use the observation.

+---------+ ----+ Options +----------------------------------------------------------

cluster(varlist) specifies the variables identifying sample clusters. If cluster() is specified, one cluster is left out of each call to command, instead of 1 observation.

idcluster(newvar) creates a new variable containing a unique integer identifier for each resampled cluster, starting at 1 and leading up to the number of clusters. This option may be specified only when the cluster() option is specified. idcluster() helps identify the cluster to which a pseudovalue belongs.

saving(filename [, suboptions]) creates a Stata data file (.dta file) consisting of (for each statistic in exp_list) a variable containing the replicates.

See prefix_saving_option for details about suboptions.

replace specifies that filename be overwritten, if it exists. This option is not shown in the dialog box.

keep specifies that new variables are to be added to the dataset containing the pseudovalues of the requested statistics. See [R] jackknife for details. When the cluster() option is specified, each cluster is given at most one nonmissing pseudovalue. This option implies the nodrop option.

mse specifies that jackknife compute the variance by using deviations of the replicates from the observed value of the statistics based on the entire dataset. By default, jackknife computes the variance by using deviations of the pseudovalues from their mean.

+-----------+ ----+ Reporting +--------------------------------------------------------

level(#); see [R] estimation options.

notable suppresses the display of the table of results.

noheader suppresses display of the table header. This option implies nolegend.

nolegend suppresses display of the table legend. The table legend identifies the rows of the table with the expressions they represent.

verbose specifies that the full table legend be displayed. By default, coefficients and standard errors are not displayed.

nodots suppresses display of the replication dots. By default, one dot character is displayed for each successful replication. A red `x' is displayed if command returns an error or if one of the values in exp_list is missing.

noisily specifies that any output from command be displayed. This option implies the nodots option.

trace causes a trace of the execution of command to be displayed. This option implies the noisily option.

title(text) specifies a title to be displayed above the table of jackknife results; the default title is Jackknife results or what is produced in e(title) by an estimation command.

display_options: noomitted, vsquish, noemptycells, baselevels, allbaselevels; see [R] estimation options.

+----------+ ----+ Advanced +---------------------------------------------------------

nodrop prevents observations outside e(sample) and the if and in qualifiers from being dropped before the data are resampled.

reject(exp) identifies an expression that indicates when results should be rejected. When exp is true, the resulting values are reset to missing values.

The following options are available with jackknife but are not shown in the dialog box:

eform_option causes the coefficient table to be displayed in exponentiated form; see [R] eform_option. command determines which eform_option is allowed (eform(string) and eform are always allowed).

coeflegend; see [R] estimation options.

Examples

Setup . sysuse auto

Jackknifed standard error of the sample mean . jackknife r(mean): summarize mpg

Jackknifed standard errors of the coefficients from a regression . jackknife: regress mpg weight trunk

Saved results

jknife saves the following in e():

Scalars e(N) sample size e(N_reps) number of complete replications e(N_misreps) number of incomplete replications e(N_clust) number of clusters e(df_r) degrees of freedom

Macros e(cmdname) command name from command e(cmd) same as e(cmdname) or jackknife e(command) command e(cmdline) command as typed e(prefix) jackknife e(wtype) weight type e(wexp) weight expression e(title) title in estimation output e(cluster) cluster variables e(pseudo) new variables containing pseudovalues e(nfunction) e(N), r(N), n() option, or empty e(exp#) expression for the #th statistic e(mse) from mse option e(vce) jackknife e(vcetype) title used to label Std. Err. e(properties) b V

Matrices e(b) observed statistics e(b_jk) jackknife estimates e(V) jackknife variance-covariance matrix e(V_modelbased) model-based variance

When exp_list is _b, jackknife will also carry forward most of the results already in e() from command.

Also see

Manual: [R] jackknife

Help: [R] jackknife postestimation; [R] bootstrap, [R] permute, [R] simulate, [SVY] svy jackknife


© Copyright 1996–2009 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index