**[R] jackknife** -- Jackknife estimation

__Syntax__

**jackknife** *exp_list* [**,** *options* *eform_option*] **:** *command*

*options* Description
-------------------------------------------------------------------------
Main
__e__**class** number of observations used is stored in **e(N)**
__r__**class** number of observations used is stored in **r(N)**
**n(***exp***)** specify *exp* that evaluates to the number of
observations used

Options
__cl__**uster(***varlist***)** variables identifying sample clusters
__id__**cluster(***newvar***)** create new cluster ID variable
__sa__**ving(***filename***, ...)** save results to *filename*; save statistics in
double precision; save results to *filename*
every *#* replications
**keep** keep pseudovalues
**mse** use MSE formula for variance estimation

Reporting
__l__**evel(***#***)** set confidence level; default is **level(95)**
**notable** suppress table of results
__noh__**eader** suppress table header
__nol__**egend** suppress table legend
__v__**erbose** display the full table legend
**nodots** suppress replication dots
**dots(***#***)** display dots every *#* replications
__noi__**sily** display any output from *command*
__tr__**ace** trace *command*
__ti__**tle(***text***)** use *text* as title for jackknife results
*display_options* control columns and column formats, row
spacing, line width, display of omitted
variables and base and empty cells, and
factor-variable labeling
*eform_option* display coefficient table in exponentiated form

Advanced
**nodrop** do not drop observations
**reject(***exp***)** identify invalid results

__coefl__**egend** display legend instead of statistics
-------------------------------------------------------------------------
**svy** is allowed; see **[SVY] svy jackknife**.
*command* is any command that follows standard Stata syntax. All weight
types supported by *command* are allowed except **aweight**s; see weight.
**coeflegend** does not appear in the dialog box.
See **[R] jackknife postestimation** for features available after estimation.

__Menu__

**Statistics > Resampling > Jackknife estimation**

__Description__

**jackknife** performs jackknife estimation of the specified statistics (or
expressions) for a Stata command or a user-written program. Statistics
are jackknifed by estimating the command once for each observation or
cluster in the dataset, leaving the associated observation or cluster out
of the calculations. **jackknife** is designed for use with nonestimation
commands, functions of coefficients, or user-written programs. To
jackknife coefficients, we recommend using the **vce(jackknife)** option when
allowed by the estimation command.

**jknife** is a synonym for **jackknife**.

__Options__

+------+
----+ Main +-------------------------------------------------------------

**eclass**, **rclass**, and **n(***exp***)** specify where *command* stores the number of
observations on which it based the calculated results. We strongly
advise you to specify one of these options.

**eclass** specifies that *command* store the number of observations in
**e(N)**.

**rclass** specifies that *command* store the number of observations in
**r(N)**.

**n(***exp***)** specifies an expression that evaluates to the number of
observations used. Specifying **n(r(N))** is equivalent to specifying
the **rclass** option. Specifying **n(e(N))** is equivalent to specifying
the **eclass** option. If *command* stores the number of observations in
**r(N1)**, specify **n(r(N1))**.

If you specify no options, **jackknife** will assume **eclass** or **rclass**,
depending on which of **e(N)** and **r(N)** is not missing (in that order).
If both **e(N)** and **r(N)** are missing, **jackknife** assumes that all
observations in the dataset contribute to the calculated result. If
that assumption is incorrect, the reported standard errors will be
incorrect. For instance, say that you specify

**. jackknife coef=_b[x2]: myreg y x1 x2 x3**

where **myreg** uses **e(n)** instead of **e(N)** to identify the number of
observations used in calculations. Further assume that observation
42 in the dataset has **x3** equal to missing. The 42nd observation
plays no role in obtaining the estimates, but **jackknife** has no way of
knowing that and will use the wrong *N*. If, on the other hand, you
specify

**. jackknife coef=_b[x2], n(e(n)): myreg y x1 x2 x3**

**jackknife** will notice that observation 42 plays no role. The **n(e(n))**
option is specified because **myreg** is an estimation command but it
stores the number of observations used in **e(n)** (instead of the
standard **e(N)**). When **jackknife** runs the regression omitting the 42nd
observation, **jackknife** will observe that **e(n)** has the same value as
when **jackknife** previously ran the regression using all the
observations. Thus **jackknife** will know that **myreg** did not use the
observation.

+---------+
----+ Options +----------------------------------------------------------

**cluster(***varlist***)** specifies the variables identifying sample clusters. If
**cluster()** is specified, one cluster is left out of each call to
*command*, instead of 1 observation.

**idcluster(***newvar***)** creates a new variable containing a unique integer
identifier for each resampled cluster, starting at **1** and leading up
to the number of clusters. This option may be specified only when
the **cluster()** option is specified. **idcluster()** helps identify the
cluster to which a pseudovalue belongs.

**saving(***filename* [**,** *suboptions*]**)** creates a Stata data file (**.dta** file)
consisting of (for each statistic in *exp_list*) a variable containing
the replicates.

See prefix_saving_option for details about *suboptions*.

**keep** specifies that new variables be added to the dataset containing the
pseudovalues of the requested statistics. See **[R] jackknife** for
details. When the **cluster()** option is specified, each cluster is
given at most one nonmissing pseudovalue. The **keep** option implies
the **nodrop** option.

**mse** specifies that **jackknife** compute the variance by using deviations of
the replicates from the observed value of the statistics based on the
entire dataset. By default, **jackknife** computes the variance by using
deviations of the pseudovalues from their mean.

+-----------+
----+ Reporting +--------------------------------------------------------

**level(***#***)**; see **[R] estimation options**.

**notable** suppresses the display of the table of results.

**noheader** suppresses the display of the table header. This option implies
**nolegend**.

**nolegend** suppresses the display of the table legend. The table legend
identifies the rows of the table with the expressions they represent.

**verbose** specifies that the full table legend be displayed. By default,
coefficients and standard errors are not displayed.

**nodots** suppresses display of the replication dots. By default, one dot
character is displayed for each successful replication. A red 'x' is
displayed if *command* returns an error or if one of the values in
*exp_list* is missing.

**dots(***#***)** displays dots every *#* replications. **dots(0)** is a synonym for
**nodots**.

**noisily** specifies that any output from *command* be displayed. This option
implies the **nodots** option.

**trace** causes a trace of the execution of *command* to be displayed. This
option implies the **noisily** option.

**title(***text***)** specifies a title to be displayed above the table of
jackknife results; the default title is **Jackknife results** or what is
produced in **e(title)** by an estimation command.

*display_options*: **noci**, __nopv__**alues**, __noomit__**ted**, **vsquish**, __noempty__**cells**,
__base__**levels**, __allbase__**levels**, __nofvlab__**el**, **fvwrap(***#***)**, **fvwrapon(***style***)**,
**cformat(***%fmt***)**, **pformat(%***fmt***)**, **sformat(%***fmt***)**, and **nolstretch**; see **[R]**
**estimation options**.

*eform_option* causes the coefficient table to be displayed in
exponentiated form; see **[R]** *eform_option*. *command* determines which
*eform_option* is allowed (**eform(***string***)** and **eform** are always allowed).

+----------+
----+ Advanced +---------------------------------------------------------

**nodrop** prevents observations outside **e(sample)** and the **if** and **in**
qualifiers from being dropped before the data are resampled.

**reject(***exp***)** identifies an expression that indicates when results should
be rejected. When *exp* is true, the resulting values are reset to
missing values.

The following option is available with **jackknife** but is not shown in the
dialog box:

**coeflegend**; see **[R] estimation options**.

__Remarks__

Typing

**. jackknife** *exp_list***:** *command*

executes *command* once for each observation in the dataset, leaving the
associated observation out of the calculations that make up *exp_list*.

*command* defines the statistical command to be executed. Most Stata
commands and user-written programs can be used with **jackknife**, as long as
they follow standard Stata syntax and allow the **if** qualifier; see **[U] 11**
**Language syntax**. The **by** prefix may not be part of *command*.

*exp_list* specifies the statistics to be collected from the execution of
*command*. If *command* changes the contents in **e(b)**, *exp_list* is optional
and defaults to **_b**.

When the **cluster()** option is given, clusters are omitted instead of
observations, and N is the number of clusters instead of the sample size.

__Examples__

Setup
**. sysuse auto**

Jackknifed standard error of the sample mean
**. jackknife r(mean): summarize mpg**

Jackknifed standard errors of the coefficients from a regression
**. jackknife: regress mpg weight trunk**

__Stored results__

**jknife** stores the following in **e()**:

Scalars
**e(N)** sample size
**e(N_reps)** number of complete replications
**e(N_misreps)** number of incomplete replications
**e(N_clust)** number of clusters
**e(k_eq)** number of equations in **e(b)**
**e(k_extra)** number of extra equations
**e(k_exp)** number of expressions
**e(k_eexp)** number of extended expressions (**_b** or **_se**)
**e(df_r)** degrees of freedom

Macros
**e(cmdname)** command name from *command*
**e(cmd)** same as **e(cmdname)** or **jackknife**
**e(command)** *command*
**e(cmdline)** command as typed
**e(prefix)** **jackknife**
**e(wtype)** weight type
**e(wexp)** weight expression
**e(title)** title in estimation output
**e(cluster)** cluster variables
**e(pseudo)** new variables containing pseudovalues
**e(nfunction)** **e(N)**, **r(N)**, **n()** option, or empty
**e(exp***#***)** expression for the *#*th statistic
**e(mse)** from **mse** option
**e(vce)** **jackknife**
**e(vcetype)** title used to label Std. Err.
**e(properties)** **b V**

Matrices
**e(b)** observed statistics
**e(b_jk)** jackknife estimates
**e(V)** jackknife variance-covariance matrix
**e(V_modelbased)** model-based variance

When *exp_list* is **_b, jackknife** will also carry forward most of the
results already in **e()** from *command*.