**[R] predictnl** -- Obtain nonlinear predictions, standard errors, etc., after
estimation

__Syntax__

**predictnl** [*type*] *newvar* = *pnl_exp* [*if*] [*in*] [**,** *options*]

*options* Description
-------------------------------------------------------------------------
Main
**se(***newvar***)** create *newvar* containing standard errors
__var__**iance(***newvar***)** create *newvar* containing variances
__w__**ald(***newvar***)** create *newvar* containing the Wald test statistic
**p(***newvar***)** create *newvar* containing the p-value for the Wald
test
**ci(***newvars***)** create *newvars* containing lower and upper
confidence intervals
__l__**evel(***#***)** set confidence level; default is **level(95)**
**g(***stub***)** create *stub***1**, *stub***2**, ..., *stub***k** variables
containing observation-specific derivatives

Advanced
__iter__**ate(***#***)** maximum iterations for finding optimal step size;
default is 100
**force** calculate standard errors, etc., even when possibly
inappropriate

**df(***#***)** use F distribution with *#* denominator degrees of
freedom for the reference distribution of the
test statistic
-------------------------------------------------------------------------
**df(***#***)** does not appear in the dialog box.

__Menu__

**Statistics > Postestimation**

__Description__

**predictnl** calculates (possibly) nonlinear predictions after any Stata
estimation command and optionally calculates the variances, standard
errors, Wald test statistics, p-values, and confidence limits for these
predictions. Unlike its companion nonlinear postestimation commands
**testnl** and **nlcom**, **predictnl** generates functions of the data (that is,
predictions), not scalars. The quantities generated by **predictnl** are
thus vectorized over the observations in the data.

Consider some general prediction, g(theta, x_i), for i = 1, ..., n, where
theta are the model parameters and x_i are some data for the ith
observation; x_i is assumed fixed. Typically, g(theta, x_i) is estimated
by g(theta-hat), x_i), where theta-hat are the estimated model
parameters, which are stored in **e(b)** following any Stata estimation
command.

In its most common use, **predictnl** generates two variables: one
containing the estimated prediction, g(theta-hat, x_i), the other
containing the estimated standard error of g(theta-hat, x_i). The
calculation of standard errors (and other obtainable quantities that are
based on the standard errors, such as test statistics) is based on the
delta method, an approximation appropriate in large samples; see *Methods*
*and formulas* in **[R] predictnl**.

**predictnl** can be used with **svy** estimation results (assuming that **predict**
is also allowed), see **[SVY] svy postestimation**.

The specification of g(theta-hat, x_i) is handled by specifying *pnl_exp*,
and the values of g(theta-hat, x_i) are stored in the new variable *newvar*
of storage type *type*. *pnl_exp* is any valid Stata expression and may also
contain calls to two special functions unique to **predictnl**:

1. **predict(**[*predict_options*]**)**: When you are evaluating *pnl_exp*,
**predict()** is a convenience function that replicates the
calculation performed by the command

**predict ...,** *predict_options*

As such, **predict()** function may be used either as a shorthand
for the formula used to make this prediction or when the
formula is not readily available. When used without
arguments, **predict()** replicates the default prediction for
that particular estimation command.

2. **xb(**[*eqno*]**)**: The **xb()** function replicates the calculation of
the linear predictor x_j*b for equation *eqno*. If **xb()** is
specified without *eqno*, the linear predictor for the first
equation (or the only equation in single-equation estimation)
is obtained.

For example, **xb(#1)** (or equivalently, **xb()**) with no arguments)
translates to the linear predictor for the first equation,
**xb(#2)** for the second, and so on. You could also refer to the
equations by their names, such as **xb(income)**.

When specifying *pnl_exp*, both of these functions may be used
repeatedly, in combination, and in combination with other
Stata functions and expressions. See *Remarks and examples* in
**[R] predictnl** for examples that use both of these functions.

__Options__

+------+
----+ Main +-------------------------------------------------------------

**se(***newvar***)** adds *newvar* of storage type *type*, where for each **i** in the
prediction sample, *newvar***[i]** contains the estimated standard error of
*pnl_exp***[i]**.

**variance(***newvar***)** adds *newvar* of storage type *type*, where for each **i** in
the prediction sample, *newvar***[i]** contains the estimated variance of
*pnl_exp***[i]**.

**wald(***newvar***)** adds *newvar* of storage type *type*, where for each **i** in the
prediction sample, *newvar***[i]** contains the Wald test statistic for the
test of the hypothesis H0:*pnl_exp***[i]**=0.

**p(***newvar***)** adds *newvar* of storage type *type*, where *newvar***[i]** contains the
p-value for the Wald test H0:*pnl_exp***[i]**=0 versus the two-sided
alternative.

**ci(***newvars***)** requires the specification of two *newvars*, such that the *i*th
observation of each will contain the left and right endpoints
(respectively) of a confidence interval for *pnl_exp***[i]**. The level of
the confidence intervals is determined by **level(***#***)**.

**level(***#***)** specifies the confidence level, as a percentage, for confidence
intervals. The default is **level(95)** or as set by **set level**.

**g(***stub***)** specifies that new variables, *stub***1**, *stub***2**, ..., *stub***k** be
created, where **k** is the dimension of e(b). *stub***1** will contain the
observation-specific derivatives of *pnl_exp* with respect to the first
element listed in e(b); *stub***2** will contain the derivatives of *pnl_exp*
with respect to e(b); etc. If the derivative of *pnl_exp* with respect
to a particular coefficient in e(b) equals zero for all observations
in the prediction sample, the *stub* variable for that coefficient is
not created. The ordering of the parameters in e(b) is precisely
that of the stored vector of parameter estimates e(b).

+----------+
----+ Advanced +---------------------------------------------------------

**iterate(***#***)** specifies the maximum number of iterations used to find the
optimal step size in the calculation of numerical derivatives of
*pnl_exp* with respect to estimated model coefficients. By default,
the maximum number of iterations is 100, but convergence is usually
achieved after only a few iterations. You should rarely have to use
this option.

**force** forces the calculation of standard errors and other
inference-related quantities in situations where **predictnl** would
otherwise refuse to do so. The calculation of standard errors takes
place by evaluating the numerical derivative of *pnl_exp* with respect
to the coefficient vector e(b). If **predictnl** detects that *pnl_exp* is
possibly a function of random quantities other than e(b), it will
refuse to calculate standard errors or any other quantity derived
from them. The **force** option forces the calculation to take place
anyway. If you use the **force** option, there is no guarantee that any
inference quantities (for example, standard errors) will be correct
or that the values obtained can be interpreted.

The following option is available with **predictnl** but is not shown in the
dialog box:

**df(***#***)** specifies that the F distribution with *#* denominator degrees of
freedom be used for the reference distribution of the test statistic.

__Remark on the manipulability of nonlinear Wald tests__

In contrast to likelihood-ratio tests, different -- mathematically
equivalent -- formulations of an hypothesis may lead to different results
for a nonlinear Wald test (lack of invariance). For instance, the two
hypotheses

H0: *pnl_exp***[i]** = 0

H0: exp(*pnl_exp***[i]**) - 1 = 0

are mathematically equivalent expressions but do not yield the same test
statistic and p-value. In extreme cases, under one formulation, one would
reject H0, whereas under an equivalent formulation one would not reject
H0.

__Remark on the use of the functions ____predict()____ and ____xb()__

When calculating inference-related quantities such as standard errors,
*pnl_exp* is evaluated repeatedly for different values of the model
parameters. Therefore, think of **predict()** and **xb()** as a means of
substituting for the *formula* of the calculation and not a means of
substituting the value of the calculation that is obtained when the model
parameters are set to any specific values. For example,

**. predict double ***pred_var, predict_options*
**. predictnl ***newvar*** = ***pred_var***, se(***newvar_se***)**

will give standard errors (*newvar_se*) equal to zero, since once
evaluated, *pred_var* will contain values that are fixed with respect to
e(b). Instead,

**. predictnl ***newvar*** = predict(***predict_options***), se(***newvar_se***)**

will produce what is intended.

__Examples__

---------------------------------------------------------------------------
Setup
**. webuse lbw**

Fit maximum-likelihood probit model
**. probit low lwt smoke ptl ht**

Compute predictions and their standard errors
**. predictnl phat = normal(_b[_cons] + _b[ht]*ht + _b[ptl]*ptl +**
**_b[smoke]*smoke + _b[lwt]*lwt), se(phat_se)**

---------------------------------------------------------------------------
Setup
**. webuse sysdsn1, clear**

Fit maximum-likelihood multinomial logit model
**. mlogit insure age male nonwhite i.site**

Compute observation-specific relative risks of selecting a prepaid plan
over an indemnity plan (with standard errors)
**. predictnl RRpaid = exp(xb(Prepaid)), se(SERRppaid)**

Same command as above
**. predictnl RRpaid = exp(xb(#1)), se(SERRppaid2)**

Calculate relative risk directly as ratio of two predicted probabilities
**. predictnl RRppaid =**
**predict(outcome(Prepaid))/predict(outcome(Indemnity)),**
**se(SERRppaid3)**

For each observation, test whether the relative risk of choosing a
prepaid plan over an indemnity plan is different from one
**. predictnl RRm1 = exp(xb(Prepaid)) - 1, wald(W_RRm1)** **p(sig_RRm1)**

---------------------------------------------------------------------------
Setup
**. webuse drugtr, clear**

Fit parametric survival model
**. streg drug age, dist(weibull)**

Calculate predicted mean survival times and their standard errors
**. predictnl t_hat = predict(mean time), se(t_hat_se)**
---------------------------------------------------------------------------