Stata 15 help for predictnl

[R] predictnl -- Obtain nonlinear predictions, standard errors, etc., after estimation


predictnl [type] newvar = pnl_exp [if] [in] [, options]

options Description ------------------------------------------------------------------------- Main se(newvar) create newvar containing standard errors variance(newvar) create newvar containing variances wald(newvar) create newvar containing the Wald test statistic p(newvar) create newvar containing the p-value for the Wald test ci(newvars) create newvars containing lower and upper confidence intervals level(#) set confidence level; default is level(95) g(stub) create stub1, stub2, ..., stubk variables containing observation-specific derivatives

Advanced iterate(#) maximum iterations for finding optimal step size; default is 100 force calculate standard errors, etc., even when possibly inappropriate

df(#) use F distribution with # denominator degrees of freedom for the reference distribution of the test statistic ------------------------------------------------------------------------- df(#) does not appear in the dialog box.


Statistics > Postestimation


predictnl calculates (possibly) nonlinear predictions after any Stata estimation command and optionally calculates the variances, standard errors, Wald test statistics, p-values, and confidence limits for these predictions. Unlike its companion nonlinear postestimation commands testnl and nlcom, predictnl generates functions of the data (that is, predictions), not scalars. The quantities generated by predictnl are thus vectorized over the observations in the data.

Consider some general prediction, g(theta, x_i), for i = 1, ..., n, where theta are the model parameters and x_i are some data for the ith observation; x_i is assumed fixed. Typically, g(theta, x_i) is estimated by g(theta-hat), x_i), where theta-hat are the estimated model parameters, which are stored in e(b) following any Stata estimation command.

In its most common use, predictnl generates two variables: one containing the estimated prediction, g(theta-hat, x_i), the other containing the estimated standard error of g(theta-hat, x_i). The calculation of standard errors (and other obtainable quantities that are based on the standard errors, such as test statistics) is based on the delta method, an approximation appropriate in large samples; see Methods and formulas in [R] predictnl.

predictnl can be used with svy estimation results (assuming that predict is also allowed), see [SVY] svy postestimation.

The specification of g(theta-hat, x_i) is handled by specifying pnl_exp, and the values of g(theta-hat, x_i) are stored in the new variable newvar of storage type type. pnl_exp is any valid Stata expression and may also contain calls to two special functions unique to predictnl:

1. predict([predict_options]): When you are evaluating pnl_exp, predict() is a convenience function that replicates the calculation performed by the command

predict ..., predict_options

As such, predict() function may be used either as a shorthand for the formula used to make this prediction or when the formula is not readily available. When used without arguments, predict() replicates the default prediction for that particular estimation command.

2. xb([eqno]): The xb() function replicates the calculation of the linear predictor x_j*b for equation eqno. If xb() is specified without eqno, the linear predictor for the first equation (or the only equation in single-equation estimation) is obtained.

For example, xb(#1) (or equivalently, xb()) with no arguments) translates to the linear predictor for the first equation, xb(#2) for the second, and so on. You could also refer to the equations by their names, such as xb(income).

When specifying pnl_exp, both of these functions may be used repeatedly, in combination, and in combination with other Stata functions and expressions. See Remarks and examples in [R] predictnl for examples that use both of these functions.


+------+ ----+ Main +-------------------------------------------------------------

se(newvar) adds newvar of storage type type, where for each i in the prediction sample, newvar[i] contains the estimated standard error of pnl_exp[i].

variance(newvar) adds newvar of storage type type, where for each i in the prediction sample, newvar[i] contains the estimated variance of pnl_exp[i].

wald(newvar) adds newvar of storage type type, where for each i in the prediction sample, newvar[i] contains the Wald test statistic for the test of the hypothesis H0:pnl_exp[i]=0.

p(newvar) adds newvar of storage type type, where newvar[i] contains the p-value for the Wald test H0:pnl_exp[i]=0 versus the two-sided alternative.

ci(newvars) requires the specification of two newvars, such that the ith observation of each will contain the left and right endpoints (respectively) of a confidence interval for pnl_exp[i]. The level of the confidence intervals is determined by level(#).

level(#) specifies the confidence level, as a percentage, for confidence intervals. The default is level(95) or as set by set level.

g(stub) specifies that new variables, stub1, stub2, ..., stubk be created, where k is the dimension of e(b). stub1 will contain the observation-specific derivatives of pnl_exp with respect to the first element listed in e(b); stub2 will contain the derivatives of pnl_exp with respect to e(b); etc. If the derivative of pnl_exp with respect to a particular coefficient in e(b) equals zero for all observations in the prediction sample, the stub variable for that coefficient is not created. The ordering of the parameters in e(b) is precisely that of the stored vector of parameter estimates e(b).

+----------+ ----+ Advanced +---------------------------------------------------------

iterate(#) specifies the maximum number of iterations used to find the optimal step size in the calculation of numerical derivatives of pnl_exp with respect to estimated model coefficients. By default, the maximum number of iterations is 100, but convergence is usually achieved after only a few iterations. You should rarely have to use this option.

force forces the calculation of standard errors and other inference-related quantities in situations where predictnl would otherwise refuse to do so. The calculation of standard errors takes place by evaluating the numerical derivative of pnl_exp with respect to the coefficient vector e(b). If predictnl detects that pnl_exp is possibly a function of random quantities other than e(b), it will refuse to calculate standard errors or any other quantity derived from them. The force option forces the calculation to take place anyway. If you use the force option, there is no guarantee that any inference quantities (for example, standard errors) will be correct or that the values obtained can be interpreted.

The following option is available with predictnl but is not shown in the dialog box:

df(#) specifies that the F distribution with # denominator degrees of freedom be used for the reference distribution of the test statistic.

Remark on the manipulability of nonlinear Wald tests

In contrast to likelihood-ratio tests, different -- mathematically equivalent -- formulations of an hypothesis may lead to different results for a nonlinear Wald test (lack of invariance). For instance, the two hypotheses

H0: pnl_exp[i] = 0

H0: exp(pnl_exp[i]) - 1 = 0

are mathematically equivalent expressions but do not yield the same test statistic and p-value. In extreme cases, under one formulation, one would reject H0, whereas under an equivalent formulation one would not reject H0.

Remark on the use of the functions predict() and xb()

When calculating inference-related quantities such as standard errors, pnl_exp is evaluated repeatedly for different values of the model parameters. Therefore, think of predict() and xb() as a means of substituting for the formula of the calculation and not a means of substituting the value of the calculation that is obtained when the model parameters are set to any specific values. For example,

. predict double pred_var, predict_options . predictnl newvar = pred_var, se(newvar_se)

will give standard errors (newvar_se) equal to zero, since once evaluated, pred_var will contain values that are fixed with respect to e(b). Instead,

. predictnl newvar = predict(predict_options), se(newvar_se)

will produce what is intended.


--------------------------------------------------------------------------- Setup . webuse lbw

Fit maximum-likelihood probit model . probit low lwt smoke ptl ht

Compute predictions and their standard errors . predictnl phat = normal(_b[_cons] + _b[ht]*ht + _b[ptl]*ptl + _b[smoke]*smoke + _b[lwt]*lwt), se(phat_se)

--------------------------------------------------------------------------- Setup . webuse sysdsn1, clear

Fit maximum-likelihood multinomial logit model . mlogit insure age male nonwhite

Compute observation-specific relative risks of selecting a prepaid plan over an indemnity plan (with standard errors) . predictnl RRpaid = exp(xb(Prepaid)), se(SERRppaid)

Same command as above . predictnl RRpaid = exp(xb(#1)), se(SERRppaid2)

Calculate relative risk directly as ratio of two predicted probabilities . predictnl RRppaid = predict(outcome(Prepaid))/predict(outcome(Indemnity)), se(SERRppaid3)

For each observation, test whether the relative risk of choosing a prepaid plan over an indemnity plan is different from one . predictnl RRm1 = exp(xb(Prepaid)) - 1, wald(W_RRm1) p(sig_RRm1)

--------------------------------------------------------------------------- Setup . webuse drugtr, clear

Fit parametric survival model . streg drug age, dist(weibull)

Calculate predicted mean survival times and their standard errors . predictnl t_hat = predict(mean time), se(t_hat_se) ---------------------------------------------------------------------------

© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index