Stata 15 help for predict

[R] predict -- Obtain predictions, residuals, etc., after estimation

Syntax

After single-equation (SE) models

predict [type] newvar [if] [in] [, single_options]

After multiple-equation (ME) models

predict [type] newvar [if] [in] [, multiple_options]

predict [type] {stub*|newvar1 ... newvarq} [if] [in] , scores

single_options Description ------------------------------------------------------------------------- Main xb calculate linear prediction stdp calculate standard error of the prediction score calculate first derivative of the log likelihood with respect to xb

Options nooffset ignore any offset() or exposure() variable other_options command-specific options -------------------------------------------------------------------------

multiple_options Description ------------------------------------------------------------------------- Main equation(eqno[,eqno]) specify equations xb calculate linear prediction stdp calculate standard error of the prediction stddp calculate the difference in linear predictions

Options nooffset ignore any offset() or exposure() variable other_options command-specific options -------------------------------------------------------------------------

Menu for predict

Statistics > Postestimation

Description

predict calculates predictions, residuals, influence statistics, and the like after estimation. Exactly what predict can do is determined by the previous estimation command; command-specific options are documented with each estimation command. Regardless of command-specific options, the actions of predict share certain similarities across estimation commands:

1. predict newvar creates newvar containing "predicted values" -- numbers related to the E(y|x). For instance, after linear regression, predict newvar creates xb and, after probit, creates the probability F(xb).

2. predict newvar, xb creates newvar containing xb. This may be the same result as option 1 (for example, linear regression) or different (for example, probit), but regardless, option xb is allowed.

3. predict newvar, stdp creates newvar containing the standard error of the linear prediction xb.

4. predict newvar, other_options may create newvar containing other useful quantities; see help or the reference manual entry for the particular estimation command to find out about other available options.

5. nooffset added to any of the above commands requests that the calculation ignore any offset or exposure variable specified by including the offset(varname_o) or exposure(varname_e) option when you fit the model.

predict can be used to make in-sample or out-of-sample predictions:

6. predict calculates the requested statistic for all possible observations, whether they were used in fitting the model or not. predict does this for the standard options 1 through 3 and generally does this for estimator-specific options 4.

7. predict newvar if e(sample), ... restricts the prediction to the estimation subsample.

8. Some statistics make sense only with respect to the estimation subsample. In such cases, the calculation is automatically restricted to the estimation subsample, and the documentation for the specific option states this. Even so, you can still specify if e(sample) if you are uncertain.

9. predict can make out-of-sample predictions even using other datasets. In particular, you can

. use ds1 (fit a model) . use two /* another dataset */ . predict yhat, ... /* fill in the predictions */

Options

+------+ ----+ Main +-------------------------------------------------------------

xb calculates the linear prediction from the fitted model. That is, all models can be thought of as estimating a set of parameters b1, b2, ..., bk, and the linear prediction is y = xb. For linear regression, the values y are called the predicted values or, for out-of-sample predictions, the forecast. For logit and probit, for example, y is called the logit or probit index.

x1, x2, ..., xk are obtained from the data currently in memory and do not necessarily correspond to the data on the independent variables used to fit the model (obtaining the b1, b2, ..., bk).

stdp calculates the standard error of the linear prediction. Here the prediction means the same thing as the "index", namely, xb. The statistic produced by stdp can be thought of as the standard error of the predicted expected value, or mean index, for the observation's covariate pattern. The standard error of the prediction is also commonly referred to as the standard error of the fitted value. The calculation can be made in or out of sample.

stddp is allowed only after you have previously fit a multiple-equation model. The standard error of the difference in linear predictions between two equations is calculated. This option requires that equation(eqno1,eqno2) be specified.

score calculates the equation-level score; this is usually the derivative of the log likelihood with respect to the linear prediction.

scores is the ME model equivalent of the score option, resulting in multiple equation-level score variables. An equation-level score variable is created for each equation in the model; ancillary parameters -- such as ln(sigma) and atanh(rho) -- make up separate equations.

equation(eqno[,eqno]) -- synonym outcome() -- is relevant only when you have previously fit a multiple-equation model. It specifies the equation to which you are referring.

equation() is typically filled in with one eqno -- it would be filled in that way with options xb and stdp, for instance. equation(#1) would mean the calculation is to be made for the first equation, equation(#2) would mean the second, and so on. You could also refer to the equations by their names. equation(income) would refer to the equation named income and equation(hours) to the equation named hours.

If you do not specify equation(), results are the same as if you specified equation(#1).

Other statistics, such as stddp, refer to between-equation concepts. In those cases, you might specify equation(#1,#2) or equation(income,hours). When two equations must be specified, equation() is required.

+---------+ ----+ Options +----------------------------------------------------------

nooffset may be combined with most statistics and specifies that the calculation should be made, ignoring any offset or exposure variable specified when the model was fit.

This option is available, even if not documented for predict after a specific command. If neither the offset(varname_o) option nor the exposure(varname_e) option was specified when the model was fit, specifying nooffset does nothing.

other_options refers to command-specific options that are documented with each command.

Examples

--------------------------------------------------------------------------- Setup . sysuse auto . regress mpg weight if foreign

Obtain predictions for just the sample on which we fit the model . predict pmpg if e(sample)

Obtain out-of-sample prediction using all 74 observations of same dataset . predict pmpg2

cooksd is a regression-specific option; see [R] regress postestimation . predict c, cooksd

--------------------------------------------------------------------------- Setup . sysuse auto, clear . generate weight2 = weight^2 . regress mpg weight weight2 foreign . webuse newautos, clear . generate weight2 = weight^2

Obtain out-of-sample prediction using another dataset . predict mpg

--------------------------------------------------------------------------- Setup . sysuse auto, clear . generate weight2 = weight^2 . regress mpg weight weight2 foreign

Obtain residuals . predict double resid, residuals . summarize resid

--------------------------------------------------------------------------- Setup . sysuse auto, clear . logistic foreign mpg weight

Obtain probability of a positive outcome; see [R] logistic postestimation . predict phat

Obtain linear prediction . predict idxhat, xb . summarize foreign phat idxhat

--------------------------------------------------------------------------- Setup . webuse airline, clear . poisson injuries XYZowned

Obtain predicted count; see [R] poisson postestimation . predict injhat

Obtain linear prediction . predict idx, xb . generate exp_idx = exp(idx) . summarize injuries injhat exp_idx idx

--------------------------------------------------------------------------- Setup . sysuse auto, clear . logistic foreign mpg weight

Obtain single-equation model scores . predict double sc, score . summarize sc

--------------------------------------------------------------------------- Setup . sysuse auto, clear . sureg (price foreign displ) (weight foreign length)

Obtain linear prediction for price equation . predict pred_p, equation(price)

Obtain linear prediction for weight equation . predict pred_w, equation(weight) . summarize price pred_p weight pred_w

--------------------------------------------------------------------------- Setup . sysuse auto, clear . ologit rep78 mpg weight

Obtain multiple-equation model scores . predict double sc*, scores . summarize sc* ---------------------------------------------------------------------------


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index