**[R] heckman postestimation** -- Postestimation tools for heckman

__Postestimation commands__

The following postestimation commands are available after **heckman**:

Command Description
-------------------------------------------------------------------------
**contrast** contrasts and ANOVA-style joint tests of estimates
* **estat ic** Akaike's and Schwarz's Bayesian information criteria
(AIC and BIC)
**estat summarize** summary statistics for the estimation sample
**estat vce** variance-covariance matrix of the estimators (VCE)
**estat** (svy) postestimation statistics for survey data
**estimates** cataloging estimation results
+ **hausman** Hausman's specification test
**lincom** point estimates, standard errors, testing, and
inference for linear combinations of coefficients
+ **lrtest** likelihood-ratio test; not available with two-step
estimator
**margins** marginal means, predictive margins, marginal effects,
and average marginal effects
**marginsplot** graph the results from margins (profile plots,
interaction plots, etc.)
**nlcom** point estimates, standard errors, testing, and
inference for nonlinear combinations of coefficients
**predict** predictions, residuals, influence statistics, and
other diagnostic measures
**predictnl** point estimates, standard errors, testing, and
inference for generalized predictions
**pwcompare** pairwise comparisons of estimates
* **suest** seemingly unrelated estimation
**test** Wald tests of simple and composite linear hypotheses
**testnl** Wald tests of nonlinear hypotheses
-------------------------------------------------------------------------
* **estat ic** and **suest** are not appropriate after **heckman, twostep**.
+ **hausman** and **lrtest** are not appropriate with **svy** estimation results.

__Syntax for predict__

After ML or twostep

**predict** [*type*] *newvar* [*if*] [*in*] [**,** *statistic* __nooff__**set**]

After ML

**predict** [*type*] {*stub**|*newvar_reg* *newvar_sel* *newvar_athrho*
*newvar_lnsigma*} [*if*] [*in*] **,** __sc__**ores**

*statistic* Description
-------------------------------------------------------------------------
Main
**xb** linear prediction; the default
**stdp** standard error of the prediction
**stdf** standard error of the forecast
__xbs__**el** linear prediction for selection equation
__stdps__**el** standard error of the linear prediction for
selection equation
__p__**r(***a***,***b***)** Pr(y | *a* < y < *b*)
**e(***a***,***b***)** *E*(y | *a* < y < *b*)
__ys__**tar(***a***,***b***)** *E*(y*), y* = max{*a*,min(y,*b*)}
__yc__**ond** *E*(y | y observed)
__ye__**xpected** *E*(y*), y taken to be 0 where unobserved
__ns__**hazard** or __m__**ills** nonselection hazard (also called inverse of
Mills's ratio)
__ps__**el** Pr(y observed)
-------------------------------------------------------------------------
These statistics are available both in and out of sample; type **predict**
*...* **if e(sample)** *...* if wanted only for the estimation sample.
**stdf** is not allowed with **svy** estimation results.

where *a* and *b* may be numbers or variables; *a* missing (*a* __>__ **.**) means minus
infinity, and *b* missing (*b* __>__ **.**) means plus infinity; see missing.

__Menu for predict__

**Statistics > Postestimation**

__Description for predict__

**predict** creates a new variable containing predictions such as linear
predictions, standard errors, probabilities, expected values, and
nonselection hazards.

__Options for predict__

+------+
----+ Main +-------------------------------------------------------------

**xb**, the default, calculates the linear prediction.

**stdp** calculates the standard error of the prediction, which can be
thought of as the standard error of the predicted expected value or
mean for the observation's covariate pattern. The standard error of
the prediction is also referred to as the standard error of the
fitted value.

**stdf** calculates the standard error of the forecast, which is the standard
error of the point prediction for 1 observation. It is commonly
referred to as the standard error of the future or forecast value.
By construction, the standard errors produced by **stdf** are always
larger than those produced by **stdp**; see *Methods and formulas* in **[R]**
**regress postestimation**.

**xbsel** calculates the linear prediction for the selection equation.

**stdpsel** calculates the standard error of the linear prediction for the
selection equation.

**pr(***a***,***b***)** calculates Pr(*a* < xb + u < *b*), the probability that y|x would be
observed in the interval (*a*,*b*).

*a* and *b* may be specified as numbers or variable names; *lb* and *ub* are
variable names;
**pr(20,30)** calculates Pr(20 < xb + u < 30);
**pr(***lb***,***ub***)** calculates Pr(*lb* < xb + u < *ub*); and
**pr(20,***ub***)** calculates Pr(20 < xb + u < *ub*).

*a* missing (*a* __>__ .) means minus infinity; **pr(.,30)** calculates
Pr(xb + u < 30);
**pr(***lb***,30)** calculates Pr(xb + u < 30) in observations for which *lb* __>__ .
and calculates Pr(*lb* < xb + u < 30) elsewhere.

*b* missing (*b* __>__ .) means plus infinity; **pr(20,.)** calculates
Pr(xb + u > 20);
**pr(20,***ub***)** calculates Pr(xb + u > 20) in observations for which *ub* __>__ .
and calculates Pr(20 < xb + u < *ub*) elsewhere.

**e(***a***,***b***)** calculates *E*(xb + u | *a* < xb + u < *b*), the expected value of y|x
conditional on y|x being in the interval (*a*,*b*), meaning that y|x is
truncated. *a* and *b* are specified as they are for **pr()**.

**ystar(***a***,***b***)** calculates *E*(y*), where y* = *a* if xb + u __<__ *a*, y* = *b* if
xb + u __>__ *b*, and y* = xb + u otherwise, meaning that y* is not
selected. *a* and *b* are specified as they are for **pr()**.

**ycond** calculates the expected value of the dependent variable conditional
on the dependent variable being observed, that is, selected.

**yexpected** calculates the expected value of the dependent variable (y*),
where that value is taken to be 0 when it is expected to be
unobserved.

The assumption of 0 is valid for many cases where nonselection
implies nonparticipation (for example, unobserved wage levels,
insurance claims from those who are uninsured) but may be
inappropriate for some problems (for example, unobserved disease
incidence).

**nshazard** and **mills** are synonyms; both calculate the nonselection hazard
-- what Heckman (1979) referred to as the inverse of the Mills ratio
-- from the selection equation.

**psel** calculates the probability of selection (or being observed).

**nooffset** is relevant when you specify **offset(***varname***)** for **heckman**. It
modifies the calculations made by **predict** so that they ignore the
offset variable; the linear prediction is treated as xb rather than
as xb + offset.

**scores**, not available with **twostep**, calculates equation-level score
variables.

The first new variable will contain the derivative of the log
likelihood with respect to the regression equation.

The second new variable will contain the derivative of the log
likelihood with respect to the selection equation.

The third new variable will contain the derivative of the log
likelihood with respect to the third equation (**athrho**).

The fourth new variable will contain the derivative of the log
likelihood with respect to the fourth equation (**lnsigma**).

__Syntax for margins__

**margins** [*marginlist*] [**,** *options*]

**margins** [*marginlist*] **,** __pr__**edict(***statistic *...**)** [__pr__**edict(***statistic *...**)**
...] [*options*]

*statistic* Description
-------------------------------------------------------------------------
**xb** linear prediction; the default
__xbs__**el** linear prediction for selection equation
__p__**r(***a***,***b***)** Pr(y | *a* < y < *b*)
**e(***a***,***b***)** *E*(y | *a* < y < *b*)
__ys__**tar(***a***,***b***)** *E*(y*), y* = max{*a*,min(y,*b*)}
* __yc__**ond** *E*(y | y observed)
* __ye__**xpected** *E*(y*), y taken to be 0 where unobserved
__ns__**hazard** or __m__**ills** nonselection hazard (also called inverse of Mills's
ratio)
__ps__**el** Pr(y observed)
**stdp** not allowed with **margins**
**stdf** not allowed with **margins**
**stdpsel** not allowed with **margins**
-------------------------------------------------------------------------
* **ycond** and **yexpected** are not allowed with **margins** after **heckman,**
**twostep**.

Statistics not allowed with **margins** are functions of stochastic
quantities other than **e(b)**.

For the full syntax, see **[R] margins**.

__Menu for margins__

**Statistics > Postestimation**

__Description for margins__

**margins** estimates margins of response for linear predictions,
probabilities, expected values, and nonselection hazards.

__Examples__

Setup
**. webuse womenwk**
**. heckman wage educ age, select(married children educ age)**

Predicted wage conditional on it being observed
**. predict ycond, ycond**

Probability of wage being observed
**. predict probseen, psel**

__Reference__

Heckman, J. 1979. Sample selection bias as a specification error.
*Econometrica* 47: 153-161.