**[R] ivregress postestimation** -- Postestimation tools for ivregress

__Postestimation commands__

The following postestimation commands are of special interest after
**ivregress**:

Command Description
-------------------------------------------------------------------------
**estat endogenous** perform tests of endogeneity
**estat firststage** report "first-stage" regression statistics
**estat overid** perform tests of overidentifying restrictions
* **estat sbknown** perform tests for a structural break with a known
break date
* **estat sbsingle** perform tests for a structural break with an
unknown break date
-------------------------------------------------------------------------
These commands are not appropriate after the **svy** prefix.
* **estat** **sbknown** and **estat** **sbsingle** work only after **ivregress** **2sls**.

The following standard postestimation commands are also available:

Command Description
-------------------------------------------------------------------------
**contrast** contrasts and ANOVA-style joint tests of estimates
**estat summarize** summary statistics for the estimation sample
**estat vce** variance-covariance matrix of the estimators (VCE)
**estat** (svy) postestimation statistics for survey data
**estimates** cataloging estimation results
+ **forecast** dynamic forecasts and simulations
+ **hausman** Hausman's specification test
**lincom** point estimates, standard errors, testing, and
inference for linear combinations of coefficients
**margins** marginal means, predictive margins, marginal effects,
and average marginal effects
**marginsplot** graph the results from margins (profile plots,
interaction plots, etc.)
**nlcom** point estimates, standard errors, testing, and
inference for nonlinear combinations of coefficients
**predict** predictions, residuals, influence statistics, and
other diagnostic measures
**predictnl** point estimates, standard errors, testing, and
inference for generalized predictions
**pwcompare** pairwise comparisons of estimates
**test** Wald tests of simple and composite linear hypotheses
**testnl** Wald tests of nonlinear hypotheses
-------------------------------------------------------------------------
+ **forecast** and **hausman** are not appropriate with **svy** estimation results.

__Syntax for predict__

**predict** [*type*] *newvar* [*if*] [*in*] [**,** *statistic*]

**predict** [*type*] {*stub**|*newvarlist*} [*if*] [*in*] **,** __sc__**ores**

*statistic* Description
-------------------------------------------------------------------------
Main
**xb** linear prediction; the default
__r__**esiduals** residuals
**stdp** standard error of the prediction
**stdf** standard error of the forecast
__p__**r(***a***,***b***)** Pr(*a* < y < *b*) under exogeneity and normal errors
**e(***a***,***b***)** *E*(y | *a* < y < *b*) under exogeneity and normal errors
__ys__**tar(***a***,***b***)** *E*(y*), y* = max{*a*,min(y,*b*)} under exogeneity and
normal errors
-------------------------------------------------------------------------
These statistics are available both in and out of sample; type **predict**
*...* **if e(sample)** *...* if wanted only for the estimation sample.
**stdf** is not allowed with **svy** estimation results.

where *a* and *b* may be numbers or variables; *a* missing (*a* __>__ **.**) means minus
infinity, and *b* missing (*b* __>__ **.**) means plus infinity; see missing.

__Menu for predict__

**Statistics > Postestimation**

__Description for predict__

**predict** creates a new variable containing predictions such as linear
predictions, residuals, standard errors, probabilities, and expected
values.

__Options for predict__

+------+
----+ Main +-------------------------------------------------------------

**xb**, the default, calculates the linear prediction.

**residuals** calculates the residuals, that is, y - xb. These are based on
the estimated equation when the observed values of the endogenous
variables are used -- not the projections of the instruments onto the
endogenous variables.

**stdp** calculates the standard error of the prediction, which can be
thought of as the standard error of the predicted expected value or
mean for the observation's covariate pattern. This is also referred
to as the standard error of the fitted value.

**stdf** calculates the standard error of the forecast, which is the standard
error of the point prediction for 1 observation. It is commonly
referred to as the standard error of the future or forecast value.
By construction, the standard errors produced by **stdf** are always
larger than those produced by **stdp**; see *Methods and formulas* in **[R]**
**regress postestimation**.

**pr(***a***,***b***)** calculates Pr(*a* < xb + u < *b*), the probability that y|x would be
observed in the interval (*a*,*b*) under exogeneity and assuming errors
are normally distributed.

*a* and *b* may be specified as numbers or variable names; *lb* and *ub* are
variable names;
**pr(20,30)** calculates Pr(20 < xb + u < 30);
**pr(***lb***,***ub***)** calculates Pr(*lb* < xb + u < *ub*); and
**pr(20,***ub***)** calculates Pr(20 < xb + u < *ub*).

*a* missing (*a* __>__ .) means minus infinity; **pr(.,30)** calculates
Pr(-infinity < xb + u < 30);
**pr(***lb***,30)** calculates Pr(-infinity < xb + u < 30) in observations for
which *lb* __>__ .
and calculates Pr(*lb* < xb + u < 30) elsewhere.

*b* missing (*b* __>__ .) means plus infinity; **pr(20,.)** calculates
Pr(+infinity > xb + u > 20);
**pr(20,***ub***)** calculates Pr(+infinity > xb + u > 20) in observations for
which *ub* __>__ .
and calculates Pr(20 < xb + u < *ub*) elsewhere.

**e(***a***,***b***)** calculates E(xb + u | *a* < xb + u < *b*), the expected value of y|x
conditional on y|x being in the interval (*a*,*b*), meaning y|x is
truncated. *a* and *b* are specified as they are for **pr()**. Exogeneity
and normally distributed errors are assumed.

**ystar(***a***,***b***)** calculates E(y*), where y* = *a* if xb + u __<__ *a*, y* = *b* if
xb + u __>__ *b*, and y* = xb + u otherwise, meaning that y* is censored.
*a* and *b* are specified as they are for **pr()**. Exogeneity and normally
distributed errors are assumed.

**scores** calculates the scores for the model. A new score variable is
created for each endogenous regressor, as well as an equation-level
score that applies to all exogenous variables and constant term (if
present).

__Syntax for margins__

**margins** [*marginlist*] [**,** *options*]

**margins** [*marginlist*] **,** __pr__**edict(***statistic *...**)** [__pr__**edict(***statistic *...**)**
...] [*options*]

*statistic* Description
-------------------------------------------------------------------------
**xb** linear prediction; the default
__p__**r(***a***,***b***)** Pr(*a* < y < *b*) under exogeneity and normal errors
**e(***a***,***b***)** *E*(y | *a* < y < *b*) under exogeneity and normal errors
__ys__**tar(***a***,***b***)** *E*(y*), y* = max{*a*,min(y,*b*)} under exogeneity and
normal errors
**stdp** not allowed with **margins**
**stdf** not allowed with **margins**
__r__**esiduals** not allowed with **margins**
-------------------------------------------------------------------------

Statistics not allowed with **margins** are functions of stochastic
quantities other than **e(b)**.

For the full syntax, see **[R] margins**.

__Menu for margins__

**Statistics > Postestimation**

__Description for margins__

**margins** estimates margins of response for linear predictions,
probabilities, and expected values.

__Syntax for estat__

Perform tests of endogeneity

**estat** __endog__**enous** [*varlist*] [**,** __l__**ags(***#***)** **forceweights** **forcenonrobust**]

Report "first-stage" regression statistics

**estat** __first__**stage** [**,** **all** **forcenonrobust**]

Perform tests of overidentifying restrictions

**estat** __over__**id** [**,** __l__**ags(***#***)** **forceweights** **forcenonrobust**]

__Menu for estat__

**Statistics > Postestimation**

__Description for estat__

**estat endogenous** performs tests to determine whether endogenous
regressors in the model are in fact exogenous. After GMM estimation, the
C (difference-in-Sargan) statistic is reported. After 2SLS estimation
with an unadjusted VCE, the Durbin (1954) and Wu-Hausman (Wu 1974;
Hausman 1978) statistics are reported. After 2SLS with a robust or VCE,
Wooldridge's (1995) robust score test and a robust regression-based test
are reported. In all cases, if the test statistic is significant, then
the variables being tested must be treated as endogenous. **estat**
**endogenous** is not available after LIML estimation.

**estat firststage** reports various statistics that measure the relevance of
the excluded exogenous variables. By default, whether the equation has
one or more than one endogenous regressor determines what statistics are
reported.

**estat overid** performs tests of overidentifying restrictions. If the 2SLS
estimator was used, Sargan's (1958) and Basmann's (1960) chi-squared
tests are reported, as is Wooldridge's (1995) robust score test; if the
LIML estimator was used, Anderson and Rubin's (1950) chi-squared test and
Basmann's F test are reported; and if the GMM estimator was used,
Hansen's (1982) J statistic chi-squared test is reported. A
statistically significant test statistic always indicates that the
instruments may not be valid.

__Options for estat endogenous__

**lags(***#***)** specifies the number of lags to use for prewhitening when
computing the heteroskedasticity- and autocorrelation-consistent
(HAC) version of the score test of endogeneity. Specifying **lags(0)**
requests no prewhitening. This option is valid only when the model
was fit via 2SLS and an HAC covariance matrix was requested when the
model was fit. The default is **lags(1)**.

**forceweights** requests that the tests of endogeneity be computed even
though **aweight**s, **pweight**s, or **iweight**s were used in the previous
estimation. By default, these tests are conducted only after
unweighted or frequency-weighted estimation. The reported critical
values may be inappropriate for weighted data, so the user must
determine whether the critical values are appropriate for a given
application.

**forcenonrobust** requests that the Durbin and Wu-Hausman tests be performed
after 2SLS estimation even though a robust VCE was used at estimation
time. This option is available only if the model was fit by 2SLS.

__Options for estat firststage__

**all** requests that all first-stage goodness-of-fit statistics be reported
regardless of whether the model contains one or more endogenous
regressors. By default, if the model contains one endogenous
regressor, then the first-stage R-squared, adjusted R-squared,
partial R-squared, and F statistics are reported, whereas if the
model contains multiple endogenous regressors, then Shea's partial
R-squared and adjusted partial R-squared are reported instead.

**forcenonrobust** requests that the minimum eigenvalue statistic and its
critical values be reported even though a robust VCE was used at
estimation time. The reported critical values assume that the errors
are independent and identically distributed normal, so the user must
determine whether the critical values are appropriate for a given
application.

__Options for estat overid__

**lags(***#***)** specifies the number of lags to use for prewhitening when
computing the heteroskedasticity- and autocorrelation-consistent
(HAC) version of the score test of overidentifying restrictions.
Specifying **lags(0)** requests no prewhitening. This option is valid
only when the model was fit via 2SLS and an HAC covariance matrix was
requested when the model was fit. The default is **lags(1)**.

**forceweights** requests that the tests of overidentifying restrictions be
computed even though **aweight**s, **pweight**s, or **iweight**s were used in the
previous estimation. By default, these tests are conducted only
after unweighted or frequency-weighted estimation. The reported
critical values may be inappropriate for weighted data, so the user
must determine whether the critical values are appropriate for a
given application.

**forcenonrobust** requests that the Sargan and Basmann tests of
overidentifying restrictions be performed after 2SLS or LIML
estimation even though a robust VCE was used at estimation time.
These tests assume that the errors are independent and identically
distributed normal, so the user must determine whether the critical
values are appropriate for a given application.

__Examples__

Setup
**. webuse hsng2**

Fit a model via 2SLS and obtain first-stage regression diagnostics
**. ivregress 2sls rent pcturban (hsngval = faminc i.region)**
**. estat firststage**

Obtain the Sargan and Basmann tests of overidentifying restrictions
**. estat overid**

Test whether **hsngval** can be treated as exogenous
**. estat endogenous**

Fit a model with two endogenous regressors via GMM and obtain all
first-stage regression diagnostics
**. ivregress gmm rent (hsngval pcturban = faminc i.region)**
**. estat firststage, all**

Obtain Hansen's J statistic
**. estat overid**

Test whether **hsngval** can be treated as exogenous
**. estat endogenous hsngval**

__Stored results__

After 2SLS estimation, **estat endogenous** stores the following in **r()**:

Scalars
**r(durbin)** Durbin chi-squared statistic
**r(p_durbin)** p-value for Durbin chi-squared statistic
**r(wu)** Wu-Hausman F statistic
**r(p_wu)** p-value for Wu-Hausman F statistic
**r(df)** degrees of freedom
**r(wudf_r)** denominator degrees of freedom for Wu-Hausman F
**r(r_score)** robust score statistic
**r(p_r_score)** p-value for robust score statistic
**r(hac_score)** HAC score statistic
**r(p_hac_score)** p-value for HAC score statistic
**r(lags)** lags used in prewhitening
**r(regF)** regression-based F statistic
**r(p_regF)** p-value for regression-based F statistic
**r(regFdf_n)** regression-based F numerator degrees of freedom
**r(regFdf_r)** regression-based F denominator degrees of freedom

After GMM estimation, **estat endogenous** stores the following in **r()**:

Scalars
**r(C)** C chi-squared statistic
**r(p_C)** p-value for C chi-squared statistic
**r(df)** degrees of freedom

**estat firststage** stores the following in **r()**:

Scalars
**r(mineig)** minimum eigenvalue statistic

Matrices
**r(mineigcv)** critical values for minimum eigenvalue statistic
**r(multiresults)** Shea's partial R-squared statistics
**r(singleresults)** first-stage R-squared and F statistics

After 2SLS estimation, **estat overid** stores the following in **r()**:

Scalars
**r(lags)** lags used in prewhitening
**r(df)** chi-squared degrees of freedom
**r(score)** score chi-squared statistic
**r(p_score)** p-value for score chi-squared statistic
**r(basmann)** Basmann chi-squared statistic
**r(p_basmann)** p-value for Basmann chi-squared statistic
**r(sargan)** Sargan chi-squared statistic
**r(p_sargan)** p-value for Sargan chi-squared statistic

After LIML estimation, **estat overid** stores the following in **r()**:

Scalars
**r(ar)** Anderson-Rubin chi-squared statistic
**r(p_ar)** p-value for Anderson-Rubin chi-squared statistic
**r(ar_df)** chi-squared degrees of freedom
**r(basmann)** Basmann F statistic
**r(p_basmann)** p-value for Basmann F statistic
**r(basmann_df_n)** F numerator degrees of freedom
**r(basmann_df_d)** F denominator degrees of freedom

After GMM estimation, **estat overid** stores the following in **r()**:

Scalars
**r(HansenJ)** Hansen's J chi-squared statistic
**r(p_HansenJ)** p-value for Hansen's J chi-squared statistic
**r(J_df)** chi-squared degrees of freedom

__References__

Anderson, T. W., and H. Rubin. 1950. The asymptotic properties of
estimates of the parameters of a single equation in a complete system
of stochastic equations. *Annals of Mathematical Statistics* 21:
570-582.

Basmann, R. L. 1960. On finite sample distributions of generalized
classical linear identifiability test statistics. *Journal of the*
*American Statistical Association* 55: 650-659.

Durbin, J. 1954. Errors in variables. *Review of the International*
*Statistical Institute* 22: 23-32.

Hansen, L. P. 1982. Large sample properties of generalized method of
moments estimators. *Econometrica* 50: 1029-1054.

Hausman, J. A. 1978. Specification tests in econometrics. *Econometrica*
46: 1251-1271.

Sargan, J. D. 1958. The estimation of economic relationships using
instrumental variables. *Econometrica* 26: 393-415.

Wooldridge, J. M. 1995. Score diagnostics for linear models estimated by
two stage least squares. In *Advances in Econometrics and*
*Quantitative Economics: Essays in Honor* *of Professor C. R. Rao*, ed.
G. S. Maddala, P. C. B. Phillips, and T. N. Srinivasan, 66-87.
Oxford: Blackwell.

Wu, D.-M. 1974. Alternative tests of independence between stochastic
regressors and disturbances: Finite sample results.* Econometrica* 42:
529-546.