help ivregress postestimation dialogs: predict estat
also see: ivregress
-------------------------------------------------------------------------------
Title
[R] ivregress postestimation -- Postestimation tools for ivregress
Description
The following postestimation commands are of special interest after
ivregress:
command description
-------------------------------------------------------------------------
estat endogenous perform tests of endogeneity
estat firststage report "first-stage" regression statistics
estat overid perform tests of overidentifying restrictions
-------------------------------------------------------------------------
These commands are not appropriate after the svy prefix.
The following standard postestimation commands are also available:
command description
-------------------------------------------------------------------------
estat VCE and estimation sample summary
estat (svy) postestimation statistics for survey data
estimates cataloging estimation results
hausman Hausman's specification test
lincom point estimates, standard errors, testing, and inference
for linear combinations of coefficients
margins marginal means, predictive margins, marginal effects, and
average marginal effects
nlcom point estimates, standard errors, testing, and inference
for nonlinear combinations of coefficients
predict predictions, residuals, influence statistics, and other
diagnostic measures
predictnl point estimates, standard errors, testing, and inference
for generalized predictions
test Wald tests of simple and composite linear hypotheses
testnl Wald tests of nonlinear hypotheses
-------------------------------------------------------------------------
Special-interest postestimation commands
estat endogenous performs tests to determine whether endogenous
regressors in the model are in fact exogenous. After GMM estimation, the
C (difference-in-Sargan) statistic is reported. After 2SLS estimation
with an unadjusted VCE, the Durbin (1954) and Wu-Hausman (Wu 1974;
Hausman 1978) statistics are reported. After 2SLS with a robust,
cluster-robust, or HAC VCE, Wooldridge's (1995) robust score test and a
robust regression-based test are reported. In all cases, if the test
statistic is significant, the variables being tested must be treated as
endogenous. estat endogenous is not available after LIML estimation.
estat firststage reports various statistics that measure the relevance of
the excluded exogenous variables. By default, whether the equation has
one or more than one endogenous regressor determines which diagnostics
for weak instruments are reported.
estat overid performs tests of overidentifying restrictions. If the 2SLS
estimator was used, Sargan's (1958) and Basmann's (1960) chi-squared
tests are reported, as is Wooldridge's (1995) robust score test; if the
LIML estimator was used, Anderson and Rubin's (1950) chi-squared test and
Basmann's F test are reported; and if the GMM estimator was used,
Hansen's (1982) J statistic chi-squared test is reported. A
statistically significant test statistic always indicates that the
instruments may not be valid.
Syntax for predict
predict [type] newvar [if] [in] [, statistic]
predict [type] {stub*|newvarlist} [if] [in] , scores
statistic description
-------------------------------------------------------------------------
Main
xb linear prediction; the default
residuals residuals
stdp standard error of the prediction
stdf standard error of the forecast
pr(a,b) Pr(a < y < b)
e(a,b) E(y | a < y < b)
ystar(a,b) E(y*), y* = max{a,min(y,b)}
-------------------------------------------------------------------------
These statistics are available both in and out of sample; type predict
... if e(sample) ... if wanted only for the estimation sample.
stdf is not allowed with svy estimation results.
where a and b may be numbers or variables; a missing (a > .) means minus
infinity, and b missing (b > .) means plus infinity; see missing.
Menu
Statistics > Postestimation > Predictions, residuals, etc.
Options for predict
+------+
----+ Main +-------------------------------------------------------------
xb, the default, calculates the linear prediction.
residuals calculates the residuals, that is, y - xb. These are based on
the estimated equation when the observed values of the endogenous
variables are used -- not the projections of the instruments onto the
endogenous variables.
stdp calculates the standard error of the prediction, which can be
thought of as the standard error of the predicted expected value or
mean for the observation's covariate pattern. This is also referred
to as the standard error of the fitted value.
stdf calculates the standard error of the forecast, which is the standard
error of the point prediction for 1 observation. It is commonly
referred to as the standard error of the future or forecast value.
By construction, the standard errors produced by stdf are always
larger than those produced by stdp; see Methods and formulas in [R]
regress.
pr(a,b) calculates Pr(a < xb + u < b), the probability that y|x would be
observed in the interval (a,b).
a and b may be specified as numbers or variable names; lb and ub are
variable names;
pr(20,30) calculates Pr(20 < xb + u < 30);
pr(lb,ub) calculates Pr(lb < xb + u < ub); and
pr(20,ub) calculates Pr(20 < xb + u < ub).
a missing (a > .) means minus infinity; pr(.,30) calculates
Pr(-infinity < xb + u < 30);
pr(lb,30) calculates Pr(-infinity < xb + u < 30) in observations for
which lb > .
and calculates Pr(lb < xb + u < 30) elsewhere.
b missing (b > .) means plus infinity; pr(20,.) calculates
Pr(+infinity > xb + u > 20);
pr(20,ub) calculates Pr(+infinity > xb + u > 20) in observations for
which ub > .
and calculates Pr(20 < xb + u < ub) elsewhere.
e(a,b) calculates E(xb + u | a < xb + u < b), the expected value of y|x
conditional on y|x being in the interval (a,b), meaning, y|x is
censored. a and b are specified as they are for pr().
ystar(a,b) calculates E(y*), where y* = a if xb + u < a, y* = b if
xb + u > b, and y* = xb + u otherwise, meaning that y* is truncated.
a and b are specified as they are for pr().
scores calculates the score for the model. A new score variable is
created for each endogenous regressor, as well as an equation-level
score that applies to all exogenous variables and constant term (if
present).
Syntax for estat endogenous
estat endogenous [varlist] [, lags(#) forceweights forcenonrobust]
Menu
Statistics > Postestimation > Reports and statistics
Options for estat endogenous
lags(#) specifies the number of lags to use for prewhitening when
computing the heteroskedasticity- and autocorrelation-consistent
(HAC) version of the score test of endogeneity. Specifying lags(0)
requests no prewhitening. This option is valid only when the model
was fit via 2SLS and an HAC covariance matrix was requested when the
model was fit. The default is lags(1).
forceweights requests that the tests of overidentifying restrictions be
computed even though aweights, pweights, or iweights were used in the
previous estimation. By default, these tests are conducted only
after unweighted or frequency-weighted estimation. The reported
critical values may be inappropriate for weighted data, so the user
must determine whether the critical values are appropriate for a
given application.
forcenonrobust requests that the Durbin and Wu-Hausman statistics be
reported even though a robust VCE was used at estimation time. This
option is available only if the model was fit by 2SLS.
Syntax for estat firststage
estat firststage [, all forcenonrobust]
Menu
Statistics > Postestimation > Reports and statistics
Options for estat firststage
all requests that all first-stage goodness-of-fit statistics be reported
regardless of whether the model contains one or more endogenous
regressors. By default, if the model contains one endogenous
regressor, then the first-stage R-squared, adjusted R-squared,
partial R-squared, and F statistics are reported, whereas if the
model contains multiple endogenous regressors, then Shea's partial
R-squared and adjusted partial R-squared are reported instead.
forcenonrobust requests that the minimum eigenvalue statistic and its
critical values be reported even though a robust VCE was used at
estimation time. The reported critical values assume that the errors
are independent and identically distributed normal, so the user must
determine whether the critical values are appropriate for a given
application.
Syntax for estat overid
estat overid [, lags(#) forceweights forcenonrobust]
Menu
Statistics > Postestimation > Reports and statistics
Options for estat overid
lags(#) specifies the number of lags to use for prewhitening when
computing the heteroskedasticity- and autocorrelation-consistent
(HAC) version of the score test of overidentifying restrictions.
Specifying lags(0) requests no prewhitening. This option is valid
only when the model was fit via 2SLS and an HAC covariance matrix was
requested when the model was fit. The default is lags(1).
forceweights requests that the tests of overidentifying restrictions be
computed even though aweights, pweights, or iweights were used in the
previous estimation. By default, these tests are conducted only
after unweighted or frequency-weighted estimation. The reported
critical values may be inappropriate for weighted data, so the user
must determine whether the critical values are appropriate for a
given application.
forcenonrobust requests that the Sargan and Basmann tests of
overidentifying restrictions be performed after 2SLS or LIML
estimation even though a robust VCE was used at estimation time.
These tests assume that the errors are independent and identically
distributed normal, so the user must determine whether the critical
values are appropriate for a given application.
Examples
Setup
. webuse hsng2
Fit a model via 2SLS and obtain first-stage regression diagnostics
. ivregress 2sls rent pcturban (hsngval = faminc i.region)
. estat firststage
Obtain the Sargan and Basmann tests of overidentifying restrictions
. estat overid
Test whether hsngval can be treated as exogenous
. estat endogenous
Fit a model with two endogenous regressors via GMM and obtain all
first-stage regression diagnostics
. ivregress gmm rent (hsngval pcturban = faminc i.region)
. estat firststage, all
Obtain Hansen's J statistic
. estat overid
Test whether hsngval can be treated as exogenous
. estat endogenous hsngval
Saved results
After 2SLS estimation, estat endogenous saves the following in r():
Scalars
r(durbin) Durbin chi-squared statistic
r(p_durbin) p-value for Durbin chi-squared statistic
r(wu) Wu-Hausman F statistic
r(p_wu) p-value for Wu-Hausman F statistic
r(df) degrees of freedom
r(wudf_r) denominator degrees of freedom for Wu-Hausman F
r(r_score) robust score statistic
r(p_r_score) p-value for robust score statistic
r(hac_score) HAC score statistic
r(p_hac_score) p-value for HAC score statistic
r(lags) lags used in prewhitening
r(regF) regression-based F statistic
r(p_regF) p-value for regression-based F statistic
r(regFdf_n) regression-based F numerator degrees of freedom
r(regFdf_r) regression-based F denominator degrees of freedom
After GMM estimation, estat endogenous saves the following in r():
Scalars
r(C) C chi-squared statistic
r(p_C) p-value for C chi-squared statistic
r(df) degrees of freedom
estat firststage saves the following in r():
Scalars
r(mineig) minimum eigenvalue statistic
Matrices
r(mineigcv) critical values for minimum eigenvalue statistic
r(multiresults) Shea's partial R-squared statistics
r(singleresults) first-stage R-squared and F statistics
After 2SLS estimation, estat overid saves the following in r():
Scalars
r(lags) lags used in prewhitening
r(df) chi-squared degrees of freedom
r(score) score chi-squared statistic
r(p_score) p-value for score chi-squared statistic
r(basmann) Basmann chi-squared statistic
r(p_basmann) p-value for Basmann chi-squared statistic
r(sargan) Sargan chi-squared statistic
r(p_sargan) p-value for Sargan chi-squared statistic
After LIML estimation, estat overid saves the following in r():
Scalars
r(ar) Anderson-Rubin chi-squared statistic
r(p_ar) p-value for Anderson-Rubin chi-squared statistic
r(ar_df) chi-squared degrees of freedom
r(basmann) Basmann F statistic
r(p_basmann) p-value for Basmann F statistic
r(basmann_df_n) F numerator degrees of freedom
r(basmann_df_d) F denominator degrees of freedom
After GMM estimation, estat overid saves the following in r():
Scalars
r(HansenJ) Hansen's J chi-squared statistic
r(p_HansenJ) p-value for Hansen's J chi-squared statistic
r(J_df) chi-squared degrees of freedom
References
Anderson, T. W., and H. Rubin. 1950. The asymptotic properties of
estimates of the parameters of a single equation in a complete system
of stochastic equations. Annals of Mathematical Statistics 21:
570-582.
Basmann, R. L. 1960. On finite sample distributions of generalized
classical linear indentifiability test statistics. Journal of the
American Statistical Association 55: 650-659.
Durbin, J. 1954. Errors in variables. Review of the International
Statistical Institute 22: 23-32.
Hansen, L. P. 1982. Large sample properties of generalized method of
moments estimators. Econometrica 50: 1029-1054.
Hausman, J. A. 1978. Specification tests in econometrics. Econometrica
46: 1251-1271.
Sargan, J. D. 1958. The estimation of economic relationships using
instrumental variables. Econometrica 26: 393-415.
Wooldridge, J. M. 1995. Score diagnostics for linear models estimated by
two stage least squares. In Advances in Econometrics and
Quantitative Economics: Essays in Honor of Professor C. R. Rao, ed.
G. S. Maddala, P. C. B. Phillips, and T. N. Srinivasan, 66-87.
Oxford: Blackwell.
Wu, D.-M. 1974. Alternative tests of independence between stochastic
regressors and disturbances: Finite sample results. Econometrica 42:
529-546.
Also see
Manual: [R] ivregress postestimation
Help: [R] ivregress