Stata 15 help for ivregress_postestimation

[R] ivregress postestimation -- Postestimation tools for ivregress

Postestimation commands

The following postestimation commands are of special interest after ivregress:

Command Description ------------------------------------------------------------------------- estat endogenous perform tests of endogeneity estat firststage report "first-stage" regression statistics estat overid perform tests of overidentifying restrictions * estat sbknown perform tests for a structural break with a known break date * estat sbsingle perform tests for a structural break with an unknown break date ------------------------------------------------------------------------- These commands are not appropriate after the svy prefix. * estat sbknown and estat sbsingle work only after ivregress 2sls.

The following standard postestimation commands are also available:

Command Description ------------------------------------------------------------------------- contrast contrasts and ANOVA-style joint tests of estimates estat summarize summary statistics for the estimation sample estat vce variance-covariance matrix of the estimators (VCE) estat (svy) postestimation statistics for survey data estimates cataloging estimation results + forecast dynamic forecasts and simulations + hausman Hausman's specification test lincom point estimates, standard errors, testing, and inference for linear combinations of coefficients margins marginal means, predictive margins, marginal effects, and average marginal effects marginsplot graph the results from margins (profile plots, interaction plots, etc.) nlcom point estimates, standard errors, testing, and inference for nonlinear combinations of coefficients predict predictions, residuals, influence statistics, and other diagnostic measures predictnl point estimates, standard errors, testing, and inference for generalized predictions pwcompare pairwise comparisons of estimates test Wald tests of simple and composite linear hypotheses testnl Wald tests of nonlinear hypotheses ------------------------------------------------------------------------- + forecast and hausman are not appropriate with svy estimation results.

Syntax for predict

predict [type] newvar [if] [in] [, statistic]

predict [type] {stub*|newvarlist} [if] [in] , scores

statistic Description ------------------------------------------------------------------------- Main xb linear prediction; the default residuals residuals stdp standard error of the prediction stdf standard error of the forecast pr(a,b) Pr(a < y < b) under exogeneity and normal errors e(a,b) E(y | a < y < b) under exogeneity and normal errors ystar(a,b) E(y*), y* = max{a,min(y,b)} under exogeneity and normal errors ------------------------------------------------------------------------- These statistics are available both in and out of sample; type predict ... if e(sample) ... if wanted only for the estimation sample. stdf is not allowed with svy estimation results.

where a and b may be numbers or variables; a missing (a > .) means minus infinity, and b missing (b > .) means plus infinity; see missing.

Menu for predict

Statistics > Postestimation

Description for predict

predict creates a new variable containing predictions such as linear predictions, residuals, standard errors, probabilities, and expected values.

Options for predict

+------+ ----+ Main +-------------------------------------------------------------

xb, the default, calculates the linear prediction.

residuals calculates the residuals, that is, y - xb. These are based on the estimated equation when the observed values of the endogenous variables are used -- not the projections of the instruments onto the endogenous variables.

stdp calculates the standard error of the prediction, which can be thought of as the standard error of the predicted expected value or mean for the observation's covariate pattern. This is also referred to as the standard error of the fitted value.

stdf calculates the standard error of the forecast, which is the standard error of the point prediction for 1 observation. It is commonly referred to as the standard error of the future or forecast value. By construction, the standard errors produced by stdf are always larger than those produced by stdp; see Methods and formulas in [R] regress postestimation.

pr(a,b) calculates Pr(a < xb + u < b), the probability that y|x would be observed in the interval (a,b) under exogeneity and assuming errors are normally distributed.

a and b may be specified as numbers or variable names; lb and ub are variable names; pr(20,30) calculates Pr(20 < xb + u < 30); pr(lb,ub) calculates Pr(lb < xb + u < ub); and pr(20,ub) calculates Pr(20 < xb + u < ub).

a missing (a > .) means minus infinity; pr(.,30) calculates Pr(-infinity < xb + u < 30); pr(lb,30) calculates Pr(-infinity < xb + u < 30) in observations for which lb > . and calculates Pr(lb < xb + u < 30) elsewhere.

b missing (b > .) means plus infinity; pr(20,.) calculates Pr(+infinity > xb + u > 20); pr(20,ub) calculates Pr(+infinity > xb + u > 20) in observations for which ub > . and calculates Pr(20 < xb + u < ub) elsewhere.

e(a,b) calculates E(xb + u | a < xb + u < b), the expected value of y|x conditional on y|x being in the interval (a,b), meaning y|x is truncated. a and b are specified as they are for pr(). Exogeneity and normally distributed errors are assumed.

ystar(a,b) calculates E(y*), where y* = a if xb + u < a, y* = b if xb + u > b, and y* = xb + u otherwise, meaning that y* is censored. a and b are specified as they are for pr(). Exogeneity and normally distributed errors are assumed.

scores calculates the scores for the model. A new score variable is created for each endogenous regressor, as well as an equation-level score that applies to all exogenous variables and constant term (if present).

Syntax for margins

margins [marginlist] [, options]

margins [marginlist] , predict(statistic ...) [predict(statistic ...) ...] [options]

statistic Description ------------------------------------------------------------------------- xb linear prediction; the default pr(a,b) Pr(a < y < b) under exogeneity and normal errors e(a,b) E(y | a < y < b) under exogeneity and normal errors ystar(a,b) E(y*), y* = max{a,min(y,b)} under exogeneity and normal errors stdp not allowed with margins stdf not allowed with margins residuals not allowed with margins -------------------------------------------------------------------------

Statistics not allowed with margins are functions of stochastic quantities other than e(b).

For the full syntax, see [R] margins.

Menu for margins

Statistics > Postestimation

Description for margins

margins estimates margins of response for linear predictions, probabilities, and expected values.

Syntax for estat

Perform tests of endogeneity

estat endogenous [varlist] [, lags(#) forceweights forcenonrobust]

Report "first-stage" regression statistics

estat firststage [, all forcenonrobust]

Perform tests of overidentifying restrictions

estat overid [, lags(#) forceweights forcenonrobust]

Menu for estat

Statistics > Postestimation

Description for estat

estat endogenous performs tests to determine whether endogenous regressors in the model are in fact exogenous. After GMM estimation, the C (difference-in-Sargan) statistic is reported. After 2SLS estimation with an unadjusted VCE, the Durbin (1954) and Wu-Hausman (Wu 1974; Hausman 1978) statistics are reported. After 2SLS with a robust or VCE, Wooldridge's (1995) robust score test and a robust regression-based test are reported. In all cases, if the test statistic is significant, then the variables being tested must be treated as endogenous. estat endogenous is not available after LIML estimation.

estat firststage reports various statistics that measure the relevance of the excluded exogenous variables. By default, whether the equation has one or more than one endogenous regressor determines what statistics are reported.

estat overid performs tests of overidentifying restrictions. If the 2SLS estimator was used, Sargan's (1958) and Basmann's (1960) chi-squared tests are reported, as is Wooldridge's (1995) robust score test; if the LIML estimator was used, Anderson and Rubin's (1950) chi-squared test and Basmann's F test are reported; and if the GMM estimator was used, Hansen's (1982) J statistic chi-squared test is reported. A statistically significant test statistic always indicates that the instruments may not be valid.

Options for estat endogenous

lags(#) specifies the number of lags to use for prewhitening when computing the heteroskedasticity- and autocorrelation-consistent (HAC) version of the score test of endogeneity. Specifying lags(0) requests no prewhitening. This option is valid only when the model was fit via 2SLS and an HAC covariance matrix was requested when the model was fit. The default is lags(1).

forceweights requests that the tests of endogeneity be computed even though aweights, pweights, or iweights were used in the previous estimation. By default, these tests are conducted only after unweighted or frequency-weighted estimation. The reported critical values may be inappropriate for weighted data, so the user must determine whether the critical values are appropriate for a given application.

forcenonrobust requests that the Durbin and Wu-Hausman tests be performed after 2SLS estimation even though a robust VCE was used at estimation time. This option is available only if the model was fit by 2SLS.

Options for estat firststage

all requests that all first-stage goodness-of-fit statistics be reported regardless of whether the model contains one or more endogenous regressors. By default, if the model contains one endogenous regressor, then the first-stage R-squared, adjusted R-squared, partial R-squared, and F statistics are reported, whereas if the model contains multiple endogenous regressors, then Shea's partial R-squared and adjusted partial R-squared are reported instead.

forcenonrobust requests that the minimum eigenvalue statistic and its critical values be reported even though a robust VCE was used at estimation time. The reported critical values assume that the errors are independent and identically distributed normal, so the user must determine whether the critical values are appropriate for a given application.

Options for estat overid

lags(#) specifies the number of lags to use for prewhitening when computing the heteroskedasticity- and autocorrelation-consistent (HAC) version of the score test of overidentifying restrictions. Specifying lags(0) requests no prewhitening. This option is valid only when the model was fit via 2SLS and an HAC covariance matrix was requested when the model was fit. The default is lags(1).

forceweights requests that the tests of overidentifying restrictions be computed even though aweights, pweights, or iweights were used in the previous estimation. By default, these tests are conducted only after unweighted or frequency-weighted estimation. The reported critical values may be inappropriate for weighted data, so the user must determine whether the critical values are appropriate for a given application.

forcenonrobust requests that the Sargan and Basmann tests of overidentifying restrictions be performed after 2SLS or LIML estimation even though a robust VCE was used at estimation time. These tests assume that the errors are independent and identically distributed normal, so the user must determine whether the critical values are appropriate for a given application.

Examples

Setup . webuse hsng2

Fit a model via 2SLS and obtain first-stage regression diagnostics . ivregress 2sls rent pcturban (hsngval = faminc i.region) . estat firststage

Obtain the Sargan and Basmann tests of overidentifying restrictions . estat overid

Test whether hsngval can be treated as exogenous . estat endogenous

Fit a model with two endogenous regressors via GMM and obtain all first-stage regression diagnostics . ivregress gmm rent (hsngval pcturban = faminc i.region) . estat firststage, all

Obtain Hansen's J statistic . estat overid

Test whether hsngval can be treated as exogenous . estat endogenous hsngval

Stored results

After 2SLS estimation, estat endogenous stores the following in r():

Scalars r(durbin) Durbin chi-squared statistic r(p_durbin) p-value for Durbin chi-squared statistic r(wu) Wu-Hausman F statistic r(p_wu) p-value for Wu-Hausman F statistic r(df) degrees of freedom r(wudf_r) denominator degrees of freedom for Wu-Hausman F r(r_score) robust score statistic r(p_r_score) p-value for robust score statistic r(hac_score) HAC score statistic r(p_hac_score) p-value for HAC score statistic r(lags) lags used in prewhitening r(regF) regression-based F statistic r(p_regF) p-value for regression-based F statistic r(regFdf_n) regression-based F numerator degrees of freedom r(regFdf_r) regression-based F denominator degrees of freedom

After GMM estimation, estat endogenous stores the following in r():

Scalars r(C) C chi-squared statistic r(p_C) p-value for C chi-squared statistic r(df) degrees of freedom

estat firststage stores the following in r():

Scalars r(mineig) minimum eigenvalue statistic

Matrices r(mineigcv) critical values for minimum eigenvalue statistic r(multiresults) Shea's partial R-squared statistics r(singleresults) first-stage R-squared and F statistics

After 2SLS estimation, estat overid stores the following in r():

Scalars r(lags) lags used in prewhitening r(df) chi-squared degrees of freedom r(score) score chi-squared statistic r(p_score) p-value for score chi-squared statistic r(basmann) Basmann chi-squared statistic r(p_basmann) p-value for Basmann chi-squared statistic r(sargan) Sargan chi-squared statistic r(p_sargan) p-value for Sargan chi-squared statistic

After LIML estimation, estat overid stores the following in r():

Scalars r(ar) Anderson-Rubin chi-squared statistic r(p_ar) p-value for Anderson-Rubin chi-squared statistic r(ar_df) chi-squared degrees of freedom r(basmann) Basmann F statistic r(p_basmann) p-value for Basmann F statistic r(basmann_df_n) F numerator degrees of freedom r(basmann_df_d) F denominator degrees of freedom

After GMM estimation, estat overid stores the following in r():

Scalars r(HansenJ) Hansen's J chi-squared statistic r(p_HansenJ) p-value for Hansen's J chi-squared statistic r(J_df) chi-squared degrees of freedom

References

Anderson, T. W., and H. Rubin. 1950. The asymptotic properties of estimates of the parameters of a single equation in a complete system of stochastic equations. Annals of Mathematical Statistics 21: 570-582.

Basmann, R. L. 1960. On finite sample distributions of generalized classical linear identifiability test statistics. Journal of the American Statistical Association 55: 650-659.

Durbin, J. 1954. Errors in variables. Review of the International Statistical Institute 22: 23-32.

Hansen, L. P. 1982. Large sample properties of generalized method of moments estimators. Econometrica 50: 1029-1054.

Hausman, J. A. 1978. Specification tests in econometrics. Econometrica 46: 1251-1271.

Sargan, J. D. 1958. The estimation of economic relationships using instrumental variables. Econometrica 26: 393-415.

Wooldridge, J. M. 1995. Score diagnostics for linear models estimated by two stage least squares. In Advances in Econometrics and Quantitative Economics: Essays in Honor of Professor C. R. Rao, ed. G. S. Maddala, P. C. B. Phillips, and T. N. Srinivasan, 66-87. Oxford: Blackwell.

Wu, D.-M. 1974. Alternative tests of independence between stochastic regressors and disturbances: Finite sample results. Econometrica 42: 529-546.


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index