Note: This FAQ is for users of Stata 5, and older versions of Stata. It is not relevant for Stata 6,
which includes the hausman command to perform the Hausman specification test.
Stata 5: How do I test endogeneity? How do I perform a Durbin–Wu–Hausman test?
| Title |
|
Stata 5: Durbin–Wu–Hausman
test (augmented regression test) for endogeneity |
| Author |
Ronna Cong, StataCorp |
| Date |
November 1999 |
Before estimating the following simultaneous equations,
z = a0 + a1*x1 + a2*x2 + epsilon1
y = b0 + b1*z + b2*x3 + epsilon2
one should decide whether it is necessary to use an instrumental variable,
i.e., whether a set of estimates obtained by least squares is consistent or
not.
Davidson and MacKinnon (1993) suggest an augmented regression test (DWH
test), which can easily be formed by including the residuals of each
endogenous right-hand side variable, as a function of all exogenous
variables, in a regression of the original model. Back to our example, we
would first perform a regression
z = c0 + c1*x1 + c2*x2 + c3*x3 + epsilon3
get residuals z_res, then perform an augmented regression:
y = d0 + d1*z + d2*x3 + d3*z_res + epsilon4
If d3 is significantly different from zero, then OLS is not consistent.
For example, let us assume that you wish to estimate
hsngval = a0 + a1*faminc + a2*reg2 + a3*reg3 + a4*reg4 + epsilon1
rent = b0 + b1*hsngval + b2*pcturban + epsilon2
To test the endogeneity of hsngval,
. regress hsngval faminc reg2-reg4 pcturban
Source | SS df MS Number of obs = 50
-------------+------------------------------ F( 5, 44) = 19.66
Model | 8.4187e+09 5 1.6837e+09 Prob > F = 0.0000
Residual | 3.7676e+09 44 85626930.6 R-squared = 0.6908
-------------+------------------------------ Adj R-squared = 0.6557
Total | 1.2186e+10 49 248700555 Root MSE = 9253.5
------------------------------------------------------------------------------
hsngval | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
faminc | 2.731324 .6818931 4.01 0.000 1.357058 4.105589
reg2 | -5095.038 4122.112 -1.24 0.223 -13402.61 3212.533
reg3 | -1778.05 4072.691 -0.44 0.665 -9986.019 6429.919
reg4 | 13413.79 4048.141 3.31 0.002 5255.296 21572.28
pcturban | 182.2201 115.0167 1.58 0.120 -49.58092 414.0211
_cons | -18671.87 11995.48 -1.56 0.127 -42847.17 5503.438
------------------------------------------------------------------------------
. predict hsng_res, res
. regress rent hsngval pcturban hsng_res
Source | SS df MS Number of obs = 50
-------------+------------------------------ F( 3, 46) = 47.05
Model | 46189.1513 3 15396.3838 Prob > F = 0.0000
Residual | 15053.9687 46 327.26019 R-squared = 0.7542
-------------+------------------------------ Adj R-squared = 0.7382
Total | 61243.12 49 1249.85959 Root MSE = 18.09
------------------------------------------------------------------------------
rent | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
hsngval | .0022398 .0002681 8.36 0.000 .0017003 .0027794
pcturban | .081516 .2438355 0.33 0.740 -.4092993 .5723313
hsng_res | -.0015889 .0003984 -3.99 0.000 -.0023908 -.000787
_cons | 120.7065 12.42856 9.71 0.000 95.68912 145.7239
------------------------------------------------------------------------------
. test hsng_res
( 1) hsng_res = 0.0
F( 1, 46) = 15.91
Prob > F = 0.0002
The small p-value indicates that OLS is not consistent.
To perform an IV regression, run
ivreg
. ivreg rent pcturban (hsngval = faminc reg2-reg4)
Instrumental variables (2SLS) regression
Source | SS df MS Number of obs = 50
-------------+------------------------------ F( 2, 47) = 42.66
Model | 36677.4033 2 18338.7017 Prob > F = 0.0000
Residual | 24565.7167 47 522.674823 R-squared = 0.5989
-------------+------------------------------ Adj R-squared = 0.5818
Total | 61243.12 49 1249.85959 Root MSE = 22.862
------------------------------------------------------------------------------
rent | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
hsngval | .0022398 .0003388 6.61 0.000 .0015583 .0029213
pcturban | .081516 .3081528 0.26 0.793 -.5384074 .7014394
_cons | 120.7065 15.70688 7.68 0.000 89.10834 152.3047
------------------------------------------------------------------------------
Instrumented: hsngval
Instruments: pcturban faminc reg2 reg3 reg4
------------------------------------------------------------------------------
Note that the coefficients of the last two estimates are the same; however,
the standard errors are different.
Reference
- Davidson, R. and J. G. MacKinnon. 1993.
- Estimation and Inference in Econometrics. New York: Oxford University Press.
|