Home  /  Resources & support  /  FAQs  /  Durbin–Wu–Hausman test for endogeneity
Note: This FAQ is for users of Stata 5.

It is not relevant for Stata 6, which includes the hausman command to perform the Hausman specification test.

Stata 5: How do I test endogeneity? How do I perform a Durbin–Wu–Hausman test?

Title   Stata 5: Durbin–Wu–Hausman test (augmented regression test) for endogeneity
Author Ronna Cong, StataCorp

Before estimating the following simultaneous equations,

    z = a0 + a1*x1 + a2*x2 + epsilon1
    y = b0 + b1*z + b2*x3 + epsilon2

one should decide whether it is necessary to use an instrumental variable, i.e., whether a set of estimates obtained by least squares is consistent or not.

Davidson and MacKinnon (1993) suggest an augmented regression test (DWH test), which can easily be formed by including the residuals of each endogenous right-hand side variable, as a function of all exogenous variables, in a regression of the original model. Back to our example, we would first perform a regression

    z = c0 + c1*x1 + c2*x2 + c3*x3 + epsilon3

get residuals z_res, then perform an augmented regression:

    y = d0 + d1*z + d2*x3 + d3*z_res + epsilon4

If d3 is significantly different from zero, then OLS is not consistent.

For example, let us assume that you wish to estimate

    hsngval = a0 + a1*faminc + a2*reg2 + a3*reg3 + a4*reg4 + epsilon1
    rent = b0 + b1*hsngval + b2*pcturban + epsilon2

To test the endogeneity of hsngval,

 . regress hsngval faminc reg2-reg4 pcturban
 
       Source |       SS       df       MS              Number of obs =      50
 -------------+------------------------------           F(  5,    44) =   19.66
        Model |  8.4187e+09     5  1.6837e+09           Prob > F      =  0.0000
     Residual |  3.7676e+09    44  85626930.6           R-squared     =  0.6908
 -------------+------------------------------           Adj R-squared =  0.6557
        Total |  1.2186e+10    49   248700555           Root MSE      =  9253.5
 
 ------------------------------------------------------------------------------
      hsngval |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
 -------------+----------------------------------------------------------------
       faminc |   2.731324   .6818931     4.01   0.000     1.357058    4.105589
         reg2 |  -5095.038   4122.112    -1.24   0.223    -13402.61    3212.533
         reg3 |   -1778.05   4072.691    -0.44   0.665    -9986.019    6429.919
         reg4 |   13413.79   4048.141     3.31   0.002     5255.296    21572.28
     pcturban |   182.2201   115.0167     1.58   0.120    -49.58092    414.0211
        _cons |  -18671.87   11995.48    -1.56   0.127    -42847.17    5503.438
 ------------------------------------------------------------------------------
 
 . predict hsng_res, res
 
 . regress rent hsngval pcturban hsng_res
 
       Source |       SS       df       MS              Number of obs =      50
 -------------+------------------------------           F(  3,    46) =   47.05
        Model |  46189.1513     3  15396.3838           Prob > F      =  0.0000
     Residual |  15053.9687    46   327.26019           R-squared     =  0.7542
 -------------+------------------------------           Adj R-squared =  0.7382
        Total |    61243.12    49  1249.85959           Root MSE      =   18.09
 
 ------------------------------------------------------------------------------
         rent |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
 -------------+----------------------------------------------------------------
      hsngval |   .0022398   .0002681     8.36   0.000     .0017003    .0027794
     pcturban |    .081516   .2438355     0.33   0.740    -.4092993    .5723313
     hsng_res |  -.0015889   .0003984    -3.99   0.000    -.0023908    -.000787
        _cons |   120.7065   12.42856     9.71   0.000     95.68912    145.7239
 ------------------------------------------------------------------------------
 
 . test hsng_res
 
  ( 1)  hsng_res = 0.0
 
        F(  1,    46) =   15.91
             Prob > F =    0.0002

The small p-value indicates that OLS is not consistent.

To perform an IV regression, run ivreg

. ivreg rent pcturban (hsngval = faminc reg2-reg4)
    
 Instrumental variables (2SLS) regression
 
       Source |       SS       df       MS              Number of obs =      50
 -------------+------------------------------           F(  2,    47) =   42.66
        Model |  36677.4033     2  18338.7017           Prob > F      =  0.0000
     Residual |  24565.7167    47  522.674823           R-squared     =  0.5989
 -------------+------------------------------           Adj R-squared =  0.5818
        Total |    61243.12    49  1249.85959           Root MSE      =  22.862
     
 ------------------------------------------------------------------------------
         rent |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
 -------------+----------------------------------------------------------------
      hsngval |   .0022398   .0003388     6.61   0.000     .0015583    .0029213
     pcturban |    .081516   .3081528     0.26   0.793    -.5384074    .7014394
        _cons |   120.7065   15.70688     7.68   0.000     89.10834    152.3047
 ------------------------------------------------------------------------------
 Instrumented:  hsngval
 Instruments:   pcturban faminc reg2 reg3 reg4
 ------------------------------------------------------------------------------

Note that the coefficients of the last two estimates are the same; however, the standard errors are different.

Reference

Davidson, R. and J. G. MacKinnon. 1993.
Estimation and Inference in Econometrics. New York: Oxford University Press.