Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

# Re: st: RE: ivreg2 and xtoverid error

 From John Antonakis To statalist@hsphsun2.harvard.edu Subject Re: st: RE: ivreg2 and xtoverid error Date Sun, 04 Apr 2010 12:27:44 +0200

```Hi Kit and Mark:

```
On another note, I was thinking more about the power issue and I had an idea. One way to get around this problem with too many dummy independent variables is to use Mundlak's trick to model fixed effects:
```
```
Mundlak, Y. (1978). Pooling of Time-Series and Cross-Section Data. Econometrica, 46(1), 69-85.
```
```
That is, taking the cluster mean of the endogenous regressors and using that as the instrument instead of the fixed effects gives almost exactly the same result. Here they are (for a larger sample and I chucked in there two more instruments to overidentify the equation--and here I have plenty of power)--and the stats look fine:
```
```
. xi: ivreg2 y (x1-x13 = mean_x1-mean_x13 z1 z2) , cluster(lead_n) endog(iia-lf em_new-om_new)
```
IV (2SLS) estimation
--------------------

Estimates efficient for homoskedasticity only
Statistics robust to heteroskedasticity and clustering on lead_number

```
Number of clusters (lead_number) = 345 Number of obs = 2616 F( 13, 344) = 80.76 Prob > F = 0.0000 Total (centered) SS = 1870.655581 Centered R2 = 0.6287 Total (uncentered) SS = 25133.5 Uncentered R2 = 0.9724 Residual SS = 694.5273064 Root MSE = .5153
```
------------------------------------------------------------------------------
|               Robust
```
y | Coef. Std. Err. z P>|z| [95% Conf. Interval]
```-------------+----------------------------------------------------------------
```
x1 | .4514196 .0571024 7.91 0.000 .3395009 .5633382 x2 | -.094049 .058022 -1.62 0.105 -.2077699 .019672 x3 | -.0285798 .0428113 -0.67 0.504 -.1124884 .0553289 x4 | .0784366 .0525558 1.49 0.136 -.0245709 .181444 x5 | .0810408 .0509977 1.59 0.112 -.0189129 .1809944 x6 | .148615 .0593743 2.50 0.012 .0322436 .2649865 x7 | -.1724113 .0287107 -6.01 0.000 -.2286832 -.1161394 x8 | .080845 .0375199 2.15 0.031 .0073073 .1543827 x9 | -.1967606 .0653847 -3.01 0.003 -.3249123 -.0686088 x10 | .1361337 .0557109 2.44 0.015 .0269424 .245325 x11 | .1213476 .0483033 2.51 0.012 .0266748 .2160204 x12 | .1441845 .0439615 3.28 0.001 .0580214 .2303475 x13 | .0320405 .0391324 0.82 0.413 -.0446575 .1087385 _cons | .5197885 .1609105 3.23 0.001 .2044097 .8351672
```------------------------------------------------------------------------------
```
Underidentification test (Kleibergen-Paap rk LM statistic): 90.529 Chi-sq(3) P-val = 0.0000
```------------------------------------------------------------------------------
```
Weak identification test (Cragg-Donald Wald F statistic): 26.380 (Kleibergen-Paap rk Wald F statistic): 3037.437 Stock-Yogo weak ID test critical values: <not available>
```------------------------------------------------------------------------------
```
Hansen J statistic (overidentification test of all instruments): 3.070 Chi-sq(2) P-val = 0.2154
```-endog- option:
```
Endogeneity test of endogenous regressors: 27.751 Chi-sq(13) P-val = 0.0098
```

```
The results are the same when using my old estimation procedure (so it is probably rounding that explains the very slight differences):
```
```
```Warning - collinearities detected
Vars dropped:       [snipped]

IV (2SLS) estimation
--------------------

Estimates efficient for homoskedasticity only
Statistics robust to heteroskedasticity and clustering on lead_number

```
Number of clusters (lead_number) = 345 Number of obs = 2616 F( 13, 344) = 80.99 Prob > F = 0.0000 Total (centered) SS = 1870.655581 Centered R2 = 0.6289 Total (uncentered) SS = 25133.5 Uncentered R2 = 0.9724 Residual SS = 694.2693929 Root MSE = .5152
```
------------------------------------------------------------------------------
|               Robust
```
y | Coef. Std. Err. z P>|z| [95% Conf. Interval]
```-------------+----------------------------------------------------------------
```
x1 | .4503576 .0572067 7.87 0.000 .3382346 .5624806 x2 | -.096804 .0580199 -1.67 0.095 -.2105209 .016913 x3 | -.0242708 .0428203 -0.57 0.571 -.108197 .0596553 x4 | .0851285 .0524755 1.62 0.105 -.0177215 .1879785 x5 | .076775 .0506549 1.52 0.130 -.0225068 .1760568 x6 | .1495938 .0591696 2.53 0.011 .0336235 .265564 x7 | -.1706418 .0287424 -5.94 0.000 -.2269759 -.1143076 x8 | .0777304 .0375514 2.07 0.038 .0041309 .1513299 x9 | -.1975138 .0649414 -3.04 0.002 -.3247966 -.070231 x10 | .1376252 .0556762 2.47 0.013 .0285017 .2467486 x11 | .1159876 .0473009 2.45 0.014 .0232795 .2086957 x12 | .1450098 .0438989 3.30 0.001 .0589695 .23105 x13 | .0325769 .0387352 0.84 0.400 -.0433428 .1084965 _cons | .5153171 .1612287 3.20 0.001 .1993147 .8313194
```------------------------------------------------------------------------------
```
Hansen J statistic (overidentification test of all instruments): 330.961 Chi-sq(331) P-val = 0.4903
```
Best,
J.

____________________________________________________

Prof. John Antonakis, Associate Dean
Department of Organizational Behavior
University of Lausanne
Internef #618
CH-1015 Lausanne-Dorigny
Switzerland

Tel ++41 (0)21 692-3438
Fax ++41 (0)21 692-3305

Faculty page:
http://www.hec.unil.ch/people/jantonakis

Personal page:
http://www.hec.unil.ch/jantonakis
____________________________________________________

On 03.04.2010 22:46, John Antonakis wrote:
> Thank Kit.
>
```
> One small bit of evidence for the fact that the fixed effects don't correlate with the error might come from the -xtoverid- test for random vs fixed effects. The classic interpretation of the test is that if it is significant, it suggests that the endogenous regressors correlate with y when the fixed-effects are not included. However, and equally so, if the test is significant, it means too that the fixed-effects correlate with y when controlling for the endogenous regressors (the fixed-effects correlate with the residual variance of y when controlling for the endogenous regressors). This test is akin to a mediation test as follows, where x is the endogenous regressor and z is exogenous:
```>
> 1. regress y on x (obtain significant coefficient)
> 2. regress y on z (obtain significant coefficient)
> 3. regress y on x and z (obtain significant coefficient only for x)
>
```
> If in step 3 the coefficient of z becomes non-significant (when it was significant before), then we have evidence of mediation--that is, that the correlation of x with y is stronger than that of z with y while controlling for the relation of x and z. The -xtoverid- test does an analogous thing: if it is non-significant we know that the endogenous regressors account for all the variance in y and that instruments don't correlate with y when controlling for the regressors; thus as an exogenous instrument, it should not correlate with the residual. I got the Sargan-Hansen statistic from the -xtoverid- 12.979 Chi-sq(13) P-value = 0.4494.
```>
```
> Also, I estimated the following fixed-effects model, a direct analog of the above mediation effect model :
```>
> est store fe
> hausman fe, force
>
```
> This test is non-significant too (though I should not be using the Hausman test with a robust estimator). Thus controlling for the endogenous variable, the fixed-effects do not correlate with y. I hope that what I have said makes sense.
```>
```
> Also, concerning the power issue, on one hand, with more instruments the model has more ways to go wrong so ceteris paribus, power to detect misspecification goes up with more degrees of freedom, correct? On the other hand, with weak instruments the power of the test is reduced. I guess a simulation would be needed to settle this.
```>
```
> Anyway, you are right in that it is possible that my instruments are weak and thus introduce bias. I have taken note of this limitation. I actually have direct measures of the leader's ability, personality, and other things, though I am saving them for another publication. I will check though to see what they give too in comparison to the fixed-effects instruments.
```>
> Best regards,
> John.
>
> ____________________________________________________
>
> Prof. John Antonakis, Associate Dean Faculty of Business and Economics
> Department of Organizational Behavior
> University of Lausanne
> Internef #618
> CH-1015 Lausanne-Dorigny
> Switzerland
>
> Tel ++41 (0)21 692-3438
> Fax ++41 (0)21 692-3305
>
> Faculty page:
> http://www.hec.unil.ch/people/jantonakis
>
> Personal page:
> http://www.hec.unil.ch/jantonakis
> ____________________________________________________
>
>
>
> On 03.04.2010 17:14, Kit Baum wrote:
>> <>
>> John said
>>
```
>> I get exactly the same estimates and standard errors with -ivreg- and -ivregress-, with the cluster robust variance estimator. When using -ivreg2- with the -noid- option it works and I get the same estimates; more importantly, I also get the Hansen J-test, which is what interests me most (the -ivregress- estimator does not report an overid for cluster-robust vce's):
```>>
```
>> Hansen J statistic (overidentification test of all instruments): 402.476, Chi-sq(404) P-val = 0.5121
```>>
```
>> The one thing to worry about here is that which arises with Sargan-Hansen tests after xtabond or user-written xtabond2: the overid test may not have much power when confronted with hundreds of instruments. >> You also mention the test provided by 'estat endogenous', which could be done in ivreg2 via the endog() option. This Durbin-Wu-Hausman test is merely telling you that you shouldn't use OLS on this model. But you're probably convinced of that in any event. Rejecting OLS as inconsistent does not imply that IV is consistent; that depends on the overid test of the excluded instruments (which you pass, but as mentioned may have low power to detect a problem) and the proper specification of the model. You might want to use ivreg2's orthog() option to consider just the non-dummy instruments as a group, and check to see that that Hansen "GMM distance" test also supports the notion that those excluded instruments are suitably orthogonal to the error.
```>>
```
>> Kit Baum | Boston College Economics & DIW Berlin | http://ideas.repec.org/e/pba1.html >> An Introduction to Stata Programming | http://www.stata-press.com/books/isp.html >> An Introduction to Modern Econometrics Using Stata | http://www.stata-press.com/books/imeus.html
```>>
>>
>> *
>> *   For searches and help try:
>> *   http://www.stata.com/help.cgi?search
>> *   http://www.stata.com/support/statalist/faq
>> *   http://www.ats.ucla.edu/stat/stata/
```
>> > *
```> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```