# st: RE: Xtivreg2 with two endogenous variables

 From "Schaffer, Mark E" <[email protected]>
Subject st: RE: Xtivreg2 with two endogenous variables
Date Sat, 16 Apr 2011 14:29:26 +0100

Vikram,

> HI all,
>
> I am using xtiverg2 for running my model with two
> regressions. My model is
>
> xtivreg2 y1 x1 x2 x3 x4 (x5 x6 = z1 z2 z3 z4) yeardummy
> industrydummy, fe
> endog(x5 x6) first
>
> the problem i am having is that the instruments for
> endogenous variable x5 are not significant for endogenous
> variable x6 and instruments for x6 are not significant for x5
> in first stage regressions and due to this i get very low
> cragg-donald f stat in 2sls regression.

That's not why the CD stat is low.

Say that you've chosen z1 and z2 because you think they're correlated
with x5, and z3 and z4 because they're correlated with x6.

It is perfectly possible for you to get a huge CD statistic even if z3
and z4 are completely uncorrelated with x5, and z1 and z2 are completely
uncorrelated with x6.

The CD stat is a measure of the rank of the matrix of reduced form
coefficients for x5 and x6 (see, e.g., -help ranktest- for a short
disucssion and some examples).  For your model to be identified, the
matrix has to be full column rank - in this case, 2.  Intuitively, if
you can separately identify both (i.e., 2) sets of RF coefficients, the
model is identified.  The situation where z3 and z4 get zero coeffs in
the x5 eqn, and z5 and z6 get zero coeffs in the x6 eqn, is not a
problem at all.

Your low CD stat is telling you that the model is underidentified.  In
other words, you cannot separately identify the 2 sets of RF
coefficients.  One or the other, maybe, but not both.

You should look at the Angrist-Pischke (A) first-stage F statistics.
These are reported by (xt)ivreg2, with one for each endogenous
regressor.  You may well find that, say, the one for x5 is large and the
one for x6 is small, suggesting, in effect, that you are identifying the
RF coefficients for x5 but not the ones for x6.

--Mark

