Statalist The Stata Listserver

[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: ST: IV list in ivreg and ivreg2: procedure to test for endogeinity of added variable if one of the original variables is endogenous?

From   "Vergeer, Robert" <>
To   <>
Subject   st: ST: IV list in ivreg and ivreg2: procedure to test for endogeinity of added variable if one of the original variables is endogenous?
Date   Wed, 30 Aug 2006 16:30:11 +0200

While browsing through the Statalist archives, I found that I am facing
a similar problem as discussed below by Mark and Eddy. Maybe I missed
it, but I couldn't find a suitable solution to the problem. Below I
propose a possible solution.
I am interested in any comments on the validity of this solution (or
other ideas to tackle this problem).

The problem concerns the limited control over the variables used as
instruments in the IV-regression.

I, like Mark, want to replicate, and then augment, a regression
performed by others. These others did - seemingly erroneous - not deal
with an endogeneity problem (say: for variable X_OriginalEndogenous).
Now, as I am interested in comparing an augmented regression with the
original regression, I don't want to deal with the endogeneity problem
of X_OriginalEndogenous, but I my interest is with the coefficient of an
added variable (say: X_added).
I do want to test for the possible endogeneity of this added explanatory
variable X_added, but I do not want to use X_OriginalEndogenous as an
(included) instrument as it affects the overall validity of all the
instruments used to instrument X_added in an adverse way (In my case,
this leads to rejection of the validity of the instruments).

To summarize:

The original regression, with which I want to compare my augmented
regression, reads:

Y = X_originalEndogenous, X_1, X_...

My augemented regression, reads:

Y = X_originalEndogenous, X_1, X_..., X_added

To test for possible endogeneity of X_added, I used:

ivreg2 Y  (X_originalEndogenous = INSTRUMENTS) X_1, X_..., X_added
orthog(X_added) and used the C-value of the Difference in Sargan/Hansen
test as an indicator for the possible endogeneity of X_added. (So that
with this test, X_originalEndogenous is not used as an instrument for

The I used:
ivreg2 Y  (X_added  X_originalEndogenous = INSTRUMENTS) X_1, X_...
And used the J-value to test for the validity of the instruments (so
X_originalEndogenous is not an included instrument when I test for the
validity of the instruments for X_added).

Does anyone know if this procedure is valid?

Thanks a lot,

Robert Vergeer

> Hi Mark,
> Thanks for your reply; please see my response below.
> Monday, April 19, 2004, Mark Schaffer wrote:
>>> Dear listers,
>>> When using ivreg or ivreg2 to do a 2SLS estimation, all the RHS
>>> variables except those explicitly specified as endogenous are
> assumed
>>> to be exogenous and valid IV. Call those the "included exogenous
>>> variables". However, I happen to have a case in which not all the
>>> "included exogenous variables" are valid IV, and I am asking
> whether
>>> users can have better control over the list of IV to be used in
> the
>>> 2SLS estimation.
>> I'm not sure this makes sense.  A "valid" IV is one that satisfies
>> the orthogonality conditions; this is synonymous with "exogenous".
>> If one of your regressors isn't a valid IV, then it isn't exogenous
>> and you need to treat it as endogenous.  This is the way that IV
>> works (or, in modern presentations, GMM with IV as a special case).
>> In your example, ln(P) might or might not be be orthogonal to the
>> disturbance term.  If it is, it's a valid IV and you can treat it
> as
>> exogenous; if it isn't, it's not a valid IV and you should treat it
>> as endogenous.  It sounds like you lean towards the latter, which
>> looks like a reasonable way to proceed (so long as you have enough
>> other valid excluded instruments to identify the equation, and they
>> are "relevant" as well as "valid").
>   ln(W/P) = a0 + a1*ln(P) + a2*y + B*X,
> It's correct that ln(P) is endogenous because the dependent variable
> is ln(W/P)=ln(W) - ln(P),

This isn't necessarily the case.  It's quite possible that ln(P) could
be exogenous even though it's used to calculate ln(W/P).  It depends on
your priors and the data generating process (and everything else).

> but the problem is that I do not want to deal with the endogeneity of
> ln(P), and I don't want it to be part of the IV for the other
> endogenous variable, y,

I'm not sure what you mean here.  If ln(P) is an exogenous regressor,
then it's an "included IV" by definition.  Not a problem for you.

> whose coefficient is the
> ultimate concern of this study.
> I do not want to treat the endogeneity of ln(P) because (1) the ln(P)
> variable is only to control for heterogenous preference, and we do not

> really care about the coefficient of ln(P), (2) to the extent that
> ln(P) are independent of y and X, the endogeneity problem of ln(P)
> does not have adverse effect on the coefficients of y and X, (3) the
> work I want to follow/replicate does not treat its endogeneity
> (Carroll and Samwick, "The Nature of Precautionary Wealth", Journal of

> Monetary Economics, 1997), and (4) good IV for ln(P) maybe difficult
> to come by.

If the work that you want to replicate treats ln(P) as exogenous, and
good IVs for ln(P) are hard to come by, then the decision to treat it as
exogenous seems to be defensible in your case.

As I said in my previous posting, either you treat ln(P) as exogenous or
endogenous.  There isn't any third way, at least in IV-GMM.  It looks
like either approach is legitimate in your case.

Hope this helps.


> I know this is a rather uncommon case, so I would appreciate any
> suggestion.
> Eddy

Ir. Robert Vergeer
Department of Economics of Innovation
TU Delft, faculty of Technology, Policy and Management
TU Delft/ faculteit Techniek, Bestuur en Management Jaffalaan 5
2628 BX  Delft
015 2788928

-----Original Message-----
[] On Behalf Of Marcello
Sent: maandag 7 augustus 2006 23:32
Subject: st: 2 parameter MLIRT polychotomous model simulation

I am trying to simulate 2-Parameter MLIRT polytomous models in Stata.
Could someone pls suggest how I can do it? I came across simirt program
but I dont understand how to specify that each item should have, say, 4
categories and specify its discrimination parameter (because the rsm
option cannot be used with the disc option).

Could someone clarify/give suggestions please?


Prathiba Natesan
*   For searches and help try:

*   For searches and help try:

© Copyright 1996–2021 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index