 Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

# st: RE: Multiple endogenous variables IV. Estimating the first stage regression with only a subset of the instruments

 From "Millimet, Daniel" To "statalist@hsphsun2.harvard.edu" Subject st: RE: Multiple endogenous variables IV. Estimating the first stage regression with only a subset of the instruments Date Tue, 25 Oct 2011 13:39:23 +0000

```No, the system is under-identified.  The easiest way to see that your solution would still fail (even if it were acceptable to only include a subset of the exogenous vars in each first-stage) is that the fitted values for x1 and x2 will be linearly dependent since z1 and z2 are linearly dependent.

Daniel

****************************************************
Daniel L. Millimet
Research Fellow, IZA
Professor, Department of Economics
Box 0496
SMU
Dallas, TX 75275-0496
phone: 214.768.3269
fax: 214.768.1821
web: http://faculty.smu.edu/millimet
****************************************************

-----Original Message-----
From: owner-statalist@hsphsun2.harvard.edu [mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of Nicolai Borgen
Sent: Tuesday, October 25, 2011 8:00 AM
To: statalist@hsphsun2.harvard.edu
Subject: st: Multiple endogenous variables IV. Estimating the first stage regression with only a subset of the instruments

I have a question regarding the use of instrumental variables when having multiple endogenous variables. I am estimating the economic return of attending different educational institutions X1, X2 and X3.
For simplicity, the models are presented without control variables and constant.

[A1]		Y= δ1*X1 + δ2*X2 + δ3*X3 + ε

Since I have strong reasons to suspect that X1, X2 and X3 is correlated with ε, it is necessary to use an instrumental variable approach to estimate the return. I therefore use proximity (in km) between municipality of adolescence and these institutions as instruments (Z1,
Z2 and Z3). Using regression commands such as ivregress and ivreg2 in STATA, all instruments are included in the first stage for each endogenous variable X1, X2 and X3:

[B1]		X1 = B1*Z1 + B2*Z2 + B3*Z3 + ε
[C1]		X2 = B1*Z1 + B2*Z2 + B3*Z3 + ε
[D1]		X3 = B1*Z1 + B2*Z2 + B3*Z3 + ε

My problem occurs because X1 and X2 are located in the same city, and
Z1 and Z2 are therefore perfectly collinear. Thus, in [B1] and [C1] either Z1 or Z2 is dropped. My model is therefore basically under-identified.  Based on this, I have the following two questions:

(1) Is it possible to estimate the first stage regression using a subset of the instruments?

[B2]		X1 = B1*Z1 + ε
[C2]		X2 = B2*Z2 + ε
[D2]		X3 = B3*Z3 + ε

(2) This STATA page http://www.stata.com/support/faqs/stat/ivr_faq.html
shows an example of how to perform the two-step computations for the instrumental variable estimator without using ivregress or ivreg2. Is this a feasible solution? Are there any STATA commands I can use that do this?

Many thanks,
Nicolai Borgen
University of Oslo
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```