# st: Testing for instrument relevance and overidentification when the endogeneous variable is used in interaction terms

 From Jason Wichert To statalist@hsphsun2.harvard.edu Subject st: Testing for instrument relevance and overidentification when the endogeneous variable is used in interaction terms Date Wed, 29 May 2013 22:18:01 +0200

```Dear Statalisters,

I have encountered some difficulties concerning 2SLS estimation when
the endogeneous variable is also used to construct interaction terms.

After digging through the archives, I found a lot of helpful comments
concerning the procedure:

http://www.stata.com/statalist/archive/2012-05/msg00970.html
http://www.stata.com/statalist/archive/2011-08/msg01485.html
http://www.stata.com/statalist/archive/2011-12/msg00705.html
http://www.stata.com/statalist/archive/2010-04/msg00759.html
http://www.stata.com/statalist/archive/2005-05/msg00150.html
http://www.stata.com/statalist/archive/2008-10/msg01009.html
http://www.stata.com/statalist/archive/2004-08/msg00779.html

Following this advice, I am running an equation of the basic form

ivreg2 y ex (en en_ex = z ex_z)

In my case, there are two exogeneous variables interacted with the
endogeneous variable. Furthermore, I need interactions of those
squared exogeneous variables and the endogeneous variables. Leaving
additional control variables and further instruments aside, this

ivreg2 y ex1 ex2 (en ex1_en ex2_en (ex1)^2_en (ex2)^2_en = z ex1_z
ex2_z (ex1)^2_z (ex2)^2_z)

So far, so good. However, I’m not sure as to how exactly examine
instrument relevance and exogeneity, and which statistics/tests to
report.

As regards instrument relevance, as to be assessed by the first stage
F statistic, the F-statistics on “en” clearly differ depending on
whether I instrument solely for “en”, or whether I also instrument for
the linear and non-linear interaction terms built around “en”. Which F
statistic is the correct one to refer to?

Considering I have multiple instruments Z, I am also not sure which
overidentification tests and results I should rely on and report.  As
holds for the F statistics, the tests of overidentifying restrictions
(Sargan N*R-sq test as well as Basmann test) provided by both ivreg2
and overid differ between instrumenting solely “en” or also for the
interaction terms build around “en”.

Any help is greatly appreciated!
Jason

