Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

# Re: st: RE: Testing for instrument relevance and overidentification when the endogeneous variable is used in interaction terms

 From Jason Wichert To statalist@hsphsun2.harvard.edu Subject Re: st: RE: Testing for instrument relevance and overidentification when the endogeneous variable is used in interaction terms Date Tue, 4 Jun 2013 20:49:05 +0200

```Mark,

Again, thank you so much for your feedback.

As regards the endogeneity tests, I’m actually using the endog option
in ivreg2. On a side note, thank you guys for this excellent tool and
the detailed explanations in the articles/versions of 2003 and 2007.

my case, ex1 and ex2 are distinct constructs, measures of good (ex1)
and poor (ex2) company performance in a certain sense, similar to
Herzberg’s two-factor theory of motivators and hygiene factors. They
both have (different) non-linear associations to my measure of
financial performance (y), which luckily has been largely established
in empirical research, as have the influences of both ex1 and ex2 on
the endogenous variable. Less established so far, however, is the
moderating effect of my endogenous variable on either link between
ex1/ex2 and y. This moderating effect (at least according to my humble
analyses), which my research focuses on, differs between the levels of
ex1 and ex2, as indicated by the significant interaction terms of
different sign between say ex1_en and (ex1)^2_en; hence my
interactions of the linear and quadratic terms of both ex1 and ex2
with en. Leaving ex2, en and controls aside, my results indicate

y = 0.363 ex1 – 0.032 (ex1)^2 – 0.007 ex1_en + 0.001 (ex1)^2_en

With all coefficients highly significant, I interpret these results as
decreasing marginal returns to ex1 or an inverted U-shaped
relationship between ex1 and y with the inflection point in the first
quadrant. While the moderating effect of en on ex1 is largely negative
(as indicated by the negative coefficient on ex1_en), this negative
effect is attenuated for high levels of ex1 (as indicated by the
positive coefficient on (ex1)^2_en). Unrelated to my initial
questions, does this interpretation seem to make sense?

In various preliminary analyses, luckily(?!) I did not find any
non-linear associations between en and y, at the very least saving me
additional nightmares of the econometric and economic kinds.

Kind regards,

Jason

On Tue, Jun 4, 2013 at 8:07 PM, Schaffer, Mark E <M.E.Schaffer@hw.ac.uk> wrote:
> Jason,
>
>> -----Original Message-----
>> From: owner-statalist@hsphsun2.harvard.edu [mailto:owner-
>> statalist@hsphsun2.harvard.edu] On Behalf Of Jason Wichert
>> Sent: 04 June 2013 12:47
>> To: statalist@hsphsun2.harvard.edu
>> Subject: Re: st: RE: Testing for instrument relevance and overidentification
>> when the endogeneous variable is used in interaction terms
>>
>> Mark,
>>
>> Thank you very much for your feedback (and all the other excellent
>> comments on 2SLS you made on statalist). It's not the usual regression 101,
>> so it actually took me a couple of days to work through all the respective IV
>> statistics, hence my late reply.
>>
>> My analyses start with just the one endogenous regressor and are
>> subsequently extended to incorporate the endogenous interaction terms.
>>
>> In the base case of just that one endogenous variable, i.e.
>>
>>  ivreg2 y ex1 ex2 (en = z1 z2)
>>
>> I intend to present the first stage F-statistics (to reject weak identification of
>> my endogenous variable), results from the Sargan/Hansen overidentification
>> test (to test whether the instruments are jointly exogenous), as well as a
>> partial R² (to assess instrument relevance), and a Hausman test for
>> endogeneity.
>
> That sounds fine.  Two minor suggestions:
>
> 1.  The first-stage F stat makes the partial R-sq redundant.  No need to report it or anything like it in the case of a single endogenous regressor.
>
> 2.  You can get ivreg2 to report an endogeneity test for you by using the endog option.
>
>>
>> In the extended case of (*gasp*)
>>
>>  ivreg2 y ex1 ex2 (ex1)^2 (ex2)^2 (en ex1_en ex2_en (ex1)^2_en (ex2)^2_en
>> = z1 z2 ex1_z1 ex1_z2 (ex1)^2_z1 (ex1)^2_z2 ex2_z1 ex2_z2
>> (ex2)^2_z1 (ex2)^2_z2)
>>
>> I intend to present results from the Sargan/Hansen overidentification test,
>> results from the Anderson/Rubin (1949) [or potentially Stock/Wright (2000)]
>> test to indicate that all the endogenous regressors are jointly significant in
>> the second stage, the Kleibergen/Paap (2006) statistic of underidentification
>> of the model (i.e. the joint endogenous regressors) , the Cragg/Donald
>> (1993) statistic of weak identification of the model, the Angrist/Pischke
>> (2009) statistics for identification of each of the endogenous regressors, as
>> well as a Hausman test for endogeneity.
>>
>> Is there something blatant obvious I’m missing or anything I could well leave
>> out, particularly in the extended case? In particular I’m wondering about
>>
>> a) the necessity of the A/R-test, considering most all of my endogenous
>> variables are highly significant in the second stage, as indicated by their
>> respective t- and p-values,
>>
>> b) the necessity of presenting both K/P as well as C/D statistics,
>>
>> c) the necessity of the Hausman test in the extended case.
>>
>
> Let's see...
>
> 1.  The K-P test for underidentification is reported by ivreg2 mostly for completeness.  If you reject weak identification based on C-D, you are also rejecting underidentification.  So you could omit K-P.
>
> 2.  Most people probably wouldn't report the A-R test unless there were signs of weak identification (in which case they might consider using weak-instrument-robust methods, e.g., rivtest).
>
> 3.  On Hausman, sample point above applies - you can get ivreg2 to report the endog test by using the endog option.  Maybe you have priors about whether one or a subset of your endogenous regressors should be tested rather than the whole lot at once.
>
> 4.  You didn't ask about this but worth mentioning anyway - when people introduce quadratics in the way you are doing, they often include the interactions.  In your case that means the interaction of ex1 and ex2 and similarly for the other regressors and instruments (and if you were really serious about it, you'd probably interact the endogenous and exogenous regressors too).  The slightly hand-wavey justification would be a Taylor approximation.
>
> HTH,
> Mark
>
>>
>> On Fri, May 31, 2013 at 8:30 PM, Schaffer, Mark E <M.E.Schaffer@hw.ac.uk>
>> wrote:
>> > Jason,
>> >
>> > I think the key point is that in your estimation
>> >
>> > ivreg2 y ex (en en_ex = z ex_z)
>> >
>> > just looking at the two standard first-stage F stats isn't enough.  You can
>> easily get 2 large first-stage F stats, and yet the model is underidentified
>> because there isn't enough information in your instruments to
>> simultaneously identify the coeffs on both your endogenous regressors.
>> >
>> > To see if both coeffs are identified, you should use either the weak- or the
>> under-identification statistic reported by ivreg2.  You can also use the
>> Angrist-Pischke (A-P) first-stage F stats to see whether one or the other
>> coeffs is identified.  More details about these in the ivreg2 help file and the
>> references therein.
>> >
>> > HTH,
>> > Mark
>> >
>> >> -----Original Message-----
>> >> From: owner-statalist@hsphsun2.harvard.edu [mailto:owner-
>> >> statalist@hsphsun2.harvard.edu] On Behalf Of Jason Wichert
>> >> Sent: 29 May 2013 21:18
>> >> To: statalist@hsphsun2.harvard.edu
>> >> Subject: st: Testing for instrument relevance and overidentification
>> >> when the endogeneous variable is used in interaction terms
>> >>
>> >> Dear Statalisters,
>> >>
>> >> I have encountered some difficulties concerning 2SLS estimation when
>> >> the endogeneous variable is also used to construct interaction terms.
>> >>
>> >> After digging through the archives, I found a lot of helpful comments
>> >> concerning the procedure:
>> >>
>> >> http://www.stata.com/statalist/archive/2012-05/msg00970.html
>> >> http://www.stata.com/statalist/archive/2011-08/msg01485.html
>> >> http://www.stata.com/statalist/archive/2011-12/msg00705.html
>> >> http://www.stata.com/statalist/archive/2010-04/msg00759.html
>> >> http://www.stata.com/statalist/archive/2005-05/msg00150.html
>> >> http://www.stata.com/statalist/archive/2008-10/msg01009.html
>> >> http://www.stata.com/statalist/archive/2004-08/msg00779.html
>> >>
>> >> Following this advice, I am running an equation of the basic form
>> >>
>> >> ivreg2 y ex (en en_ex = z ex_z)
>> >>
>> >> In my case, there are two exogeneous variables interacted with the
>> >> endogeneous variable. Furthermore, I need interactions of those
>> >> squared exogeneous variables and the endogeneous variables. Leaving
>> >> additional control variables and further instruments aside, this
>> >>
>> >> ivreg2 y ex1 ex2 (en ex1_en ex2_en (ex1)^2_en (ex2)^2_en = z ex1_z
>> >> ex2_z (ex1)^2_z (ex2)^2_z)
>> >>
>> >> So far, so good. However, I’m not sure as to how exactly examine
>> >> instrument relevance and exogeneity, and which statistics/tests to report.
>> >>
>> >> As regards instrument relevance, as to be assessed by the first stage
>> >> F statistic, the F-statistics on “en” clearly differ depending on
>> >> whether I instrument solely for “en”, or whether I also instrument
>> >> for the linear and non-linear interaction terms built around “en”.
>> >> Which F statistic is the correct one to refer to?
>> >>
>> >> Considering I have multiple instruments Z, I am also not sure which
>> >> overidentification tests and results I should rely on and report.  As
>> >> holds for the F statistics, the tests of overidentifying restrictions
>> >> (Sargan N*R-sq test as well as Basmann test) provided by both ivreg2
>> >> and overid differ between instrumenting solely “en” or also for the
>> interaction terms build around “en”.
>> >>
>> >> Any help is greatly appreciated!
>> >> Jason
>> >>
>> >> *
>> >> *   For searches and help try:
>> >> *   http://www.stata.com/help.cgi?search
>> >> *   http://www.stata.com/support/faqs/resources/statalist-faq/
>> >> *   http://www.ats.ucla.edu/stat/stata/
>> >
>> >
>> > -----
>> > Sunday Times Scottish University of the Year 2011-2013 Top in the UK
>> > for student experience Fourth university in the UK and top in Scotland
>> > (National Student Survey 2012)
>> >
>> > We invite research leaders and ambitious early career researchers to
>> > how to apply.
>> >
>> > Heriot-Watt University is a Scottish charity registered under charity
>> > number SC000278.
>> >
>> >
>> > *
>> > *   For searches and help try:
>> > *   http://www.stata.com/help.cgi?search
>> > *   http://www.stata.com/support/faqs/resources/statalist-faq/
>> > *   http://www.ats.ucla.edu/stat/stata/
>>
>> *
>> *   For searches and help try:
>> *   http://www.stata.com/help.cgi?search
>> *   http://www.stata.com/support/faqs/resources/statalist-faq/
>> *   http://www.ats.ucla.edu/stat/stata/
>
>
> -----
> Sunday Times Scottish University of the Year 2011-2013
> Top in the UK for student experience
> Fourth university in the UK and top in Scotland (National Student Survey 2012)
>
>
> We invite research leaders and ambitious early career researchers to
> to apply.
>
> Heriot-Watt University is a Scottish charity
> registered under charity number SC000278.
>
>
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/faqs/resources/statalist-faq/
> *   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/
```