Stata: Data Analysis and Statistical Software

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: RE: Testing for instrument relevance and overidentification when the endogeneous variable is used in interaction terms

From	Jason Wichert <[email protected]>
To	[email protected]
Subject	Re: st: RE: Testing for instrument relevance and overidentification when the endogeneous variable is used in interaction terms
Date	Fri, 7 Jun 2013 13:35:21 +0200
Mark,

Alright, that’s a relief. However, as promised (threatened?), I’ve
come up with some additional questions.

Isn’t the approach discussed most recently, i.e.

[1] ivreg2 y controls ex1 ex2 (en en_ex1 en_ex2 = enhat enhat_ex1 enhat_ex2)
 (notation: ex1, ex2 = exogenous variables; en = endogenous variable;
z1, z2 = instruments for en)

in effect the same as rolling additional FSR’s (for the lack of
appropriate word) of the type

[2] regress en_ex1 controls ex1 ex2 enhat enhat_ex1

and

[3] regress en_ex2 controls ex1 ex2 enhat enhat_ex2

just with the additional/unnecessary respective instruments enhat_ex2
and enhat_ex1 in [2] and [3]?

While tedious and error-prone, following approach [1] doesn’t seem to
“cure” the 2SLS test results, since I’m again instrumenting several
interaction terms by instruments weakly or un-correlated to the
instrumented terms (such as enhat_ex2 to ex1_en).

I read up on the “control function approach” suggested largely by
Wooldridge as an alternative, which he mentions in his 2002 version of
"Econometric analysis of cross section and panel data", a comment he
made in http://www.stata.com/statalist/archive/2011-03/msg00187.html ,
as well as lecture slides I found online (
http://www.eief.it/files/2011/10/slides_3_controlfuncs.pdf ). In the
latter, regarding to forbidden regressions and the control function
approach, he states “Danger with plugging in fitted values for y2 [the
endog. variable] is that one might be tempted to plug y2_hat into
nonlinear functions, say (y2)^2 or y2_z1. This does not result in
consistent estimation of the scaled parameters or the partial effects.
If we believe y2 has a linear RF with additive normal error
independent of z, the addition of v2_hat solves the endogeneity
problem *regardless* of how y2 appears.”

I’m considering to run such sole control function, predict the
residual, and incorporate in my OLS or what otherwise might be the
second stage of my 2SLS, with all the interaction terms (for the sake
of brevity, leaving quadratic terms of ex1 and ex2 aside), i.e.

regress en ex1 ex2 z1 z2 en1_z1 en1_z2 en2_z1 en2_z2

predict en_resid, resid

regress y ex1 ex2 en ex1_en ex2_en en_resid

This almost sounds too simple to be true to my naïve understanding of
the matter. So again, any feedback is highly appreciated.

Jason


On Thu, Jun 6, 2013 at 10:26 PM, Schaffer, Mark E <[email protected]> wrote:
> Jason,
>
> I was just generalizing - in your case there's only one preliminary regression, namely the one to get en_hat.
>
> Cheers,
> Mark
>
>> -----Original Message-----
>> From: [email protected] [mailto:owner-
>> [email protected]] On Behalf Of Jason Wichert
>> Sent: 06 June 2013 13:27
>> To: [email protected]
>> Subject: Re: st: RE: Testing for instrument relevance and overidentification
>> when the endogeneous variable is used in interaction terms
>>
>> Mark,
>>
>> As multiple times before, thank you very much. However, you got me a little
>> confused with your statement “when you do the various preliminary
>> regressions” for getting fitted values. My understanding was to solely get
>> fitted values for my one truly endogenous variable “en” from a single
>> regression of “en” on all included and excluded instruments (including ex1
>> and ex2, which are to be interacted with “en”), to then form interactions of
>> the fitted/predicted values of “en” with ex1 and ex2, and ultimately use
>> those interactions (enhat, enhat_ex1, enhat_ex2) as instruments for en,
>> en_ex1, en_ex2 in ivreg2.
>> Apologies for asking again, but considering the difficulties encountered and
>> discussed so far, I want to make sure to follow the correct procedure and
>> stay away from any territory of forbidden regressions and the likes.
>>
>> I’m afraid you and statalist won’t have heard from me and this issue for the
>> last time just yet.
>>
>> Kind regards,
>> Jason
>>
>> On Thu, Jun 6, 2013 at 1:30 PM, Schaffer, Mark E <[email protected]>
>> wrote:
>> > Jason,
>> >
>> > I think that's right.  If it's the procedure I think you have in mind, the
>> intuition behind it is that you are fairly confident that your included and
>> excluded instruments (ex and z in your notation, I think) in various forms
>> (levels, squares, interactions with each other, etc.) are all valid instruments.
>> When you do the various preliminary regressions ("first stage" is probably
>> the wrong term - it's not the same thing as the 1st stage of 2SLS) and get
>> fitted values, those fitted values are linear combinations of various valid
>> instruments.  Since they're linear combinations of exogenous things, they're
>> also exogenous and can be used as excluded instruments or interacted with
>> other exogenous variables to get still more instruments.  The reason to use
>> linear combinations instead of the variables separately is to avoid the various
>> problems that come from using a large number of excluded instruments.  Of
>> course, there's a lot of other stuff you also have to believe for this to !
>> >  work, but that's your call ... good luck!
>> >
>> > HTH,
>> > Mark
>> >
>> >> -----Original Message-----
>> >> From: [email protected] [mailto:owner-
>> >> [email protected]] On Behalf Of Jason Wichert
>> >> Sent: 06 June 2013 08:50
>> >> To: [email protected]
>> >> Subject: Re: st: RE: Testing for instrument relevance and
>> >> overidentification when the endogeneous variable is used in
>> >> interaction terms
>> >>
>> >> Mark,
>> >>
>> >> Unfortunately I need both of them. It’s been established both
>> >> empirically as well as analytically to use both ex1 and ex2 as
>> >> distinct constructs of good and bad characteristics at the same time, and I
>> try to examine whether “en”
>> >> moderates the links between ex1, ex2, and y.
>> >>
>> >> As alternatives to using levels or quadratic levels of the ex1 and
>> >> ex2 interacted with en, I already played around with classifications
>> >> into quartiles and quintiles, corroborating my results (I didn’t do
>> >> deciles due to the size of my sample). However, since “en” is highly
>> >> influenced by both ex1 and ex2, I’m afraid every reviewer is
>> >> inevitably going to ask for remedies about endogeneity extending a set of
>> appropriate control variables or fixed effects.
>> >> Depending on how you look at it, (un)fortunately, not too many
>> >> researchers have taken on this endogeneity issue; the ones that do
>> >> largely instrument “en”
>> >> and simply control for ex1 and/or ex2 absent any interactions.
>> >>
>> >> The one paper most closely related in so far as that it uses "en" in
>> >> interaction terms, instruments "en" and several interaction terms
>> >> using the fitted first stage values; he runs FSR's for "en" and all
>> >> exogeneous variables, then instruments "en" as well as the
>> >> interaction terms by enhat as well as interactions of the exogenous
>> >> variables with enhat, to lastly use the fitted values in the second
>> >> stage. Therefore, his model is just-identified and the only test statistic he
>> reports is the A/P(2009) F-stat.
>> >>
>> >> While such approach has been mentioned here on statalist before (e.g.
>> >> in http://www.stata.com/statalist/archive/2011-08/msg01496.html in a
>> >> reply to http://www.stata.com/statalist/archive/2011-08/msg01485.html
>> >> ; in http://www.stata.com/statalist/archive/2011-12/msg00718.html in
>> >> a reply to
>> >> http://www.stata.com/statalist/archive/2011-12/msg00705.html
>> >> ), but then again, just because something has been published before
>> >> and maybe even cited, it doesn’t necessarily have to be the only,
>> >> correct, or only correct way, I guess.
>> >>
>> >> Searching for alternatives, however, I’m tempted to give this
>> >> approach at least a try in my setting. Please correct me where I’m
>> >> wrong, but the procedure (leaving quadratic variables aside) would
>> >> look like the
>> >> following:
>> >>
>> >> 1. I regress “en” on all instruments:
>> >>
>> >> regress en ex1 ex2 controls z1 z2
>> >>
>> >> 2. I predict enhat:
>> >>
>> >> predict enhat
>> >>
>> >> 3. I form interactions of ex1_enhat ex2_enhat
>> >>
>> >> 4. I run a 2SLS and instrument for the endogenous interaction terms
>> >> by way of the generated interactions:
>> >>
>> >> ivreg2 y controls ex1 ex2 (en en_ex1 en_ex2 = enhat enhat_ex1
>> >> enhat_ex2)
>> >>
>> >> Is there anything I’m missing or that I should be cautious of?
>> >>
>> >> Again, thank you very much in advance.
>> >>
>> >> On Thu, Jun 6, 2013 at 1:29 AM, Schaffer, Mark E
>> >> <[email protected]>
>> >> wrote:
>> >> > Jason,
>> >> >
>> >> >> -----Original Message-----
>> >> >> From: [email protected] [mailto:owner-
>> >> >> [email protected]] On Behalf Of Jason Wichert
>> >> >> Sent: 05 June 2013 23:07
>> >> >> To: [email protected]
>> >> >> Subject: Re: st: RE: Testing for instrument relevance and
>> >> >> overidentification when the endogeneous variable is used in
>> >> >> interaction terms
>> >> >>
>> >> >> Mark,
>> >> >>
>> >> >> Thanks much and apologies for the lack of clarification. Yes, from
>> >> >> model [1] to model [2] the C-D F-stat drastically declines, from
>> >> >> well rejecting weak identification (depending on the choice of
>> >> >> instruments, the F-stat ranges between 15 and 20) to values of
>> >> >> around
>> >> >> 4 or 5. I assume this decline, as well as the increase in the
>> >> >> Sargan-statistic (from clear rejection of the null to failure to
>> >> >> reject), to result from the strong (weak) correlations between my
>> >> >> excluded instruments and the dependent variable (some of the
>> >> endogenous regressors); e.g.
>> >> >> the excluded instrument ex1_z1 is strongly correlated to y,
>> >> >> whereas it is barely correlated to the endogenous interaction term
>> >> >> ex2_en. I will definitely look closer into the Sargan-Hansen
>> >> >> statistic and try to get a feel for what the tests show when I
>> >> >> completely drop either
>> >> >> ex1 or ex2, thus at the very least increasing the correlation
>> >> >> between the excluded instruments and the endogenous regressors.
>> >> >>
>> >> >> The forbidden regressions I already did read up on amidst my
>> >> >> crusade through the statalist archives, searching for guidance on my
>> problems.
>> >> >> While I’m admittedly nervous about the error-proneness, I thought
>> >> >> the procedure suggested by Jeffrey Wooldridge
>> >> >> (http://www.stata.com/statalist/archive/2011-03/msg00188.html )
>> >> >> might allow me to instrument only “en” in the third step. Do you
>> >> >> see any other feasible way to reduce the ivreg2 command and the
>> >> >> respective tests to ultimately just one endogenous variable?
>> >> >
>> >> > Are both the endogenous regressors actually of interest?  Could you
>> >> > drop
>> >> one or the other?
>> >> >
>> >> > If one of the regressors is the one you really care about, and the
>> >> > other is
>> >> there because you're worried about omitted variable bias, a halfway
>> >> house that might work would be a semi-reduced form: drop the
>> >> endogenous regressor that isn't interesting, and add selected
>> >> instruments to the regression as regressors.  Hard to tell whether
>> >> this is appropriate in your case
>> >> - it probably isn't - but worth mentioning anyway.
>> >> >
>> >> > --Mark
>> >> >
>> >> >> On Wed, Jun 5, 2013 at 11:37 PM, Schaffer, Mark E
>> >> >> <[email protected]>
>> >> >> wrote:
>> >> >> > Jason,
>> >> >> >
>> >> >> >> -----Original Message-----
>> >> >> >> From: [email protected] [mailto:owner-
>> >> >> >> [email protected]] On Behalf Of Jason Wichert
>> >> >> >> Sent: 05 June 2013 21:49
>> >> >> >> To: [email protected]
>> >> >> >> Subject: Re: st: RE: Testing for instrument relevance and
>> >> >> >> overidentification when the endogeneous variable is used in
>> >> >> >> interaction terms
>> >> >> >>
>> >> >> >> Alright, now here s some more issues I have encountered.
>> >> >> >>
>> >> >> >> Using just one endogenous variable “en” in the model
>> >> >> >>
>> >> >> >> [1] ivreg2 y ex1 ex2 controls (en = z1 z2),
>> >> >> >>
>> >> >> >> the respective test statistics are just fine. However, when
>> >> >> >> also incorporating interaction terms of the kind
>> >> >> >>
>> >> >> >> [2] ivreg2 y ex1 ex2 control (en en_ex1 en_ex2 = z1 z2 z1_ex1
>> >> >> >> z1_ex2
>> >> >> >> z2_ex1 z2_ex2)
>> >> >> >>
>> >> >> >> as well as quadratic interaction terms, I’m having issues with
>> >> >> >> the test
>> >> >> results.
>> >> >> >> In particular:
>> >> >> >>
>> >> >> >> - Stock/Yogo (2005) have calculated critical values for the
>> >> >> >> Cragg-Donald (1993) F-statistic only for up to three endogenous
>> >> >> >> variables. While the critical values provided don’t differ too
>> >> >> >> much among 1, 2, and 3 endogenous variables and such references
>> >> >> >> might be eyeballed, does anyone know about exact critical
>> >> >> >> values in the case of
>> >> >> more than three endogenous regressors?
>> >> >> >
>> >> >> > I don't think they've been compiled.  But no one should mind if
>> >> >> > you are a
>> >> >> bit hand-wavey in your writeup at this point.  Rough magnitudes
>> >> >> are still informative.
>> >> >> >
>> >> >> >> - as further regards the Cragg-Donald (1993) F-statistic to
>> >> >> >> test for weak identification, I notice an implosion of the
>> >> >> >> F-statistic from model [1] to model [2].
>> >> >> >
>> >> >> > Do you mean it gets very small, so the model becomes weakly
>> >> identified?
>> >> >> >
>> >> >> >> Since the null of C-D states that the instruments are *jointly*
>> >> >> >> only weakly correlated with the endogenous regressors, I
>> >> >> >> naively assume the small F- statistic results from the 2SLS
>> >> >> >> procedure in my case, since many of the instruments are
>> >> >> >> strongly correlated to the endogenous variables, i.e. the
>> >> >> >> interaction terms, by construction (e.g. z1_ex1 is highly
>> >> >> >> correlated to en_ex1). Could somebody confirm
>> >> >> this?
>> >> >> >
>> >> >> > Not sure what you mean, to be honest.  My gut feeling is that
>> >> >> > you are
>> >> >> expecting a lot of your instruments for them to be correlated with
>> >> >> the endogenous regressors not just in levels but also via interactions.
>> >> >> >
>> >> >> >> - on a side note, the Kleibergen-Paap (2006) statistic of
>> >> >> >> underidentification does just fine in each model.
>> >> >> >>
>> >> >> >> - a similar concerns regards the Sargan/Hansen statistic of
>> >> >> >> overidentification, which tests whether *any* of the
>> >> >> >> instruments fail the orthogonality criterion. Since I know ex1
>> >> >> >> and ex2 are highly correlated to y, so should the constructed
>> >> >> >> instruments z1_ex1, z1_ex2, etc., right? Therefore, I naively
>> >> >> >> interpret the exploding Sargan-statistic (from failure to
>> >> >> >> reject the null of overidentification with a p-value of around
>> >> >> >> 0.7 to complete rejection at
>> >> >> 0.00) as a mere by-product of my model specification, correct?
>> >> >> >
>> >> >> > Again, not sure what you mean.  The only thing I can suggest
>> >> >> > here is that
>> >> >> perhaps you can work out which of the orthogonality conditions
>> >> >> you're violating.  One of the ways to think about a failure of the
>> >> >> Sargan-Hansen test is that your instruments are identifying
>> >> >> "different betas", in the same way that a Hausman test gives you a
>> >> >> big test stat when the two estimated betas are very different.  It
>> >> >> might be worth comparing the results using the full set of IVs
>> >> >> based on z1 and z2 and their interactions, the results using just
>> >> >> z1 and its
>> >> interactions, and the results using just z2 and its interactions.
>> >> >> >
>> >> >> >> If my naïve assumptions were true, would it allow for a
>> >> >> >> stricter testing procedure to use a different approach to the
>> >> >> >> model
>> >> specification?
>> >> >> >>
>> >> >> >> In particular, instead of the setup as indicated by [2], I
>> >> >> >> might be tempted to try a different approach, such as
>> >> >> >> regressing “en” on all instruments (included exogenous controls
>> >> >> >> as well as excluded
>> >> >> >> instruments) to get predictions enhat and then forming
>> >> >> >> interactions enhat_ex1, enhat_ex2, enhat_(ex1)^2,
>> >> >> >> enhat_(ex2)^2, taking into account the incorrect standard
>> >> >> >> errors. Would that seem
>> >> likely to help?
>> >> >> >
>> >> >> > This sounds an awful lot like the "Forbidden Regression".  (And
>> >> >> > the name
>> >> >> pretty much tells you how this is going to pan out.)  If you
>> >> >> google that term you'll find it very quickly, or if you have
>> >> >> Angrist and Pischke's "Mostly Harmless Econometrics" it's covered in
>> there.
>> >> >> >
>> >> >> > --Mark
>> >> >> >
>> >> >> >>
>> >> >> >> Again, thanks much in advance for anyone (putting my hopes on
>> >> >> >> Mark
>> >> >> >> here) providing useful advice!
>> >> >> >>
>> >> >> >>
>> >> >> >> On Tue, Jun 4, 2013 at 8:49 PM, Jason Wichert
>> >> >> >> <[email protected]>
>> >> >> >> wrote:
>> >> >> >> > Mark,
>> >> >> >> >
>> >> >> >> > Again, thank you so much for your feedback.
>> >> >> >> >
>> >> >> >> > As regards the endogeneity tests, I’m actually using the
>> >> >> >> > endog option in ivreg2. On a side note, thank you guys for
>> >> >> >> > this excellent tool and the detailed explanations in the
>> >> >> >> > articles/versions of 2003 and
>> >> >> 2007.
>> >> >> >> >
>> >> >> >> > As regards your fourth point, concerning additional interactions:
>> >> >> >> > in my case, ex1 and ex2 are distinct constructs, measures of
>> >> >> >> > good
>> >> >> >> > (ex1) and poor (ex2) company performance in a certain sense,
>> >> >> >> > similar to Herzberg’s two-factor theory of motivators and
>> >> >> >> > hygiene factors. They both have (different) non-linear
>> >> >> >> > associations to my measure of financial performance (y),
>> >> >> >> > which luckily has been largely established in empirical
>> >> >> >> > research, as have the influences of both ex1 and ex2 on the
>> >> >> >> > endogenous variable. Less established so far, however, is the
>> >> >> >> > moderating effect of my endogenous variable on either link
>> >> >> >> > between
>> >> >> >> > ex1/ex2 and y. This moderating effect (at least according to
>> >> >> >> > my humble analyses), which my research focuses on, differs
>> >> >> >> > between the levels of
>> >> >> >> > ex1 and ex2, as indicated by the significant interaction
>> >> >> >> > terms of different sign between say ex1_en and (ex1)^2_en;
>> >> >> >> > hence my interactions of the linear and quadratic terms of
>> >> >> >> > both ex1 and
>> >> >> >> > ex2 with en. Leaving ex2, en and controls aside, my results
>> >> >> >> > indicate
>> >> >> >> >
>> >> >> >> > y = 0.363 ex1 – 0.032 (ex1)^2 – 0.007 ex1_en + 0.001
>> >> >> >> > (ex1)^2_en
>> >> >> >> >
>> >> >> >> > With all coefficients highly significant, I interpret these
>> >> >> >> > results as decreasing marginal returns to ex1 or an inverted
>> >> >> >> > U-shaped relationship between ex1 and y with the inflection
>> >> >> >> > point in the first quadrant. While the moderating effect of
>> >> >> >> > en on ex1 is largely negative (as indicated by the negative
>> >> >> >> > coefficient on ex1_en), this negative effect is attenuated
>> >> >> >> > for high levels of ex1 (as indicated by the positive
>> >> >> >> > coefficient on (ex1)^2_en). Unrelated to my initial
>> >> >> >> > questions, does this
>> >> interpretation seem to make sense?
>> >> >> >> >
>> >> >> >> > In various preliminary analyses, luckily(?!) I did not find
>> >> >> >> > any non-linear associations between en and y, at the very
>> >> >> >> > least saving me additional nightmares of the econometric and
>> >> >> >> > economic
>> >> kinds.
>> >> >> >> >
>> >> >> >> > Kind regards,
>> >> >> >> >
>> >> >> >> > Jason
>> >> >> >> >
>> >> >> >> > On Tue, Jun 4, 2013 at 8:07 PM, Schaffer, Mark E
>> >> >> >> > <[email protected]>
>> >> >> >> wrote:
>> >> >> >> >> Jason,
>> >> >> >> >>
>> >> >> >> >>> -----Original Message-----
>> >> >> >> >>> From: [email protected] [mailto:owner-
>> >> >> >> >>> [email protected]] On Behalf Of Jason Wichert
>> >> >> >> >>> Sent: 04 June 2013 12:47
>> >> >> >> >>> To: [email protected]
>> >> >> >> >>> Subject: Re: st: RE: Testing for instrument relevance and
>> >> >> >> >>> overidentification when the endogeneous variable is used in
>> >> >> >> >>> interaction terms
>> >> >> >> >>>
>> >> >> >> >>> Mark,
>> >> >> >> >>>
>> >> >> >> >>> Thank you very much for your feedback (and all the other
>> >> >> >> >>> excellent comments on 2SLS you made on statalist). It's not
>> >> >> >> >>> the usual regression 101, so it actually took me a couple
>> >> >> >> >>> of days to work through all the respective IV statistics,
>> >> >> >> >>> hence my late
>> >> reply.
>> >> >> >> >>>
>> >> >> >> >>> My analyses start with just the one endogenous regressor
>> >> >> >> >>> and are subsequently extended to incorporate the
>> endogenous
>> >> >> >> >>> interaction
>> >> >> >> terms.
>> >> >> >> >>>
>> >> >> >> >>> In the base case of just that one endogenous variable, i.e.
>> >> >> >> >>>
>> >> >> >> >>>  ivreg2 y ex1 ex2 (en = z1 z2)
>> >> >> >> >>>
>> >> >> >> >>> I intend to present the first stage F-statistics (to reject
>> >> >> >> >>> weak identification of my endogenous variable), results
>> >> >> >> >>> from the Sargan/Hansen overidentification test (to test
>> >> >> >> >>> whether the instruments are jointly exogenous), as well as
>> >> >> >> >>> a partial R² (to assess instrument relevance), and a
>> >> >> >> >>> Hausman test for
>> >> endogeneity.
>> >> >> >> >>
>> >> >> >> >> That sounds fine.  Two minor suggestions:
>> >> >> >> >>
>> >> >> >> >> 1.  The first-stage F stat makes the partial R-sq redundant.
>> >> >> >> >> No need to
>> >> >> >> report it or anything like it in the case of a single
>> >> >> >> endogenous
>> >> regressor.
>> >> >> >> >>
>> >> >> >> >> 2.  You can get ivreg2 to report an endogeneity test for you
>> >> >> >> >> by using the
>> >> >> >> endog option.
>> >> >> >> >>
>> >> >> >> >>>
>> >> >> >> >>> In the extended case of (*gasp*)
>> >> >> >> >>>
>> >> >> >> >>>  ivreg2 y ex1 ex2 (ex1)^2 (ex2)^2 (en ex1_en ex2_en
>> >> >> >> >>> (ex1)^2_en (ex2)^2_en = z1 z2 ex1_z1 ex1_z2 (ex1)^2_z1
>> >> >> >> >>> (ex1)^2_z2 ex2_z1
>> >> >> >> >>> ex2_z2
>> >> >> >> >>> (ex2)^2_z1 (ex2)^2_z2)
>> >> >> >> >>>
>> >> >> >> >>> I intend to present results from the Sargan/Hansen
>> >> >> >> >>> overidentification test, results from the Anderson/Rubin
>> >> >> >> >>> (1949) [or potentially Stock/Wright (2000)] test to
>> >> >> >> >>> indicate that all the endogenous regressors are jointly
>> >> >> >> >>> significant in the second stage, the Kleibergen/Paap (2006)
>> >> >> >> >>> statistic of underidentification of the model (i.e. the
>> >> >> >> >>> joint endogenous
>> >> >> >> >>> regressors) , the Cragg/Donald
>> >> >> >> >>> (1993) statistic of weak identification of the model, the
>> >> >> >> >>> Angrist/Pischke
>> >> >> >> >>> (2009) statistics for identification of each of the
>> >> >> >> >>> endogenous regressors, as well as a Hausman test for
>> endogeneity.
>> >> >> >> >>>
>> >> >> >> >>> Is there something blatant obvious I’m missing or anything
>> >> >> >> >>> I could well leave out, particularly in the extended case?
>> >> >> >> >>> In particular I’m wondering about
>> >> >> >> >>>
>> >> >> >> >>> a) the necessity of the A/R-test, considering most all of
>> >> >> >> >>> my endogenous variables are highly significant in the
>> >> >> >> >>> second stage, as indicated by their respective t- and
>> >> >> >> >>> p-values,
>> >> >> >> >>>
>> >> >> >> >>> b) the necessity of presenting both K/P as well as C/D
>> >> >> >> >>> statistics,
>> >> >> >> >>>
>> >> >> >> >>> c) the necessity of the Hausman test in the extended case.
>> >> >> >> >>>
>> >> >> >> >>> Again, thank you very much in advance for your feedback!
>> >> >> >> >>
>> >> >> >> >> Let's see...
>> >> >> >> >>
>> >> >> >> >> 1.  The K-P test for underidentification is reported by
>> >> >> >> >> ivreg2 mostly for
>> >> >> >> completeness.  If you reject weak identification based on C-D,
>> >> >> >> you are also rejecting underidentification.  So you could omit K-P.
>> >> >> >> >>
>> >> >> >> >> 2.  Most people probably wouldn't report the A-R test unless
>> >> >> >> >> there were
>> >> >> >> signs of weak identification (in which case they might consider
>> >> >> >> using
>> >> >> >> weak- instrument-robust methods, e.g., rivtest).
>> >> >> >> >>
>> >> >> >> >> 3.  On Hausman, sample point above applies - you can get
>> >> >> >> >> ivreg2 to report
>> >> >> >> the endog test by using the endog option.  Maybe you have
>> >> >> >> priors about whether one or a subset of your endogenous
>> >> >> >> regressors should be tested rather than the whole lot at once.
>> >> >> >> >>
>> >> >> >> >> 4.  You didn't ask about this but worth mentioning anyway -
>> >> >> >> >> when people
>> >> >> >> introduce quadratics in the way you are doing, they often
>> >> >> >> include the interactions.  In your case that means the
>> >> >> >> interaction of ex1 and ex2 and similarly for the other
>> >> >> >> regressors and instruments (and if you were really serious
>> >> >> >> about it, you'd probably interact the endogenous and exogenous
>> >> >> >> regressors too).  The slightly hand-wavey justification would be a
>> Taylor approximation.
>> >> >> >> >>
>> >> >> >> >> HTH,
>> >> >> >> >> Mark
>> >> >> >> >>
>> >> >> >> >>>
>> >> >> >> >>> On Fri, May 31, 2013 at 8:30 PM, Schaffer, Mark E
>> >> >> >> >>> <[email protected]>
>> >> >> >> >>> wrote:
>> >> >> >> >>> > Jason,
>> >> >> >> >>> >
>> >> >> >> >>> > I think the key point is that in your estimation
>> >> >> >> >>> >
>> >> >> >> >>> > ivreg2 y ex (en en_ex = z ex_z)
>> >> >> >> >>> >
>> >> >> >> >>> > just looking at the two standard first-stage F stats isn't
>> enough.
>> >> >> >> >>> > You can
>> >> >> >> >>> easily get 2 large first-stage F stats, and yet the model
>> >> >> >> >>> is underidentified because there isn't enough information
>> >> >> >> >>> in your instruments to simultaneously identify the coeffs
>> >> >> >> >>> on both your
>> >> >> >> endogenous regressors.
>> >> >> >> >>> >
>> >> >> >> >>> > To see if both coeffs are identified, you should use
>> >> >> >> >>> > either the
>> >> >> >> >>> > weak- or the
>> >> >> >> >>> under-identification statistic reported by ivreg2.  You can
>> >> >> >> >>> also use the Angrist-Pischke (A-P) first-stage F stats to
>> >> >> >> >>> see whether one or the other coeffs is identified.  More
>> >> >> >> >>> details about these in the
>> >> >> >> >>> ivreg2 help file and the references therein.
>> >> >> >> >>> >
>> >> >> >> >>> > HTH,
>> >> >> >> >>> > Mark
>> >> >> >> >>> >
>> >> >> >> >>> >> -----Original Message-----
>> >> >> >> >>> >> From: [email protected]
>> >> >> >> >>> >> [mailto:owner- [email protected]] On Behalf
>> >> >> >> >>> >> Of Jason Wichert
>> >> >> >> >>> >> Sent: 29 May 2013 21:18
>> >> >> >> >>> >> To: [email protected]
>> >> >> >> >>> >> Subject: st: Testing for instrument relevance and
>> >> >> >> >>> >> overidentification when the endogeneous variable is used
>> >> >> >> >>> >> in interaction terms
>> >> >> >> >>> >>
>> >> >> >> >>> >> Dear Statalisters,
>> >> >> >> >>> >>
>> >> >> >> >>> >> I have encountered some difficulties concerning 2SLS
>> >> >> >> >>> >> estimation when the endogeneous variable is also used to
>> >> >> >> >>> >> construct interaction
>> >> >> >> terms.
>> >> >> >> >>> >>
>> >> >> >> >>> >> After digging through the archives, I found a lot of
>> >> >> >> >>> >> helpful comments concerning the procedure:
>> >> >> >> >>> >>
>> >> >> >> >>> >> http://www.stata.com/statalist/archive/2012-
>> >> 05/msg00970.htm
>> >> >> >> >>> >> l
>> >> >> >> >>> >> http://www.stata.com/statalist/archive/2011-
>> >> 08/msg01485.htm
>> >> >> >> >>> >> l
>> >> >> >> >>> >> http://www.stata.com/statalist/archive/2011-
>> >> 12/msg00705.htm
>> >> >> >> >>> >> l
>> >> >> >> >>> >> http://www.stata.com/statalist/archive/2010-
>> >> 04/msg00759.htm
>> >> >> >> >>> >> l
>> >> >> >> >>> >> http://www.stata.com/statalist/archive/2005-
>> >> 05/msg00150.htm
>> >> >> >> >>> >> l
>> >> >> >> >>> >> http://www.stata.com/statalist/archive/2008-
>> >> 10/msg01009.htm
>> >> >> >> >>> >> l
>> >> >> >> >>> >> http://www.stata.com/statalist/archive/2004-
>> >> 08/msg00779.htm
>> >> >> >> >>> >> l
>> >> >> >> >>> >>
>> >> >> >> >>> >> Following this advice, I am running an equation of the
>> >> >> >> >>> >> basic form
>> >> >> >> >>> >>
>> >> >> >> >>> >> ivreg2 y ex (en en_ex = z ex_z)
>> >> >> >> >>> >>
>> >> >> >> >>> >> In my case, there are two exogeneous variables
>> >> >> >> >>> >> interacted with the endogeneous variable. Furthermore, I
>> >> >> >> >>> >> need interactions of those squared exogeneous variables
>> >> >> >> >>> >> and the endogeneous
>> >> >> variables.
>> >> >> >> >>> >> Leaving additional control variables and further
>> >> >> >> >>> >> instruments aside, this already leads to the following
>> >> >> >> >>> >> simplified
>> >> regression:
>> >> >> >> >>> >>
>> >> >> >> >>> >> ivreg2 y ex1 ex2 (en ex1_en ex2_en (ex1)^2_en (ex2)^2_en
>> >> >> >> >>> >> = z ex1_z ex2_z (ex1)^2_z (ex2)^2_z)
>> >> >> >> >>> >>
>> >> >> >> >>> >> So far, so good. However, I’m not sure as to how exactly
>> >> >> >> >>> >> examine instrument relevance and exogeneity, and which
>> >> >> >> >>> >> statistics/tests to
>> >> >> >> report.
>> >> >> >> >>> >>
>> >> >> >> >>> >> As regards instrument relevance, as to be assessed by
>> >> >> >> >>> >> the first stage F statistic, the F-statistics on “en”
>> >> >> >> >>> >> clearly differ depending on whether I instrument solely
>> >> >> >> >>> >> for “en”, or whether I also instrument for the linear
>> >> >> >> >>> >> and non-linear interaction terms built
>> >> >> >> around “en”.
>> >> >> >> >>> >> Which F statistic is the correct one to refer to?
>> >> >> >> >>> >>
>> >> >> >> >>> >> Considering I have multiple instruments Z, I am also not
>> >> >> >> >>> >> sure which overidentification tests and results I should
>> >> >> >> >>> >> rely on and report.  As holds for the F statistics, the
>> >> >> >> >>> >> tests of overidentifying restrictions (Sargan N*R-sq
>> >> >> >> >>> >> test as well as Basmann test) provided by both ivreg2
>> >> >> >> >>> >> and overid differ between instrumenting solely “en” or
>> >> >> >> >>> >> also for the
>> >> >> >> >>> interaction terms build around “en”.
>> >> >> >> >>> >>
>> >> >> >> >>> >> Any help is greatly appreciated!
>> >> >> >> >>> >> Jason
>> >> >> >> >>> >>
>> >> >> >> >>> >> *
>> >> >> >> >>> >> *   For searches and help try:
>> >> >> >> >>> >> *   http://www.stata.com/help.cgi?search
>> >> >> >> >>> >> *   http://www.stata.com/support/faqs/resources/statalist-
>> >> faq/
>> >> >> >> >>> >> *   http://www.ats.ucla.edu/stat/stata/
>> >> >> >> >>> >
>> >> >> >> >>> >
>> >> >> >> >>> > -----
>> >> >> >> >>> > Sunday Times Scottish University of the Year 2011-2013
>> >> >> >> >>> > Top in the UK for student experience Fourth university in
>> >> >> >> >>> > the UK and top in Scotland (National Student Survey 2012)
>> >> >> >> >>> >
>> >> >> >> >>> > We invite research leaders and ambitious early career
>> >> >> >> >>> > researchers to join us in leading and driving research in
>> >> >> >> >>> > key inter-disciplinary
>> >> >> >> themes.
>> >> >> >> >>> > Please see www.hw.ac.uk/researchleaders for further
>> >> >> >> >>> > information and how to apply.
>> >> >> >> >>> >
>> >> >> >> >>> > Heriot-Watt University is a Scottish charity registered
>> >> >> >> >>> > under charity number SC000278.
>> >> >> >> >>> >
>> >> >> >> >>> >
>> >> >> >> >>> > *
>> >> >> >> >>> > *   For searches and help try:
>> >> >> >> >>> > *   http://www.stata.com/help.cgi?search
>> >> >> >> >>> > *   http://www.stata.com/support/faqs/resources/statalist-
>> faq/
>> >> >> >> >>> > *   http://www.ats.ucla.edu/stat/stata/
>> >> >> >> >>>
>> >> >> >> >>> *
>> >> >> >> >>> *   For searches and help try:
>> >> >> >> >>> *   http://www.stata.com/help.cgi?search
>> >> >> >> >>> *   http://www.stata.com/support/faqs/resources/statalist-
>> faq/
>> >> >> >> >>> *   http://www.ats.ucla.edu/stat/stata/
>> >> >> >> >>
>> >> >> >> >>
>> >> >> >> >> -----
>> >> >> >> >> Sunday Times Scottish University of the Year 2011-2013 Top
>> >> >> >> >> in the UK for student experience Fourth university in the UK
>> >> >> >> >> and top in Scotland (National Student Survey 2012)
>> >> >> >> >>
>> >> >> >> >>
>> >> >> >> >> We invite research leaders and ambitious early career
>> >> >> >> >> researchers to join us in leading and driving research in
>> >> >> >> >> key inter-disciplinary
>> >> >> themes.
>> >> >> >> >> Please see www.hw.ac.uk/researchleaders for further
>> >> >> >> >> information and how to apply.
>> >> >> >> >>
>> >> >> >> >> Heriot-Watt University is a Scottish charity registered
>> >> >> >> >> under charity number SC000278.
>> >> >> >> >>
>> >> >> >> >>
>> >> >> >> >> *
>> >> >> >> >> *   For searches and help try:
>> >> >> >> >> *   http://www.stata.com/help.cgi?search
>> >> >> >> >> *   http://www.stata.com/support/faqs/resources/statalist-faq/
>> >> >> >> >> *   http://www.ats.ucla.edu/stat/stata/
>> >> >> >>
>> >> >> >> *
>> >> >> >> *   For searches and help try:
>> >> >> >> *   http://www.stata.com/help.cgi?search
>> >> >> >> *   http://www.stata.com/support/faqs/resources/statalist-faq/
>> >> >> >> *   http://www.ats.ucla.edu/stat/stata/
>> >> >> >
>> >> >> >
>> >> >> > -----
>> >> >> > Sunday Times Scottish University of the Year 2011-2013 Top in
>> >> >> > the UK for student experience Fourth university in the UK and
>> >> >> > top in Scotland (National Student Survey 2012)
>> >> >> >
>> >> >> >
>> >> >> > We invite research leaders and ambitious early career
>> >> >> > researchers to join us in leading and driving research in key inter-
>> disciplinary themes.
>> >> >> > Please see www.hw.ac.uk/researchleaders for further information
>> >> >> > and how to apply.
>> >> >> >
>> >> >> > Heriot-Watt University is a Scottish charity registered under
>> >> >> > charity number SC000278.
>> >> >> >
>> >> >> >
>> >> >> > *
>> >> >> > *   For searches and help try:
>> >> >> > *   http://www.stata.com/help.cgi?search
>> >> >> > *   http://www.stata.com/support/faqs/resources/statalist-faq/
>> >> >> > *   http://www.ats.ucla.edu/stat/stata/
>> >> >>
>> >> >> *
>> >> >> *   For searches and help try:
>> >> >> *   http://www.stata.com/help.cgi?search
>> >> >> *   http://www.stata.com/support/faqs/resources/statalist-faq/
>> >> >> *   http://www.ats.ucla.edu/stat/stata/
>> >> >
>> >> >
>> >> > -----
>> >> > Sunday Times Scottish University of the Year 2011-2013 Top in the
>> >> > UK for student experience Fourth university in the UK and top in
>> >> > Scotland (National Student Survey 2012)
>> >> >
>> >> >
>> >> > We invite research leaders and ambitious early career researchers
>> >> > to join us in leading and driving research in key inter-disciplinary themes.
>> >> > Please see www.hw.ac.uk/researchleaders for further information and
>> >> > how to apply.
>> >> >
>> >> > Heriot-Watt University is a Scottish charity registered under
>> >> > charity number SC000278.
>> >> >
>> >> >
>> >> > *
>> >> > *   For searches and help try:
>> >> > *   http://www.stata.com/help.cgi?search
>> >> > *   http://www.stata.com/support/faqs/resources/statalist-faq/
>> >> > *   http://www.ats.ucla.edu/stat/stata/
>> >>
>> >> *
>> >> *   For searches and help try:
>> >> *   http://www.stata.com/help.cgi?search
>> >> *   http://www.stata.com/support/faqs/resources/statalist-faq/
>> >> *   http://www.ats.ucla.edu/stat/stata/
>> >
>> >
>> > -----
>> > Sunday Times Scottish University of the Year 2011-2013 Top in the UK
>> > for student experience Fourth university in the UK and top in Scotland
>> > (National Student Survey 2012)
>> >
>> > We invite research leaders and ambitious early career researchers to
>> > join us in leading and driving research in key inter-disciplinary themes.
>> > Please see www.hw.ac.uk/researchleaders for further information and
>> > how to apply.
>> >
>> > Heriot-Watt University is a Scottish charity registered under charity
>> > number SC000278.
>> >
>> >
>> > *
>> > *   For searches and help try:
>> > *   http://www.stata.com/help.cgi?search
>> > *   http://www.stata.com/support/faqs/resources/statalist-faq/
>> > *   http://www.ats.ucla.edu/stat/stata/
>>
>> *
>> *   For searches and help try:
>> *   http://www.stata.com/help.cgi?search
>> *   http://www.stata.com/support/faqs/resources/statalist-faq/
>> *   http://www.ats.ucla.edu/stat/stata/
>
>
> -----
> Sunday Times Scottish University of the Year 2011-2013
> Top in the UK for student experience
> Fourth university in the UK and top in Scotland (National Student Survey 2012)
>
> We invite research leaders and ambitious early career researchers to
> join us in leading and driving research in key inter-disciplinary themes.
> Please see www.hw.ac.uk/researchleaders for further information and how
> to apply.
>
> Heriot-Watt University is a Scottish charity
> registered under charity number SC000278.
>
>
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/faqs/resources/statalist-faq/
> *   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/
References:
- Re: st: RE: Testing for instrument relevance and overidentification when the endogeneous variable is used in interaction terms
  - From: Jason Wichert <[email protected]>
- RE: st: RE: Testing for instrument relevance and overidentification when the endogeneous variable is used in interaction terms
  - From: "Schaffer, Mark E" <[email protected]>
- Re: st: RE: Testing for instrument relevance and overidentification when the endogeneous variable is used in interaction terms
  - From: Jason Wichert <[email protected]>
- Re: st: RE: Testing for instrument relevance and overidentification when the endogeneous variable is used in interaction terms
  - From: Jason Wichert <[email protected]>
- RE: st: RE: Testing for instrument relevance and overidentification when the endogeneous variable is used in interaction terms
  - From: "Schaffer, Mark E" <[email protected]>
- Re: st: RE: Testing for instrument relevance and overidentification when the endogeneous variable is used in interaction terms
  - From: Jason Wichert <[email protected]>
- RE: st: RE: Testing for instrument relevance and overidentification when the endogeneous variable is used in interaction terms
  - From: "Schaffer, Mark E" <[email protected]>
- Re: st: RE: Testing for instrument relevance and overidentification when the endogeneous variable is used in interaction terms
  - From: Jason Wichert <[email protected]>
- RE: st: RE: Testing for instrument relevance and overidentification when the endogeneous variable is used in interaction terms
  - From: "Schaffer, Mark E" <[email protected]>
- Re: st: RE: Testing for instrument relevance and overidentification when the endogeneous variable is used in interaction terms
  - From: Jason Wichert <[email protected]>
- RE: st: RE: Testing for instrument relevance and overidentification when the endogeneous variable is used in interaction terms
  - From: "Schaffer, Mark E" <[email protected]>
Prev by Date: RE: st: imputing dates into a string date
Next by Date: AW: st: AW: tabplot how to get rid of Graphs by ?
Previous by thread: RE: st: RE: Testing for instrument relevance and overidentification when the endogeneous variable is used in interaction terms
Next by thread: RE: st: RE: Testing for instrument relevance and overidentification when the endogeneous variable is used in interaction terms
Index(es):
- Date
- Thread