Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: st: RE: Testing for instrument relevance and overidentification when the endogeneous variable is used in interaction terms


From   "Schaffer, Mark E" <M.E.Schaffer@hw.ac.uk>
To   "statalist@hsphsun2.harvard.edu" <statalist@hsphsun2.harvard.edu>
Subject   RE: st: RE: Testing for instrument relevance and overidentification when the endogeneous variable is used in interaction terms
Date   Mon, 10 Jun 2013 17:14:07 +0000

Jason,

> -----Original Message-----
> From: owner-statalist@hsphsun2.harvard.edu [mailto:owner- 
> statalist@hsphsun2.harvard.edu] On Behalf Of Jason Wichert
> Sent: 07 June 2013 12:35
> To: statalist@hsphsun2.harvard.edu
> Subject: Re: st: RE: Testing for instrument relevance and 
> overidentification when the endogeneous variable is used in 
> interaction terms
> 
> Mark,
> 
> Alright, that’s a relief. However, as promised (threatened?), I’ve 
> come up with some additional questions.
> 
> Isn’t the approach discussed most recently, i.e.
> 
> [1] ivreg2 y controls ex1 ex2 (en en_ex1 en_ex2 = enhat enhat_ex1 
> enhat_ex2)
>  (notation: ex1, ex2 = exogenous variables; en = endogenous variable; 
> z1, z2 = instruments for en)
> 
> in effect the same as rolling additional FSR’s (for the lack of 
> appropriate word) of the type
> 
> [2] regress en_ex1 controls ex1 ex2 enhat enhat_ex1
> 
> and
> 
> [3] regress en_ex2 controls ex1 ex2 enhat enhat_ex2
> 
> just with the additional/unnecessary respective instruments enhat_ex2 
> and
> enhat_ex1 in [2] and [3]?

I'm not sure (to be honest I've lost track of what is what).  You have equivalence with straight IV only if enhat_ex2 and enhat_ex1 are "unnecessary" because of perfect collinearity.  If you don't have perfect collinearity, and your IVs enhat, enhat_ex1 and enhat_ex2 are linear combinations of a longer list of things you think are exogenous, then you may have taken a step towards solving your problem because you have a smaller number of IVs (created by collapsing your full set of IVs into a smaller number of linear combinations).

As for the control function approach, that looks promising.  But maybe others on the list who use it have something to contribute here....

--Mark

> 
> While tedious and error-prone, following approach [1] doesn’t seem to “cure”
> the 2SLS test results, since I’m again instrumenting several 
> interaction terms by instruments weakly or un-correlated to the 
> instrumented terms (such as
> enhat_ex2 to ex1_en).
> 
> I read up on the “control function approach” suggested largely by 
> Wooldridge as an alternative, which he mentions in his 2002 version of 
> "Econometric analysis of cross section and panel data", a comment he 
> made in http://www.stata.com/statalist/archive/2011-03/msg00187.html , 
> as well as lecture slides I found online ( 
> http://www.eief.it/files/2011/10/slides_3_controlfuncs.pdf ). In the 
> latter, regarding to forbidden regressions and the control function 
> approach, he states “Danger with plugging in fitted values for y2 [the 
> endog. variable] is that one might be tempted to plug y2_hat into nonlinear functions, say (y2)^2 or y2_z1.
> This does not result in consistent estimation of the scaled parameters 
> or the partial effects.
> If we believe y2 has a linear RF with additive normal error 
> independent of z, the addition of v2_hat solves the endogeneity 
> problem *regardless* of how y2 appears.”
> 
> I’m considering to run such sole control function, predict the 
> residual, and incorporate in my OLS or what otherwise might be the 
> second stage of my 2SLS, with all the interaction terms (for the sake 
> of brevity, leaving quadratic terms of ex1 and ex2 aside), i.e.
> 
> regress en ex1 ex2 z1 z2 en1_z1 en1_z2 en2_z1 en2_z2
> 
> predict en_resid, resid
> 
> regress y ex1 ex2 en ex1_en ex2_en en_resid
> 
> This almost sounds too simple to be true to my naïve understanding of 
> the matter. So again, any feedback is highly appreciated.
> 
> Jason
> 
> 
> On Thu, Jun 6, 2013 at 10:26 PM, Schaffer, Mark E 
> <M.E.Schaffer@hw.ac.uk>
> wrote:
> > Jason,
> >
> > I was just generalizing - in your case there's only one preliminary 
> > regression,
> namely the one to get en_hat.
> >
> > Cheers,
> > Mark
> >
> >> -----Original Message-----
> >> From: owner-statalist@hsphsun2.harvard.edu [mailto:owner- 
> >> statalist@hsphsun2.harvard.edu] On Behalf Of Jason Wichert
> >> Sent: 06 June 2013 13:27
> >> To: statalist@hsphsun2.harvard.edu
> >> Subject: Re: st: RE: Testing for instrument relevance and 
> >> overidentification when the endogeneous variable is used in 
> >> interaction terms
> >>
> >> Mark,
> >>
> >> As multiple times before, thank you very much. However, you got me 
> >> a little confused with your statement “when you do the various 
> >> preliminary regressions” for getting fitted values. My 
> >> understanding was to solely get fitted values for my one truly 
> >> endogenous variable “en” from a single regression of “en” on all 
> >> included and excluded instruments (including ex1 and ex2, which are 
> >> to be interacted with “en”), to then form interactions of the 
> >> fitted/predicted values of “en” with ex1 and ex2, and ultimately 
> >> use those interactions (enhat, enhat_ex1, enhat_ex2) as instruments for en, en_ex1, en_ex2 in ivreg2.
> >> Apologies for asking again, but considering the difficulties 
> >> encountered and discussed so far, I want to make sure to follow the 
> >> correct procedure and stay away from any territory of forbidden 
> >> regressions
> and the likes.
> >>
> >> I’m afraid you and statalist won’t have heard from me and this 
> >> issue for the last time just yet.
> >>
> >> Kind regards,
> >> Jason
> >>
> >> On Thu, Jun 6, 2013 at 1:30 PM, Schaffer, Mark E 
> >> <M.E.Schaffer@hw.ac.uk>
> >> wrote:
> >> > Jason,
> >> >
> >> > I think that's right.  If it's the procedure I think you have in 
> >> > mind, the
> >> intuition behind it is that you are fairly confident that your 
> >> included and excluded instruments (ex and z in your notation, I
> >> think) in various forms (levels, squares, interactions with each 
> >> other, etc.)
> are all valid instruments.
> >> When you do the various preliminary regressions ("first stage" is 
> >> probably the wrong term - it's not the same thing as the 1st stage 
> >> of
> >> 2SLS) and get fitted values, those fitted values are linear 
> >> combinations of various valid instruments.  Since they're linear 
> >> combinations of exogenous things, they're also exogenous and can be 
> >> used as excluded instruments or interacted with other exogenous 
> >> variables to get still more instruments.  The reason to use linear 
> >> combinations instead of the variables separately is to avoid the 
> >> various problems that come from using a large number of excluded
> instruments.  Of course, there's a lot of other stuff you also have to 
> believe for this to !
> >> >  work, but that's your call ... good luck!
> >> >
> >> > HTH,
> >> > Mark

<snip - getting too long for the list server>



----- 
Sunday Times Scottish University of the Year 2011-2013
Top in the UK for student experience
Fourth university in the UK and top in Scotland (National Student Survey 2012)


We invite research leaders and ambitious early career researchers to 
join us in leading and driving research in key inter-disciplinary themes. 
Please see www.hw.ac.uk/researchleaders for further information and how
to apply.

Heriot-Watt University is a Scottish charity
registered under charity number SC000278.


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index