Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
Christine Scheef <christine.scheef@unisg.ch> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: Using ivregress when the endogenous variable is used in an interaction term in the main regression |

Date |
Thu, 22 Dec 2011 20:19:36 +0100 |

Hey, I am following your discussion since I am working on a similar problem at the moment. However, my endogenous variable in the interaction is binary. I have 3 questions: - Since the endogenous variable is binary, is it right to use logit instead of regress in the first stage? - I also want to calculate a 3-way interaction with 2 continuous exogenous variables and the endogenous binary variable. Can I form the interactions of X2hat with X1 and X3, that is X2hat* X1* X2? - When calculating the thrid step with ivregress - Do I still need to check for the exogeneity of the instrument variables X2hat and X2hat*X1? I very much appreciate your help. Best, Christine Nick, I don't have a specific reference in mind, but I suppose you should be able to construct a workable explanation from Prof. Wooldridge's reply (indeed, you can probably directly cite it) and some directed googling. T On Wed, Dec 21, 2011 at 10:01 AM, Nick Kohn <coffeemug.nick@gmail.com> wrote: > My apologies for spamming but I also wanted to mention that I'm trying > out the specification that includes the endogenous variables as stand > alone terms. > > I'm not sure whether I'll use it in my paper though because I'll need > to provide a justification of why I deviate from the paper I cite, and > going into long winded econometric arguments is beyond the scope of > what I'm doing. > > Is there a paper or book I can cite that explains why adding the > levels is appropriate? > > On Wed, Dec 21, 2011 at 6:58 PM, Nick Kohn <coffeemug.nick@gmail.com> wrote: >> Sorry for the confusion - X1 is included as a stand alone term. >> >> To be more detailed, my model looks like this (X is exogenous, E is endogenous): >> >> dY = X1 + X2 >> + X1*X3 >> + X1*X3*E1 >> + X1*X3*E2 >> + X1*X3*E3 >> + controls >> >> X3 is an indicator variable that is equal to 1 when X1 <= 0 >> >> On Wed, Dec 21, 2011 at 6:44 PM, Austin Nichols <austinnichols@gmail.com> wrote: >>> Tirthankar Chakravarty <tirthankar.chakravarty@gmail.com>: >>> I don't see anywhere that the X1 is included as a main effect as >>> opposed to just being included in the product X1*X2. (Though it is >>> not clear what is included in "+controls" in the post.) It seems that >>> X1 is exogenous by assumption, i.e. X1 is uncorrelated with e while X2 >>> is correlated with e. There are no quadratic terms in Z in my >>> suggestion. Note that you suggested instrumenting with X2hat*X1 and >>> X2hat is linear in Z. >>> >>> On Wed, Dec 21, 2011 at 12:15 PM, Tirthankar Chakravarty >>> <tirthankar.chakravarty@gmail.com> wrote: >>>> " It does not seem too much of a stretch to assume Z*X1 >>>> uncorrelated with e as well (which implies X2hat*X1 uncorrelated with >>>> e)" >>>> >>>> This part is the problem. When you form cross-products of the >>>> instrument matrix, you will end up with quadratic terms in Z, coming >>>> from terms like the one you mention, which will need to be >>>> uncorrelated with the structural errors, hence the independence >>>> requirement. >>>> >>>> Again, note that X1 is included so there is no overidentification (or, >>>> at best, the same degree of overidentification as without the >>>> interaction term). >>>> >>>> T >>>> >>>> On Wed, Dec 21, 2011 at 8:57 AM, Austin Nichols <austinnichols@gmail.com> wrote: >>>>> Tirthankar Chakravarty <tirthankar.chakravarty@gmail.com>: >>>>> No conditional independence assumed, though of course an independence >>>>> assumption lets you form all kinds of transformations of Z to use as >>>>> excluded instruments. >>>>> >>>>> We need Z, Z*X1, and X1 uncorrelated with e, but Z and e were already >>>>> assumed uncorrelated and X1 is exogenous by assumption as well, in the >>>>> original post. It does not seem too much of a stretch to assume Z*X1 >>>>> uncorrelated with e as well (which implies X2hat*X1 uncorrelated with >>>>> e), but if we use all 3 as instruments we will see evidence of any >>>>> violations of assumptions in the overid test (assuming no weak >>>>> instruments problem). >>>>> >>>>> On Wed, Dec 21, 2011 at 11:44 AM, Tirthankar Chakravarty >>>>> <tirthankar.chakravarty@gmail.com> wrote: >>>>>> Austin, >>>>>> >>>>>> I agree re: well-cited papers. >>>>>> >>>>>> Note that the efficiency you mention comes at a cost. As I pointed out >>>>>> in my previous Statalist reply: >>>>>> http://www.stata.com/statalist/archive/2011-08/msg01496.html >>>>>> the instrumenting strategy you suggest requires the instruments to be >>>>>> conditionally independent rather than just uncorrelated with the >>>>>> structural errors. >>>>>> >>>>>> T >>>>>> >>>>>> On Wed, Dec 21, 2011 at 7:57 AM, Austin Nichols <austinnichols@gmail.com> wrote: >>>>>>> Nick Kohn <coffeemug.nick@gmail.com>: >>>>>>> Or better, instrument for X1*X2 using Z, Z*X1, and X1. >>>>>>> For maximal efficiency given your assumptions you may prefer >>>>>>> to instrument for X1*X2 using Z*X1, or even >>>>>>> to instrument for X1*X2 using X2hat*X1, >>>>>>> but you should build in an overid test whenever feasible. >>>>>>> >>>>>>> Just because a well-cited paper does something wrong does not mean you >>>>>>> have to, though. >>>>>>> >>>>>>> Including the main effects of X1 and X2 makes for harder interpretation, but >>>>>>> will make you a lot more confident of your answers once you have worked out the >>>>>>> interpretation. >>>>>>> >>>>>>> On Wed, Dec 21, 2011 at 9:20 AM, Tirthankar Chakravarty >>>>>>> <tirthankar.chakravarty@gmail.com> wrote: >>>>>>>> In that case, none of this is necessary. Just instrument for X1*X2 >>>>>>>> using Z. All standard results apply. >>>>>>>> >>>>>>>> T >>>>>>>> >>>>>>>> On Wed, Dec 21, 2011 at 6:03 AM, Nick Kohn <coffeemug.nick@gmail.com> wrote: >>>>>>>>> Hmmm I see what you mean, but I'm following the methodology of a well >>>>>>>>> cited paper that does the same thing. >>>>>>>>> >>>>>>>>> I'll be sure to discuss this limitation, but in terms of using this >>>>>>>>> model, would the 3 steps in my last message be correct? >>>>>>>>> >>>>>>>>> On Wed, Dec 21, 2011 at 2:56 PM, Tirthankar Chakravarty >>>>>>>>> <tirthankar.chakravarty@gmail.com> wrote: >>>>>>>>>> I wanted to indirectly confirm that you did have the main effect in >>>>>>>>>> the regression because even though I don't know the nature of your >>>>>>>>>> study, a hard-to-defend methodological position arises when you >>>>>>>>>> include interaction terms without including the main effect. You might >>>>>>>>>> want to take that on the authority of someone who (literally) wrote >>>>>>>>>> the book on the subject: >>>>>>>>>> >>>>>>>>>> http://www.stata.com/statalist/archive/2011-03/msg00188.html >>>>>>>>>> >>>>>>>>>> and reconsider your decision to not include the main effect. >>>>>>>>>> >>>>>>>>>> T >>>>>>>>>> >>>>>>>>>> On Wed, Dec 21, 2011 at 5:46 AM, Nick Kohn <coffeemug.nick@gmail.com> wrote: >>>>>>>>>>> My model doesn't have X2 as a separate term, so in terms of the model >>>>>>>>>>> you had it looks like: >>>>>>>>>>> Y = b*X1*X2 + controls >>>>>>>>>>> So the only place the endogenous variable comes up is the interaction term >>>>>>>>>>> >>>>>>>>>>> At the risk of being repetitive, would these be the correct steps (so >>>>>>>>>>> essentially only step 3 changes from what you said): >>>>>>>>>>> 1) regress X2 on all instruments, exogenous variables and controls >>>>>>>>>>> 2) Form interactions of X2hat with the exogenous variable X1, that is, X2hat*X1 >>>>>>>>>>> 3) ivregress instrumenting for X2*X1 using X2hat*X1. >>>>>>>>>>> >>>>>>>>>>> On Wed, Dec 21, 2011 at 1:44 PM, Tirthankar Chakravarty >>>>>>>>>>> <tirthankar.chakravarty@gmail.com> wrote: >>>>>>>>>>>> Not quite; here is the recommended procedure (I am assuming that you >>>>>>>>>>>> have the main effect of the endogenous variable in there as in Y = >>>>>>>>>>>> a*X2 + b*X1*X2 + controls): >>>>>>>>>>>> >>>>>>>>>>>> 1) -regress- X2 on _all_ instruments (included exogenous controls and >>>>>>>>>>>> excluded instruments) and get predictions X2hat. >>>>>>>>>>>> >>>>>>>>>>>> 2) Form interactions of X2hat with the exogenous variable X1, that is, X2hat*X1. >>>>>>>>>>>> >>>>>>>>>>>> 3) -ivregress- instrumenting for X2 and X2*X1 using X2hat and X2hat*X1. >>>>>>>>>>>> >>>>>>>>>>>> Note that there is distinction between two calls to -regress- and >>>>>>>>>>>> using -ivregress- for 3). >>>>>>>>>>>> >>>>>>>>>>>> T >>>>>>>>>>>> >>>>>>>>>>>> On Wed, Dec 21, 2011 at 3:43 AM, Nick Kohn <coffeemug.nick@gmail.com> wrote: >>>>>>>>>>>>> Thanks for the reply. >>>>>>>>>>>>> >>>>>>>>>>>>> My simplified model is (X2 is endogenous): >>>>>>>>>>>>> Y = b*X1*X2 + controls >>>>>>>>>>>>> >>>>>>>>>>>>> In regards to the third option you suggest, would I do the following? >>>>>>>>>>>>> >>>>>>>>>>>>> 1) First stage regression to get X2hat using the instrument Z >>>>>>>>>>>>> 2) Run the first stage again but use X1*X2hat as the instrument for >>>>>>>>>>>>> X1*X2 (so Z is no longer used) >>>>>>>>>>>>> 3) Run the second stage using (X1*X2)hat (so the whole product is >>>>>>>>>>>>> fitted from step 2)) >>>>>>>>>>>>> >>>>>>>>>>>>> On Wed, Dec 21, 2011 at 12:24 PM, Tirthankar Chakravarty >>>>>>>>>>>>> <tirthankar.chakravarty@gmail.com> wrote: >>>>>>>>>>>>>> You can see my previous reply to a similar question here: >>>>>>>>>>>>>> http://www.stata.com/statalist/archive/2011-08/msg01496.html >>>>>>>>>>>>>> >>>>>>>>>>>>>> T >>>>>>>>>>>>>> >>>>>>>>>>>>>> On Wed, Dec 21, 2011 at 2:24 AM, Nick Kohn <coffeemug.nick@gmail.com> wrote: >>>>>>>>>>>>>>> Hi, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> I have a specification in which the endogenous variable is interacted >>>>>>>>>>>>>>> with an exogenous variable. Since I cannot multiply the variables >>>>>>>>>>>>>>> directly in the regression, I create a new variable. In ivregress it >>>>>>>>>>>>>>> makes no sense to use the entire interaction term as the endogenous >>>>>>>>>>>>>>> variable. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> I can do the first stage manually (and then use the fitted value in >>>>>>>>>>>>>>> the main regression), however, from what I remember the standard >>>>>>>>>>>>>>> errors will be wrong when doing it manually. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Is there a way to overcome this? >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Thanks * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**Re: st: Using ivregress when the endogenous variable is used in an interaction term in the main regression***From:*Tirthankar Chakravarty <tirthankar.chakravarty@gmail.com>

- Prev by Date:
**st: Estimating GF-QUAIDS** - Next by Date:
**st: Tabout including all categories** - Previous by thread:
**Re: st: Using ivregress when the endogenous variable is used in an interaction term in the main regression** - Next by thread:
- Index(es):