Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: Using ivregress when the endogenous variable is used in an interaction term in the main regression
From
Nick Kohn <[email protected]>
To
[email protected]
Subject
Re: st: Using ivregress when the endogenous variable is used in an interaction term in the main regression
Date
Wed, 21 Dec 2011 18:58:44 +0100
Sorry for the confusion - X1 is included as a stand alone term.
To be more detailed, my model looks like this (X is exogenous, E is endogenous):
dY = X1 + X2
+ X1*X3
+ X1*X3*E1
+ X1*X3*E2
+ X1*X3*E3
+ controls
X3 is an indicator variable that is equal to 1 when X1 <= 0
On Wed, Dec 21, 2011 at 6:44 PM, Austin Nichols <[email protected]> wrote:
> Tirthankar Chakravarty <[email protected]>:
> I don't see anywhere that the X1 is included as a main effect as
> opposed to just being included in the product X1*X2. (Though it is
> not clear what is included in "+controls" in the post.) It seems that
> X1 is exogenous by assumption, i.e. X1 is uncorrelated with e while X2
> is correlated with e. There are no quadratic terms in Z in my
> suggestion. Note that you suggested instrumenting with X2hat*X1 and
> X2hat is linear in Z.
>
> On Wed, Dec 21, 2011 at 12:15 PM, Tirthankar Chakravarty
> <[email protected]> wrote:
>> " It does not seem too much of a stretch to assume Z*X1
>> uncorrelated with e as well (which implies X2hat*X1 uncorrelated with
>> e)"
>>
>> This part is the problem. When you form cross-products of the
>> instrument matrix, you will end up with quadratic terms in Z, coming
>> from terms like the one you mention, which will need to be
>> uncorrelated with the structural errors, hence the independence
>> requirement.
>>
>> Again, note that X1 is included so there is no overidentification (or,
>> at best, the same degree of overidentification as without the
>> interaction term).
>>
>> T
>>
>> On Wed, Dec 21, 2011 at 8:57 AM, Austin Nichols <[email protected]> wrote:
>>> Tirthankar Chakravarty <[email protected]>:
>>> No conditional independence assumed, though of course an independence
>>> assumption lets you form all kinds of transformations of Z to use as
>>> excluded instruments.
>>>
>>> We need Z, Z*X1, and X1 uncorrelated with e, but Z and e were already
>>> assumed uncorrelated and X1 is exogenous by assumption as well, in the
>>> original post. It does not seem too much of a stretch to assume Z*X1
>>> uncorrelated with e as well (which implies X2hat*X1 uncorrelated with
>>> e), but if we use all 3 as instruments we will see evidence of any
>>> violations of assumptions in the overid test (assuming no weak
>>> instruments problem).
>>>
>>> On Wed, Dec 21, 2011 at 11:44 AM, Tirthankar Chakravarty
>>> <[email protected]> wrote:
>>>> Austin,
>>>>
>>>> I agree re: well-cited papers.
>>>>
>>>> Note that the efficiency you mention comes at a cost. As I pointed out
>>>> in my previous Statalist reply:
>>>> http://www.stata.com/statalist/archive/2011-08/msg01496.html
>>>> the instrumenting strategy you suggest requires the instruments to be
>>>> conditionally independent rather than just uncorrelated with the
>>>> structural errors.
>>>>
>>>> T
>>>>
>>>> On Wed, Dec 21, 2011 at 7:57 AM, Austin Nichols <[email protected]> wrote:
>>>>> Nick Kohn <[email protected]>:
>>>>> Or better, instrument for X1*X2 using Z, Z*X1, and X1.
>>>>> For maximal efficiency given your assumptions you may prefer
>>>>> to instrument for X1*X2 using Z*X1, or even
>>>>> to instrument for X1*X2 using X2hat*X1,
>>>>> but you should build in an overid test whenever feasible.
>>>>>
>>>>> Just because a well-cited paper does something wrong does not mean you
>>>>> have to, though.
>>>>>
>>>>> Including the main effects of X1 and X2 makes for harder interpretation, but
>>>>> will make you a lot more confident of your answers once you have worked out the
>>>>> interpretation.
>>>>>
>>>>> On Wed, Dec 21, 2011 at 9:20 AM, Tirthankar Chakravarty
>>>>> <[email protected]> wrote:
>>>>>> In that case, none of this is necessary. Just instrument for X1*X2
>>>>>> using Z. All standard results apply.
>>>>>>
>>>>>> T
>>>>>>
>>>>>> On Wed, Dec 21, 2011 at 6:03 AM, Nick Kohn <[email protected]> wrote:
>>>>>>> Hmmm I see what you mean, but I'm following the methodology of a well
>>>>>>> cited paper that does the same thing.
>>>>>>>
>>>>>>> I'll be sure to discuss this limitation, but in terms of using this
>>>>>>> model, would the 3 steps in my last message be correct?
>>>>>>>
>>>>>>> On Wed, Dec 21, 2011 at 2:56 PM, Tirthankar Chakravarty
>>>>>>> <[email protected]> wrote:
>>>>>>>> I wanted to indirectly confirm that you did have the main effect in
>>>>>>>> the regression because even though I don't know the nature of your
>>>>>>>> study, a hard-to-defend methodological position arises when you
>>>>>>>> include interaction terms without including the main effect. You might
>>>>>>>> want to take that on the authority of someone who (literally) wrote
>>>>>>>> the book on the subject:
>>>>>>>>
>>>>>>>> http://www.stata.com/statalist/archive/2011-03/msg00188.html
>>>>>>>>
>>>>>>>> and reconsider your decision to not include the main effect.
>>>>>>>>
>>>>>>>> T
>>>>>>>>
>>>>>>>> On Wed, Dec 21, 2011 at 5:46 AM, Nick Kohn <[email protected]> wrote:
>>>>>>>>> My model doesn't have X2 as a separate term, so in terms of the model
>>>>>>>>> you had it looks like:
>>>>>>>>> Y = b*X1*X2 + controls
>>>>>>>>> So the only place the endogenous variable comes up is the interaction term
>>>>>>>>>
>>>>>>>>> At the risk of being repetitive, would these be the correct steps (so
>>>>>>>>> essentially only step 3 changes from what you said):
>>>>>>>>> 1) regress X2 on all instruments, exogenous variables and controls
>>>>>>>>> 2) Form interactions of X2hat with the exogenous variable X1, that is, X2hat*X1
>>>>>>>>> 3) ivregress instrumenting for X2*X1 using X2hat*X1.
>>>>>>>>>
>>>>>>>>> On Wed, Dec 21, 2011 at 1:44 PM, Tirthankar Chakravarty
>>>>>>>>> <[email protected]> wrote:
>>>>>>>>>> Not quite; here is the recommended procedure (I am assuming that you
>>>>>>>>>> have the main effect of the endogenous variable in there as in Y =
>>>>>>>>>> a*X2 + b*X1*X2 + controls):
>>>>>>>>>>
>>>>>>>>>> 1) -regress- X2 on _all_ instruments (included exogenous controls and
>>>>>>>>>> excluded instruments) and get predictions X2hat.
>>>>>>>>>>
>>>>>>>>>> 2) Form interactions of X2hat with the exogenous variable X1, that is, X2hat*X1.
>>>>>>>>>>
>>>>>>>>>> 3) -ivregress- instrumenting for X2 and X2*X1 using X2hat and X2hat*X1.
>>>>>>>>>>
>>>>>>>>>> Note that there is distinction between two calls to -regress- and
>>>>>>>>>> using -ivregress- for 3).
>>>>>>>>>>
>>>>>>>>>> T
>>>>>>>>>>
>>>>>>>>>> On Wed, Dec 21, 2011 at 3:43 AM, Nick Kohn <[email protected]> wrote:
>>>>>>>>>>> Thanks for the reply.
>>>>>>>>>>>
>>>>>>>>>>> My simplified model is (X2 is endogenous):
>>>>>>>>>>> Y = b*X1*X2 + controls
>>>>>>>>>>>
>>>>>>>>>>> In regards to the third option you suggest, would I do the following?
>>>>>>>>>>>
>>>>>>>>>>> 1) First stage regression to get X2hat using the instrument Z
>>>>>>>>>>> 2) Run the first stage again but use X1*X2hat as the instrument for
>>>>>>>>>>> X1*X2 (so Z is no longer used)
>>>>>>>>>>> 3) Run the second stage using (X1*X2)hat (so the whole product is
>>>>>>>>>>> fitted from step 2))
>>>>>>>>>>>
>>>>>>>>>>> On Wed, Dec 21, 2011 at 12:24 PM, Tirthankar Chakravarty
>>>>>>>>>>> <[email protected]> wrote:
>>>>>>>>>>>> You can see my previous reply to a similar question here:
>>>>>>>>>>>> http://www.stata.com/statalist/archive/2011-08/msg01496.html
>>>>>>>>>>>>
>>>>>>>>>>>> T
>>>>>>>>>>>>
>>>>>>>>>>>> On Wed, Dec 21, 2011 at 2:24 AM, Nick Kohn <[email protected]> wrote:
>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>
>>>>>>>>>>>>> I have a specification in which the endogenous variable is interacted
>>>>>>>>>>>>> with an exogenous variable. Since I cannot multiply the variables
>>>>>>>>>>>>> directly in the regression, I create a new variable. In ivregress it
>>>>>>>>>>>>> makes no sense to use the entire interaction term as the endogenous
>>>>>>>>>>>>> variable.
>>>>>>>>>>>>>
>>>>>>>>>>>>> I can do the first stage manually (and then use the fitted value in
>>>>>>>>>>>>> the main regression), however, from what I remember the standard
>>>>>>>>>>>>> errors will be wrong when doing it manually.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Is there a way to overcome this?
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks
>
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/statalist/faq
> * http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/