Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

# Re: st: Using ivregress when the endogenous variable is used in an interaction term in the main regression

 From Tirthankar Chakravarty To statalist@hsphsun2.harvard.edu Subject Re: st: Using ivregress when the endogenous variable is used in an interaction term in the main regression Date Wed, 21 Dec 2011 09:15:14 -0800

" It does not seem too much of a stretch to assume Z*X1
uncorrelated with e as well (which implies X2hat*X1 uncorrelated with
e)"

This part is the problem. When you form cross-products of the
instrument matrix, you will end up with quadratic terms in Z, coming
from terms like the one you mention, which will need to be
uncorrelated with the structural errors, hence the independence
requirement.

Again, note that X1 is included so there is no overidentification (or,
at best, the same degree of overidentification as without the
interaction term).

T

On Wed, Dec 21, 2011 at 8:57 AM, Austin Nichols <austinnichols@gmail.com> wrote:
> Tirthankar Chakravarty <tirthankar.chakravarty@gmail.com>:
> No conditional independence assumed, though of course an independence
> assumption lets you form all kinds of transformations of Z to use as
> excluded instruments.
>
> We need Z, Z*X1, and X1 uncorrelated with e, but Z and e were already
> assumed uncorrelated and X1 is exogenous by assumption as well, in the
> original post.  It does not seem too much of a stretch to assume Z*X1
> uncorrelated with e as well (which implies X2hat*X1 uncorrelated with
> e), but if we use all 3 as instruments we will see evidence of any
> violations of assumptions in the overid test (assuming no weak
> instruments problem).
>
> On Wed, Dec 21, 2011 at 11:44 AM, Tirthankar Chakravarty
> <tirthankar.chakravarty@gmail.com> wrote:
>> Austin,
>>
>> I agree re: well-cited papers.
>>
>> Note that the efficiency you mention comes at a cost. As I pointed out
>> in my previous Statalist reply:
>> http://www.stata.com/statalist/archive/2011-08/msg01496.html
>> the instrumenting strategy you suggest requires the instruments to be
>> conditionally independent rather than just uncorrelated with the
>> structural errors.
>>
>> T
>>
>> On Wed, Dec 21, 2011 at 7:57 AM, Austin Nichols <austinnichols@gmail.com> wrote:
>>> Nick Kohn <coffeemug.nick@gmail.com>:
>>> Or better, instrument for X1*X2 using Z, Z*X1, and X1.
>>> For maximal efficiency given your assumptions you may prefer
>>> to instrument for X1*X2 using Z*X1, or even
>>> to instrument for X1*X2 using X2hat*X1,
>>> but you should build in an overid test whenever feasible.
>>>
>>> Just because a well-cited paper does something wrong does not mean you
>>> have to, though.
>>>
>>> Including the main effects of X1 and X2 makes for harder interpretation, but
>>> will make you a lot more confident of your answers once you have worked out the
>>> interpretation.
>>>
>>> On Wed, Dec 21, 2011 at 9:20 AM, Tirthankar Chakravarty
>>> <tirthankar.chakravarty@gmail.com> wrote:
>>>> In that case, none of this is necessary. Just instrument for X1*X2
>>>> using Z. All standard results apply.
>>>>
>>>> T
>>>>
>>>> On Wed, Dec 21, 2011 at 6:03 AM, Nick Kohn <coffeemug.nick@gmail.com> wrote:
>>>>> Hmmm I see what you mean, but I'm following the methodology of a well
>>>>> cited paper that does the same thing.
>>>>>
>>>>> I'll be sure to discuss this limitation, but in terms of using this
>>>>> model, would the 3 steps in my last message be correct?
>>>>>
>>>>> On Wed, Dec 21, 2011 at 2:56 PM, Tirthankar Chakravarty
>>>>> <tirthankar.chakravarty@gmail.com> wrote:
>>>>>> I wanted to indirectly confirm that you did have the main effect in
>>>>>> the regression because even though I don't know the nature of your
>>>>>> study, a hard-to-defend methodological position arises when you
>>>>>> include interaction terms without including the main effect. You might
>>>>>> want to take that on the authority of someone who (literally) wrote
>>>>>> the book on the subject:
>>>>>>
>>>>>> http://www.stata.com/statalist/archive/2011-03/msg00188.html
>>>>>>
>>>>>> and reconsider your decision to not include the main effect.
>>>>>>
>>>>>> T
>>>>>>
>>>>>> On Wed, Dec 21, 2011 at 5:46 AM, Nick Kohn <coffeemug.nick@gmail.com> wrote:
>>>>>>> My model doesn't have X2 as a separate term, so in terms of the model
>>>>>>> you had it looks like:
>>>>>>>  Y = b*X1*X2 + controls
>>>>>>> So the only place the endogenous variable comes up is the interaction term
>>>>>>>
>>>>>>> At the risk of being repetitive, would these be the correct steps (so
>>>>>>> essentially only step 3 changes from what you said):
>>>>>>> 1) regress X2 on all instruments, exogenous variables and controls
>>>>>>> 2) Form interactions of X2hat with the exogenous variable X1, that is, X2hat*X1
>>>>>>> 3) ivregress instrumenting for X2*X1 using X2hat*X1.
>>>>>>>
>>>>>>> On Wed, Dec 21, 2011 at 1:44 PM, Tirthankar Chakravarty
>>>>>>> <tirthankar.chakravarty@gmail.com> wrote:
>>>>>>>> Not quite; here is the recommended procedure (I am assuming that you
>>>>>>>> have the main effect of the endogenous variable in there as in Y =
>>>>>>>> a*X2 + b*X1*X2 + controls):
>>>>>>>>
>>>>>>>> 1) -regress- X2 on _all_ instruments (included exogenous controls and
>>>>>>>> excluded instruments) and get predictions X2hat.
>>>>>>>>
>>>>>>>> 2) Form interactions of X2hat with the exogenous variable X1, that is, X2hat*X1.
>>>>>>>>
>>>>>>>> 3) -ivregress- instrumenting for X2 and X2*X1 using X2hat and X2hat*X1.
>>>>>>>>
>>>>>>>> Note that there is distinction between two calls to -regress- and
>>>>>>>> using -ivregress- for 3).
>>>>>>>>
>>>>>>>> T
>>>>>>>>
>>>>>>>> On Wed, Dec 21, 2011 at 3:43 AM, Nick Kohn <coffeemug.nick@gmail.com> wrote:
>>>>>>>>>
>>>>>>>>> My simplified model is (X2 is endogenous):
>>>>>>>>> Y = b*X1*X2 + controls
>>>>>>>>>
>>>>>>>>> In regards to the third option you suggest, would I do the following?
>>>>>>>>>
>>>>>>>>>  1) First stage regression to get X2hat using the instrument Z
>>>>>>>>>  2) Run the first stage again but use X1*X2hat as the instrument for
>>>>>>>>> X1*X2 (so Z is no longer used)
>>>>>>>>>  3) Run the second stage using (X1*X2)hat (so the whole product is
>>>>>>>>> fitted from step 2))
>>>>>>>>>
>>>>>>>>> On Wed, Dec 21, 2011 at 12:24 PM, Tirthankar Chakravarty
>>>>>>>>> <tirthankar.chakravarty@gmail.com> wrote:
>>>>>>>>>> You can see my previous reply to a similar question here:
>>>>>>>>>> http://www.stata.com/statalist/archive/2011-08/msg01496.html
>>>>>>>>>>
>>>>>>>>>> T
>>>>>>>>>>
>>>>>>>>>> On Wed, Dec 21, 2011 at 2:24 AM, Nick Kohn <coffeemug.nick@gmail.com> wrote:
>>>>>>>>>>> Hi,
>>>>>>>>>>>
>>>>>>>>>>> I have a specification in which the endogenous variable is interacted
>>>>>>>>>>> with an exogenous variable. Since I cannot multiply the variables
>>>>>>>>>>> directly in the regression, I create a new variable. In ivregress it
>>>>>>>>>>> makes no sense to use the entire interaction term as the endogenous
>>>>>>>>>>> variable.
>>>>>>>>>>>
>>>>>>>>>>> I can do the first stage manually (and then use the fitted value in
>>>>>>>>>>> the main regression), however, from what I remember the standard
>>>>>>>>>>> errors will be wrong when doing it manually.
>>>>>>>>>>>
>>>>>>>>>>> Is there a way to overcome this?
>>>>>>>>>>>
>>>>>>>>>>> Thanks
>
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/

--
Tirthankar Chakravarty
tchakravarty@ucsd.edu
tirthankar.chakravarty@gmail.com

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/