Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Using ivregress when the endogenous variable is used in an interaction term in the main regression


From   Tirthankar Chakravarty <[email protected]>
To   [email protected]
Subject   Re: st: Using ivregress when the endogenous variable is used in an interaction term in the main regression
Date   Thu, 22 Dec 2011 11:30:13 -0800

Christine,

1) For the two-way interaction with binary endogenous regressor, your
question is fully answered in the Prof. Wooldridge's reply here:
http://www.stata.com/statalist/archive/2011-03/msg00188.html

This is analogous to the procedure suggested earlier in the thread for
the continuous interaction in the previous discussions. In fact, that
reply also discusses a three-way interaction.

2) The three-way interaction poses no new challenges if the other two
terms are both exogenous.

3) If Z, X1 and X3 are exogenous, the interaction X2hat*X1*X3 is
necessarily exogenous.

T

On Thu, Dec 22, 2011 at 11:19 AM, Christine Scheef
<[email protected]> wrote:
> Hey,
>
> I am following your discussion since I am working on a similar problem at
> the moment. However, my endogenous variable in the interaction is binary.
> I have 3 questions:
> - Since the endogenous variable is binary, is it right to use logit
> instead of regress in the first stage?
> - I also want to calculate a 3-way interaction with 2 continuous exogenous
> variables and the endogenous binary variable. Can I form the interactions
> of X2hat with X1 and X3, that is X2hat* X1* X2?
> - When calculating the thrid step with ivregress - Do I still need to
> check for the exogeneity of the instrument variables X2hat and X2hat*X1?
>
> I very much appreciate your help.
>
> Best,
> Christine
>
>
> Nick,
>
> I don't have a specific reference in mind, but I suppose you should be
> able to construct a workable explanation from Prof. Wooldridge's reply
> (indeed, you can probably directly cite it) and some directed
> googling.
>
> T
>
> On Wed, Dec 21, 2011 at 10:01 AM, Nick Kohn <[email protected]>
> wrote:
>> My apologies for spamming but I also wanted to mention that I'm trying
>> out the specification that includes the endogenous variables as stand
>> alone terms.
>>
>> I'm not sure whether I'll use it in my paper though because I'll need
>> to provide a justification of why I deviate from the paper I cite, and
>> going into long winded econometric arguments is beyond the scope of
>> what I'm doing.
>>
>> Is there a paper or book I can cite that explains why adding the
>> levels is appropriate?
>>
>> On Wed, Dec 21, 2011 at 6:58 PM, Nick Kohn <[email protected]>
> wrote:
>>> Sorry for the confusion - X1 is included as a stand alone term.
>>>
>>> To be more detailed, my model looks like this (X is exogenous, E is
> endogenous):
>>>
>>> dY = X1 + X2
>>>     + X1*X3
>>>     + X1*X3*E1
>>>     + X1*X3*E2
>>>     + X1*X3*E3
>>>     + controls
>>>
>>> X3 is an indicator variable that is equal to 1 when X1 <= 0
>>>
>>> On Wed, Dec 21, 2011 at 6:44 PM, Austin Nichols
> <[email protected]> wrote:
>>>> Tirthankar Chakravarty <[email protected]>:
>>>> I don't see anywhere that the X1 is included as a main effect as
>>>> opposed to just being included in the product X1*X2.  (Though it is
>>>> not clear what is included in "+controls" in the post.) It seems that
>>>> X1 is exogenous by assumption, i.e. X1 is uncorrelated with e while X2
>>>> is correlated with e. There are no quadratic terms in Z in my
>>>> suggestion. Note that you suggested instrumenting with X2hat*X1 and
>>>> X2hat is linear in Z.
>>>>
>>>> On Wed, Dec 21, 2011 at 12:15 PM, Tirthankar Chakravarty
>>>> <[email protected]> wrote:
>>>>> " It does not seem too much of a stretch to assume Z*X1
>>>>> uncorrelated with e as well (which implies X2hat*X1 uncorrelated with
>>>>> e)"
>>>>>
>>>>> This part is the problem. When you form cross-products of the
>>>>> instrument matrix, you will end up with quadratic terms in Z, coming
>>>>> from terms like the one you mention, which will need to be
>>>>> uncorrelated with the structural errors, hence the independence
>>>>> requirement.
>>>>>
>>>>> Again, note that X1 is included so there is no overidentification
> (or,
>>>>> at best, the same degree of overidentification as without the
>>>>> interaction term).
>>>>>
>>>>> T
>>>>>
>>>>> On Wed, Dec 21, 2011 at 8:57 AM, Austin Nichols
> <[email protected]> wrote:
>>>>>> Tirthankar Chakravarty <[email protected]>:
>>>>>> No conditional independence assumed, though of course an
> independence
>>>>>> assumption lets you form all kinds of transformations of Z to use as
>>>>>> excluded instruments.
>>>>>>
>>>>>> We need Z, Z*X1, and X1 uncorrelated with e, but Z and e were
> already
>>>>>> assumed uncorrelated and X1 is exogenous by assumption as well, in
> the
>>>>>> original post.  It does not seem too much of a stretch to assume
> Z*X1
>>>>>> uncorrelated with e as well (which implies X2hat*X1 uncorrelated
> with
>>>>>> e), but if we use all 3 as instruments we will see evidence of any
>>>>>> violations of assumptions in the overid test (assuming no weak
>>>>>> instruments problem).
>>>>>>
>>>>>> On Wed, Dec 21, 2011 at 11:44 AM, Tirthankar Chakravarty
>>>>>> <[email protected]> wrote:
>>>>>>> Austin,
>>>>>>>
>>>>>>> I agree re: well-cited papers.
>>>>>>>
>>>>>>> Note that the efficiency you mention comes at a cost. As I pointed
> out
>>>>>>> in my previous Statalist reply:
>>>>>>> http://www.stata.com/statalist/archive/2011-08/msg01496.html
>>>>>>> the instrumenting strategy you suggest requires the instruments to
> be
>>>>>>> conditionally independent rather than just uncorrelated with the
>>>>>>> structural errors.
>>>>>>>
>>>>>>> T
>>>>>>>
>>>>>>> On Wed, Dec 21, 2011 at 7:57 AM, Austin Nichols
> <[email protected]> wrote:
>>>>>>>> Nick Kohn <[email protected]>:
>>>>>>>> Or better, instrument for X1*X2 using Z, Z*X1, and X1.
>>>>>>>> For maximal efficiency given your assumptions you may prefer
>>>>>>>> to instrument for X1*X2 using Z*X1, or even
>>>>>>>> to instrument for X1*X2 using X2hat*X1,
>>>>>>>> but you should build in an overid test whenever feasible.
>>>>>>>>
>>>>>>>> Just because a well-cited paper does something wrong does not mean
> you
>>>>>>>> have to, though.
>>>>>>>>
>>>>>>>> Including the main effects of X1 and X2 makes for harder
> interpretation, but
>>>>>>>> will make you a lot more confident of your answers once you have
> worked out the
>>>>>>>> interpretation.
>>>>>>>>
>>>>>>>> On Wed, Dec 21, 2011 at 9:20 AM, Tirthankar Chakravarty
>>>>>>>> <[email protected]> wrote:
>>>>>>>>> In that case, none of this is necessary. Just instrument for
> X1*X2
>>>>>>>>> using Z. All standard results apply.
>>>>>>>>>
>>>>>>>>> T
>>>>>>>>>
>>>>>>>>> On Wed, Dec 21, 2011 at 6:03 AM, Nick Kohn
> <[email protected]> wrote:
>>>>>>>>>> Hmmm I see what you mean, but I'm following the methodology of a
> well
>>>>>>>>>> cited paper that does the same thing.
>>>>>>>>>>
>>>>>>>>>> I'll be sure to discuss this limitation, but in terms of using
> this
>>>>>>>>>> model, would the 3 steps in my last message be correct?
>>>>>>>>>>
>>>>>>>>>> On Wed, Dec 21, 2011 at 2:56 PM, Tirthankar Chakravarty
>>>>>>>>>> <[email protected]> wrote:
>>>>>>>>>>> I wanted to indirectly confirm that you did have the main
> effect in
>>>>>>>>>>> the regression because even though I don't know the nature of
> your
>>>>>>>>>>> study, a hard-to-defend methodological position arises when you
>>>>>>>>>>> include interaction terms without including the main effect.
> You might
>>>>>>>>>>> want to take that on the authority of someone who (literally)
> wrote
>>>>>>>>>>> the book on the subject:
>>>>>>>>>>>
>>>>>>>>>>> http://www.stata.com/statalist/archive/2011-03/msg00188.html
>>>>>>>>>>>
>>>>>>>>>>> and reconsider your decision to not include the main effect.
>>>>>>>>>>>
>>>>>>>>>>> T
>>>>>>>>>>>
>>>>>>>>>>> On Wed, Dec 21, 2011 at 5:46 AM, Nick Kohn
> <[email protected]> wrote:
>>>>>>>>>>>> My model doesn't have X2 as a separate term, so in terms of
> the model
>>>>>>>>>>>> you had it looks like:
>>>>>>>>>>>>  Y = b*X1*X2 + controls
>>>>>>>>>>>> So the only place the endogenous variable comes up is the
> interaction term
>>>>>>>>>>>>
>>>>>>>>>>>> At the risk of being repetitive, would these be the correct
> steps (so
>>>>>>>>>>>> essentially only step 3 changes from what you said):
>>>>>>>>>>>> 1) regress X2 on all instruments, exogenous variables and
> controls
>>>>>>>>>>>> 2) Form interactions of X2hat with the exogenous variable X1,
> that is, X2hat*X1
>>>>>>>>>>>> 3) ivregress instrumenting for X2*X1 using X2hat*X1.
>>>>>>>>>>>>
>>>>>>>>>>>> On Wed, Dec 21, 2011 at 1:44 PM, Tirthankar Chakravarty
>>>>>>>>>>>> <[email protected]> wrote:
>>>>>>>>>>>>> Not quite; here is the recommended procedure (I am assuming
> that you
>>>>>>>>>>>>> have the main effect of the endogenous variable in there as
> in Y =
>>>>>>>>>>>>> a*X2 + b*X1*X2 + controls):
>>>>>>>>>>>>>
>>>>>>>>>>>>> 1) -regress- X2 on _all_ instruments (included exogenous
> controls and
>>>>>>>>>>>>> excluded instruments) and get predictions X2hat.
>>>>>>>>>>>>>
>>>>>>>>>>>>> 2) Form interactions of X2hat with the exogenous variable X1,
> that is, X2hat*X1.
>>>>>>>>>>>>>
>>>>>>>>>>>>> 3) -ivregress- instrumenting for X2 and X2*X1 using X2hat and
> X2hat*X1.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Note that there is distinction between two calls to -regress-
> and
>>>>>>>>>>>>> using -ivregress- for 3).
>>>>>>>>>>>>>
>>>>>>>>>>>>> T
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Wed, Dec 21, 2011 at 3:43 AM, Nick Kohn
> <[email protected]> wrote:
>>>>>>>>>>>>>> Thanks for the reply.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> My simplified model is (X2 is endogenous):
>>>>>>>>>>>>>> Y = b*X1*X2 + controls
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> In regards to the third option you suggest, would I do the
> following?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>  1) First stage regression to get X2hat using the instrument
> Z
>>>>>>>>>>>>>>  2) Run the first stage again but use X1*X2hat as the
> instrument for
>>>>>>>>>>>>>> X1*X2 (so Z is no longer used)
>>>>>>>>>>>>>>  3) Run the second stage using (X1*X2)hat (so the whole
> product is
>>>>>>>>>>>>>> fitted from step 2))
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Wed, Dec 21, 2011 at 12:24 PM, Tirthankar Chakravarty
>>>>>>>>>>>>>> <[email protected]> wrote:
>>>>>>>>>>>>>>> You can see my previous reply to a similar question here:
>>>>>>>>>>>>>>>
> http://www.stata.com/statalist/archive/2011-08/msg01496.html
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> T
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Wed, Dec 21, 2011 at 2:24 AM, Nick Kohn
> <[email protected]> wrote:
>>>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I have a specification in which the endogenous variable is
> interacted
>>>>>>>>>>>>>>>> with an exogenous variable. Since I cannot multiply the
> variables
>>>>>>>>>>>>>>>> directly in the regression, I create a new variable. In
> ivregress it
>>>>>>>>>>>>>>>> makes no sense to use the entire interaction term as the
> endogenous
>>>>>>>>>>>>>>>> variable.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I can do the first stage manually (and then use the fitted
> value in
>>>>>>>>>>>>>>>> the main regression), however, from what I remember the
> standard
>>>>>>>>>>>>>>>> errors will be wrong when doing it manually.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Is there a way to overcome this?
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Thanks
>
>
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/



-- 
Tirthankar Chakravarty
[email protected]
[email protected]

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index