Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: RE: st: ivreg2: interacting the endogenous regressor

From   Christopher Baum <[email protected]>
To   "[email protected]" <[email protected]>
Subject   Re: RE: st: ivreg2: interacting the endogenous regressor
Date   Fri, 25 May 2012 13:40:27 -0400

On May 25, 2012, at 8:33 AM, statalist-digest wrote:

> Kit, from what I understand, your suggestion is to treat the two
> variables (endogenous and interaction) as independent, thus running
> something like:
> 		ivreg2 y ex (en en_ex = z z_ex)
> where 'ex' is the exogenous variable, 'en' is the endogenous variable,
> 'z' is the instrument (excluded variable), 'en_ex' is the interaction
> between the endogenous and the exogenous variables and 'z_ex' is the
> interaction between the exogenous variable and the instrument. 
> My question is: doesn't this assume that 'en' and 'en_ex' are
> independent? In other words, isn't it that the first-stage regressions
> predict 'en' independently of the prediction of 'en_ex' (and, inversely,
> predict 'en_ex' independently of the prediction of 'en')? This would be
> more of a problem (I think) the higher the variance of 'en' relative to
> that of 'ex'. But, importantly, if it is the other way round (i.e., if
> the endogenous variable has a much lower variance than the exogenous
> one, so that 'en' and 'en_ex' are not too collinear), isn't it that my
> prediction of 'en_ex' using 'z_ex' will essentially be a regression of
> 'ex on 'ex' - and thus basically irrelevant?? 
> I was thinking that ideally one ought to run a first-stage regression of
> the form "reg  en z ex" (plus lags, as appropriate) and then use the
> prediction to create the interaction term between the exogenous
> variable, on the one hand, and the prediction of the endogenous
> variable, on the other, before moving on with the second-stage
> regression. Is this the wrong way of thinking about it? And, if not, is
> there a way to implement this in Stata??

I don't quite understand your concern. That is indeed the equation I proposed should be estimated, and the RHS endogenous
variables (the dependent variables in the nonexistent FSRs) are indeed en and en*ex. In writing down a regression equation,
we never assume that the regressors are independent (presumably you mean uncorrelated, or orthogonal?) In a textbook case,
if they were, then you wouldn't need multiple regression. So we imagine that ex, en and en_ex are by construction correlated
to some degree, as are z and z_ex.

If you were to run FSRs (by speciffying the -first- option in -ivregress- or Baum-Schaffer-Stillman -ivreg2-, you can see them) we would
naturally expect the predicted values of those FSRs to be correlated, just as en and en_ex are themselves correlated. So what?
If the variance of ex is very small, it is a lousy regressor, by the simple logic of regression: the smaller the variance (or, properly, variation)
of the regressor, the larger its standard error, cet. par. The same goes for the en variable; if it has a very small variance/variability, how likely 
is it to explain much in the relationship? 

I would emphatically NOT recommend "rolling your own" by running FSRs by hand and then trying to do something sensible with
them. That almost surely will come to grief, and in worst case you will have entered the land of the 'forbidden regression': e.g., see Mark's comments in


Kit Baum   |   Boston College Economics & DIW Berlin   |
                             An Introduction to Stata Programming  |
  An Introduction to Modern Econometrics Using Stata  |

*   For searches and help try:

© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index