[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]
st: re: two stage least squares questions
We are interested in the impact of commuting on health.
So the regression is:
Y = a + b1X1 + b2X2 + e
here Y is defined as health status of subjects, a continuous variable.
X1 is commuting time, a continuous and endogenous variable.
We like to instument X1 with Z1 and Z2, so the second regression is:
X1 = a + b1Z1 + b2Z2 + b3X2 + e
However, we think Z1, Z2 and X2 are associated with the decision on
whether subjects take up commuting instead of on commuting time, so we
prefer to use X1 as a binary variable indicating whether to commute, say
X1', in the second regression,.
X1' = a + b1Z1 + b2Z2 + b3X2 + u
We wonder if this is possible or CORRECT as we try to use the same
variable (commuting) differently in the first (as binary) and second (as
No, it would not be correct to do so. If the amount of commuting time
is endogenously determined with health status -- that is, if X1 is
correlated with e -- then a binary transformation of X1 will also be
correlated with e. But why not just do
ivreg2 y x2 (x1 = z1 z2)
and be done with it? If z1 and z2 are correlated with a binary
transformation of x1, they will be correlated with the level of x1.
Note that despite its name 'two-stage least squares' is not
implemented by running two regressions. This is a FAQ on this list.
The instrumental variables estimator is defined by a set of
regressors X and a set of instruments Z. See Ch. 8 of my book or the
Baum-Schaffer-Stillman Stata Journal 2003 article (available in
working paper form from the URL below).
Kit Baum, Boston College Economics and DIW Berlin
An Introduction to Modern Econometrics Using Stata:
* For searches and help try: