No, you cannot use the fitted values or residuals from a first stage
using a different dep var in the second stage as you propose.  You
can, however, simply model the effect of "any_commute" (binary X1' in
your notation) on Y directly; see
http://www.stata.com/statalist/archive/2007-05/msg00338.html

though you undoubtedly still have a selection problem--commuting and
wages can be observed only for the employed.

On 5/21/07, Zhiqiang Feng <zf2@st-andrews.ac.uk> wrote:
```Hi, everyone,

We like to do a two stage least square (2SLS) regression. We have looked
at some information in the forum. For example, the email communications
on "Re: st: 2SLS with Probit in the first-stage regression" in 2004.
However, our questions are somewhat different.

We are interested in the impact of commuting on health.

So the regression is:

Y = a + b1X1 + b2X2 + e

here Y is defined as health status of subjects, a continuous variable.
X1 is commuting time, a continuous and endogenous variable.

We like to instument X1 with Z1 and Z2, so the second regression is:

X1 = a + b1Z1 + b2Z2 + b3X2 + e

However, we think Z1, Z2 and X2 are associated with the decision on
whether subjects take up commuting instead of on commuting time, so we
prefer to use X1 as a binary variable indicating whether to commute, say
X1', in the second regression,.

X1' = a + b1Z1 + b2Z2 + b3X2 + u

We wonder if this is possible or CORRECT as we try to use the same
variable (commuting) differently in the first (as binary) and second (as
continuous) regressions.

Any help would be appreciated.
