Re: st: IV Regression with many zeros in 1st stage

Michael,

> Dear Statalist colleagues,
>
> I would like to estimate the following regression:
>
> BD=aX+bS+e
>
> where BD=birth defect and S is a measure of smoking (either 0/1 or number of
> cigarettes).  I would like to instrument S with cigarette prices and other
> variables (local
> smoking policy variables, eg).  The problem, I think, is that > 50% of
> respondents didn't
> smoke.  Does affect the appropriateness of the IV approach?  Ie, can my 1st
> stage
> regression:
>
> S = cZ+dP+e
>
> be estimated if >50% of S is zero?

IV is a one-step estimator, and it doesn't matter if the "first-stage
regression" is correctly specified or not because it's not being
estimated as a structural equation.  All you need is for S to be
correlated with the excluded instruments.  You can check this with
the 1st-stage F stat.

- If you want to do IV as 2SLS and estimate the first stage by hand,
you have to include *all* the exogenous variables.  That means in
your case you would include X along with Z and P.

- ...but there's no need to do it in two stages by hand; just use
ivreg (or ivreg2, the extended IV written by Kit Baum, Steve Stillman
and myself, which will provide the 1st-stage F-stat automatically if
you want it).

- One issue that you haven't raised is whether your birth defect
equation should be linear.  If BD is a dichotomous, you might want to
us Joe Harkness' ivprob (probit with endogenous regressors) instead.

Hope this helps.

--Mark

>
> Any help would be greatly appreciated.
>
> Thanks,
> Michael
>
