# RE: st: RE: RE: RE: Re: RE: RE: RE: RE: Re: several endogenous dummies

 From "Rodrigo Alfaro" To statalist@hsphsun2.harvard.edu Subject RE: st: RE: RE: RE: Re: RE: RE: RE: RE: Re: several endogenous dummies Date Wed, 06 Sep 2006 20:05:16 +0000

```Hi,

It seems to me that Marijke has a difficult problem to improve.

1. Wooldridge procedure is explained for one endogenous
variable. I read further in the book and the results are valid in
first order asymptotic framework, which mean that it is fine
to get it. Clearly the referee didn't pay attention to the
applicability of the procedure in other context than
treatment effects. It should be interesting to see the
sensibility of your results changing the underlying model
(logit, probit, etc). You can convice the people that your
results are valid no matter what is the dist.

2. Using the same idea, you can define a new variable
that takes all the possible combinations. For K=3, I am
thinking in y=1 if x1=x2=x3=0, y=2 if x1=x2=0 and
x3=1, and so on... then y can be instrumented by
-mlogit- or -mprobit- (I don't see how you can use the
poisson distribution). Again a validity test.

3. I don't remember if you have cross-sectional or
panel data model. In the last case, you could try
Hausman-Taylor model that it seems to work with
dummy variables (at least someone already did it).
With cross-sectional maybe you can compute each
regression for some specific groups and try to do
a non-parametric test to compare the coefficients.
But I expect a very poor fitting.

4. It seems unreasonable to get 100^7 regressions.
For that number you can get a finite-sample interval

I hope this helps you
Rodrigo.

```
```From: "Verpoorten, Marijke" <Marijke.Verpoorten@econ.kuleuven.be>
To: <statalist@hsphsun2.harvard.edu>
Subject: st: RE: RE: RE: Re: RE: RE: RE: RE: Re: several endogenous dummies
Date: Wed, 6 Sep 2006 12:22:00 +0200

Dear Mark,

I neither see a big difference between between a treatment effect model
and a model with an endogenous dummy. Maybe the only difference lies in
the fact that I have several endogenous dummies?

It is indeed computationally cumbersome to construct the AR confidence
regions. Christian Hansen listed a stata do file on his website:

http://faculty.chicagogsb.edu/christian.hansen/research/index.htm

This is however for only one endogenous variable. In my case, where I
have to deal with 7 endogenous variables, constructing AR confidence
regions over a space of 100 values for each variable would entail
(100)^7 regressions and Wald tests.

However, I merely used the procedure to verify the values calculated in
the condivreg procedure.

So, suppose that in the condivreg procedure only one out of seven shocks
is different from zero.
Then, I test:

generate
Y'=Y-(0)*shock1-(0)*shock2-(0)*shock3-(0)*shock4-(0)*shock5-(0)* shock6
-(0)*shock7
regress Y' X Z (with X included instruments and Z excluded instruments),
robust
test Z

If the Wald test rejects that the coefficients on Z equal zero, than the
vector of coefficients (0,0,0,0,0,0,0) for the shock variables is
rejected.

I repeat this test using the values obtained in the condivreg procedure
(0.70,0,0,0,0,0,0).

generate
Y'=Y-(0.70)*shock1-(0)*shock2-(0)*shock3-(0)*shock4-(0)*shock5-(0)*
shock6 -(0)*shock7
regress Y' X Z (with X included instruments and Z excluded instruments),
robust
test Z

In this case, the Wald test does not reject that the coefficients on Z
equal zero. Therefore, I claim that the AR test does not contradict the
CLR results.

Marijke

-----Original Message-----
From: owner-statalist@hsphsun2.harvard.edu
[mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of Schaffer,
Mark E
Sent: dinsdag 5 september 2006 17:42
To: statalist@hsphsun2.harvard.edu
Subject: st: RE: RE: Re: RE: RE: RE: RE: Re: several endogenous dummies

Marijke,

> -----Original Message-----
> From: owner-statalist@hsphsun2.harvard.edu
> [mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of
> Verpoorten, Marijke
> Sent: Monday, September 04, 2006 8:08 AM
> To: statalist@hsphsun2.harvard.edu
> Subject: st: RE: Re: RE: RE: RE: RE: Re: several endogenous dummies
>
> Dear Rodrigo,
> Dear Mark,
>
> Indeed, this is how I proceeded at first, so
>
> Step 1: use a probit/poisson to calculate the predicted
> probabilities for the dummies/count variables Step 2: use
> these predicted probabilities in the first stage of the usual
> ivreg2 Step 3: second stage of the usual ivreg2 procedure
>
> BTW, the weak instrument problem in this procedure turns out
> to be as severe as for the usual ivreg2
>
> Also, I received the following comment (from a referee):
>
> > "I went to look at pages 623-25 of Wooldridge but I confess I did
not
> > see how it applies to your case, which is not a treatment effect. In

> > the first part of the book, Wooldridge makes it clear that
endogenous
> > variables must be instrumented in a linear way, that is, using OLS
as
> > first stage regression, irrespective of whether the endogeneous
> > variable is dichotomous or even Poisson. The reasoning is that the
> > purpose of instrumentation is to eliminate the possible correlation
> > between the regressors and the error term. Correlation is a linear
> > concept. Instrumenting in a non-linear way (e.g., the
> > probability from a first stage probit) introduces the
> > possibility of non completely eliminating correlation, which
> > is linear by construction. This is why the ivreg command (or
> > ivreg2) should be used for all instrumentation."

I'm not sure what the referee is getting at.  The procedure on pp.
623-25 requires that you construct instruments by using probit and
getting the predicted values.  You then use these predicted values as
instruments in ivreg (or ivreg2), which is what s/he is asking for.  I
also don't understand the point about treatment effects.  What is the
formal difference between a dummy endogenous variable (your model) and a
simple treatment effect model?

> Therefore I tried to present an alternative:
>
> 1) instrument using the usual ivreg2
> 2) instrument using the condivreg procedure, even though its
> for one shock at a time (Mikusheva and Poi, 2006) (this was a
> suggestion of the referee, but I found it troublesome that it
> was for one shock at a time, cf. explanation of Mark in
> previous email)
> 3) To verify whether the findings of the condivreg procedure
> only stem from the fact that I instrument one shock at the
> time, I subjected the confidence intervals obtained in the
> condivreg procedure to another test that is robust to weak
> instruments, i.e. the (heteroskedasticity and autocorrelation
> consistent) Anderson and Rubin (AR) test (Anderson and Rubin,
> 1949; Chernozhukov and Hansen, 2005). I find that the results
> of the AR test do not contradict the CLR results. (To use the
> AR test to construct confidence intervals instead of merely
> verifying the ones obtained by condivreg requires a lot of
> computations since I have 7 endogenous variables)

My understanding of this procedure is that it requires you to generate
confidence regions (not intervals), i.e., the technique generates joint
regions for the 7 coefficients together.  Just out of curiousity, how
did you do this?  "A lot of computations" sounds like it could be an
understatement!

Cheers,
Mark

Prof. Mark E. Schaffer
Director
Centre for Economic Reform and Transformation
Department of Economics
School of Management & Languages
Heriot-Watt University
Edinburgh EH14 4AS  UK
44-131-451-3494 direct
44-131-451-3296 fax
http://www.sml.hw.ac.uk/cert

>
> Anderson, T. W. and H. Rubin (1949), "Estimators of the
> Parameters of a Single Equation in a Complete Set of
> Stochastic Equations", Annals of Mathematical Statistics, 21: 570-582.
>
> Chernozhukov, V. and Ch. Hansen (2005), "The Reduced Form: a
> Simple Approach to Inference with Weak Instruments",
> Unpublished Manuscript.
>
> Mikusheva, A. and B. Poi. (2006). "Tests and Confidence sets
> with correct size in the simultaneous equations model with
> potentially weak instruments." Stata Journal.
>
>
>
> -----Original Message-----
> From: owner-statalist@hsphsun2.harvard.edu
> [mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of Rodrigo A.
> Alfaro
> Sent: vrijdag 1 september 2006 20:02
> To: statalist@hsphsun2.harvard.edu
> Subject: st: Re: RE: RE: RE: RE: Re: several endogenous dummies
>
> Dear Mark,
>
> I am concern that Marijke has endogenous DUMMIES variables.
> Do you think that Procedure 18.1 (Wooldridge) could help in
> this case? I know that this is not a treatment problem, but
> it would be hard to get strong-instruments in a linear framework.
>
> Rodrigo.
>
>
> ----- Original Message -----
> From: "Schaffer, Mark E" <M.E.Schaffer@hw.ac.uk>
> To: <statalist@hsphsun2.harvard.edu>
> Sent: Friday, September 01, 2006 11:15 AM
> Subject: st: RE: RE: RE: RE: Re: several endogenous dummies
>
>
> Marijke,
>
> > -----Original Message-----
> > From: owner-statalist@hsphsun2.harvard.edu
> > [mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of
> > Verpoorten, Marijke
> > Sent: Friday, September 01, 2006 3:45 PM
> > To: statalist@hsphsun2.harvard.edu
> > Subject: st: RE: RE: RE: Re: several endogenous dummies
> >
> > Dear Mark,
> >
> > Thank you for pointing out the 3SLS, I wasn't aware of this
> procedure.
> >
> > Wrt the second issue, there is a misunderstanding. I'm not
> > omitting any endogenous variable from the equation. I do
> not estimate
> >
> > ivreg2 y x1 (x2=z1)
> > ivreg2 y x1 (x3=z1)
> >
> >
> > Condivreg y x1 x2 (x3=z1 z2), ar lm
> > Condivreg y x1 x3 (x2=z1 z3), ar lm
> >
> > with the set of instruments (z1 z2) and (z1 z3) a relevant
> > subset of the full set of instruments (z1 z2 z3). I do so,
> > because I have weak instruments and condivreg only allows for
> > instrumenting one endogenous variable.
> >
> > Are these equations also misspecified?
>
> Yes.  It all comes down to the same problem.  In the full
> specification,
>
> ivreg2 y x1 (x2 x3=z1 z2)
>
> but you have a weak instrument problem.  You are suggesting that you
> deal with this by reducing the number of endogenous
> variables.  You can
> try
>
> ivreg2 y x1 (x3=z1 z2)    [my suggestion]
>
> or
>
> ivreg2 y x1 x2 (x3=z1 z2)   [your suggestion]
>
> but neither is well specified.  In my case, you have an endogeneity
> problem via omitted variable bias; in yours, via the
> endogeneity of x2.
> There's no direct escape, I'm afraid.
>
> Cheers,
> Mark
>

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```
```
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```