Dear Rodrigo,
Dear Mark,
Indeed, this is how I proceeded at first, so
Step 1: use a probit/poisson to calculate the predicted probabilities
for the dummies/count variables
Step 2: use these predicted probabilities in the first stage of the
usual ivreg2
Step 3: second stage of the usual ivreg2 procedure
BTW, the weak instrument problem in this procedure turns out to be as
severe as for the usual ivreg2
Also, I received the following comment (from a referee):
"I went to look at pages 623-25 of Wooldridge but I confess I did not
>see how it applies to your case, which is not a treatment effect. In
>the first part of the book, Wooldridge makes it clear that endogenous
>variables must be instrumented in a linear way, that is, using OLS as
>first stage regression, irrespective of whether the endogeneous
>variable is dichotomous or even Poisson. The reasoning is that the
>purpose of instrumentation is to eliminate the possible correlation
>between the regressors and the error term. Correlation is a linear
concept. Instrumenting in a non-linear way (e.g., the probability from a
first stage probit) introduces the possibility of non completely
eliminating correlation, which is linear by construction. This is why
the ivreg command (or ivreg2) should be used for all instrumentation."
Therefore I tried to present an alternative:
1) instrument using the usual ivreg2
2) instrument using the condivreg procedure, even though its for one
shock at a time (Mikusheva and Poi, 2006) (this was a suggestion of the
referee, but I found it troublesome that it was for one shock at a time,
cf. explanation of Mark in previous email)
3) To verify whether the findings of the condivreg procedure only stem
from the fact that I instrument one shock at the time, I subjected the
confidence intervals obtained in the condivreg procedure to another test
that is robust to weak instruments, i.e. the (heteroskedasticity and
autocorrelation consistent) Anderson and Rubin (AR) test (Anderson and
Rubin, 1949; Chernozhukov and Hansen, 2005). I find that the results of
the AR test do not contradict the CLR results. (To use the AR test to
construct confidence intervals instead of merely verifying the ones
obtained by condivreg requires a lot of computations since I have 7
endogenous variables)
Anderson, T. W. and H. Rubin (1949), "Estimators of the Parameters of a
Single Equation in a Complete Set of Stochastic Equations", Annals of
Mathematical Statistics, 21: 570-582.
Chernozhukov, V. and Ch. Hansen (2005), "The Reduced Form: a Simple
Approach to Inference with Weak Instruments", Unpublished Manuscript.
Mikusheva, A. and B. Poi. (2006). "Tests and Confidence sets with
correct size in the simultaneous equations model with potentially weak
instruments." Stata Journal.
-----Original Message-----
From: [email protected]
[mailto:[email protected]] On Behalf Of Rodrigo A.
Alfaro
Sent: vrijdag 1 september 2006 20:02
To: [email protected]
Subject: st: Re: RE: RE: RE: RE: Re: several endogenous dummies
Dear Mark,
I am concern that Marijke has endogenous DUMMIES
variables. Do you think that Procedure 18.1 (Wooldridge)
could help in this case? I know that this is not a treatment
problem, but it would be hard to get strong-instruments
in a linear framework.
Rodrigo.
----- Original Message -----
From: "Schaffer, Mark E" <[email protected]>
To: <[email protected]>
Sent: Friday, September 01, 2006 11:15 AM
Subject: st: RE: RE: RE: RE: Re: several endogenous dummies
Marijke,
> -----Original Message-----
> From: [email protected]
> [mailto:[email protected]] On Behalf Of
> Verpoorten, Marijke
> Sent: Friday, September 01, 2006 3:45 PM
> To: [email protected]
> Subject: st: RE: RE: RE: Re: several endogenous dummies
>
> Dear Mark,
>
> Thank you for pointing out the 3SLS, I wasn't aware of this procedure.
>
> Wrt the second issue, there is a misunderstanding. I'm not
> omitting any endogenous variable from the equation. I do not estimate
>
> ivreg2 y x1 (x2=z1)
> ivreg2 y x1 (x3=z1)
>
> Instead, I estimate
>
> Condivreg y x1 x2 (x3=z1 z2), ar lm
> Condivreg y x1 x3 (x2=z1 z3), ar lm
>
> with the set of instruments (z1 z2) and (z1 z3) a relevant
> subset of the full set of instruments (z1 z2 z3). I do so,
> because I have weak instruments and condivreg only allows for
> instrumenting one endogenous variable.
>
> Are these equations also misspecified?
Yes. It all comes down to the same problem. In the full specification,
ivreg2 y x1 (x2 x3=z1 z2)
but you have a weak instrument problem. You are suggesting that you
deal with this by reducing the number of endogenous variables. You can
try
ivreg2 y x1 (x3=z1 z2) [my suggestion]
or
ivreg2 y x1 x2 (x3=z1 z2) [your suggestion]
but neither is well specified. In my case, you have an endogeneity
problem via omitted variable bias; in yours, via the endogeneity of x2.
There's no direct escape, I'm afraid.
Cheers,
mark
Prof. Mark E. Schaffer
Director
Centre for Economic Reform and Transformation
Department of Economics
School of Management & Languages
Heriot-Watt University
Edinburgh EH14 4AS UK
44-131-451-3494 direct
44-131-451-3296 fax
http://www.sml.hw.ac.uk/cert
>
> Marijke
>
>
> -----Original Message-----
> From: [email protected]
> [mailto:[email protected]] On Behalf Of
> Schaffer, Mark E
> Sent: vrijdag 1 september 2006 16:30
> To: [email protected]
> Subject: st: RE: RE: Re: several endogenous dummies
>
> Marijke,
>
> Two reactions to your post:
>
> > -----Original Message-----
> > From: [email protected]
> > [mailto:[email protected]] On Behalf Of
> Verpoorten,
> > Marijke
> > Sent: Friday, September 01, 2006 1:28 PM
> > To: [email protected]
> > Subject: st: RE: Re: several endogenous dummies
> >
> > Hi Rodrigo,
> >
> > Thanks a lot for your answer. I'm sorry for my late reply; I was
> > traveling.
> >
> > To answer to your first three questions: (1)I have a set of
> > 17 instruments among which several cross-products and
> squares of the
> > exogenous RHS variables, (2) I use the same set of instruments for
> > each of the variables, though for some variables some
> instruments are
> > not relevant. I did not find a way to use different subsets of
> > instruments in the ivreg2 procedure,
>
> This is a misunderstanding about single-equation IV that
> comes up from time to time on Statalist. In single-equation
> IV, there is no way to limit sets instruments to apply to
> particular sets of endogenous regressors. This is basically
> by definition - you can do this, but then you are in the land
> of system estimation, 3SLS, FIML and the like. For example, in
>
> ivreg2 y x1 (x2 x3 = z1 z2 z3 z4)
>
> you might think that z1 and z2 instrument for x2, and z3 and
> z4 instrument for x3. But that is another way of saying that
> you want to specify 3 equations - y, x2, and x3 - and get
> efficiency gains from system estimation. No problem - use
> reg3 or whatever - but then it's not single-equation IV.
>
> > (3) I need to instrument five dummies and two count
> variables. These
> > variables give information on whether or not a household
> was hit by a
> > particular war-related shock, such as the death/illness of
> a household
> > member, imprisonment of a member, months taken refuge abroad etc. I
> > want to analyze which type of shock has a long term effect on the
> > household's welfare.
> >
> > Household welfare in 2002 = f(household welfare in 1990, household
> > characteristics in 1990, shocks occurring between 1990-2002)
> >
> > When I use the usual ivreg2 procedure to solve for the possible
> > endogeneity of the war-related shocks, I face the weak instrument
> > problem. Therefore I also use the condivreg procedure, for
> each shock
> > separately, while using the most relevant set of
> instruments for each
> > shock (those significant at 10% in the first stage of ivreg2).
> > However, I don't know whether it makes sense to instrument for each
> > shock separately.
>
> This is probably not legitimate, at least as you describe it.
> The problem is that you can't identify an equation in this
> kind of piece-by-piece manner. It's like the following
> example. You want to estimate
>
> ivreg2 y x1 (x2 x3=z1)
>
> but it's not identified because you don't have enough
> excluded instruments. You can't solve the problem by
> estimating the following two equations:
>
> ivreg2 y x1 (x2=z1)
>
> ivreg2 y x1 (x3=z1)
>
> The two equations are identified but misspecified, because in
> each case,
> z1 will be correlated with the error term via the omitted
> endogenous variable. You will have the same problem if you
> instrument for each of your endogeous regressors separately.
>
> HTH.
>
> --Mark
>
> Prof. Mark E. Schaffer
> Director
> Centre for Economic Reform and Transformation Department of
> Economics School of Management & Languages Heriot-Watt
> University Edinburgh EH14 4AS UK
> 44-131-451-3494 direct
> 44-131-451-3296 fax
> http://www.sml.hw.ac.uk/cert
>
>
> > I'm not sure I understand your suggestion about using the mlogit or
> > mprobit procedure. Is this to be used in the first stage? Is it
> > possible when the dummies may overlap, i.e. a household may face
> > several shocks.
> > How may the first stage information be used in the second stage? As
> > predicted probabilities?
> >
> > I also have to admit that I don't know what you mean with a full
> > characterization of the problem using ml. Could you put me on the
> > right track with a reference to a stata code, a textbook or an
> > article?
> >
> > Thank you very much,
> >
> > Marijke
> >
> >
> >
> >
> >
> > -----Original Message-----
> > From: [email protected]
> > [mailto:[email protected]] On Behalf Of
> Rodrigo A.
> > Alfaro
> > Sent: zaterdag 26 augustus 2006 6:22
> > To: [email protected]
> > Subject: st: Re: several endogenous dummies
> >
> > Marijke,
> >
> > I don't know the answer for your question but I can give you some
> > questions that you can explore. Note that the reference
> that you wrote
> > describes 1
> >
> > dummy variable, which sounds reasonable to do it by that procedure
> > instead of linear IV. Moreover, Wooldridge said that the
> estimation of
> > the parameters and the specification of the model in the
> first stage
> > do not affect the standard errors of 2SLS. Great!!!
> >
> > How many instruments are you going to use for these dummies?
> > Same set for each one? What number several means? Why not
> combine the
> > choices into a multinominal problem (solving by mlogit or mprobit)?
> > After you feel confortable with your entire model,
> equations for the
> > dummies plus your 2SLS one I think that it is not longer valid the
> > non-effect on std errors when you are trying to solve for several
> > endogenous dummies.
> >
> > Maybe a full characterization of the problem is the way to
> go. You can
> > describe all the process (endogenous dummies plus your continuous
> > variable)
> > as a maximum likelihood framework. You will pay with additional
> > assumption above the model but the reward will be a complete system
> > with "no-better"
> > standard errors.
> >
> > Rodrigo.
> >
> >
> >
> > ----- Original Message -----
> > From: "Verpoorten, Marijke" <[email protected]>
> > To: <[email protected]>;
> > <[email protected]>; <[email protected]>
> > Sent: Friday, August 25, 2006 3:38 PM
> > Subject: st: several endogenous dummies
> >
> >
> > Dear statlisters,
> >
> > I wonder whether, when having a continuous variable as a dependent
> > variable and several endogenous dummies, it`s better to use
> the usual
> > 2SLS (ivreg2), instead of instrumenting the dummies
> non-linearly (as
> > in Wooldridge, 2002, p623-625). Could you help me with this
> question?
> >
> > Kind regards,
> > Marijke
> >
> > Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm
> >
> >
> > *
> > * For searches and help try:
> > * http://www.stata.com/support/faqs/res/findit.html
> > * http://www.stata.com/support/statalist/faq
> > * http://www.ats.ucla.edu/stat/stata/
> >
> > *
> > * For searches and help try:
> > * http://www.stata.com/support/faqs/res/findit.html
> > * http://www.stata.com/support/statalist/faq
> > * http://www.ats.ucla.edu/stat/stata/
> >
> > Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm
> >
> >
> > *
> > * For searches and help try:
> > * http://www.stata.com/support/faqs/res/findit.html
> > * http://www.stata.com/support/statalist/faq
> > * http://www.ats.ucla.edu/stat/stata/
> >
> >
>
> *
> * For searches and help try:
> * http://www.stata.com/support/faqs/res/findit.html
> * http://www.stata.com/support/statalist/faq
> * http://www.ats.ucla.edu/stat/stata/
>
> Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm
>
>
> *
> * For searches and help try:
> * http://www.stata.com/support/faqs/res/findit.html
> * http://www.stata.com/support/statalist/faq
> * http://www.ats.ucla.edu/stat/stata/
>
>
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/
Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/