Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
st: rare event problem in first stage of IV2SLS
From
Erkan Duman <[email protected]>
To
[email protected]
Subject
st: rare event problem in first stage of IV2SLS
Date
Wed, 22 Jan 2014 18:15:04 +0200
I have a binary choice model with binary endogenous variable. I am
investigating the impacts of migration experience on the school
attendance of migrant household's children. I decided on to use IV2SLS
and bivariate probit methods where the instrument is historical
migration networks at state level which is supposed to provide a
reason why one household engages in migration and another similar
household does not. I could not use bivariate probit because in any of
my specifications the bivariate normality of errors assumption is
violated. IV2SLS does not need such an assumption; however, the
predicted school attendance rates are out of [-1,1] range and the
estimated migration coefficient is also out of range- around 4. I
controlled for multicollinearity and try to control as many variables
as possible which may threaten the instrument's exogeneity. None
worked for me, still the estimated migration coefficient is around 3.
Below you can find the two stages:
School attendance(i)= a+b*migration_hat(i)+error(i) Second stage
migration(i)= c+d*historical migration rate(ij)+error2(i)
First stage
Chiburis et al. 2011 argues that when the treatment probability (in
our case the share of remittance receiving households) is low where
low is below 0.1, then linear IV estimation becomes very
uninformartive. When I searched for that problem, I come up with King
and Zeng 2001 which provides a way to correct for rare event. King and
Zeng 2001 deals with a logit regresion where the dependent variable is
a rare event. In my case the first stage is a logit regression where
the dependent variable is a rare event and I believe that this rare
event problem in the first stage causes problems in estimating
coefficients out of the [-1, 1] range. In my case the share of
remittance receiving househols is 1.55% which suits the rare event
definition of King and Zeng 2001. That is, 1529 remittance receiving
househols and 97038 non-receiving househols (1529 1s and 97038 0s). I
thought to use King and Zeng 2001 correction method (relogit in stata)
in the first stage regression and plug the predicted values from the
first stage into the second stage; however, in this case, the standard
errors from the first stage needs to be corrected. Plus, I am not sure
whether this way of handling the problem is correct, also do not know
how to correct for the standard errors. I could not find any relevant
material which deals with rare events in an instrumental variable
estimation environment. Can you please help me solving the rare event
problem in the first stage of an instrumental variable estimation
strategy?
--
Erkan Duman
Graduate student - PhD
Faculty of Art and Social Sciences
Sabancı University
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/