Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: Sample selection correction models with fixed-effects
From
Austin Nichols <[email protected]>
To
[email protected]
Subject
Re: st: Sample selection correction models with fixed-effects
Date
Mon, 31 Jan 2011 12:10:26 -0500
Ivy Kodzi <[email protected]> :
You have missing data, not unobservable data, so you should not jump
to a parametric model like -heckman- where you have to make strong
assumptions about unobservables; instead jump to -mi- to impute
regressors. But you have missing data on birthweights, right? And
that is your outcome (AKA depvar)? Your first step is simply to limit
your sample to cases without missing data. Perhaps you will later
want to limit your estimation sample to types of cases with lower
rates of missingness, for example by running separate logits of low
birthweight on parity dummies within narrow bands of reported income
and discarding results with small numbers of observations (effectively
allowing a nonparametric dependence of low birthweight incidence on
parity and income).
You have repeated observations on mothers? Running logits, the
conditional fixed effect model will run very slowly with that many
obs, but will in the end give you marginal effects very similar to a
fixed-effect linear regression (linear probability model) most of the
time, so you might start with -xtreg, fe- or -areg- instead (or
-xtivreg2- from SSC). The reason you can't include dummies and run a
regular logit is not that you have 100K mothers; it is that you have a
few obs per mother--the approach of including dummies and running a
regular logit works when you have enough obs to run a separate logit
in each group, so your asymptotics work out OK (i.e. you are assuming
the N within each group of obs is close to infinity, not the overall
N, so your individual dummies are estimable).
On Mon, Jan 31, 2011 at 10:18 AM, Ivy Kodzi <[email protected]> wrote:
> Hi Statalisters,
>
> My name is Ivy Kodzi. I am new to the list and find it very helpful!
>
> I am doing some analysis where I want to estimate the effect of parity
> (how many births a woman has had) on the chance of having a child
> with low birth weight. I also want to adjust for all constant
> observed and unobserved factors at the mother level e.g. some aspects
> of her health/genetics, household environment, community etc., so I
> am running a fixed-effects regression model with a dummy for whether
> or not a child was born with low birth weight as the outcome variable.
> In my data file, I have child characteristics nested under "mother".
> The birth weight variable, however, (from DHS data-sets) has a serious
> sample selection problem – the data predominantly come from more
> educated, richer, urban women who had their babies in hospitals. In my
> child level file, about two-thirds of children are missing birth
> weights largely because their mothers (less affluent) had the birth
> at home - hence no birth weight recorded. I could adjust for the
> selection problem with a heckman model e.g. with a bivariate probit
> model, with selection but fundamentally, I want to conduct a
> within-mother analysis so I want to include fixed-effects for the
> mother. The commands heckprob and heckprob2 in stata allow for
> adjusting for the clustering of children within mother, but not the
> inclusion of fixed effects. I have nearly a 100 thousand mothers, so
> including dummies for mother-specific fixed-effects will not be
> efficient, and may be incorrect.
>
> I am also not very sure if I should approach the problem as a Heckman
> type sample selection problem or as a non-ignorable missing data
> problem (even with so much missingness?). I think the problem is more
> of the former, that is why I am considering a Heckman correction
> model. Can anyone recommend how to do the estimation, especially in
> stata?
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/