Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Sample selection correction models with fixed-effects

From   Ivy Kodzi <[email protected]>
To   [email protected]
Subject   Re: st: Sample selection correction models with fixed-effects
Date   Mon, 31 Jan 2011 12:49:35 -0500

Thank you, Austin for the feedback. So far, I have limited the
analysis to cases with birthweight data. I  have also tested
interaction effects (parity and "income") and have found that the
parity effects are not depend on income levels. So, if I understand
you correctly, it sounds like there might be little additional benefit
in doing  further "correction" for the missingness problem using a
parametric method like the Heckman model.

Thank you also for the insights about why it would be wrong to include
dummies for mothers. I totally get you.

On Mon, Jan 31, 2011 at 12:10 PM, Austin Nichols
<[email protected]> wrote:
> Ivy Kodzi <[email protected]> :
> You have missing data, not unobservable data, so you should not jump
> to a parametric model like -heckman- where you have to make strong
> assumptions about unobservables; instead jump to -mi- to impute
> regressors.  But you have missing data on birthweights, right?  And
> that is your outcome (AKA depvar)?  Your first step is simply to limit
> your sample to cases without missing data.   Perhaps you will later
> want to limit your estimation sample to types of cases with lower
> rates of missingness, for example by running separate logits of low
> birthweight on parity dummies within narrow bands of reported income
> and discarding results with small numbers of observations (effectively
> allowing a nonparametric dependence of low birthweight incidence on
> parity and income).
> You have repeated observations on mothers? Running logits, the
> conditional fixed effect model will run very slowly with that many
> obs, but will in the end give you marginal effects very similar to a
> fixed-effect linear regression (linear probability model) most of the
> time, so you might start with -xtreg, fe- or -areg- instead (or
> -xtivreg2- from SSC).  The reason you can't include dummies and run a
> regular logit is not that you have 100K mothers; it is that you have a
> few obs per mother--the approach of including dummies and running a
> regular logit works when you have enough obs to run a separate logit
> in each group, so your asymptotics work out OK (i.e. you are assuming
> the N within each group of obs is close to infinity, not the overall
> N, so your individual dummies are estimable).
> On Mon, Jan 31, 2011 at 10:18 AM, Ivy Kodzi <[email protected]> wrote:
>> Hi Statalisters,
>> My name is Ivy Kodzi. I am new to the list and find it very helpful!
>> I am doing some analysis where I want to estimate the effect of parity
>> (how many  births a woman has had) on the chance of having a child
>> with low birth weight.  I also want to adjust for all constant
>> observed and unobserved factors at the mother  level e.g. some aspects
>> of her health/genetics, household environment, community etc., so  I
>> am running a fixed-effects regression model with a dummy for whether
>> or not a child was born with low birth weight as the outcome variable.
>> In my data file, I have child characteristics nested under "mother".
>> The birth weight variable, however, (from DHS data-sets) has a serious
>> sample selection problem – the data predominantly come from more
>> educated, richer, urban women who had their babies in hospitals. In my
>> child level file, about two-thirds of children are missing birth
>> weights  largely because their mothers (less affluent) had the birth
>> at home - hence no birth weight recorded. I could adjust for the
>> selection problem with a heckman model e.g. with a  bivariate probit
>> model, with selection  but fundamentally, I want to conduct a
>> within-mother analysis so I want to include fixed-effects for the
>> mother. The commands heckprob and heckprob2 in stata allow for
>> adjusting for the clustering of children within mother, but not the
>> inclusion of fixed effects.  I have nearly a 100 thousand mothers, so
>> including dummies for mother-specific fixed-effects will  not be
>> efficient, and may be incorrect.
>> I am also not very sure if I should approach the problem as a  Heckman
>> type sample selection problem or as a non-ignorable missing data
>> problem (even with so much missingness?).  I think the problem is more
>> of the former, that is why I am considering  a Heckman correction
>> model. Can anyone recommend how to do the estimation, especially  in
>> stata?
> *
> *   For searches and help try:
> *
> *
> *

*   For searches and help try:

© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index