Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

random effects logit model [was: st: From: Emma Gorman ...]

From   Nick Cox <>
Subject   random effects logit model [was: st: From: Emma Gorman ...]
Date   Wed, 24 Aug 2011 07:27:21 +0100

Please use meaningful titles for your postings and full references.
Both are recommended in the Statalist FAQ.

-xt- is set up to be general enough to apply to all sorts of problems,
including but not only those with longitudinal data. -xt- is in fact
quite often applied to problems without a time variable. It is a
user's decision whether their problem requires panels of a certain
minimum length.

In any case singleton panels with known time can contribute
information about the overall change.


Emma Gorman

On Wed, Aug 24, 2011 at 3:02 AM,  <> wrote:

> I am estimating a random effects logit model, augmented with cluster
> means  to account for correlation between effects and varying
> covariates (a la Mundlak 1978), using xtlogit.
> xtlogit WKRT $tmeans if allyears==1, or intpoints(20) re ;
> Where tmeans is regular covariates + cluster means.
> I have three waves of data (individuals over time) and would ideally like to
> use all longitudinal respondents (those who are in all three waves in this
> analysis). However, a fair few individuals have missing data for some
> covariates, so these observations are dropped from the regression model.
> I initially ended up with  *minimum observation per group: 1*
>                                                     avg obs per group: 2.5
>                                                 max obs per group: 3
> I found that there were many individuals who were observed in all waves, but
> only had non-missing information for all covariates (and dep variable) in
> one wave. I had assumed such people, who have only one usable wave of
> observation, would be automatically dropped form estimation by Stata as they
> provide no longitudinal info for the model.
> So I isolated and removed these people from the estimation command manually,
> to end up with:
> Random effects u_i ~ Gaussian                   Obs per group: min =2
>                                                              avg = 2.7
>                                                              max = 3
> My question is essentially: why is it that such cases, which only provide
> cross-sectional information, not dropped automatically (/should / they be
> dropped) ? Or is there an option to only use longitudinal information? How
> is this consistent with theory? It seems strange that the default should be
> to include everyone.
> My understanding of random effects models is that they use the most
> efficient combination of between and within variation, the time invariant
> individual effects are integrated out of the likelihood function and are
> assumed to be independent (in the non-linear case).
> So we don't want to know about those who don't have longitudinal information
> for estimation. (??)
> NB a complication with inclusion of cluster means is that  if there
> are individuals who only have one usable wave of information due to a
> missing dependent variable for the other waves, these guys still have
> valid cluster means for the explanatory variables, so in some sense
> there is still within and between information even with just one
> 'wave' of usable information. So perhaps these guys should not be
> gotten rid of...

*   For searches and help try:

© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index