Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: RE: logistic regression with clustered SE vs. xtlogit

From   Timothy Mak <>
To   "" <>
Subject   st: RE: logistic regression with clustered SE vs. xtlogit
Date   Tue, 18 Jun 2013 10:08:20 +0800

Hi Adam, 

This sounds like an Item Response Theory problem to me. 

However, usually they assume an underlying continuous score rather than an underlying binary factor. If you definitely want a binary underlying factor, then the model to consider is a latent class model. 

If you want to assume an underlying continuous score, then the Rasch model may be appropriate. Your random-effects logistic model can be thought of as a very simplistic Rasch model - i.e. it assumes all questions have the same probability of being 0 or 1. See for a very useful discussion of the Rasch model and how to do it in Stata. 

The clustered SE approach is the same as doing -xtlogit, pa- with the corr(independent) and vce(robust) option. The difference between -xtlogit, re- and -xtlogit, pa- is explained in 

I wouldn't recommend using -xtlogit, pa-, since (a) it's not commonly (if ever) used for this kind of analysis, and (b) it's inefficient even if it's appropriate. 

That's my twopence... 


-----Original Message-----
From: [] On Behalf Of Adam Olszewski
Sent: 16 June 2013 08:13
Subject: st: logistic regression with clustered SE vs. xtlogit

Dear listers,
I have a dataset with results from a 15-item questionnaire, with a
binary response to each question (the questions are felt to measure
the same underlying binary factor). I want to study the correlation of
demographic variables (age, gender) on whether the answer is 0 or 1.
The questions are obviously correlated between the 15 items filled out
by the same person.
After reading about different models, it seems that logistic
regression with clustered standard errors (-logit varlist, vce(cluster
ID)-) or random-effects logistic model (-xtset ID-, then: -xtlogit
varlist, re-) might be appropriate, and give similar, although not
identical results. I am not sure what is the conceptual difference
between them. Would one be preferred over the other in some
circumstances? Or did I even pick wrong tools for the problem? I
thought about regressing the mean of answers, but such a dependent
variable does not meet assumption for any model that I know of.
Sorry if this sounds basic, but I rarely wander beyond routine
logistic regression and I am a little puzzled by xtlogit.
Best regards,
Adam Olszewski
*   For searches and help try:

*   For searches and help try:

© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index