Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

# st: RE: logistic regression with clustered SE vs. xtlogit

 From Timothy Mak <[email protected]> To "[email protected]" <[email protected]> Subject st: RE: logistic regression with clustered SE vs. xtlogit Date Tue, 18 Jun 2013 10:08:20 +0800

```Hi Adam,

This sounds like an Item Response Theory problem to me. http://en.wikipedia.org/wiki/Item_response_theory

However, usually they assume an underlying continuous score rather than an underlying binary factor. If you definitely want a binary underlying factor, then the model to consider is a latent class model.

If you want to assume an underlying continuous score, then the Rasch model may be appropriate. Your random-effects logistic model can be thought of as a very simplistic Rasch model - i.e. it assumes all questions have the same probability of being 0 or 1. See http://www.stata.com/support/faqs/statistics/rasch-model/ for a very useful discussion of the Rasch model and how to do it in Stata.

The clustered SE approach is the same as doing -xtlogit, pa- with the corr(independent) and vce(robust) option. The difference between -xtlogit, re- and -xtlogit, pa- is explained in http://www.stata.com/support/faqs/statistics/random-effects-versus-population-averaged/

I wouldn't recommend using -xtlogit, pa-, since (a) it's not commonly (if ever) used for this kind of analysis, and (b) it's inefficient even if it's appropriate.

That's my twopence...

Tim

-----Original Message-----
From: [email protected] [mailto:[email protected]] On Behalf Of Adam Olszewski
Sent: 16 June 2013 08:13
To: [email protected]
Subject: st: logistic regression with clustered SE vs. xtlogit

Dear listers,
I have a dataset with results from a 15-item questionnaire, with a
binary response to each question (the questions are felt to measure
the same underlying binary factor). I want to study the correlation of
demographic variables (age, gender) on whether the answer is 0 or 1.
The questions are obviously correlated between the 15 items filled out
by the same person.
regression with clustered standard errors (-logit varlist, vce(cluster
ID)-) or random-effects logistic model (-xtset ID-, then: -xtlogit
varlist, re-) might be appropriate, and give similar, although not
identical results. I am not sure what is the conceptual difference
between them. Would one be preferred over the other in some
circumstances? Or did I even pick wrong tools for the problem? I
variable does not meet assumption for any model that I know of.
Sorry if this sounds basic, but I rarely wander beyond routine
logistic regression and I am a little puzzled by xtlogit.
Best regards,
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/
```