Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: RE: logistic regression with clustered SE vs. xtlogit


From   Timothy Mak <tshmak@hku.hk>
To   "statalist@hsphsun2.harvard.edu" <statalist@hsphsun2.harvard.edu>
Subject   st: RE: logistic regression with clustered SE vs. xtlogit
Date   Tue, 18 Jun 2013 10:08:20 +0800

Hi Adam, 

This sounds like an Item Response Theory problem to me. http://en.wikipedia.org/wiki/Item_response_theory 

However, usually they assume an underlying continuous score rather than an underlying binary factor. If you definitely want a binary underlying factor, then the model to consider is a latent class model. 

If you want to assume an underlying continuous score, then the Rasch model may be appropriate. Your random-effects logistic model can be thought of as a very simplistic Rasch model - i.e. it assumes all questions have the same probability of being 0 or 1. See http://www.stata.com/support/faqs/statistics/rasch-model/ for a very useful discussion of the Rasch model and how to do it in Stata. 

The clustered SE approach is the same as doing -xtlogit, pa- with the corr(independent) and vce(robust) option. The difference between -xtlogit, re- and -xtlogit, pa- is explained in http://www.stata.com/support/faqs/statistics/random-effects-versus-population-averaged/ 

I wouldn't recommend using -xtlogit, pa-, since (a) it's not commonly (if ever) used for this kind of analysis, and (b) it's inefficient even if it's appropriate. 

That's my twopence... 

Tim



-----Original Message-----
From: owner-statalist@hsphsun2.harvard.edu [mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of Adam Olszewski
Sent: 16 June 2013 08:13
To: statalist@hsphsun2.harvard.edu
Subject: st: logistic regression with clustered SE vs. xtlogit

Dear listers,
I have a dataset with results from a 15-item questionnaire, with a
binary response to each question (the questions are felt to measure
the same underlying binary factor). I want to study the correlation of
demographic variables (age, gender) on whether the answer is 0 or 1.
The questions are obviously correlated between the 15 items filled out
by the same person.
After reading about different models, it seems that logistic
regression with clustered standard errors (-logit varlist, vce(cluster
ID)-) or random-effects logistic model (-xtset ID-, then: -xtlogit
varlist, re-) might be appropriate, and give similar, although not
identical results. I am not sure what is the conceptual difference
between them. Would one be preferred over the other in some
circumstances? Or did I even pick wrong tools for the problem? I
thought about regressing the mean of answers, but such a dependent
variable does not meet assumption for any model that I know of.
Sorry if this sounds basic, but I rarely wander beyond routine
logistic regression and I am a little puzzled by xtlogit.
Best regards,
Adam Olszewski
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index