Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: st: Hosmer-Lemeshow and other Pseudo Rsquares


From   Cameron McIntosh <[email protected]>
To   STATA LIST <[email protected]>
Subject   RE: st: Hosmer-Lemeshow and other Pseudo Rsquares
Date   Mon, 14 May 2012 21:08:49 -0400

Joseph,

In order to be able to provide an optimally informed write-up to reviewers, you should also have a look at:
DeMaris, A. (2002). Explained Variance in Logistic Regression: A Monte Carlo Study of Proposed Measures. Sociological Methods & Research, 31(1), 27-74.http://gabarrot.psychologie-sociale.org/documents/DM2002.pdf

Menard, S. (2000). Coefficients of Determination for Multiple Logistic Regression Analysis. The American Statistician, 54(1), 17-24. 

Liao, J.G., & McGee, D. (2003). Adjusted Coefficients of Determination for Logistic Regression. The American Statistician, 57(3), 161-165. 

Mittlböck, M., & Schemper, M. (1996). Explained variation for logistic regression. Statistics in Medicine, 15(19), 1987-1997.

Mittlböck, M. (1998). Computing measures of explained variation for logistic regression models. Computer Methods and Programs in Biomedicine, 58(1), 17-24.

Allen, J., & Le, H. (2008). An Additional Measure of Overall Effect Size for Logistic Regression Models. Journal of Educational and Behavioral Statistics, 33(4), 416-441.

Heinz, H., Waldhor, T., & Mittlböck, M. (2005). Careful use of pseudo R-squared measures in epidemiological studies. Statistics in Medicine, 24(18), 2867-2872.http://www.meduniwien.ac.at/msi/biometrie/publikationen/Separata/Heinzl_Waldhoer_Mittlboeck_2005_SiM.pdf

Cameron, A.C., & Windmeije, F.A.G. (1997). An R-squared measure of goodness of fit for some common nonlinear regression models. Journal of Econometrics, 77(2), 329-342.http://cameron.econ.ucdavis.edu/research/je97preprint.pdf

Veall, M.R. & Zimmermann, K.F. (1996). Pseudo-R2 Measures for Some Common Limited Dependent Variable Models. Journal of Economic Surveys, 10(3), 241-259.

Cam

> Date: Mon, 14 May 2012 10:45:45 -0400
> Subject: Re: st: Hosmer-Lemeshow and other Pseudo Rsquares
> From: [email protected]
> To: [email protected]
> 
> Yeah, I second the 'tribal' thing.  I've been largely learning a lot
> of this on my own in order to be thorough on this particular project
> and from one discipline's literature to the next the terminology alone
> is night day, never mind the exact method and type of reporting they
> prefer.
> 
> I believe I'm on track now.  Thanks for the suggestions!
> 
> On Mon, May 14, 2012 at 10:43 AM, Nick Cox <[email protected]> wrote:
> > I find these things to be highly tribal. One large part of the
> > statistical world doesn't know at all about what another large part
> > regards as utterly standard. So, anything that might surprise your
> > reviewers might need to be explained very carefully.
> >
> > On Mon, May 14, 2012 at 3:34 PM, Joseph Padgett <[email protected]> wrote:
> >> Thanks, Nick. That's helpful.  I've seen these suggestions before, but
> >> wrapped in bigger discussions and not nearly as succinct.
> >>
> >> I am aware that the R square measures for logistic models are only
> >> guides and not sole determining factors, but it seems that researchers
> >> commonly report some form of it (sociology background here btw).
> >>
> >> So I've calculated both of your suggestions.  Any advice on reporting
> >> those?  Does either have an associated line of research that you're
> >> aware of that I should be referring to/citing when I'm reporting the
> >> calculation and results?
> >>
> >> On Mon, May 14, 2012 at 10:10 AM, Nick Cox <[email protected]> wrote:
> >>> I suggest a few meta-rules for yourself:
> >>>
> >>> 1. Whatever you calculate should be defined and calculated
> >>> consistently across different models.
> >>>
> >>> 2. Whatever you calculate you promise to use with extreme caution
> >>> always flagging precisely how it is calculated.
> >>>
> >>> 3. You don't decide which model is "best" from these measures; you
> >>> just treat them as descriptive statistics.
> >>>
> >>> #1 sounds easy but can bite quite hard. I find the idea of R^2 as
> >>>
> >>> square of correlation between observed and predicted
> >>>
> >>> as the sense of R^2 that I like best but this grows out of a long
> >>> personal history of working with correlation and regression and one
> >>> that is dominated by working with continuous outcomes. People with a
> >>> long history the other way round might want you to look for
> >>>
> >>> 1 - (log likelihood for model) / (log likelihood for same model with
> >>> only a constant term)
> >>>
> >>> and could have similar warm feelings for that. Others would find the
> >>> whole idea of looking at goodness of fit without also assessing number
> >>> of parameters or model complexity in general quite misguided, but
> >>> those others can't agree on which of various *IC you should use, and
> >>> even those who have a favourite often say, "You should use ?IC except
> >>> that it usually favours over-simplified models" or some such.
> >>>
> >>> On the first option see
> >>>
> >>> FAQ     . . . . . . . . . . . . . . . . . . . . . . . Do-it-yourself R-squared
> >>>        . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  N. J. Cox
> >>>        9/03    How can I get an R-squared value when a Stata command
> >>>                does not supply one?
> >>>                http://www.stata.com/support/faqs/stat/rsquared.html
> >>>
> >>> On Mon, May 14, 2012 at 2:47 PM, Joseph Padgett <[email protected]> wrote:
> >>>
> >>>> I am working with a data set where students are nested within school.
> >>>>
> >>>> I have completed a thorough run of models starting with nulls and
> >>>> ending with full fixed- and random-effects with all controls and
> >>>> predictors and several models in between with various combinations of
> >>>> controls.  My dependent variable is a binary outcome.
> >>>>
> >>>> I have Haussman tests, LR tests, and Wald taken care of, but I would
> >>>> like to report some goodness-of-fit results for my models.  I am aware
> >>>> of the Hosmer-Lemeshow test statistic and it's interpretation, but I'm
> >>>> having a difficult time finding out how to compute it from my model
> >>>> results.  I would also like to consider alternatives such as Cox and
> >>>> Snell.
> >>>>
> >>>> I have run my models with each of xtlogit, xtmelogit, and gllamm.  I
> >>>> did this mostly to be able to learn a bit about the post estimation
> >>>> commands and different options with each command.  That being said, I
> >>>> don't know how to get the pseudo Rsquare measures after any of these
> >>>> and most explanations that I find refer only to the logit command and
> >>>> give examples using very simplistic models.
> >>>>
> >>>> I'm fairly certain there's something terribly obvious that I'm
> >>>> overlooking.  Any help would be greatly appreciated.
> >>>
> >>> *
> >>> *   For searches and help try:
> >>> *   http://www.stata.com/help.cgi?search
> >>> *   http://www.stata.com/support/statalist/faq
> >>> *   http://www.ats.ucla.edu/stat/stata/
> >>
> >> *
> >> *   For searches and help try:
> >> *   http://www.stata.com/help.cgi?search
> >> *   http://www.stata.com/support/statalist/faq
> >> *   http://www.ats.ucla.edu/stat/stata/
> >
> > *
> > *   For searches and help try:
> > *   http://www.stata.com/help.cgi?search
> > *   http://www.stata.com/support/statalist/faq
> > *   http://www.ats.ucla.edu/stat/stata/
> 
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
 		 	   		  
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index