Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: Residuals in Logistic Regression


From   Jhilbe@aol.com
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: Residuals in Logistic Regression
Date   Mon, 12 Apr 2004 12:11:48 -0400

The -logistic- command was based on a program called -logiodds-that was made available to Stata users in the January 1991 "The Stata News". It was a stimulus to the creation of the Stata Technical Bulletin, which began in May of 1991.  In fact, the first issue had a revised edition of -logiodds-. It was written to implement Hosmer-Lemeshow recomendations regarding covariate patterns and various GOF statistics, which were detailed in their then fairly new text. It was meant to be an alternative to Stata's -logit- command, which kept the observation based residuals. 

The current post -logistic- commands, lfit, lstat, and lroc, were provided as options to the -logiodds- command (I believe that the 2nd version, the one preceding Stata's official 
-logistic- command, was called -logiodd2- in STB-1). 

As it is now, -logistic- still retains the residuals-by-covariate-pattern approach to diagnostics. This underlays the fit statistics as well. Most other commercial software does not do this - hence possible differences in output. In my opinion, the Hosmer-Lemeshow approach of having fit statistics based on covariate pattern is preferable to simply using unadjusted individual observations as the basis of residual and fit statistics.

The way to get what you want -- observation and not covariate patterns -- is to use -logit-, obtain the linear predictor and fit (mu) statistics using -predict-, and calculate the residuals and fit statistics using the appropriate formulae. You can find them in the manual, or in Hardin & Hilbe (2001-Stata Press). Calculating the residuals is really quiet easy. 

I hope that this helps - and gives a bit of background. 

Joe Hilbe

> 
> At 06:55 PM 4/9/2004 +0100, Nick Cox wrote:
> >To give this pot another stir, and to use
> >mutually accessible data, -logit- and -glm-
> >with logit link and binomial family give quite
> >different deviance residuals.
> >
> >The pattern of -glm-'s makes sense,
> >but that of -logit-'s is more puzzling.
> 
> Ok, after looking in several wrong places, I finally found an explanation 
> in the Stata Reference Manual G-M, pp. 315-316.  It says that "All the 
> residual and diagnostic statistics calculated by Stata [NOTE: I think it 
> really means Stata logistic regression and some related routines] are in 
> terms of covariate patterns, not observations.  That is, all observations 
> with the same covariate patterns are given the same residual and diagnostic 
> statistics."  It says that Hosmer and Lemeshow argue that this is the 
> better way to do it.
> 
> They may be right, but even Stata isn't consistent across routines in the 
> handling of this.  I'd like for -predict- to offer residual stats that were 
> based on the individual observations and not the covariate 
> patterns.
> 
> -------------------------------------------
> Richard Williams, Notre Dame Dept of Sociology
> OFFICE: (574)631-6668, (574)631-6463
> FAX:    (574)288-4373
> HOME:   (574)289-5227
> EMAIL:  Richard.A.Williams.5@ND.Edu
> WWW (personal):    http://www.nd.edu/~rwilliam
> WWW (department):    http://www.nd.edu/~soc
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index