# st: Residuals in Logistic Regression

 From Richard Williams To statalist@hsphsun2.harvard.edu Subject st: Residuals in Logistic Regression Date Fri, 09 Apr 2004 09:26:35 -0500

I obviously don't understand how the residual statistics work in logistic regression. I have run a logistic regression of happymar (coded 0, 1) on church and female (also both 0, 1) and educ (years of education). I then use predict to get the deviance residuals (I get similar results if I use the rstandard or residuals options on predict). I get the following:

. extremes dev p happymar church female educ, nolabel high

+----------------------------------------------------------------+
| obs: dev p happymar church female educ |
|----------------------------------------------------------------|
| 43. 1.170689 .5039612 1 1 0 10 |
| 2. 1.394511 .3782007 1 0 1 10 |
| 6. 2.4859 .0858805 0 0 0 11 |
| 13. 2.4859 .0858805 1 0 0 11 |
| 36. 2.4859 .0858805 1 0 0 11 |
+----------------------------------------------------------------+

note: 6 values of -1.331003

What I don't understand is, why does case 6 stand out as an outlier? The observed value of happymar is 0, and the predicted probability of a 1 is only .0858805. So, I would think case 6 is anything but an outlier. In addition, cases 13 and 36 have the same values as 6 on the Xs, but for them happymar = 1. So, I would expect them to be outliers, but I would not expect case 6 to be.

Incidentally, if I cheat and run OLS regression instead, case 6 is not identified as an outlier.

*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/