I obviously don't understand how the residual statistics work in logistic
regression. I have run a logistic regression of happymar (coded 0, 1) on
church and female (also both 0, 1) and educ (years of education). I then
use predict to get the deviance residuals (I get similar results if I use
the rstandard or residuals options on predict). I get the following:
. extremes dev p happymar church female educ, nolabel high
What I don't understand is, why does case 6 stand out as an outlier? The
observed value of happymar is 0, and the predicted probability of a 1 is
only .0858805. So, I would think case 6 is anything but an outlier. In
addition, cases 13 and 36 have the same values as 6 on the Xs, but for them
happymar = 1. So, I would expect them to be outliers, but I would not
expect case 6 to be.
Incidentally, if I cheat and run OLS regression instead, case 6 is not
identified as an outlier.