Allan Reese (Cefas) <r.a.reese@cefas.co.uk> wrote:
> I've been fitting a logistic glm to some count data: six treatments, each
> duplicated, response is r/n:
>
> treat r n
> 1 8 13
> 1 9 15
> ...
> Since the effect is due, in this and similar examples, to groups having zero
> variance, is it not possible to modify glm (and other estimating commands?) to
> detect this and either issue a warning or switch automatically to robust
> estimators?
What Allan observes in these data is what we refer to as "perfect failure
predictors" in logit models. The categories 4 and 6 of treat (or the
corresponding indicator variables defining these categories) predict failure
perfectly, i.e. if we are to expand these data to the equivalent binary
representation
treat y
1 1
1 1
1 1
1 1
1 1
1 1
1 1
1 1
1 0
1 0
1 0
1 0
1 0
2 1
...
then Pr(y = 0 | treat==4) = 0 and Pr(y = 0 | treat==6) = 0. Stata's -logit- and
-logistic- commands report the appropriate warning message in such situation
and drop perfect predictors from the model. A more detailed description of
this may be found in [R] logit p.96.
Since Alan has grouped data he can use Sata's -blogit- command to fit logit
model:
. xi: blogit r n i.treat
i.treat _Itreat_1-6 (naturally coded; _Itreat_1 omitted)
note: _Itreat_4 != 0 predicts success perfectly
_Itreat_4 dropped and 2 obs not used
note: _Itreat_6 != 0 predicts success perfectly
_Itreat_6 dropped and 2 obs not used
Logistic regression for grouped data Number of obs = 118
LR chi2(3) = 13.50
Prob > chi2 = 0.0037
Log likelihood = -54.186571 Pseudo R2 = 0.1108
------------------------------------------------------------------------------
_outcome | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
_Itreat_2 | 2.203739 .8279165 2.66 0.008 .5810527 3.826426
_Itreat_3 | .4119798 .5553943 0.74 0.458 -.6765729 1.500533
_Itreat_5 | 1.761907 .7211817 2.44 0.015 .3484164 3.175397
_cons | .4353181 .386953 1.12 0.261 -.3230959 1.193732
------------------------------------------------------------------------------
Note that -blogit- identified the "perfect predictor problem" and issued the
corresponding warning messages.
You can fit the logit model equivalently using -glm- with family(binomial) and
logit(link). However, -glm- does not replicate the behavior of the
corresponding logit model since it is thought of a facility to fit a set of
generalized linear models. Therefore, we tend not to specialize it for the
particular members of this family as, in this case, logit models.
-- Yulia
ymarchenko@stata.com
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/