Statalist The Stata Listserver


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: GLMs that fail - an effect (Hauck-Donner)


From   ymarchenko@stata.com (Yulia Marchenko, StataCorp)
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: GLMs that fail - an effect (Hauck-Donner)
Date   Fri, 17 Mar 2006 14:44:10 -0600

Allan Reese (Cefas) <r.a.reese@cefas.co.uk> wrote:

> I've been fitting a logistic glm to some count data: six treatments, each
> duplicated, response is r/n:
>
>	treat     r        n
>	1         8       13 
>	1         9       15 
> ...
> Since the effect is due, in this and similar examples, to groups having zero
> variance, is it not possible to modify glm (and other estimating commands?) to
> detect this and either issue a warning or switch automatically to robust
> estimators?

What Allan observes in these data is what we refer to as "perfect failure
predictors" in logit models. The categories 4 and 6 of treat (or the
corresponding indicator variables defining these categories) predict failure
perfectly, i.e. if we are to expand these data to the equivalent binary
representation

	treat	y
	1	1
	1	1
	1	1
	1	1
	1	1
	1	1
	1	1
	1	1
	1	0
	1	0
	1	0
	1	0
	1	0
	2	1
	...

then Pr(y = 0 | treat==4) = 0 and Pr(y = 0 | treat==6) = 0. Stata's -logit- and
-logistic- commands report the appropriate warning message in such situation
and drop perfect predictors from the model.  A more detailed description of
this may be found in [R] logit p.96.

Since Alan has grouped data he can use Sata's -blogit- command to fit logit
model:

. xi: blogit r n i.treat
i.treat           _Itreat_1-6         (naturally coded; _Itreat_1 omitted)
note: _Itreat_4 != 0 predicts success perfectly
      _Itreat_4 dropped and 2 obs not used

note: _Itreat_6 != 0 predicts success perfectly
      _Itreat_6 dropped and 2 obs not used


Logistic regression for grouped data              Number of obs   =        118
                                                  LR chi2(3)      =      13.50
                                                  Prob > chi2     =     0.0037
Log likelihood = -54.186571                       Pseudo R2       =     0.1108

------------------------------------------------------------------------------
    _outcome |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
   _Itreat_2 |   2.203739   .8279165     2.66   0.008     .5810527    3.826426
   _Itreat_3 |   .4119798   .5553943     0.74   0.458    -.6765729    1.500533
   _Itreat_5 |   1.761907   .7211817     2.44   0.015     .3484164    3.175397
       _cons |   .4353181    .386953     1.12   0.261    -.3230959    1.193732
------------------------------------------------------------------------------

Note that -blogit- identified the "perfect predictor problem" and issued the
corresponding warning messages.

You can fit the logit model equivalently using -glm- with family(binomial) and
logit(link). However, -glm- does not replicate the behavior of the
corresponding logit model since it is thought of a facility to fit a set of
generalized linear models. Therefore, we tend not to specialize it for the
particular members of this family as, in this case, logit models.


-- Yulia
   ymarchenko@stata.com
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2020 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index