[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

From |
"Nick Cox" <[email protected]> |

To |
<[email protected]> |

Subject |
st: RE: -binreg- |

Date |
Thu, 21 Nov 2002 19:18:06 -0000 |

Jay Kaufman > > The -binreg- routine fits generalized linear models for the > binomial family. It is presumably preferred over fitting the same > model in -glm-, not only for the convenience of not having to > specify the distributional family in the command line, but also > because in iteratively seeking the estimates it checks to make sure > that they are consistent with the range of allowable probabilities > (i.e. 0 to 1), as described on page 138 of the manual [Ref A-G]. > So my question is, why does -binreg- appear to be so > bad at this checking? > > Take a very simple model using the auto.dta. > > . use "C:\Stata\auto.dta", clear > (1978 Automobile Data) > > . binreg foreign mpg, rr > > Residual df = 72 No. of obs = 74 > Pearson X2 = 73.88014 Deviance = 78.99933 > Dispersion = 1.026113 Dispersion = 1.097213 > > Bernoulli distribution, log link > ------------------------------------------------------------ > ---------- > | EIM > foreign | Risk Ratio Std. Err. z P>|z| [95% > Conf. Interval] > --------+--------------------------------------------------- > ---------- > mpg | 1.097213 .0109901 9.26 0.000 1.075883 > 1.118966 > ------------------------------------------------------------ > ---------- > > . predict phat, mu > > . sum phat > > Variable | Obs Mean Std. Dev. Min Max > -------------+----------------------------------------------------- > phat | 74 .3008965 .22691 .1072727 1.580984 > > Clearly a predicted probability > 1.5 is not a good estimate. Did > I do something wrong? Or did -binreg- do something wrong? Or is > this simply another example of why linear models of the logit and > probit have dominated analysis of binary data for decades? > > By the way, note that if I fit the exact same model using -glm-, > this same observation gets a predicted probability of 1.43, so > -binreg- actually seems to do worse. This kind of comment could be extended indefinitely and made whenever a model yields predictions that violate limits known to the modeller. Yet models with inappropriate limiting behaviour remain in the repertoire for various reasons, some force of habit or tradition, but one often being that they may well be adequate or even best for the range of data found in practice. My interpretation is that, as in some of those children's stories, you got exactly what you asked for, and that's your punishment. Specifically, in the case of -binreg- there is no such check -- according to my scan of the code. As you imply, a prerogative of the modeller, and usually a mark of good modelling taste, is to prefer a model that will always yield qualitatively correct predictions, as is guaranteed here by appropriate choice of link. With the -or- option (i.e. logit link) predictions are bounded appropriately and very close to what the equivalent -glm- yields. Nick [email protected] * * For searches and help try: * http://www.stata.com/support/faqs/res/findit.html * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**References**:**st: -binreg-***From:*Jay Kaufman <[email protected]>

- Prev by Date:
**st: -binreg-** - Next by Date:
**st: Re: -binreg-** - Previous by thread:
**st: -binreg-** - Next by thread:
**st: Re: -binreg-** - Index(es):

© Copyright 1996–2024 StataCorp LLC | Terms of use | Privacy | Contact us | What's new | Site index |