[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

From |
rgutierrez@stata.com (Roberto G. Gutierrez, StataCorp.) |

To |
statalist@hsphsun2.harvard.edu |

Subject |
st: Re: -binreg- |

Date |
Thu, 21 Nov 2002 13:26:52 -0600 |

Jay Kaufman <Jay_Kaufman@unc.edu> asks: > The -binreg- routine fits generalized linear models for the binomial family. > It is presumably preferred over fitting the same model in -glm-, not only > for the convenience of not having to specify the distributional family in > the command line, but also because in iteratively seeking the estimates it > checks to make sure that they are consistent with the range of allowable > probabilities (i.e. 0 to 1), as described on page 138 of the manual [Ref > A-G]. So my question is, why does -binreg- appear to be so bad at this > checking? Actually, with Stata 7 -binreg- and -glm- are one in the same; -binreg- now only serves as a front end to a -glm- call. The only advantage of using -binreg- (besides not having to specify -family(binomial)- to -glm-) is that you do not have to know which link goes with which type of estimate, i.e. use a log link to get risk ratios. With Stata 7, -glm- was overhauled with one of the improvements being the inclusion of all the -binreg- special code to bump predicted probabilities back into the range (0,1) before applying the link function during the iterative estimation. As such, this made -binreg- obsolete, except for serving as a frontend to the more powerful -glm-. The reason you saw a difference in output between -binreg- and -glm- is that -binreg- uses IRLS exclusively, yet the default for -glm- is to use Newton-Raphson maximum likelihood. If you add the -irls- option to your -glm- commands you'll see that there is no difference at all. > Take a very simple model using the auto.dta. > . use "C:\Stata\auto.dta", clear > (1978 Automobile Data) > . binreg foreign mpg, rr This is precisely the same as . glm for mpg, fam(binom) link(log) irls eform [output omitted] > . predict phat, mu > . sum phat > > Variable | Obs Mean Std. Dev. Min Max > -------------+----------------------------------------------------- > phat | 74 .3008965 .22691 .1072727 1.580984 > Clearly a predicted probability > 1.5 is not a good estimate. Did I do > something wrong? Or did -binreg- do something wrong? Or is this simply > another example of why linear models of the logit and probit have dominated > analysis of binary data for decades? The "bumping predicted probabilities back into the range (0,1)" only occurs during the iterative estimation, and thus one of the dangers of this is that you end up with parameter estimates that produce linear predictors that produce inverse link transformations that really want to be outside the range (0,1). The fact that you were bumping these inverse links back into (0,1) was done so that you are able to calculate a likelihood (or deviance) and have something to work with. The only other alternative is to produce an error. At the convergence step it is hoped that no "bumping" was necessary, but if it was then you get the behavior above. Since an exponentiated linear predictor can take on any positive value, such difficulty is just a fact of life when using the log link. You have a good point: This is one good reason that logit and probit links are dominant. In your analysis, I'll note there is only one problematic point: observation 71, and all other observations give predicted probabilities in the proper range. This observation has the largest value of -mpg- (41) and it can be argued that this value is an outlier (or influential point in the context of this model). In summary, such behavior is not uncommon when using the binomial family with a log link, especially in the case of outliers/influential points. > By the way, note that if I fit the exact same model using -glm-, this same > observation gets a predicted probability of 1.43, so -binreg- actually seems > to do worse. See the above. --Bobby rgutierrez@stata.com * * For searches and help try: * http://www.stata.com/support/faqs/res/findit.html * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

- Prev by Date:
**st: RE: -binreg-** - Next by Date:
**st: problem referencing certain characters** - Previous by thread:
**st: -binreg-** - Next by thread:
**st: problem referencing certain characters** - Index(es):

© Copyright 1996–2014 StataCorp LP | Terms of use | Privacy | Contact us | What's new | Site index |