[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

From |
rgutierrez@stata.com (Roberto G. Gutierrez, StataCorp LP) |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: Binomial regression |

Date |
Fri, 03 Aug 2007 16:23:53 -0500 |

Vince Wiggins and I responded on this thread yesterday, in which one of our main points was that forcing convergence when an identity link may not be appropriate results in a loss of model interpretation. This is a case where not only is there no free lunch, but the exact price of that lunch is unclear. We mentioned various ways that convergence could be achieved. One method I did not mention is the result of similar work I have done working with the log link function. This method carries with it an ease of interpretation (akin to knowing the price of your lunch) and involves using a modified version of the identity link, something I call the "Identity/logit" link. The Identity/Logit link is a simple modification of the standard identity link that turns into a logit link outside a prespecified range, say outside (0.01, 0.99), the default. Inside this range, you have the identity function; outside you have a logit function translated and scaled so as to be continuous and have continuous derivatives at the endpoints. Since the link function turns to logit at the endpoints, the range of the identity/logit inverse link is within (0,1), meaning that you won't get inadmissible predicted probabilities and convergence can be achieved over a much wider variety of datasets. The only drawback is one of interpretation, however, the interpretation here is clear: When the predicted probability is in the range (0.01, 0.99), your coefficients reflect risk differences, otherwise, they are (rescaled) log(odds ratio)'s. In other words, you need only qualify your treatment of risk differences to state that they are valid for risks in the range (0.01, 0.99) rather than (0,1). To me this seems better than trying to interpret the fruits of an ad hoc adjustment. I attach the code for -identitylogit- below. A simple demonstration . sysuse auto . glm for price mpg, fam(bin) link(identity) < no convergence > . glm for price mpg, fam(bin) link(identitylogit) nolog Generalized linear models No. of obs = 74 Optimization : ML Residual df = 71 Scale parameter = 1 Deviance = 70.50849514 (1/df) Deviance = .9930774 Pearson = 68.72329876 (1/df) Pearson = .9679338 Variance function: V(u) = u*(1-u) [Bernoulli] Link function : g(u) = u -> logit(u) [Identity (0.010,0.990)] AIC = 1.033899 Log likelihood = -35.25424757 BIC = -235.0801 ------------------------------------------------------------------------------ | OIM foreign | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- price | .0000516 .0000157 3.29 0.001 .0000209 .0000823 mpg | .0471231 .0093466 5.04 0.000 .0288041 .0654421 _cons | -1.046464 .2299337 -4.55 0.000 -1.497126 -.5958023 ------------------------------------------------------------------------------ The identity/logit link also accepts an optional argument, p, for controlling the range of the identity part of this link. For example, . glm for price mpg, fam(bin) link(identitylogit 0.05) would give a valid range for risk differences of (0.05, 0.95) rather than the default (0.01, 0.99). --Bobby rgutierrez@stata.com --------------------------------begin indentitylogit.ado--------------------- *! version 1.0.0 03aug2007 program define identitylogit version 8.2 args todo eta mu return if `todo' == -1 { /* Title */ if "$SGLM_p" == "" { global SGLM_p 0.01 } else { confirm number $SGLM_p if $SGLM_p <= 0 | $SGLM_p >= 0.5 { di as error /// "link parameter p must be between 0 and 0.5" exit 459 } } if "$SGLM_m" == "1" { global SGLM_lf "u -> logit(u)" } else { global SGLM_lf "u/$SGLM_m -> logit(u/$SGLM_m)" } local a1 : di %5.3f $SGLM_p local a2 : di %5.3f 1-$SGLM_p global SGLM_lt "Identity (`a1',`a2')" exit } tempname a b c d gp p scalar `p' = $SGLM_p scalar `gp' = logit(`p') scalar `b' = `p'*(1-`p') scalar `a' = `p' - `b'*`gp' scalar `d' = `b' scalar `c' = 1-`p' - `d'*logit(1-`p') if `todo' == 0 { // eta = g(mu) tempvar eta2 gen double `eta' = `mu'/$SGLM_m gen double `eta2' = `eta' replace `eta' = logit(`mu'/$SGLM_m)*`b' + `a' /// if `eta2' <= `p' replace `eta' = logit(`mu'/$SGLM_m)*`d' + `c' /// if `eta2' >= 1-`p' exit } if `todo' == 1 { // mu = g^-1(eta) gen double `mu' = $SGLM_m*`eta' replace `mu' = $SGLM_m*invlogit((`eta'-`a')/`b') /// if `eta' <= `p' replace `mu' = $SGLM_m*invlogit((`eta'-`c')/`d') /// if `eta' >= 1-`p' exit } if `todo' == 2 { // (d mu)/(d eta) gen double `return' = $SGLM_m replace `return' = (`mu'*(1-`mu'/$SGLM_m)) / `b' /// if `eta' <= `p' replace `return' = (`mu'*(1-`mu'/$SGLM_m)) / `d' /// if `eta' >= 1-`p' exit } if `todo' == 3 { // (d^2 mu)(d eta^2) gen double `return' = 0 replace `return' = (`mu'*(1-`mu'/$SGLM_m)* /// (1-2*`mu'/$SGLM_m)) / (`b'^2) /// if `eta' <= `p' replace `return' = (`mu'*(1-`mu'/$SGLM_m)* /// (1-2*`mu'/$SGLM_m)) / (`d'^2) /// if `eta' >= 1-`p' exit } noi di as err "Unknown call to glm link function" exit 198 end ---------------------------------end identitylogit.ado----------------------- * * For searches and help try: * http://www.stata.com/support/faqs/res/findit.html * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

- Prev by Date:
**Re: st: Binomial regression** - Next by Date:
**st: Constrained ARCH in STATA** - Previous by thread:
**Re: st: Binomial regression** - Next by thread:
**Re: st: Binomial regression** - Index(es):

© Copyright 1996–2015 StataCorp LP | Terms of use | Privacy | Contact us | What's new | Site index |