[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

From |
Constantine Daskalakis <C_Daskalakis@mail.jci.tju.edu> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: Binomial regression |

Date |
Fri, 03 Aug 2007 11:35:23 -0400 |

Hey Marcello.

Shouldn't you be on vacation? :)

Certainly, you can recover the absolute risks for each covariate pattern

(or observation), but what about the risk DIFFERENCE?

The point is this:

If the Xs are additive on the log-odds scale, we can fit a logistic

regression and report a single OR for each X as a summary measure.

Additivity on the log-odds scale implies non-additivity on the

probability scale, so there is no single RD for X (depends on the actual value of X).

But suppose the Xs are (more) additive on the original probability scale

rather than the log-odds scale. Then, it would be best to report a

single RD for each X as a summary measure of effect (rather than a

single OR, which is not really appropriate in this situation). Now, from

a logistic regression without interactions, we get a single OR for each

X, but an infinite number of RDs for the same X. That's not very good.

On the other hand, we can retrieve a single summary RD from a

main-effects-only binomial regression model. And, by the way, a logistic

regression would need a bunch of interactions to have comparable fit to

this binomial regression, and would still not provide us with a single

summary RD.

There's also causal inference stuff regarding causal interpretations for

RD, but not for RR or OR. So, that's additional motivation for focusing on the RD.

Bottom line: The goal here (get summary RD) CANNOT be achieved via

logistic regression.

Finally, there's nothing about a "uniform distribution" here. It's just

a generalized linear model with the identity link -- a different model

for how the risk changes as a function of covariates (and the same

binomial error structure that logistic regression uses). There's no

substantive reason to prefer one link over another. Why would the risk

follow the logistic function rather than any other curve (including a

line)? The correct/appropriate link function will depend on the data at

hand. Sometimes it will be a line, sometimes a logistic, sometimes some

other unknown beast.

I think that we are getting to the point where logistic has become so

ingrained that many people think of it (unconsciously?) as "have

screwdriver, will always use screwdriver (with binary outcome)." The

logit is the canonical link function for the Bernoulli/binomial and

implementation of regression with the logit link is easier than anything

else. But that's just ease of programming and custom.

For history buffs, this goes back to Cornfield (Bull Int Statist Inst, 1961) who 'tricked' programs designed for discriminant analysis to do logistic regression and to Walker & Duncan (Biometrics, 1967) who looked at the maximum likelihood approach. At the time, lack of computing resources made such things impractical for other types of regressions for binary outcomes. But that's half a century ago.

CD

On 8/2/2007 10:45 PM, Marcello Pagano wrote:

Sorry to disagree with your first sentence, Constantine.

Logistic regression stipulates a linear relationship of covariates with the log of the odds of an event (not odds ratios). From this it is straightforward to recover the probability (or risk, if you prefer that label) of the event.

Don't understand your aversion to logistic regression to achieve what you want to achieve.

If you don't like the shape of the logistic, then any other cdf will provide you with a transformation to obey the constraints inherent in modeling a probability. The uniform distribution that you wish to use has to be curtailed, as others have pointed out.

m.p.

Constantine Daskalakis wrote:

No argument about logistic regression. But that gives you odds ratios. What if you want risk differences instead?

* * For searches and help try: * http://www.stata.com/support/faqs/res/findit.html * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

-- The documents accompanying this transmission may contain confidential health or business information. This information is intended for the use of the individual or entity named above. If you have received this information in error, please notify the sender immediately and arrange for the return or destruction of these documents. Constantine Daskalakis, ScD Assistant Professor, Thomas Jefferson University, Division of Biostatistics 1015 Chestnut St., Suite M100, Philadelphia, PA 19107 Tel: 215-955-5695 Fax: 215-503-3804 Email: c_daskalakis@mail.jci.tju.edu Webpage: http://www.jefferson.edu/clinpharm/biostatistics/ * * For searches and help try: * http://www.stata.com/support/faqs/res/findit.html * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**Re: st: Binomial regression***From:*Marcello Pagano <pagano@hsph.harvard.edu>

**References**:**Re: st: Binomial regression***From:*Maarten buis <maartenbuis@yahoo.co.uk>

**Re: st: Binomial regression***From:*Constantine Daskalakis <C_Daskalakis@mail.jci.tju.edu>

**Re: st: Binomial regression***From:*Marcello Pagano <pagano@hsph.harvard.edu>

- Prev by Date:
**Re: st: Binomial regression** - Next by Date:
**Re: st: calculating cumulative exposure** - Previous by thread:
**Re: st: Binomial regression** - Next by thread:
**Re: st: Binomial regression** - Index(es):

© Copyright 1996–2014 StataCorp LP | Terms of use | Privacy | Contact us | What's new | Site index |