Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Simplification of formula in logistic regression

From   Maarten Buis <>
Subject   Re: st: Simplification of formula in logistic regression
Date   Mon, 16 May 2011 09:35:06 +0200

--- On Sun, May 15, 2011 at 4:23 PM, Mikkel Brabrand wrote:
> If I want clinicians to use my model, it needs to be simple. I cannot expect them to use a piece of software to calculate the risk score and it is virtually impossible to have it incorporated in the programs used at my department. I therefore need to simplify it and make the variables categorized or dichotomous.

Splitting up your variable is not the only way to make your results
understandable to a lay public. Adding a continuous variable and than
choosing and tabulating a couple of well chosen example values or a
carefully designed graphs can do just as well or even better. See for
Nicola Orsini and Sander Greenland (2011) "A procedure to tabulate and
plot results after flexible modeling of a quantitative covariate"
The Stata Journal, 11(1): 1--29.

> I have previously used the trial and error way, and come up with a model that seems reasonable (and tested it in an independent cohort, and am now testing it in two external cohorts at other hospitals). However, there must be a correct way to select the cuf-off levels, I just cannot find out how. I have asked most statisticians I have met on my way, but no one seems to know how. I hoped that some of you might have a suggestion...

This can be thought of as a zero-th order spline with unknown knot
location. Within linear regression there is a tradition of using -nl-
to find such knot locations. The problem is that in order to
generalize that to logistic regression you would want to use -ml- and
the the first and second derivatives of the likelihood function with
respect to the knot location are not continuous functions. This makes
it impossible to use the standard machinery of finding the
maximum-likelihood solution, and even harder to do inference on it. So
I would just forget about that.

Hope this helps,

Maarten L. Buis
Institut fuer Soziologie
Universitaet Tuebingen
Wilhelmstrasse 36
72074 Tuebingen

*   For searches and help try:

© Copyright 1996–2017 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index