Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Simplification of formula in logistic regression

From   "Ariel Linden, DrPH" <>
To   <>
Subject   Re: st: Simplification of formula in logistic regression
Date   Mon, 16 May 2011 10:40:29 -0700

As a health services researcher, I get frustrated by these requests. One the
one hand, we develop tools to maximize the accuracy of measurement, and on
the other hand, there is this constant desire to "dummy down" the
measurement instrument so that it can be "simple" for clinicians to use. 

No matter that by dummying down the instrument, the accuracy likewise

I would suggest to Mikkel that you either remodel the data using "simple"
dichotomous terms, and accept that the accuracy of the model (e.g.
sensitivity/specificity) may be diminished, or more reasonably, you train
your clinicians how to use the instrument as it stands in its (presumably)
more accurate yet complex form.


Date: Sun, 15 May 2011 17:48:41 +0100
From: Nick Cox <>
Subject: Re: st: Simplification of formula in logistic regression

Sorry, but I think you will continue find this "correct way" to be elusive.


On Sun, May 15, 2011 at 4:23 PM, Mikkel Brabrand <>
> If I want clinicians to use my model, it needs to be simple. I cannot
expect them to use a piece of software to calculate the risk score and it is
virtually impossible to have it incorporated in the programs used at my
department. I therefore need to simplify it and make the variables
categorized or dichotomous. I have previously used the trial and error way,
and come up with a model that seems reasonable (and tested it in an
independent cohort, and am now testing it in two external cohorts at other
hospitals). However, there must be a correct way to select the cuf-off
levels, I just cannot find out how. I have asked most statisticians I have
met on my way, but no one seems to know how. I hoped that some of you might
have a suggestion...
> Mikkel
> Den 15/05/2011 kl. 16.49 skrev Nick Cox:
>> I don't know what "statistically correct" would mean here. If you
>> think your model is useful, there are no grounds for coarsening it. If
>> the implication is that clinicians can't understand or don't need to
>> understand the internals of the formula you can think of encapsulating
>> the details in a Stata do-file or some equivalent in other software.
>> A broad issue is that detailed models optimised to fit particular
>> datasets often perform poorly on other data.
>> Nick
>> On Sun, May 15, 2011 at 3:43 PM, Mikkel Brabrand <>
>>> I have performed a logistic regression analysis including five variables
and one outcome. However, I would like to simplify the formula significantly
for clinical use. So, instead of the formula been something like
-12.22+2.33*systolic blood pressure-1.21*temperature etc., I would like to
make a scoring system where the score is calculated on basis of the measured
values of the vital signs.
>>> An example could be something like this
>>> .................2 points..1 point...0 points...1 point.....2 points
>>> Pulse   ...........-30........31-50....51-100....101-200..201-
>>> Sys. BP.........-60........61-100..101-200...201-
>>> However, I have no idea how to find the optimal cut-off points. Do any
of you have a suggestion how to do this statistically correct?

*   For searches and help try:

© Copyright 1996–2017 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index