Stata: Data Analysis and Statistical Software

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Simplification of formula in logistic regression

From	Marcello Pagano <[email protected]>
To	[email protected]
Subject	Re: st: Simplification of formula in logistic regression
Date	Mon, 16 May 2011 14:21:48 -0400

Before knocking this request too much further, one should consider theaccuracy of the variables going into the equation. Something like bloodpressure, which can be measured very accurately at any instant, can varytremendously a minute later. One should not be fooled by apparentaccuracy of clinical measures. The grandaddy (or grandmom??) of allthese is the Apgar score. She wanted a measure of the babies at birthbased on what we would consider very, very loose measures --- e.g.Reflex irritability (response of skin simulation to feet) : No response(score of 0); Some motion (score of 1); and Cry (score of 2); or Color:Blue:Pale (score of 0); Body Pink: extremities blue (score of 1); andcompletely pink (score of 2) --- and yet the use of this score hasproven to be a great advance in pediatrics. An excellent read:

"The Score" by Atul Gawande
http://www.newyorker.com/archive/2006/10/09/061009fa_fact

m.p.



On 5/16/2011 1:40 PM, Ariel Linden, DrPH wrote:

As a health services researcher, I get frustrated by these requests. One the
one hand, we develop tools to maximize the accuracy of measurement, and on
the other hand, there is this constant desire to "dummy down" the
measurement instrument so that it can be "simple" for clinicians to use.

No matter that by dummying down the instrument, the accuracy likewise
diminishes.

I would suggest to Mikkel that you either remodel the data using "simple"
dichotomous terms, and accept that the accuracy of the model (e.g.
sensitivity/specificity) may be diminished, or more reasonably, you train
your clinicians how to use the instrument as it stands in its (presumably)
more accurate yet complex form.



Date: Sun, 15 May 2011 17:48:41 +0100
From: Nick Cox<[email protected]>
Subject: Re: st: Simplification of formula in logistic regression

Sorry, but I think you will continue find this "correct way" to be elusive.

Nick

On Sun, May 15, 2011 at 4:23 PM, Mikkel Brabrand<[email protected]>
wrote:

If I want clinicians to use my model, it needs to be simple. I cannot

expect them to use a piece of software to calculate the risk score and it is
virtually impossible to have it incorporated in the programs used at my
department. I therefore need to simplify it and make the variables
categorized or dichotomous. I have previously used the trial and error way,
and come up with a model that seems reasonable (and tested it in an
independent cohort, and am now testing it in two external cohorts at other
hospitals). However, there must be a correct way to select the cuf-off
levels, I just cannot find out how. I have asked most statisticians I have
met on my way, but no one seems to know how. I hoped that some of you might
have a suggestion...

Mikkel

Den 15/05/2011 kl. 16.49 skrev Nick Cox:

I don't know what "statistically correct" would mean here. If you
think your model is useful, there are no grounds for coarsening it. If
the implication is that clinicians can't understand or don't need to
understand the internals of the formula you can think of encapsulating
the details in a Stata do-file or some equivalent in other software.

A broad issue is that detailed models optimised to fit particular
datasets often perform poorly on other data.

Nick

On Sun, May 15, 2011 at 3:43 PM, Mikkel Brabrand<[email protected]>

wrote:

I have performed a logistic regression analysis including five variables

and one outcome. However, I would like to simplify the formula significantly
for clinical use. So, instead of the formula been something like
-12.22+2.33*systolic blood pressure-1.21*temperature etc., I would like to
make a scoring system where the score is calculated on basis of the measured
values of the vital signs.

An example could be something like this

.................2 points..1 point...0 points...1 point.....2 points

Pulse   ...........-30........31-50....51-100....101-200..201-

Sys. BP.........-60........61-100..101-200...201-

However, I have no idea how to find the optimal cut-off points. Do any

of you have a suggestion how to do this statistically correct?
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

Follow-Ups:
- RE: st: Simplification of formula in logistic regression
  - From: "David Radwin" <[email protected]>

References:
- Re: st: Simplification of formula in logistic regression
  - From: "Ariel Linden, DrPH" <[email protected]>

Prev by Date: Re: st: Simplification of formula in logistic regression
Next by Date: Re: st: reshape WDI from wide to long format
Previous by thread: Re: st: Simplification of formula in logistic regression
Next by thread: RE: st: Simplification of formula in logistic regression
Index(es):
- Date
- Thread