Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down at the end of May, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Simplification of formula in logistic regression


From   Maarten Buis <maartenlbuis@gmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: Simplification of formula in logistic regression
Date   Mon, 16 May 2011 09:35:06 +0200

--- On Sun, May 15, 2011 at 4:23 PM, Mikkel Brabrand wrote:
> If I want clinicians to use my model, it needs to be simple. I cannot expect them to use a piece of software to calculate the risk score and it is virtually impossible to have it incorporated in the programs used at my department. I therefore need to simplify it and make the variables categorized or dichotomous.

Splitting up your variable is not the only way to make your results
understandable to a lay public. Adding a continuous variable and than
choosing and tabulating a couple of well chosen example values or a
carefully designed graphs can do just as well or even better. See for
example:
Nicola Orsini and Sander Greenland (2011) "A procedure to tabulate and
plot results after flexible modeling of a quantitative covariate"
The Stata Journal, 11(1): 1--29.
http://www.stata-journal.com/article.html?article=st0215

> I have previously used the trial and error way, and come up with a model that seems reasonable (and tested it in an independent cohort, and am now testing it in two external cohorts at other hospitals). However, there must be a correct way to select the cuf-off levels, I just cannot find out how. I have asked most statisticians I have met on my way, but no one seems to know how. I hoped that some of you might have a suggestion...

This can be thought of as a zero-th order spline with unknown knot
location. Within linear regression there is a tradition of using -nl-
to find such knot locations. The problem is that in order to
generalize that to logistic regression you would want to use -ml- and
the the first and second derivatives of the likelihood function with
respect to the knot location are not continuous functions. This makes
it impossible to use the standard machinery of finding the
maximum-likelihood solution, and even harder to do inference on it. So
I would just forget about that.

Hope this helps,
Maarten

--------------------------
Maarten L. Buis
Institut fuer Soziologie
Universitaet Tuebingen
Wilhelmstrasse 36
72074 Tuebingen
Germany


http://www.maartenbuis.nl
--------------------------

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index