[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]
st: Unobserved heterogeneity in logistic regression
--- "daniel waxman" <firstname.lastname@example.org>wrote:
Maartin Buis directed me to a short paper of his: "Unobserved heterogeneity
in logistic regression":
The concept makes sense--the question is what to do about it.
I am using in-hospital mortality as an outcome in a multivariable logistic
model, focusing on a particular laboratory test (troponin I) as a predictor
(either with simple log transformation, or using -mfp-). I test
independence by doing nested logistic models with every other mortality
predictor that I can find (some continuous, some dichotomous), and the odds
ratio for the test of interest remains stable (and Hosmer Lemeshow goodness
of fit stats do not reject the models). My sample sizes are on the order of
10,000-30,000 observations per data set.
The overall mortality is 3.3% and the predictor of interest is strongly
skewed to the left (see below).
--- end of quote ---
I would just add a note that another way to deal with skewed data is to
generate a categorical variable and then indicator variables so that you can
look for clinically meaningful cut points in the relationship between dependent
variable and predictors - you have enough data not to worry about the added
degrees of freedom. Also, an ROC of 0.85 in your model is getting pretty darn
close to the limit of what you can get with a real world data set.
Finally, with big data sets the H-L test statistic may sometimes become
significant, even when your model fits well through the range of risk.
Dartmouth Hitchcock Medical Center
Clinical Research/Department of Medicine
One Medical Center Drive
Lebanon, NH 03756
Fax (603) 653-3554
* For searches and help try: