Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: Transformed values in logistic regression


From   Fred Wolfe <fwolfe@arthritis-research.org>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: Transformed values in logistic regression
Date   Fri, 27 Aug 2004 06:10:18 -0500

There are 2 interesting problems with creatinine. The 1st is that a doubling of the creatinine at any level is considered roughly to be a loss of 50% of kidney function. So the change in creatinine has most meaning in terms of a baseline value.

The second problem is the distribution.

. qtiles creat,n(10)
. tabstat creat,by(q10_creat) s(min max n)

Summary for variables: creatinine
by categories of: q10_creatinine (10 quantiles of creatinine )

q10_creatinine | min max N
---------------+------------------------------
1 | .2 .7 1290
3 | .8 .8 1012
5 | .9 .9 934
6 | 1 1 798
8 | 1.025 1.2 879
9 | 1.3 1.3 274
10 | 1.4 15.7 533
---------------+------------------------------
Total | .2 15.7 5720
----------------------------------------------

. su creat,d

Creatinine-Lab
-------------------------------------------------------------
Percentiles Smallest
1% .5 .2
5% .6 .2
10% .7 .2 Obs 5720
25% .8 .2 Sum of Wgt. 5720

50% .9 Mean .9718925
Largest Std. Dev. .4218341
75% 1.1 5.3
90% 1.3 10 Variance .177944
95% 1.5 11 Skewness 12.87933
99% 2.1 15.7 Kurtosis 361.7305

All of the change takes place (in my clinical data) at the 10th quantile (actually around the 99th quantile).


So, depending on the purpose of the analyses and the distribution of your creatinines, perhaps excluding the highest values would be a valid approach.

Fred

At 04:29 AM 8/27/2004, you wrote:

I would try expressing creatinine in deciles. This gives a more intuitively
appealing measure than taking the log. You can also output, using -adjust-
predicted values for, say, the first and last deciles, which give a clear
idea of how important creatinine is in your model.
I'd agree with Ricardo and Richard Williams that the referee's argument appears odd, especially as the range of creatinine in your data is more than a factor of 10. The referee's point seems to be that no intervention could conceivably change creatinine by such a large factor as 2.7, but unless you're proposing such an intervention i don't see that's relevant. I think the simplest way around the referee may be to multiply your coefficients by e.g. ln(1.5) to give log-odds ratios for e.g. a 50% increase in creatinine. (Equivalently divide the variable holding ln(creatinine) by ln(1.5) to give log-creatinine to base 1.5.)

The other alternative would be quantiles but personally i'd prefer quarters or fifths to tenths. One problem with tenths would be that a comparison between two tenths of creatinine uses only a fifth of your patients so loses power, so your CIs will be wider. Another is that to present results for all the tenths takes up a lot of space, while presenting only top vs bottom tenth hides a lot of information. A complication for quantiles in general is that you have to decide whether to base them on just the controls or on everyone studied - in a case-control study, the former is often preferred.

Roger.

*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/

Fred Wolfe
National Data Bank for Rheumatic Diseases
Wichita, Kansas
Tel (316) 263-2125     Fax (316) 263-0761
fwolfe@arthritis-research.org


*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index