# Re: st: Transformed values in logistic regression

 From Roger Harbord <[email protected]> To [email protected] Subject Re: st: Transformed values in logistic regression Date Fri, 27 Aug 2004 10:29:33 +0100

--On 27 August 2004 09:14 +0100 "=?windows-1252?Q?Ron=E1n_Conroy?=" <[email protected]> wrote:

```Ricardo Ovaldia wrote:

```
```Specifically we were interested in modeling
case-control status as a function of several patient
covariates including serum creatinine which in our
data ranges from 0.11 to 1.98.
Because of skewness and to make the odds ratio
independent of the units measurement, we decided to
log-transform the creatinine values before entering
them into our logistic model. However the reviewer
wrote "Using a log-transform for creatine is absurd
because a 1-unit increase in ln(x) is equivalent to
increasing x by a factor of 2.718 which is in the
realm of impossibility"

```
```I would try expressing creatinine in deciles. This gives a more intuitively
appealing measure than taking the log. You can also output, using -adjust-
predicted values for, say, the first and last deciles, which give a clear
idea of how important creatinine is in your model.
```
I'd agree with Ricardo and Richard Williams that the referee's argument appears odd, especially as the range of creatinine in your data is more than a factor of 10. The referee's point seems to be that no intervention could conceivably change creatinine by such a large factor as 2.7, but unless you're proposing such an intervention i don't see that's relevant. I think the simplest way around the referee may be to multiply your coefficients by e.g. ln(1.5) to give log-odds ratios for e.g. a 50% increase in creatinine. (Equivalently divide the variable holding ln(creatinine) by ln(1.5) to give log-creatinine to base 1.5.)

The other alternative would be quantiles but personally i'd prefer quarters or fifths to tenths. One problem with tenths would be that a comparison between two tenths of creatinine uses only a fifth of your patients so loses power, so your CIs will be wider. Another is that to present results for all the tenths takes up a lot of space, while presenting only top vs bottom tenth hides a lot of information. A complication for quantiles in general is that you have to decide whether to base them on just the controls or on everyone studied - in a case-control study, the former is often preferred.

Roger.

*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/