# st: RE: Log transform- SE or std

 From Roger Newson To statalist@hsphsun2.harvard.edu Subject st: RE: Log transform- SE or std Date Mon, 23 Jun 2003 20:42:19 +0100

```At 19:52 23/06/03 +0100, Nick Cox wrote (in reply to Ricardo Ovaldia)::
```
```Ricardo Ovaldia

> I used a log transform to normalize the distribution
> of a biochemical substance. I then use a ttest to
> compare the transformed mean of cases to controls. I
> now want to present my results listing the means, stds
> and the p-value. I could present the original
> (untransformed) means and stds, but I think that is
> misleading. Now, I know that the transformed mean is
> the geometric mean, so no problem there, but what
> about the standard error. How can I "untransform" it?

I am not at all clear that the se has an analogue
on the original scale.

I suggest that it is easier, and possibly
closer to the underlying problem, to exponentiate
the confidence intervals given by -ttest- for
the transformed means.

Another way to approach this is by

by analogy with the principle (e.g. Conroy Stata Journal
2002) that a t test is equivalent to a regression
on a binary explanatory variable.
```
Yet another way to do this is

gene byte baseline=1
regress response casecontrol baseline,noconst eform(GM/Ratio)

which displays instant confidence intervals for the baseline geometric mean and the geometric mean ratio. This is done by creating a constant X-variable -baseline- and including it in the model, fooling Stata into thinking that the model has no constant. (If you use the -eform- option without -noconst-, then the constant parameter is not shown.)

Some very useful information on the lognormal distribution and its parameters can be found on Stas Kolenikov's web page at

http://www.komkon.org/~tacik/

For instance, you will find out there that the coefficient of variation of the lognormal distribution is given by

CV=sqrt(exp(sigma^2 ) - 1)

or, alternatively,

sigma=sqrt(log(cv^2 + 1))

where sigma is the SD of the natural logs. I personally tend to use the coefficient of variation as my favourite dispersion measure for a lognormal distribution, especially when I am reporting power calculations. However, other people seem to like the geometric SD, defined as the natural antilog of the SD of the natural logs, or inter-percentile ratios.

I hope this helps.

Roger

--
Roger Newson
Lecturer in Medical Statistics
Department of Public Health Sciences
King's College London
5th Floor, Capital House
42 Weston Street
London SE1 3QD
United Kingdom

Tel: 020 7848 6648 International +44 20 7848 6648
Fax: 020 7848 6620 International +44 20 7848 6620
or 020 7848 6605 International +44 20 7848 6605
Email: roger.newson@kcl.ac.uk
Website: http://www.kcl-phs.org.uk/rogernewson

Opinions expressed are those of the author, not the institution.

*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/