Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: RE: Log transform- SE or std

From   Roger Newson <>
Subject   st: RE: Log transform- SE or std
Date   Mon, 23 Jun 2003 20:42:19 +0100

At 19:52 23/06/03 +0100, Nick Cox wrote (in reply to Ricardo Ovaldia)::
Ricardo Ovaldia

> I used a log transform to normalize the distribution
> of a biochemical substance. I then use a ttest to
> compare the transformed mean of cases to controls. I
> now want to present my results listing the means, stds
> and the p-value. I could present the original
> (untransformed) means and stds, but I think that is
> misleading. Now, I know that the transformed mean is
> the geometric mean, so no problem there, but what
> about the standard error. How can I "untransform" it?

I am not at all clear that the se has an analogue
on the original scale.

I suggest that it is easier, and possibly
closer to the underlying problem, to exponentiate
the confidence intervals given by -ttest- for
the transformed means.

Another way to approach this is by

glm response caseorcontrol, link(log)

by analogy with the principle (e.g. Conroy Stata Journal
2002) that a t test is equivalent to a regression
on a binary explanatory variable.
Yet another way to do this is

gene byte baseline=1
regress response casecontrol baseline,noconst eform(GM/Ratio)

which displays instant confidence intervals for the baseline geometric mean and the geometric mean ratio. This is done by creating a constant X-variable -baseline- and including it in the model, fooling Stata into thinking that the model has no constant. (If you use the -eform- option without -noconst-, then the constant parameter is not shown.)

Some very useful information on the lognormal distribution and its parameters can be found on Stas Kolenikov's web page at

For instance, you will find out there that the coefficient of variation of the lognormal distribution is given by

CV=sqrt(exp(sigma^2 ) - 1)

or, alternatively,

sigma=sqrt(log(cv^2 + 1))

where sigma is the SD of the natural logs. I personally tend to use the coefficient of variation as my favourite dispersion measure for a lognormal distribution, especially when I am reporting power calculations. However, other people seem to like the geometric SD, defined as the natural antilog of the SD of the natural logs, or inter-percentile ratios.

I hope this helps.


Roger Newson
Lecturer in Medical Statistics
Department of Public Health Sciences
King's College London
5th Floor, Capital House
42 Weston Street
London SE1 3QD
United Kingdom

Tel: 020 7848 6648 International +44 20 7848 6648
Fax: 020 7848 6620 International +44 20 7848 6620
or 020 7848 6605 International +44 20 7848 6605

Opinions expressed are those of the author, not the institution.

* For searches and help try:

© Copyright 1996–2015 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index