[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

From |
Roger Newson <roger.newson@kcl.ac.uk> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: Log transform of skewed data |

Date |
Wed, 02 Jun 2004 21:56:18 +0100 |

At 14:53 02/06/04 -0400, you wrote:

Log-transformed data can often be understood in terms of geometric means and their ratios. If in Stata you typeI have data on the "cost" (actually tranformed hours) of various types of caretaking for Alzheimers patients. I'm interested in a regression model to test treatment effects in a multisite study. As is usual for cost data, it is positively skewed. So, I contemplated a log transform, either through a direct transformation of the response, or through a log link in a glm, gee, or something similar. I actually am using "xt" commands to allow for nonindependence among caretakers treated at the same site. the problem is that the mode cost is $0, so that the distribution is bimodal. This, of course, remains true if I do a lof transform. Any ideas on how to analyze such data would be apreciated.

findit gmratio

then you should be taken to my website, where you can download my Stata Tip on the -eform- option of -regress- (Newson, 2003), which shows how to use this to calculate confidence intervals for geometric means and their ratios.

If there are zeros, however, then there is a problem, because the log of zero is not defined. In this case, you either have to transform the zeros to something else, or use arithmetic means instead of geometric means, with a log link function, in a glm or gee, usually using the -eform- option. The parameters will then be arithmetic means and their ratios, instead of geometric means and their ratios. Arithmetic means are still defined if the outcome is possibly zero, as is the case with loglinear modelling of count data, and the principle is the same with non-count data such as your caretaker-hours. The trick with the -noconst- option, mentioned in Newson (2003) may still be useful if you want a baseline arithmetic mean for a baseline patient.

Hope this helps.

Roger

References

Newson R. Stata tip 1: The eform() option of regress. The Stata Journal 2003; 3(4): 445.

--

Roger Newson

Lecturer in Medical Statistics

Department of Public Health Sciences

King's College London

5th Floor, Capital House

42 Weston Street

London SE1 3QD

United Kingdom

Tel: 020 7848 6648 International +44 20 7848 6648

Fax: 020 7848 6620 International +44 20 7848 6620

or 020 7848 6605 International +44 20 7848 6605

Email: roger.newson@kcl.ac.uk

Website: http://www.kcl-phs.org.uk/rogernewson

Opinions expressed are those of the author, not the institution.

*

* For searches and help try:

* http://www.stata.com/support/faqs/res/findit.html

* http://www.stata.com/support/statalist/faq

* http://www.ats.ucla.edu/stat/stata/

**References**:**st: Log transform of skewed data***From:*"Stephen Soldz" <ssoldz@soldzresearch.com>

- Prev by Date:
**Re: st: simple question** - Next by Date:
**Re: st: simple question** - Previous by thread:
**st: Log transform of skewed data** - Next by thread:
**Re: st: Log transform of skewed data** - Index(es):

© Copyright 1996–2014 StataCorp LP | Terms of use | Privacy | Contact us | What's new | Site index |