[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
"Allan Reese (Cefas)" <allan.reese@cefas.co.uk> |

To |
<statalist@hsphsun2.harvard.edu> |

Subject |
RE: st: Handling 0 values when using logs of a dependent variable (statalist-digest V4 #3487) |

Date |
Thu, 16 Jul 2009 10:27:06 +0100 |

I've seen two examples very recently where someone used ln(x+1) as a reflex action. Don't agree that ln(0) is undefined; it's mathematically well defined but as an asymptote. Agree that the answer to Dana's question requires more background information. First point to check is whether the zeros are actual data values or themselves are missing values. ln(.) really is . Second option is that zero is the code for "too small to measure" or "below limit of detection". In that case replacing the zeros with teeny-weeny values or using tobit may be the best choice. The usual reason for using ln(x+1) is when x is a count variable, which is left-truncated but discrete and can go indefinitely large. Counts typically show a unimodal highly-skewed distribution (Poisson or otherwise). Mapping zero to zero, ln(1), does not offend and ln(x+1) is often then treated as a normal variate; ie, you can do anova on it. If, however, x is a ratio (y/z) then 1 represents the mid-point and both y=0 and z=0 cause problems. When y is less than z, ln(y/z) is a negative number, and by taking ln(x+1) in one of my examples half the data was effectively dropped by mapping into the range 0 to ln(2). If the raw data are y and z, you have options in GLMs. If you have only x, a tobit with left and right censored values is appropriate. Allan *********************************************************************************** This email and any attachments are intended for the named recipient only. Its unauthorised use, distribution, disclosure, storage or copying is not permitted. If you have received it in error, please destroy all copies and notify the sender. In messages of a non-business nature, the views and opinions expressed are the author's own and do not necessarily reflect those of the organisation from which it is sent. All emails may be subject to monitoring. *********************************************************************************** * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

- Prev by Date:
**Re: st: interpretation of FE model with interaction of 2 dummies with continuous variable** - Next by Date:
**Re: st: Restricting range of values in a graph** - Previous by thread:
**st: Restricting range of values in a graph** - Next by thread:
**st: Memory settings** - Index(es):

© Copyright 1996–2014 StataCorp LP | Terms of use | Privacy | Contact us | What's new | Site index |