Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: st: Handling 0 values when using logs of a dependent variable (statalist-digest V4 #3487)


From   "Allan Reese (Cefas)" <[email protected]>
To   <[email protected]>
Subject   RE: st: Handling 0 values when using logs of a dependent variable (statalist-digest V4 #3487)
Date   Thu, 16 Jul 2009 10:27:06 +0100

I've seen two examples very recently where someone used ln(x+1) as a
reflex action.  Don't agree that ln(0) is undefined; it's mathematically
well defined but as an asymptote.  Agree that the answer to Dana's
question requires more background information.

First point to check is whether the zeros are actual data values or
themselves are missing values. ln(.) really is .

Second option is that zero is the code for "too small to measure" or
"below limit of detection".  In that case replacing the zeros with
teeny-weeny values or using tobit may be the best choice.

The usual reason for using ln(x+1) is when x is a count variable, which
is left-truncated but discrete and can go indefinitely large.  Counts
typically show a unimodal highly-skewed distribution (Poisson or
otherwise).  Mapping zero to zero, ln(1), does not offend and ln(x+1) is
often then treated as a normal variate; ie, you can do anova on it.

If, however, x is a ratio (y/z) then 1 represents the mid-point and both
y=0 and z=0 cause problems.  When y is less than z, ln(y/z) is a
negative number, and by taking ln(x+1) in one of my examples half the
data was effectively dropped by mapping into the range 0 to ln(2).  If
the raw data are y and z, you have options in GLMs.  If you have only x,
a tobit with left and right censored values is appropriate.

Allan



***********************************************************************************
This email and any attachments are intended for the named recipient only.  Its unauthorised use, distribution, disclosure, storage or copying is not permitted.  If you have received it in error, please destroy all copies and notify the sender.  In messages of a non-business nature, the views and opinions expressed are the author's own and do not necessarily reflect those of the organisation from which it is sent.  All emails may be subject to monitoring.
***********************************************************************************


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index