Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: About taking log on zero values
From 
 
Daniel Feenberg <[email protected]> 
To 
 
[email protected] 
Subject 
 
Re: st: About taking log on zero values 
Date 
 
Thu, 20 Feb 2014 14:42:58 -0500 (EST) 
On Thu, 20 Feb 2014, Maarten Buis wrote:
On Thu, Feb 20, 2014 at 4:16 PM, Alfonso Sánchez-Peñalver wrote:
In any case, if possible the best possible solution would be to estimate the values that ln(sales) would take for those zeros using either a Tobit or a Heckman sample selection model.
This might work if we were talking about an dependent variable (though
in that case I would just use a glm with a log link). However, we are
talking about an independent variable. What should happen with the 0s
depends on the functional form of the relationship between sales (the
independent variable, which Sebastian wants to log) and y (the
dependent variable). A Tobit or a Heckman are not tools for
determining such a functional form, and can thus not be a solution to
this problem.
If the to be logged variable is on the RHS, then can't you put in an 
arbitrary value instead of log(0), and add a dummy variable which is one 
for that case, otherwise zero? Then you have a functional form that 
encompasses missing/non-missing and has a log relation for all non-zero 
values. The size of the arbitrary value won't actually affect the 
predicted values, T-statistics or regression summary statistics such as 
R**2, so it shouldn't be a source of controversy.
Daniel Feenberg
-- Maarten
---------------------------------
Maarten L. Buis
WZB
Reichpietschufer 50
10785 Berlin
Germany
http://www.maartenbuis.nl
---------------------------------
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/