Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: About taking log on zero values

From   Daniel Feenberg <>
Subject   Re: st: About taking log on zero values
Date   Thu, 20 Feb 2014 14:42:58 -0500 (EST)

On Thu, 20 Feb 2014, Maarten Buis wrote:

On Thu, Feb 20, 2014 at 4:16 PM, Alfonso Sánchez-Peñalver wrote:
In any case, if possible the best possible solution would be to estimate the values that ln(sales) would take for those zeros using either a Tobit or a Heckman sample selection model.

This might work if we were talking about an dependent variable (though
in that case I would just use a glm with a log link). However, we are
talking about an independent variable. What should happen with the 0s
depends on the functional form of the relationship between sales (the
independent variable, which Sebastian wants to log) and y (the
dependent variable). A Tobit or a Heckman are not tools for
determining such a functional form, and can thus not be a solution to
this problem.

If the to be logged variable is on the RHS, then can't you put in an arbitrary value instead of log(0), and add a dummy variable which is one for that case, otherwise zero? Then you have a functional form that encompasses missing/non-missing and has a log relation for all non-zero values. The size of the arbitrary value won't actually affect the predicted values, T-statistics or regression summary statistics such as R**2, so it shouldn't be a source of controversy.

Daniel Feenberg

-- Maarten

Maarten L. Buis
Reichpietschufer 50
10785 Berlin

*   For searches and help try:

© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index