Re: st: About taking log on zero values

Stata would interpret the "." as a missing value and thus drop the observation from the estimation. You would thus only be regressing the observations with positive values of the original variable. A simple trick to not lose any observations is to add a very small constant (say 0.00000001) to those zero values before taking logs. That would keep all observations. I'm sure this will have many retractors too.

In your case entering the log of sales as an explanatory variable I guess is to capture nonlinearities in the relationship? If that's the case, to avoid the problem with the zeros, have you thought of entering a quadratic relationship with sales instead of a linear one?

> On Feb 19, 2014, at 2:44 PM, Sebastian Say <sebastian.statalist@gmail.com> wrote:
> Hi my question is about how stata treats a log-transformed variable
> that draws upon an original variable that contains zero.
> In my dataset, i have firm sales data but some of them have zero. I
> indicated as a "."
> I plan to run a regression, e.g.
> reg y x1 x2 logsales
> My question is, how would stata treat these "." if I do not remove them?
> Technically the "." should be undefined.
> I've read some papers and they usually put a 1 for those sales data
> with zeros in them. Is this a usual practice?
> Thank you very much.
