Thanks for the suggestion and the reference.
My help -transint-, referred to by Austin in his email
to which I replied, which remains in the material below, also
refers to this transformation.
I agree that in general considering quite a different
transformation is part of the possibilities, but
it isn't really a solution to the specific question
I posed myself in the pseudo-FAQ I have below.
Nick
n.j.cox@durham.ac.uk
Benito, Andrew
> I wld suggest looking at the following reference
>
> Burbidge, J.B., Magee, L. and Robb, L. (1988),
> `Alternative Transformations to Handle Extreme Values of the
> Dependent Variable', Journal of the American Statistical
> Association, 83, 123-7.
>
> It suggests using an Inverse hyperbolic sine function to
> replace the log in such cases. (related to Nick's point(d)).
> The functional form, with the dampening factor set to 1, is
> sinh⁻¹(x)=ln(n+√(1+x²)). But more generally (ie without the
> dampening factor set in that way), it involves non-linear estimation.
Nick Cox
> Let me summarise the situation as I see it. This tries a
> slightly more general pitch than the current thread.
>
> I want to work with the logarithm of a variable, but that
> variable contains zero values. What should I do?
> ---------------------------------------------------------
< snip >
Austin Nichols
> > In re: adding alpha to X to make ln(X) nonmissing Why does this
> > operation come up so often, when it is so often a bad idea? I have
> > seen several papers this week that add some constant to X so that
> > ln(X) can be regressed on some variables, or some variable can be
> > regressed on it. Wouldn't you be just as well off imputing
> > 2*atan(X)-2*atan(1) or somesuch? Is there a well-known
> good reference
> > on this subject?
> >
> > Just now, when looking up the ref for an adjacent thread on
> btscs.ado,
> > I ran across Oneal & Russett (2001) which acknowledges that Beck,
> > Katz, and Tucker (1998) pointed out an error, and then replies to
> > another critique with this (p.480):
> > "
> > Before taking the logarithm [of trade volume in $millions]
> we assigned
> > a different value to the trade variable for dyads that report no
> > trade. Some value must be imputed because the logarithm of zero is
> > undefined. We use $100,000 [so really it was ln(0.1)]; Green, Kim,
> > and Yoon used $1. It is this that accounts for most of the
> > differences between our results and theirs.
> > "
> > Oneal, John R. and Bruce Russett. 2001. Clear and Clean: The Fixed
> > Effects of the Liberal Peace. International Organization,
> Vol. 55, No.
> > 2. (Spring, 2001), pp. 469-485.
> > http://links.jstor.org/sici?sici=0020-8183%28200121%2955%3A2%3
> > C469%3ACACTFE%3E2.0.CO%3B2-A
> >
> > Green, Donald P., Soo Yeon Kim, and David H. Yoon. 2001.
> "Dirty Pool."
> > International Organization, Vol. 55, No. 2. (Spring, 2001), pp.
> > 441-468.
> > http://links.jstor.org/sici?sici=0020-8183%28200121%2955%3A2%3
> C441%3ADP%3E2.0.CO%3B2-N
>
> Beck, Nathaniel, Jonathan N. Katz and Richard Tucker. 1998.
> Taking Time Seriously: Time-Series-Cross-Section Analysis
> with a Binary Dependent Variable. American Journal of
> Political Science, 42:
> 1260-1288.
>
> See also:
> ssc install transint
> h transint
> http://www.stata.com/statalist/archive/2006-11/msg00294.html
>
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/