Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Reconcile Log Transformed with Untransformed Results

From   Austin Nichols <>
Subject   Re: st: Reconcile Log Transformed with Untransformed Results
Date   Thu, 25 Feb 2010 16:58:36 -0500

Erasmo Giambona <> :
No need to add "in economic terms" as the result is simply not interpretable.
To restate my objection from Feb 13:
a regression of ln(y+1) on ln(x+1) does not estimate
an elasticity, and a change from -0.45 to +0.4 does not correspond to
any well-defined percentage point change.  If you are unsure of the
correct functional form, consider -lpoly- or -fracpoly- or -mkspline-
or -pspline- (on SSC).

Why not simply estimate a linear regression with OLS and plot your 16
points as well, both with and without the outlier you don't like?

sysuse auto, clear
keep in 1/16
replace mpg=mpg/20-1
replace weight=weight/3300-1
sc mpg weight ||lfit mpg weight||lfit mpg weight if _n!=13
g y=ln(mpg+1)
g x=ln(weight+1)
sc y x ||lfit y x||lfit y x if _n!=13, name(why)

I can't see why you are ever adding one and taking logs--there is no
justification for it that I have seen.

On Thu, Feb 25, 2010 at 12:32 PM, Erasmo Giambona <> wrote:
> Thanks Tony. Actually, I take the log of 1+y. Yes, i tried glm with a
> log link and that helps as well. The issue is that i found it
> difficult to interpret the results in economic terms. All the details
> are in the previous emails.
> Erasmo
> On Thu, Feb 25, 2010 at 6:24 PM, Lachenbruch, Peter
> <> wrote:
>> Since one of your y's is negative, -0.03, why should taking logs help? Would a glm with a log link help?
>> Tony
>> Peter A. Lachenbruch
>> Department of Public Health
>> Oregon State University
>> Corvallis, OR 97330
>> Phone: 541-737-3832
>> FAX: 541-737-4001
>> -----Original Message-----
>> From: [] On Behalf Of Erasmo Giambona
>> Sent: Thursday, February 25, 2010 4:32 AM
>> To:
>> Subject: Re: st: Reconcile Log Transformed with Untransformed Results
>> Thanks Austin. I have been traveling so it has been difficult to look
>> into this issue. To answer your question. I am using a two-step
>> procedure that is used sometime in monetary policy research. My y is a
>> coefficient estimated from a panel regression using firm level data.
>> This is the first step. y ranges from -0.03 to +0.07 (with mean=0.023,
>> median=0.024, st dev=0.028, skew=-.37, kurt= 2.52). I have 16 y's, one
>> per year. In the secon step i regress y on x, where x is an annual
>> interest rate spread ranging from -.95% to 1.15% (with mean=3.96e-07,
>> median=.0004551, st dev=.6426913, skew=.1102487, kurt= 2.15). The
>> scatter of y on x clearly shows that y increase with x, but there is
>> one obs (out of the 16) with a very low x and a very high y. I am
>> taking the logs to try to reduce the effetc of this obs. Thought this
>> is more parimonious relative to the alternative of dropping hte obs
>> and winsorizing seems unfeasible with 16 obs.
>> Any additional thoughts would be appreciated,
>> Erasmo
>> On Tue, Feb 16, 2010 at 6:11 PM, Austin Nichols <> wrote:
>>> Erasmo Giambona <>:
>>> As I already pointed out, I doubt your estimates correspond to any
>>> well-defined percentage point change.  Perhaps you can give us a
>>> better sense of the distributions of the untransformed y and x (and
>>> what they measure and in what units), and what the scatterplot of y
>>> against x looks like.  You may also prefer to state your effects in
>>> terms of standard deviations rather than the interquartile range.
>>> On Tue, Feb 16, 2010 at 9:39 AM, Erasmo Giambona <> wrote:
>>>> Thanks Maarten. In this example, OLS and GLM give very similar
>>>> econimic effects. In fact, 74 cents for the OLS is really 9.52%
>>>> relative to the mean wage of 7.77. This 9.52% is very much in line
>>>> with the 9.7% found with GLM. In my case, the coeff. on X for the OLS
>>>> is 0.0064. Relative to the mean for the LHS variable of 0.02. This is
>>>> an economic effect of about 28%. With the GLS, using exactly your
>>>> code, X gets a coefficient of 2.025 or a 102.5% increase in Y. Or
>>>> perhaps, I am misinterpreting this coefficient.
>>>> Thanks,
>>>> Erasmo
>>>> On Mon, Feb 15, 2010 at 9:22 AM, Maarten buis <> wrote:
>>>>> --- On Sun, 14/2/10, Erasmo Giambona wrote:
>>>>>> I ran the regressions with both RHS and LHS untransformed
>>>>>> using both OLS and GLM with link(log). With the OLS the
>>>>>> coeff on X is 0.006 while with the GLM the coefficient is
>>>>>> 0.700. I find a bit hard to intepret the GLM coefficient.
>>>>> Consider the example below:
>>>>> *--------------- begin example -----------------
>>>>> sysuse nlsw88, clear
>>>>> gen byte baseline =1
>>>>> reg wage grade
>>>>> glm wage grade baseline,  ///
>>>>>    link(log) eform nocons
>>>>> *--------------- end example --------------------
>>>>> The -regress- results are interpreted as follows:
>>>>> People without education can expect a wage of
>>>>> -1.96 dollars an hour (substantively we know that
>>>>> people hardly ever pay for the privelege to work,
>>>>> so this is a sign of bad model fit), and they get
>>>>> 74 cents an hour more of every additional year of
>>>>> education.
>>>>> The -glm- results are interpreted as follows:
>>>>> People without education can expect a wage of
>>>>> 2.25 dollars an hour, and for every additional
>>>>> year of education they can expect an increase
>>>>> of 9.7%.
>>>>> Hope this helps,
>>>>> Maarten

*   For searches and help try:

© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index