Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

# Re: st: Reconcile Log Transformed with Untransformed Results

 From Austin Nichols To statalist@hsphsun2.harvard.edu Subject Re: st: Reconcile Log Transformed with Untransformed Results Date Thu, 25 Feb 2010 16:58:36 -0500

```Erasmo Giambona <e.giambona@gmail.com> :
No need to add "in economic terms" as the result is simply not interpretable.
To restate my objection from Feb 13:
a regression of ln(y+1) on ln(x+1) does not estimate
an elasticity, and a change from -0.45 to +0.4 does not correspond to
any well-defined percentage point change.  If you are unsure of the
correct functional form, consider -lpoly- or -fracpoly- or -mkspline-
or -pspline- (on SSC).

Why not simply estimate a linear regression with OLS and plot your 16
points as well, both with and without the outlier you don't like?

sysuse auto, clear
keep in 1/16
replace mpg=mpg/20-1
replace weight=weight/3300-1
sc mpg weight ||lfit mpg weight||lfit mpg weight if _n!=13
g y=ln(mpg+1)
g x=ln(weight+1)
sc y x ||lfit y x||lfit y x if _n!=13, name(why)

I can't see why you are ever adding one and taking logs--there is no
justification for it that I have seen.

On Thu, Feb 25, 2010 at 12:32 PM, Erasmo Giambona <e.giambona@gmail.com> wrote:
> Thanks Tony. Actually, I take the log of 1+y. Yes, i tried glm with a
> log link and that helps as well. The issue is that i found it
> difficult to interpret the results in economic terms. All the details
> are in the previous emails.
> Erasmo
>
> On Thu, Feb 25, 2010 at 6:24 PM, Lachenbruch, Peter
> <Peter.Lachenbruch@oregonstate.edu> wrote:
>> Since one of your y's is negative, -0.03, why should taking logs help? Would a glm with a log link help?
>>
>> Tony
>>
>> Peter A. Lachenbruch
>> Department of Public Health
>> Oregon State University
>> Corvallis, OR 97330
>> Phone: 541-737-3832
>> FAX: 541-737-4001
>>
>>
>> -----Original Message-----
>> From: owner-statalist@hsphsun2.harvard.edu [mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of Erasmo Giambona
>> Sent: Thursday, February 25, 2010 4:32 AM
>> To: statalist@hsphsun2.harvard.edu
>> Subject: Re: st: Reconcile Log Transformed with Untransformed Results
>>
>> Thanks Austin. I have been traveling so it has been difficult to look
>> into this issue. To answer your question. I am using a two-step
>> procedure that is used sometime in monetary policy research. My y is a
>> coefficient estimated from a panel regression using firm level data.
>> This is the first step. y ranges from -0.03 to +0.07 (with mean=0.023,
>> median=0.024, st dev=0.028, skew=-.37, kurt= 2.52). I have 16 y's, one
>> per year. In the secon step i regress y on x, where x is an annual
>> interest rate spread ranging from -.95% to 1.15% (with mean=3.96e-07,
>> median=.0004551, st dev=.6426913, skew=.1102487, kurt= 2.15). The
>> scatter of y on x clearly shows that y increase with x, but there is
>> one obs (out of the 16) with a very low x and a very high y. I am
>> taking the logs to try to reduce the effetc of this obs. Thought this
>> is more parimonious relative to the alternative of dropping hte obs
>> and winsorizing seems unfeasible with 16 obs.
>>
>> Any additional thoughts would be appreciated,
>>
>> Erasmo
>>
>> On Tue, Feb 16, 2010 at 6:11 PM, Austin Nichols <austinnichols@gmail.com> wrote:
>>> Erasmo Giambona <e.giambona@gmail.com>:
>>> As I already pointed out, I doubt your estimates correspond to any
>>> well-defined percentage point change.  Perhaps you can give us a
>>> better sense of the distributions of the untransformed y and x (and
>>> what they measure and in what units), and what the scatterplot of y
>>> against x looks like.  You may also prefer to state your effects in
>>> terms of standard deviations rather than the interquartile range.
>>>
>>> On Tue, Feb 16, 2010 at 9:39 AM, Erasmo Giambona <e.giambona@gmail.com> wrote:
>>>> Thanks Maarten. In this example, OLS and GLM give very similar
>>>> econimic effects. In fact, 74 cents for the OLS is really 9.52%
>>>> relative to the mean wage of 7.77. This 9.52% is very much in line
>>>> with the 9.7% found with GLM. In my case, the coeff. on X for the OLS
>>>> is 0.0064. Relative to the mean for the LHS variable of 0.02. This is
>>>> an economic effect of about 28%. With the GLS, using exactly your
>>>> code, X gets a coefficient of 2.025 or a 102.5% increase in Y. Or
>>>> perhaps, I am misinterpreting this coefficient.
>>>>
>>>> Thanks,
>>>>
>>>> Erasmo
>>>>
>>>> On Mon, Feb 15, 2010 at 9:22 AM, Maarten buis <maartenbuis@yahoo.co.uk> wrote:
>>>>> --- On Sun, 14/2/10, Erasmo Giambona wrote:
>>>>>> I ran the regressions with both RHS and LHS untransformed
>>>>>> using both OLS and GLM with link(log). With the OLS the
>>>>>> coeff on X is 0.006 while with the GLM the coefficient is
>>>>>> 0.700. I find a bit hard to intepret the GLM coefficient.
>>>>>
>>>>> Consider the example below:
>>>>>
>>>>> *--------------- begin example -----------------
>>>>> sysuse nlsw88, clear
>>>>> gen byte baseline =1
>>>>>
>>>>> glm wage grade baseline,  ///
>>>>> *--------------- end example --------------------
>>>>>
>>>>>
>>>>> The -regress- results are interpreted as follows:
>>>>> People without education can expect a wage of
>>>>> -1.96 dollars an hour (substantively we know that
>>>>> people hardly ever pay for the privelege to work,
>>>>> so this is a sign of bad model fit), and they get
>>>>> 74 cents an hour more of every additional year of
>>>>> education.
>>>>>
>>>>> The -glm- results are interpreted as follows:
>>>>> People without education can expect a wage of
>>>>> 2.25 dollars an hour, and for every additional
>>>>> year of education they can expect an increase
>>>>> of 9.7%.
>>>>>
>>>>> Hope this helps,
>>>>> Maarten

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```