Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
st: RE: RE: retransformation of ln(Y) coefficient and CI in regression
From 
 
"Steve Rothenberg" <[email protected]> 
To 
 
<[email protected]> 
Subject 
 
st: RE: RE: retransformation of ln(Y) coefficient and CI in regression 
Date 
 
Mon, 6 Jun 2011 06:31:48 -0500 
Solved!  Once again Nick (and through the magic of Statalist archives,
Martin Weiss) provide the key information.
My problem was using -predict- instead of -predictnl-, since the "exp"
operator is non-linear.  The syntax, after Martin's earlier post
(http://www.stata.com/statalist/archive/2009-04/msg00006.html) should be,
after estimation:
. predictnl expphat= exp(xb()), ci(lbexp ubexp)
Thanks again to Nick and Martin.
Steve Rothenberg
Date: Mon, 6 Jun 2011 07:14:00 +0100
From: Nick Cox <[email protected]>
Subject: Re: st: RE: retransformation of ln(Y) coefficient and CI in
regression
prediction +/- favoured multiplier * standard error.
Nick
-----Mensaje original-----
De: Steve Rothenberg [mailto:[email protected]] 
Enviado el: Sunday, June 05, 2011 11:55 AM
Para: '[email protected]'
Asunto: st: RE: retransformation of ln(Y) coefficient and CI in regression
Thank you for the glm suggestion, Nick.  
After 
. glm Y i.factor, vce(robust) family(Gaussian) link(log)
followed by
 
. predict xxx, mu
the command does indeed return the factor predictions in the original Y
metric.
However, the regression table with 95% CI is still in the original ln(Y)
units and I am still stuck not being able to calculate the 95% CI in the
original Y unit metric.  The predict command for returning prediction SE
(stdp) also only returns the SE in the ln(Y) metric.
I've read the manual on glm postestimation and can derive no hints on this
issue.
I'd welcome further suggestions for deriving the 95% confidence interval in
the original Y metric after either 
. regress ln(Y) ..., vce(robust)
or
. glm Y ..., link(log) vce(robust)
or any other estimation commands. 
Steve Rothenberg
****************
If you recast your model as 
glm Y i.factor ... , link(log) 
no post-estimation fudges are required. -predict- automatically supplies
stuff in terms of Y, not ln Y. 
Nick 
[email protected]
-----Mensaje original-----
De: Steve Rothenberg [mailto:[email protected]] 
Enviado el: Sunday, June 05, 2011 10:27 AM
Para: '[email protected]'
Asunto: retransformation of ln(Y) coefficient and CI in regression
I have a simple model with a natural log dependent variable and a three
level factor predictor.    I?ve used
 . regress lnY i.factor, vce(robust)
to obtain estimates in the natural log metric.  I want to be able to display
the results in a graph as means and 95% CI for each level of the factor with
retransformed units in the original Y metric.
I?ve also calculated geometric means and 95% CI for each level of the factor
variable using 
. ameans Y if factor==x
simply as a check, though the 95% CI is not adjusted for the vce(robust)
standard error as calculated by the -regress- model.
Using naïve transformation (i.e. ignoring retransformation bias) with 
. display exp(coefficient)
from the output of -regress- for each level of the predictor, with the
classic formulation:
Level 0 = exp(constant)
Level 1 = exp(constant+coef(1))
Level 2 = exp(constant+coef(2))
the series of retransformations from the -regress- command is the same as
the geometric means from the series of -ameans- commands.
When I try to do the same with the lower and upper 95% CI (substituting the
limits of the 95% CI for the coefficients) from the -regress- command,
however, the retransformed IC is much larger than calculated from the-
ameans- command, much more so than the differences in standard errors from
regress with and without the vce(robust) option would indicate.
I?ve discovered -levpredict- for unbiased retransformation of log dependent
variables in regression-type estimations by Christopher Baum in SSC but it
only outputs the bias-corrected means from the preceding -regress-.  To be
sure there is some small bias in the first or second decimal place of the
mean factor levels compared to naïve retransformation.
Am I doing something wrong by treating the 95% CI of each level of the
factor variable in the same way I treat the coefficients without correcting
for retransformation bias?  Is there any way I can obtain either the
retransformed CI or the bias-corrected retransformed CI for the different
levels of the factor variable in the original metric of Y?
I'd like to retain the robust SE from the above estimation as there is
considerable difference in variance in each level of the factor variable.
Steve Rothenberg
National Institute of Public Health
Cuernavaca, Morelos, Mexico
Stata/MP 11.2 for Windows (32-bit)
Born 30 Mar 2011  
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/