Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: RE: retransformation of ln(Y) coefficient and CI in regression

From   "Steve Rothenberg" <>
To   <>
Subject   st: RE: retransformation of ln(Y) coefficient and CI in regression
Date   Sun, 5 Jun 2011 11:55:24 -0500

Thank you for the glm suggestion, Nick.  


. glm Y i.factor, vce(robust) family(Gaussian) link(log)

followed by
. predict xxx, mu

the command does indeed return the factor predictions in the original Y

However, the regression table with 95% CI is still in the original ln(Y)
units and I am still stuck not being able to calculate the 95% CI in the
original Y unit metric.  The predict command for returning prediction SE
(stdp) also only returns the SE in the ln(Y) metric.

I've read the manual on glm postestimation and can derive no hints on this

I'd welcome further suggestions for deriving the 95% confidence interval in
the original Y metric after either 

. regress ln(Y) ..., vce(robust)


. glm Y ..., link(log) vce(robust)

or any other estimation commands. 

Steve Rothenberg

If you recast your model as 

glm Y i.factor ... , link(log) 

no post-estimation fudges are required. -predict- automatically supplies
stuff in terms of Y, not ln Y. 


-----Mensaje original-----
De: Steve Rothenberg [] 
Enviado el: Sunday, June 05, 2011 10:27 AM
Para: ''
Asunto: retransformation of ln(Y) coefficient and CI in regression

I have a simple model with a natural log dependent variable and a three
level factor predictor.    I?ve used

 . regress lnY i.factor, vce(robust)

to obtain estimates in the natural log metric.  I want to be able to display
the results in a graph as means and 95% CI for each level of the factor with
retransformed units in the original Y metric.

I?ve also calculated geometric means and 95% CI for each level of the factor
variable using 

. ameans Y if factor==x

simply as a check, though the 95% CI is not adjusted for the vce(robust)
standard error as calculated by the -regress- model.

Using naïve transformation (i.e. ignoring retransformation bias) with 

. display exp(coefficient)

from the output of -regress- for each level of the predictor, with the
classic formulation:

Level 0 = exp(constant)
Level 1 = exp(constant+coef(1))
Level 2 = exp(constant+coef(2))

the series of retransformations from the -regress- command is the same as
the geometric means from the series of -ameans- commands.

When I try to do the same with the lower and upper 95% CI (substituting the
limits of the 95% CI for the coefficients) from the -regress- command,
however, the retransformed IC is much larger than calculated from the-
ameans- command, much more so than the differences in standard errors from
regress with and without the vce(robust) option would indicate.

I?ve discovered -levpredict- for unbiased retransformation of log dependent
variables in regression-type estimations by Christopher Baum in SSC but it
only outputs the bias-corrected means from the preceding -regress-.  To be
sure there is some small bias in the first or second decimal place of the
mean factor levels compared to naïve retransformation.

Am I doing something wrong by treating the 95% CI of each level of the
factor variable in the same way I treat the coefficients without correcting
for retransformation bias?  Is there any way I can obtain either the
retransformed CI or the bias-corrected retransformed CI for the different
levels of the factor variable in the original metric of Y?

I'd like to retain the robust SE from the above estimation as there is
considerable difference in variance in each level of the factor variable.

Steve Rothenberg
National Institute of Public Health
Cuernavaca, Morelos, Mexico

Stata/MP 11.2 for Windows (32-bit)
Born 30 Mar 2011  

*   For searches and help try:

© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index