Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: RE: RE: retransformation of ln(Y) coefficient and CI in regression

From   "Steve Rothenberg" <>
To   <>
Subject   st: RE: RE: retransformation of ln(Y) coefficient and CI in regression
Date   Tue, 7 Jun 2011 17:48:28 -0500

I had already posted my re-discovery of the -predictnl- options, suggested
by Martin Weiss, my search provoked by Nick Cox's suggestion to use the
options with -predict- after -glm ln(Y) i.factor, vce(robust)- estimation,
before I discovered Roger Newson's and Martin Buis's elegant treatments of
the problem, using the -eform- option for -regress-, listed below.

Thanks for the additional code, good folks, and for all the help.

Steve Rothenberg
Date: Mon, 6 Jun 2011 10:31:46 +0100
From: Roger Newson <>
Subject: Re: st: retransformation of ln(Y) coefficient and CI in regression

The -regress- command has an -eform- option, which gives the confidence 
limits of geometric means and their ratios. This is described in Newson 
(2003), and can be used together with -robust- to display 
unequal-variance confidence limits.

And, if you want to plot the confidence limits against the factor 
values, then you might like to use the -parmest-, -eclplot-, -fvregen- 
and -descsave- packages, downloadable from SSC. As in:

tempfile df0
descsave factor, do(`"`df0'"', replace)
regress lnY ibn.factor, vce(robust) noconst eform(GM/Ratio)
parmest, norestore eform
fvregen, do(`"`df0'"')
eclplot estimate min* max* factor

In this example, we start by defining a temporary file whose macro name 
is -df0-. We then use -descsave- (an extended version of -describe- 
which can create output do-files) to write a do-file to that temporary 
file, defining the variable attributes (storage type, format, variable 
label and value label) of the variable -factor-. We then use -regress-, 
with the -eform(GM)- option to specify confidence limits for geometric 
means and/or their ratios, and the -noconst- option and the X-variable 
list -ibn.factor- to specify that the parameters will be geometric means 
instead of ratios.  We then use -parmest- to overwrite the existing 
dataset in memory with an output dataset (or resultsset), with 1 
observation per parameter and data on parameter names, estimates, 
confidence limits and other parameter attributes. In this new output 
dataset, we then use -fvregen- to regenerate the variable -factor- from 
the parameter names. Finally, we use -eclplot- to produce a confidence 
interval plot,  with the values of -factor- on the X-axis and the 
estimates and unequal-variance confidence limits for the corresponding 
geometric means on the Y-axis. More about all these packages can be 
found in the on-line help for -parmest-, which contains many hypertext 

I hope this helps.

Best wishes


On Sun, Jun 5, 2011 at 6:55 PM, Steve Rothenberg wrote:
> . glm Y i.factor, vce(robust) family(Gaussian) link(log)
> followed by
> . predict xxx, mu
> the command does indeed return the factor predictions in the original Y
> metric.
> However, the regression table with 95% CI is still in the original ln(Y)
> units and I am still stuck not being able to calculate the 95% CI in the
> original Y unit metric.

As for the regression table, you can your coefficients in the y metric
by specifying the -eform- option:

*-------------- begin example -----------------
sysuse auto, clear
gen byte baseline = 1
gen c_mpg = mpg - 20
glm price c_mpg foreign baseline, ///
    link(log) nocons eform
*---------------- end example ----------------

In this example the domestic cars with 20 miles per gallon cost on
average 5,735 dollars. This price increases by a factor 1.36, i.e.
36%, when the car is foreign and decreases by a factor .93, i.e. -7%,
for every mile per gallon increase in mileage.

> The predict command for returning prediction SE
> (stdp) also only returns the SE in the ln(Y) metric.
> I'd welcome further suggestions for deriving the 95% confidence interval
> the original Y metric after either

For that type of problem I like the old -adjust- command, see: -help
adjust-. That help file says that it is superseded by the -margins-
command, but it is much easier to use if you want to create variables
(e.g. as preparation for creating graphs).

Hope this helps,

*   For searches and help try:

© Copyright 1996–2017 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index