Stata: Data Analysis and Statistical Software

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: RE: RE: retransformation of ln(Y) coefficient and CI in regression

From	"Steve Rothenberg" <[email protected]>
To	<[email protected]>
Subject	st: RE: RE: retransformation of ln(Y) coefficient and CI in regression
Date	Tue, 7 Jun 2011 17:48:28 -0500

I had already posted my re-discovery of the -predictnl- options, suggested
by Martin Weiss, my search provoked by Nick Cox's suggestion to use the
options with -predict- after -glm ln(Y) i.factor, vce(robust)- estimation,
before I discovered Roger Newson's and Martin Buis's elegant treatments of
the problem, using the -eform- option for -regress-, listed below.

Thanks for the additional code, good folks, and for all the help.

Steve Rothenberg
*******************
Date: Mon, 6 Jun 2011 10:31:46 +0100
From: Roger Newson <[email protected]>
Subject: Re: st: retransformation of ln(Y) coefficient and CI in regression

The -regress- command has an -eform- option, which gives the confidence 
limits of geometric means and their ratios. This is described in Newson 
(2003), and can be used together with -robust- to display 
unequal-variance confidence limits.

And, if you want to plot the confidence limits against the factor 
values, then you might like to use the -parmest-, -eclplot-, -fvregen- 
and -descsave- packages, downloadable from SSC. As in:

tempfile df0
descsave factor, do(`"`df0'"', replace)
regress lnY ibn.factor, vce(robust) noconst eform(GM/Ratio)
parmest, norestore eform
fvregen, do(`"`df0'"')
eclplot estimate min* max* factor

In this example, we start by defining a temporary file whose macro name 
is -df0-. We then use -descsave- (an extended version of -describe- 
which can create output do-files) to write a do-file to that temporary 
file, defining the variable attributes (storage type, format, variable 
label and value label) of the variable -factor-. We then use -regress-, 
with the -eform(GM)- option to specify confidence limits for geometric 
means and/or their ratios, and the -noconst- option and the X-variable 
list -ibn.factor- to specify that the parameters will be geometric means 
instead of ratios.  We then use -parmest- to overwrite the existing 
dataset in memory with an output dataset (or resultsset), with 1 
observation per parameter and data on parameter names, estimates, 
confidence limits and other parameter attributes. In this new output 
dataset, we then use -fvregen- to regenerate the variable -factor- from 
the parameter names. Finally, we use -eclplot- to produce a confidence 
interval plot,  with the values of -factor- on the X-axis and the 
estimates and unequal-variance confidence limits for the corresponding 
geometric means on the Y-axis. More about all these packages can be 
found in the on-line help for -parmest-, which contains many hypertext 
references.

I hope this helps.

Best wishes

Roger

On Sun, Jun 5, 2011 at 6:55 PM, Steve Rothenberg wrote:
> . glm Y i.factor, vce(robust) family(Gaussian) link(log)
>
> followed by
>
> . predict xxx, mu
>
> the command does indeed return the factor predictions in the original Y
> metric.
>
> However, the regression table with 95% CI is still in the original ln(Y)
> units and I am still stuck not being able to calculate the 95% CI in the
> original Y unit metric.

As for the regression table, you can your coefficients in the y metric
by specifying the -eform- option:

*-------------- begin example -----------------
sysuse auto, clear
gen byte baseline = 1
gen c_mpg = mpg - 20
glm price c_mpg foreign baseline, ///
    link(log) nocons eform
*---------------- end example ----------------

In this example the domestic cars with 20 miles per gallon cost on
average 5,735 dollars. This price increases by a factor 1.36, i.e.
36%, when the car is foreign and decreases by a factor .93, i.e. -7%,
for every mile per gallon increase in mileage.

> The predict command for returning prediction SE
> (stdp) also only returns the SE in the ln(Y) metric.
>
> I'd welcome further suggestions for deriving the 95% confidence interval
in
> the original Y metric after either

For that type of problem I like the old -adjust- command, see: -help
adjust-. That help file says that it is superseded by the -margins-
command, but it is much easier to use if you want to create variables
(e.g. as preparation for creating graphs).

Hope this helps,
Maarten

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

Prev by Date: st: get expected error terms in multivariate probit model
Next by Date: Re: st: imposing cross-equation constrains with nlsur
Previous by thread: st: RE: RE: retransformation of ln(Y) coefficient and CI in regression
Next by thread: st: Elimination of outliers
Index(es):
- Date
- Thread