Thanks for all the useful comments.
Just to clarify the issue: For example, the predictions based on
log(E[price]) = XG with GLM should be identical to the predictions
generated from E[log(price)] = XB (fit by -regress-, generating
B_hat), when the later are adjusted properly?
What would you suggest for predictions based on a box-cox (left-hand
side) transformation? A two step procedure, first estimating the box-cox
transformation parameter and then using that parameter in a GLM to
generate predicted variables?
> -----Original Message-----
> From: [email protected]
> [mailto:[email protected]] On Behalf Of Phil Schumm
> Sent: Wednesday, April 12, 2006 11:57 AM
> To: [email protected]
> Subject: st: Re: Generating predicted values for OLS with
> transformed dependent variables
>
>
> On Apr 12, 2006, at 9:31 AM, Nick Cox wrote:
> > As posted earlier, -glm- offers the back-to-basics Jacobin solution
> > as an alternative to this use of Jacobians.
>
> On Apr 12, 2006, at 11:00 AM, Rodrigo A. Alfaro wrote:
> > I didn't explore glm command before, then I tried the following:
> >
> > sysuse auto
> > g lnp=ln(price)
> > glm price, family(normal) link(log) nohead nolog
> > reg lnp, nohead
> >
> > and the coefficient (let's say mu) is different. Is there somethig
> > than I am missing? a normalization issue?
>
>
> To expand on Nick's suggestion, one of the primary features of the
> GLM approach (as opposed to modeling a transformed variable) is to
> obtain predictions on the raw (i.e., untransformed) scale.
> So GLM is
> absolutely an important alternative to consider if this is a
> requirement.
>
> The reason your results are different is that you've fit two
> different models. They are:
>
> E[log(price)] = XB (fit by -regress-, generating B_hat)
>
> and
>
> log(E[price]) = XG (fit by -glm-)
>
> One can show that under certain conditions, you can consistently
> estimate G by B_hat (except for the intercept), but if those
> conditions aren't met, B_hat will be estimating something
> different.
> Naively assuming that B_hat estimates G is a common mistake people
> make when interpreting the results of a regression on a transformed
> variable.
>
> The documentation on -glm- in [R] is a good start, but if you're
> using this for anything important, I'd strongly suggest picking up a
> copy of Generalized Linear Models (by McCullagh and Nelder), in
> particular the chapters "An outline of generalized linear
> models" and
> "Models with constant coefficient of variation".
>
>
> -- Phil
>
> *
> * For searches and help try:
> * http://www.stata.com/support/faqs/res/findit.html
> * http://www.stata.com/support/statalist/faq
> * http://www.ats.ucla.edu/stat/stata/
>
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/