Statalist The Stata Listserver


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: RE: Re: Generating predicted values for OLS with transformed dependent variables


From   "Daniel Schneider" <[email protected]>
To   <[email protected]>
Subject   st: RE: Re: Generating predicted values for OLS with transformed dependent variables
Date   Wed, 12 Apr 2006 12:47:47 -0700

Thanks for all the useful comments.

Just to clarify the issue: For example, the predictions based on
log(E[price]) = XG with GLM should be identical to the predictions
generated from E[log(price)] = XB    (fit by -regress-, generating
B_hat), when the later are adjusted properly?

What would you suggest for predictions based on a box-cox (left-hand
side) transformation? A two step procedure, first estimating the box-cox
transformation parameter and then using that parameter in a GLM to
generate predicted variables? 

> -----Original Message-----
> From: [email protected] 
> [mailto:[email protected]] On Behalf Of Phil Schumm
> Sent: Wednesday, April 12, 2006 11:57 AM
> To: [email protected]
> Subject: st: Re: Generating predicted values for OLS with 
> transformed dependent variables
> 
> 
> On Apr 12, 2006, at 9:31 AM, Nick Cox wrote:
> > As posted earlier, -glm- offers the back-to-basics Jacobin solution
> > as an alternative to this use of Jacobians.
> 
> On Apr 12, 2006, at 11:00 AM, Rodrigo A. Alfaro wrote:
> > I didn't explore glm command before, then I tried the following:
> >
> > sysuse auto
> > g lnp=ln(price)
> > glm price, family(normal) link(log) nohead nolog
> > reg lnp, nohead
> >
> > and the coefficient (let's say mu) is different. Is there somethig
> > than I am missing? a normalization issue?
> 
> 
> To expand on Nick's suggestion, one of the primary features of the  
> GLM approach (as opposed to modeling a transformed variable) is to  
> obtain predictions on the raw (i.e., untransformed) scale.  
> So GLM is  
> absolutely an important alternative to consider if this is a  
> requirement.
> 
> The reason your results are different is that you've fit two  
> different models.  They are:
> 
> E[log(price)] = XB    (fit by -regress-, generating B_hat)
> 
> and
> 
> log(E[price]) = XG    (fit by -glm-)
> 
> One can show that under certain conditions, you can consistently  
> estimate G by B_hat (except for the intercept), but if those  
> conditions aren't met, B_hat will be estimating something 
> different.   
> Naively assuming that B_hat estimates G is a common mistake people  
> make when interpreting the results of a regression on a transformed  
> variable.
> 
> The documentation on -glm- in [R] is a good start, but if you're  
> using this for anything important, I'd strongly suggest picking up a  
> copy of Generalized Linear Models (by McCullagh and Nelder), in  
> particular the chapters "An outline of generalized linear 
> models" and  
> "Models with constant coefficient of variation".
> 
> 
> -- Phil
> 
> *
> *   For searches and help try:
> *   http://www.stata.com/support/faqs/res/findit.html
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
> 

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index