Stata: Data Analysis and Statistical Software

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: getting realistic fitted values from a regression

From	David Jacobs <[email protected]>
To	[email protected]
Subject	Re: st: getting realistic fitted values from a regression
Date	Thu, 22 Jul 2010 18:29:59 -0400

Maarten states the received wisdom on this issue, but in theeconometrics text authored by Jeffrey Wooldridge (IntroductoryEconometrics Thompson-Southwestern 2003 ) on pp. 208-9 Wooldridgesuggests a way to obtain unlogged predictions from a regression inwhich the regressand is in log form (there have been subsequenteditions of this book but the page numbers I give will be close inthose newer editions). If one of the statistical experts on thislist is familiar with this approach or is willing to look it up, I'dbe interested in their reaction.

That said, I wholeheartedly agree with Maarten's recommendation. Ifound the article he suggests by Cox et al. to be extremely usefuland I'm grateful to him for suggesting it on another occasion.


David Jacobs

At 03:08 AM 7/22/2010, you wrote:

--- On Wed, 21/7/10, Woolton Lee wrote:
> I have estimated a regression (OLS) using log of patient
> travel distance to a hospital predicted by patient, hospital
> and area characteristics.  I am going to report the results
> as marginal effects that I've computed by obtaining
> predictions from my estimated regression computed by fixing
> some variables and keeping others at their original values.
>  However after I compute the predictions, I am getting
> unrealistically large numbers.  When I examined the regression
> residuals it looks as though the obs with unrealistic fitted
> values have larger residuals.  Is there a way to adjust the
> regression to better account for this problem?

If you want to predict the travel distance you should use
-glm- with -link(log)- option rather than use -regress- on
a log transformed dependent variable. The difference is that
with the former you are modeling log(E(y)), while in the latter
you are moddeling E(log(y)). If you want to backtransform your
predictions using the antlog transformation you will get
exp(log(E(y))) = E(y) for the -glm- command, while after -regress
you get exp(E(log(y))) != E(y). A nice discussion on this issue
can be found in:

Nicholas J. Cox, Jeff Warburton, Alona Armstrong, Victoria J. Holliday
(2007) "Fitting concentration and load rating curves with generalized
linear models" Earth Surface Processes and Landforms, 33(1):25--39.
<http://www3.interscience.wiley.com/journal/114281617/abstract>

There exist approximations you can use after -regress- to fix
this problem, by why try to fix a problem if you can easily prevent
it?

Hope this helps,
Maarten

--------------------------
Maarten L. Buis
Institut fuer Soziologie
Universitaet Tuebingen
Wilhelmstrasse 36
72074 Tuebingen
Germany

http://www.maartenbuis.nl
--------------------------




*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

Follow-Ups:
- RE: st: getting realistic fitted values from a regression
  - From: "Nick Cox" <[email protected]>

References:
- st: getting realistic fitted values from a regression
  - From: Woolton Lee <[email protected]>
- Re: st: getting realistic fitted values from a regression
  - From: Maarten buis <[email protected]>

Prev by Date: RE: st: Radar module for stata 10
Next by Date: RE: st: Radar module for stata 10
Previous by thread: Re: st: getting realistic fitted values from a regression
Next by thread: RE: st: getting realistic fitted values from a regression
Index(es):
- Date
- Thread