Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Comparison of the R-squared in a loglog and linear model


From   Austin Nichols <austinnichols@gmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: Comparison of the R-squared in a loglog and linear model
Date   Fri, 18 Jun 2010 11:17:25 -0400

Kit et al.--
Duan's smearing method is one approach to dealing with a logged
depvar; a better approach is to use a regression technique that
respects the functional form, like -poisson- (or another member of the
-glm- family). But you still cannot compare the R-squared across
non-nested models and hope to conclude anything about which model is
better from that information alone.  Mean squared prediction error in
levels for the nonzero outcomes seems a reasonable criterion for
rejecting the log(y) regression model below.

use http://fmwww.bc.edu/ec-p/data/mus/mus03data, clear
qui reg totexp suppins phylim actlim totchr age female income
predict xb
qui reg ltotexp suppins phylim actlim totchr age female income
levpredict tenorm
levpredict teduan, duan print
qui poisson totexp suppins phylim actlim totchr age female income
predict tepois
qui nbreg totexp suppins phylim actlim totchr age female income
predict tenbreg
su totexp xb te*
su totexp xb te* if totexp>0
corr totexp xb te*
g mse_xb=(totexp-xb)^2/1e6
g mse_norm=(totexp-tenorm)^2/1e6
g mse_duan=(totexp-teduan)^2/1e6
g mse_pois=(totexp-tepois)^2/1e6
g mse_nbreg=(totexp-tenbreg)^2/1e6
su mse*
su mse* if totexp>0

    Variable |       Obs        Mean    Std. Dev.       Min        Max
-------------+--------------------------------------------------------
      mse_xb |      2955    127.0504    642.6503     .00005   12779.11
    mse_norm |      2955    142.4353    641.0374   3.32e-06   11744.09
    mse_duan |      2955    140.7604    644.1605   .0000549   11842.16
    mse_pois |      2955    128.3255    648.1356   4.52e-06   12841.78
   mse_nbreg |      2955    131.8694    642.3027   2.48e-06   12432.65

For those enamored of scatter plots for this kind of comparison, much
more work is required to get a good picture of fit.  This is one
approach:

g cr_te=totexp^(1/3)
g cr_xb=sign(xb)*abs(xb)^(1/3)
g cr_norm=tenorm^(1/3)
g cr_duan=teduan^(1/3)
g cr_pois=tepois^(1/3)
g cr_nbreg=tenbreg^(1/3)
sc cr_* cr_te if totexp>0, msize(1 1 1 1 1 1)

On Fri, Jun 18, 2010 at 9:47 AM, Christopher Baum <kit.baum@bc.edu> wrote:
> <>
> On Jun 18, 2010, at 2:33 AM, Natalie wrote:
>
>> Can I not maybe obtain the antilog predicted values for the log log
>> model and compute the R-squared between the antilog of the observed and
>> predicted values. And then compare this R-square with the R-square
>> obtained from OLS estimation of the linear model?
>>
>> There are other statistical programs that can do this automatically, but
>> as I work with Stata, I'd rather do it with this program.
>
>
> findit levpredict
>
> Generate the level form of the dependent variable (correctly, using this routine) and then
> compute the squared correlation between that and the original level variable. That will be the
> R^2 of the log form of the regression.
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index