Notice: On March 31, it was **announced** that Statalist is moving from an email list to a **forum**. The old list will shut down on April 23, and its replacement, **statalist.org** is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
Austin Nichols <austinnichols@gmail.com> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: Comparison of the R-squared in a loglog and linear model |

Date |
Fri, 18 Jun 2010 11:17:25 -0400 |

Kit et al.-- Duan's smearing method is one approach to dealing with a logged depvar; a better approach is to use a regression technique that respects the functional form, like -poisson- (or another member of the -glm- family). But you still cannot compare the R-squared across non-nested models and hope to conclude anything about which model is better from that information alone. Mean squared prediction error in levels for the nonzero outcomes seems a reasonable criterion for rejecting the log(y) regression model below. use http://fmwww.bc.edu/ec-p/data/mus/mus03data, clear qui reg totexp suppins phylim actlim totchr age female income predict xb qui reg ltotexp suppins phylim actlim totchr age female income levpredict tenorm levpredict teduan, duan print qui poisson totexp suppins phylim actlim totchr age female income predict tepois qui nbreg totexp suppins phylim actlim totchr age female income predict tenbreg su totexp xb te* su totexp xb te* if totexp>0 corr totexp xb te* g mse_xb=(totexp-xb)^2/1e6 g mse_norm=(totexp-tenorm)^2/1e6 g mse_duan=(totexp-teduan)^2/1e6 g mse_pois=(totexp-tepois)^2/1e6 g mse_nbreg=(totexp-tenbreg)^2/1e6 su mse* su mse* if totexp>0 Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- mse_xb | 2955 127.0504 642.6503 .00005 12779.11 mse_norm | 2955 142.4353 641.0374 3.32e-06 11744.09 mse_duan | 2955 140.7604 644.1605 .0000549 11842.16 mse_pois | 2955 128.3255 648.1356 4.52e-06 12841.78 mse_nbreg | 2955 131.8694 642.3027 2.48e-06 12432.65 For those enamored of scatter plots for this kind of comparison, much more work is required to get a good picture of fit. This is one approach: g cr_te=totexp^(1/3) g cr_xb=sign(xb)*abs(xb)^(1/3) g cr_norm=tenorm^(1/3) g cr_duan=teduan^(1/3) g cr_pois=tepois^(1/3) g cr_nbreg=tenbreg^(1/3) sc cr_* cr_te if totexp>0, msize(1 1 1 1 1 1) On Fri, Jun 18, 2010 at 9:47 AM, Christopher Baum <kit.baum@bc.edu> wrote: > <> > On Jun 18, 2010, at 2:33 AM, Natalie wrote: > >> Can I not maybe obtain the antilog predicted values for the log log >> model and compute the R-squared between the antilog of the observed and >> predicted values. And then compare this R-square with the R-square >> obtained from OLS estimation of the linear model? >> >> There are other statistical programs that can do this automatically, but >> as I work with Stata, I'd rather do it with this program. > > > findit levpredict > > Generate the level form of the dependent variable (correctly, using this routine) and then > compute the squared correlation between that and the original level variable. That will be the > R^2 of the log form of the regression. * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**Re: st: Comparison of the R-squared in a loglog and linear model***From:*"Joao Ricardo F. Lima" <jricardofl@gmail.com>

**References**:**Re: st: Comparison of the R-squared in a loglog and linear model***From:*Christopher Baum <kit.baum@bc.edu>

- Prev by Date:
**st: RE: A Previous Question on Selecting Sample from Panel Data** - Next by Date:
**Re: st: AW: Correct way to save datasets** - Previous by thread:
**Re: st: Comparison of the R-squared in a loglog and linear model** - Next by thread:
**Re: st: Comparison of the R-squared in a loglog and linear model** - Index(es):