Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: FW: Model SS/R-square in nl


From   Steven Samuels <[email protected]>
To   [email protected]
Subject   Re: st: FW: Model SS/R-square in nl
Date   Thu, 30 Jun 2011 22:03:54 -0400

One consequence of the fact that the mean-only model is not nested in the no-constant model is that it is possible that SSE > SST, so that s R-square = 1 - SSE/SST <0

In this example y is constant so the SST = 0, whereas SSE>0. Thus I believe that Gordon is incorrect and that the traditional  approach is correct.


**********************
clear
scalar drop _all
range x 0 10 11
gen y = 10
sum x
sum y
scalar n = r(N)
scalar var = r(Var)
scalar sstot = (n-1)*var

scalar list sstot
reg y x, nocons
******************


Steve




Some people agree with you, e.g.

H. A. Gordon.  Errors in Computer Packages. Least Squares Regression Through the Origin
Journal of the Royal Statistical Society. Series D (The Statistician) Vol. 30, No. 1 (Mar., 1981), pp. 23-29

But others don't, and the "error" is well-established. If you take your point of view, you have to justify an ANOVA table with the following d.f., taking p = 1 regressor.

SS	d.f.
Model   1
Error   n - 1
Total   n - 1 ?

This problem arises because the mean-only model is _not_ nested in the no-constant model as standard LS theory requires.

You can achieve a "nesting" by fitting no mean, getting:
Model 1
Error n - 1
Total n

The main benefit to the "SST must be the same for all models" approach, I think, is that one can compare R2 consistently for the same data set as R2 =  1 - SSE/SST. 


Steve
[email protected]

On Jun 30, 2011, at 4:24 PM, CJ Lan wrote:

If you look at the residual SS, i.e., sum of (yi-yhat)^2, the 1st model renders 28315 and the 2nd model renders 28427, which sounds reasonable because one parameter is eliminated.  My point is the Total SS, i.e., sum of (yi-mean(y))^2, should not be changed (=39434).  Therefore, in the 2nd model, the Model SS = (Total SS)-(residual SS) = 39434-28427 = 11007 and the R2 should have been 0.2791, which is the answer I got from Matlab.

The curve will not be forced through the origin.  The curve of the 1st model starts at (b0+b1=45.6) and decreases at an exponential rate.  Similarly the curve of the 2nd model starts at (b1=44.5) and decreases at a similar exponential rate.

-----Original Message-----
From: [email protected] [mailto:[email protected]] On Behalf Of Nick Cox
Sent: Thursday, June 30, 2011 4:05 PM
To: [email protected]
Subject: Re: st: FW: Model SS/R-square in nl

No, it is not a bug.

Your constant may not be significant by itself, but the model is
different. R-squares for different models are often difficult to
compare effectively.

Plot the fitted curves and the data to see what it is going on.

In my experience, especially with nonlinear models, it is far better
to rely on physical, biological, economic or other scientific
understanding to choose the better model and to compare fitted curves
with the data, rather than to rely blindly on a significance test.
Does it make sense to force the curve through the origin?

Nick

On Thu, Jun 30, 2011 at 6:06 PM, CJ Lan <[email protected]> wrote:
> I was using nl to run a 3-parameter NLS model estimation and got R2=0.28
> (see the first output).  Since the parameter b0 is insignificant, I drop
> it and re-estimate it again.  This time, I got the wrong R2 (=0.86 in
> the 2nd output).  It is apparent that either the "Model SS" or "Total
> SS" is wrongly calculated.  Is this bug?  Thank you for help.
> 
> (1)
> . nl exp3 : passby A in 1/152
> (obs =152)
> Iteration 0:  residual SS =3D  29741.65
> Iteration 1:  residual SS =3D  28448.53
> Iteration 2:  residual SS =3D  28316.37
> Iteration 3:  residual SS =3D  28315.61
> Iteration 4:  residual SS =3D   28315.6
> Iteration 5:  residual SS =3D   28315.6
> Iteration 6:  residual SS =3D   28315.6
> Iteration 7:  residual SS =3D   28315.6
>     Source |       SS       df       MS     Number of obs =152
> -------------+------------------------------  F(  2,   149) =29.25
>      Model |  11118.3472     2   5559.1736  Prob > F      =0.0000
>   Residual |  28315.6009   149   190.03759  R-squared     =0.2819
> -------------+------------------------------  Adj R-squared =0.2723
>      Total |  39433.9482   151  261.151975  Root MSE      =13.78541
>                                             Res. dev.     =1225.905
> 3-parameter asymptotic regression, passby = b0 + b1*b2^A
> ------------------------------------------------------------------------
>     passby |      Coef.   Std. Err.      t    P>|t| 95% Conf.Interval]
> -------------+----------------------------------------------------------
>         b0 |   11.59292   10.68695     1.08   0.280    -9.52 32.71048
>         b1 |   34.10476   9.433555     3.62   0.000     15.4 52.74559
>         b2 |    .998132   .0011685   854.19   0.000     .995 1.000441
> ------------------------------------------------------------------------
> * Parameter b0 taken as constant term in model & ANOVA table
> (SEs, P values, CIs, and correlations are asymptotic approximations)
> 
> (2)
> . nl exp2 : passby A in 1/152
> (obs =3D 152)
> Iteration 0:  residual SS =3D  29510.02
> Iteration 1:  residual SS =3D  28427.14
> Iteration 2:  residual SS =3D  28426.97
> Iteration 3:  residual SS =3D  28426.97
>     Source |       SS       df       MS     Number of obs =152
> -------------+------------------------------  F(  2,   150) =468.32
>      Model |  177506.602     2  88753.3012  Prob > F      =0.0000
>   Residual |  28426.9672   150  189.513115  R-squared     =0.8620
> -------------+------------------------------  Adj R-squared =0.8601
>      Total |   205933.57   152  1354.82612  Root MSE      =13.76638
>                                             Res. dev.     =1226.502
> 2-parameter exp. growth curve, passby =3D b1*b2^A
> ------------------------------------------------------------------------
>     passby |      Coef.   Std. Err.      t    P>|t|[95% Conf.interval]
> -------------+----------------------------------------------------------
>         b1 |   44.54536   2.038308    21.85   0.000  40.51785 48.57286
>         b2 |   .9988862   .0001727  5783.22   0.000  .9985449 .9992275
> ------------------------------------------------------------------------
> (SEs, P values, CIs, and correlations are asymptotic approximations)
> 

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

PLEASE NOTE: Florida has a very broad public records law. Most written communications to or from the Town of Jupiter officials and employees regarding public business are public records available to the public and media upon request. Your e-mail communications may be subject to public disclosure. Under Florida law, e-mail addresses are public records. If you do not want your e-mail address released in response to a public records request, do not send electronic mail to this entity. Instead, contact this office by phone or in writing. The views expressed in this message may not necessarily reflect those of the Town of Jupiter. If you have received this message in error, please notify us immediately by replying to this message, and please delete it from your computer. Thank you.


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index