Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

RE: st: definition of pseudo R^2 for dprobit or probit


From   "Nick Cox" <[email protected]>
To   <[email protected]>
Subject   RE: st: definition of pseudo R^2 for dprobit or probit
Date   Tue, 28 Oct 2003 10:20:38 -0000

I agreed strongly with Richard before his last paragraph. 

My own bias is to try to steer the discussion in
the opposite direction, away from all ideas of "best": 

* That discussion goes in a circle with a discussion of criteria 
for "best", and there are lots, as everyone knows. After all, 
we go round and round on preferred measures of location, scale, 
shape, association in two-way tables, rank correlation, and so 
forth. 

* There are all sorts of theoretical and practical arguments for 
saying that in many fields far too much emphasis is already placed 
on single-number figures of merit (as compared with looking 
at graphs, looking at residuals, detailed discussion of the 
scientific and practical issues behind variable choice, model 
structure, etc.). Sometimes it seems that researchers will 
spend a very long time producing or collating data, formatting
it for software, writing programs, ..., and then expect to make a 
quick decision on model virtues based on a few magic numbers! 

* These questions of which measures to use 
seem to arise primarily when response variables are categorical 
(wide sense). The even wider context including measured responses 
is, I hope everyone will agree, vital. After all, the history 
presumably is that people wanted measures fulfilling the same 
role as R^2 in (say) multiple regression -- even if that role 
is often aggressive, not analytical, using R^2 to intimidate, 
rather than to inform. 

There are two simple ideals, it seems to me: that everyone 
should state clearly what definition of R^2 they are 
using; and that in principle enough information should be 
provided to allow other measures to be calculated. Beyond 
that, if measures fail to agree numerically, then choosing 
one as best requires a special argument (which, 
for all I know, could be "this is what people use in this 
field, so I'll use it too").  

There are more platitudes posing as homespun wisdom at 
http://www.stata.com/support/faqs/stat/rsquared.html
(and also some references and some code fragments). 

Nick 
[email protected] 

> -----Original Message-----
> From: [email protected]
> [mailto:[email protected]]On Behalf Of Richard
> Williams
> Sent: 28 October 2003 02:32
> To: [email protected]
> Subject: Re: st: definition of pseudo R^2 for dprobit or probit
> 
> 
> At 08:04 PM 10/27/2003 -0600, Scott Merryman wrote:
> 
> >[R] maximize, Methods and Formulas section
> >
> >Pseudo R2 = 1 - L1/L0, where L1 is the log likelihood of 
> the full model and L0
> >is the log likelihood of the constant-only model.
> 
> That is one of a couple of equivalent formulas but probably 
> the simplest to 
> write in an email message!  Certainly clearer than what I 
> wrote earlier.
> 
> As a sidelight, this is one of many statistics that claims 
> the name of 
> "Pseudo R2".   It would be nice if Stata explicitly labeled it as 
> McFadden's R2, and perhaps reported a couple of the other 
> alternatives in 
> case anybody wants them.
> 
> Of the various alternatives, McFadden's R2 seems to have 
> emerged as the 
> favorite and best.  Anybody strongly disagree and think 
> something else is 
> better?
> 

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index