Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: Gof for ologit/oprobit


From   "Clive Nicholas" <clivelists@googlemail.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: Gof for ologit/oprobit
Date   Wed, 31 Oct 2007 02:54:10 +0000

Dan Weitzenfeld wrote:

> What is the best way to communicate to non-statisticians the Goodness
> of Fit (gof) of an ordered logit/ordered probit model?
>
> For OLS, there is the trusty R2, letting you tell a non-statistician,
> "I can explain X% of the variation in the dependent variable."
>
> For logit/probit, I've used the probability of correct classification,
> type I and type II error rates as my go-to metric for gof.

Personally speaking, I've never thought the 'percentage correctly
classified' summary statistic (which I think you're referring to) to
be particularly meaningful, since it can hardly fail when Y=1 in more
than 50% of the observations. No doubt I'll get lynched for saying
this.

> Is there a corresponding metric for ordered logit/ordered probit?
> I've read about psuedo R2 and it's faults.  Probability of correct
> classification doesn't seem fair given the multiple categories of the
> dependent variable - if my model predicts you'll be a 2 but you're a
> 3, I get no credit for being close.
>
> Please feel free to just reply with a link/manual reference that I
> should read and I'll do the reading.

I wouldn't go beyond McFadden's R-squared (see Pampel, 2000: 49 for a
crisp definition), which measures the proportional reduction in the
absolute value of the log-likelihood, and which also appears to be the
consensus choice of GOF statistic in GLM models.

That said, I rather think you're tackling the question of model
evaluation the wrong way round. I wouldn't rely on any couple of fit
statistics when evaluating the 'health' of my model. My first and
foremost concerns would be:

(1) Is this the most parsimonious model I can fit whilst remaining
comprehensive in explanatory scope?

(2) Do all the coefficients representing my key variables of interest
attain statistically significant effects upon what is being explained?

(3) Do all of those key coefficients have the correct sign?

(4) Does the model violate any key statistical or distributional
assumptions? (For me, always the toughest question to answer, and
sometimes even tougher to cook up an alternative in the light of any
violations.)

Only then do I think about goodness of fit; if your model is a 'good'
model, GOF normally takes care of itself, anyway.

-- 
Clive Nicholas

[Please DO NOT mail me personally here, but at
<clivenicholas@hotmail.com>. Thanks!]

Pampel F (2000) "Logistic Regression: A Primer", QASS Paper 132,
Thousand Oaks, CA: Sage.
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index