[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

From |
n j cox <n.j.cox@durham.ac.uk> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: Gof for ologit/oprobit |

Date |
Wed, 31 Oct 2007 10:01:24 +0000 |

Clive Nicholas and Richard Williams (both, like me, formally

non-statisticians) made many good points. I guess Dan didn't mean

to imply that "non-statistician" necessarily means statistically uninformed, as that would put a question-mark on most of the members of this list. I'll take it as shorthand for "someone who does not know much statistics".

I'd add a few comments. First of all some marginal disagreement:

Richard wrote, and Clive agreed:

> The count measures can be pretty much useless when one outcome is

> rare, e.g. only 10% get a zero, because it will then often be the

> case that every case gets predicted as a 1.

I am not clear that this is an indictment of the measures concerned.

If your model can't predict rare outcomes, that is part of its

limitations, and you really should want to know that. Of course,

almost no model can predict rare outcomes, as most operate on

some kind of averaging, but that doesn't change the principle.

More generally:

"Goodness-of-fit" is in part a propaganda term. Less common

is "badness-of-fit", a term I believe is due to, or at least

was spread by, Joseph B. Kruskal. Google counts of GOF versus

BOF run about 10 to 1. That surprised me: I would have guessed

more than 100 to 1. A case of my uneven sampling of the literature,

no doubt.

In linear regression if forced to choose a single measure I would use

RMS error (~ SD of residuals), not R^2. It is on the scale of the

response, and it is less likely to impart false optimism. I'd

go for an RMS error in other models whenever it was computable

(and it is whenever predictions can be made on the scale of

the response variable).

Nobody put in a word for graphical assessment. Scientists like

observed vs fitted plots (calibration plots). Statistically-minded

people usually start with residual vs fitted plots as a health check.

(No news is good news.) It's true, unfortunately, that many of these plots are more difficult to define, or to work with, for discrete response models. It's also true, unfortunately, that StataCorp provided various graphical add-ons for use after -regress- and -anova- but stopped about there. In the Stata Journal in 2004 I wrote up a -modeldiag- package, but it doesn't really extend to ordered logit or ordered probit because of the multiple outcomes. (-findit modeldiag- for locations.) So, that just added a task to my to-do list.

For ordered *it, I'd want first a cross-plot of observed and predicted

outcomes.

Here is a dopey example with Stata 8.2.

. sysuse auto, clear

. ologit rep78 mpg

. predict predicted, xb

. scatter rep78 predicted

Naturally, you might also want to round the predictions. -tabplot-

from SSC then provides a way of keeping the comparison graphical.

It's possible to add an R^2, naturally, but not necessarily useful.

In a loosely similar thread, Maarten Buis recently underlined

a simple but fundamental point he makes to social science

students. Paraphrasing, and he might dissent from this wording:

A perfect model could mean that I can predict your behaviour

or condition just from knowing a few things about you. Does that

tally with how you think you (and people (and society)) actually work?

Nick

n.j.cox@durham.ac.uk

Dan Weitzenfeld

What is the best way to communicate to non-statisticians the Goodness

of Fit (gof) of an ordered logit/ordered probit model?

For OLS, there is the trusty R2, letting you tell a non-statistician,

"I can explain X% of the variation in the dependent variable."

For logit/probit, I've used the probability of correct classification,

type I and type II error rates as my go-to metric for gof.

Is there a corresponding metric for ordered logit/ordered probit?

I've read about psuedo R2 and it's faults. Probability of correct

classification doesn't seem fair given the multiple categories of the

dependent variable - if my model predicts you'll be a 2 but you're a

3, I get no credit for being close.

*

* For searches and help try:

* http://www.stata.com/support/faqs/res/findit.html

* http://www.stata.com/support/statalist/faq

* http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**Re: st: Gof for ologit/oprobit***From:*n j cox <n.j.cox@durham.ac.uk>

- Prev by Date:
**Re: st: RE: return results from dstdize** - Next by Date:
**Re: st: Gof for ologit/oprobit** - Previous by thread:
**Re: st: Gof for ologit/oprobit** - Next by thread:
**Re: st: Gof for ologit/oprobit** - Index(es):

© Copyright 1996–2016 StataCorp LP | Terms of use | Privacy | Contact us | What's new | Site index |