Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: st: criticisms of classical model selection methods

From   Nick Cox <>
To   "''" <>
Subject   RE: st: criticisms of classical model selection methods
Date   Thu, 19 Aug 2010 19:16:15 +0100

I agree generally with Maarten's comments and add a few more, mostly standard and perhaps much broader than was called for by the original question. 

1. I find it a bit droll that measures introduced fairly recently are described as "classical". "Classical" is what was old when you were young.... 

2. There seems to be widespread confusion between or conflation of "automated", "objective" and "correct". It seems widely considered that researchers should use judgment in choosing model form, predictor variables, whether or not to transform, whether to drop outliers or problem observations, what to do about missing data, how to regard error terms, etc., etc., while final model choice must be based on quantitative criteria. I can see that automated choice may be a compelling need if for some reason you are comparing many candidate models, but if that is the case a better strategy might be to think harder about cutting down the field. 

3. Most currently fashionable criteria quantify a trade-off between goodness of fit and simplicity but the existence of multiple different criteria with the same broad aim shows that even that very restricted combination of two criteria can be quantified in different ways. Statements of the form that "?IC is known typically to over-fit or under-fit" show that even believers in these criteria can not resist inserting their own (or other people's) judgements, which is indeed usually a good idea. In other words there can be (must be!) criteria to choose criteria, and so on. Even defining simplicity as the number of adjustable parameters is a very narrow view of simplicity. Quantifying lack of fit alone is of course very hard too. 

4. Many fields continue to neglect graphical assessment of models. The common attitude that complicated models cannot be assessed graphically is often implicit but just shows lack of imagination or experience. It is striking that so many researchers place massive emphasis on the elegant presentation of massive tabular output, which in at least some  cases they do not appear to examine carefully, just homing in on the one magic number that supposedly captures overall model performance. 

5. In many fields model choice is, or should be, based on criteria such as whether the model makes good physical (medical, economic,...) sense, comparisons with other models, the need to explain a model to users or consumers, whether the model has sensible limiting or qualitative behaviour, or whether the model will yield good predictions beyond the data. It is elementary, but also fundamental, that most such criteria are difficult or impossible to program but nevertheless can be discussed within a field. 


Maarten buis

--- On Thu, 19/8/10, Sam Brilleman wrote:
> Is anyone able to point me in the direction of references
> dealing with the discussion for and against traditional
> methods of model selection. I am particularly interested in
> criticisms of the assumptions on which commonly used model
> selection criteria are based (eg. AIC, BIC, etc).

Andrew Gelman just posted on his blog on a closely related issue.

I would have put this slightly differently: there may or may
not be a true world out there (I'll leave that question to the
philosophers), but the purpose of a model and model selection
is _not_ to find that true world, put to find a simplified 
version of it that helps us find the answer to a specific 
research question. 

As a consequence we need to have an idea of what the key parts
are that we need to get right in our model in order to answer 
our question, and check those. This really depends on the 
question, there can be no universaly correct checklist/cookbook
we can work through in order to get the "correct/scientific"
model. A generic measure, like goodnes of fit, just distracts 
us from the main issue, which is to answer our research 
question based on stuff we have seen, heard, felt, or otherwise

*   For searches and help try:

© Copyright 1996–2017 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index