Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: outliers


From   Fabio Zona <fabio.zona@unibocconi.it>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: outliers
Date   Fri, 27 Aug 2010 17:42:41 +0200 (CEST)

Wow! ..it seems to be a very long-lived, old story...


----- Messaggio originale -----
Da: "Nick Cox" <n.j.cox@durham.ac.uk>
A: "statalist@hsphsun2.harvard.edu" <statalist@hsphsun2.harvard.edu>
Inviato: Venerdì, 27 agosto 2010 17:27:38 GMT +01:00 Amsterdam/Berlino/Berna/Roma/Stoccolma/Vienna
Oggetto: RE: st: outliers

In 1827 Olbers asked Gauss "What should count as an unusual or too large a deviation? I would like to receive more precise directions." Gauss in his reply was disinclined to give any more directions and compared the situation to everyday life, where one often has to make intuitive judgments outside the reign of formal and explicit rules.

This is a paraphrase of a paraphrase from Gigerenzer, G. and five friends. 1989. The empire of chance: How probability changed science and everyday life. Cambridge University Press, p.83, who give the reference to Olbers' Leben und Werke. 

Nick 
n.j.cox@durham.ac.uk 

Maarten buis

--- On Fri, 27/8/10, fabio.zona@unibocconi.it wrote:
> More broadly: when would you suggest to use mmregress
> instead of regress (also with robust option)? Can we say
> that mmregress is always better than the simple OLS? Or it
> can be used only in the presence of a large number of
> outliers? and for how many outliers would you suggest the
> mmregres instaead of regress?

Unfortunately there can be no general recipe we can follow 
here. Remember that what we are trying to do is the following:
We have a question, we observe stuff, we summerize the stuff
using a model, we answer our question based on that summary.

Outliers are just observations that don't fit well in our 
model. This can mean two things, either there is something
wrong witht the observations or there is something wrong with
the model. 

There are several ways in which a computer can quantify how 
well an observation fits within the model, but there is no way 
a computer can decide whether it is the model or the observation 
that is to blame.

The solution is to know your data, figure out why a certain 
observations have been classified as outliers. If you have many
of those, don't only focus on various forms of "robust" 
regression, also consider that variables may have non-linear 
effects, i.e. try transformations. That is the art of using 
statistics for research.


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index