Stata: Data Analysis and Statistical Software

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: st: outliers

From	Nick Cox <[email protected]>
To	"'[email protected]'" <[email protected]>
Subject	RE: st: outliers
Date	Fri, 27 Aug 2010 16:27:38 +0100

In 1827 Olbers asked Gauss "What should count as an unusual or too large a deviation? I would like to receive more precise directions." Gauss in his reply was disinclined to give any more directions and compared the situation to everyday life, where one often has to make intuitive judgments outside the reign of formal and explicit rules.

This is a paraphrase of a paraphrase from Gigerenzer, G. and five friends. 1989. The empire of chance: How probability changed science and everyday life. Cambridge University Press, p.83, who give the reference to Olbers' Leben und Werke. 

Nick 
[email protected] 

Maarten buis

--- On Fri, 27/8/10, [email protected] wrote:
> More broadly: when would you suggest to use mmregress
> instead of regress (also with robust option)? Can we say
> that mmregress is always better than the simple OLS? Or it
> can be used only in the presence of a large number of
> outliers? and for how many outliers would you suggest the
> mmregres instaead of regress?

Unfortunately there can be no general recipe we can follow 
here. Remember that what we are trying to do is the following:
We have a question, we observe stuff, we summerize the stuff
using a model, we answer our question based on that summary.

Outliers are just observations that don't fit well in our 
model. This can mean two things, either there is something
wrong witht the observations or there is something wrong with
the model. 

There are several ways in which a computer can quantify how 
well an observation fits within the model, but there is no way 
a computer can decide whether it is the model or the observation 
that is to blame.

The solution is to know your data, figure out why a certain 
observations have been classified as outliers. If you have many
of those, don't only focus on various forms of "robust" 
regression, also consider that variables may have non-linear 
effects, i.e. try transformations. That is the art of using 
statistics for research.

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

Follow-Ups:
- Re: st: outliers
  - From: Fabio Zona <[email protected]>

References:
- Re: st: outliers
  - From: [email protected]
- Re: st: outliers
  - From: Maarten buis <[email protected]>

Prev by Date: Re: st: looking for more efficient programming for randomly shuffling list of numbers
Next by Date: Re: st: outliers
Previous by thread: Re: st: outliers
Next by thread: Re: st: outliers
Index(es):
- Date
- Thread