Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

# RE: st: outliers

 From Nick Cox To "'statalist@hsphsun2.harvard.edu'" Subject RE: st: outliers Date Fri, 27 Aug 2010 16:27:38 +0100

```In 1827 Olbers asked Gauss "What should count as an unusual or too large a deviation? I would like to receive more precise directions." Gauss in his reply was disinclined to give any more directions and compared the situation to everyday life, where one often has to make intuitive judgments outside the reign of formal and explicit rules.

This is a paraphrase of a paraphrase from Gigerenzer, G. and five friends. 1989. The empire of chance: How probability changed science and everyday life. Cambridge University Press, p.83, who give the reference to Olbers' Leben und Werke.

Nick
n.j.cox@durham.ac.uk

Maarten buis

--- On Fri, 27/8/10, fabio.zona@unibocconi.it wrote:
> More broadly: when would you suggest to use mmregress
> instead of regress (also with robust option)? Can we say
> that mmregress is always better than the simple OLS? Or it
> can be used only in the presence of a large number of
> outliers? and for how many outliers would you suggest the

Unfortunately there can be no general recipe we can follow
here. Remember that what we are trying to do is the following:
We have a question, we observe stuff, we summerize the stuff
using a model, we answer our question based on that summary.

Outliers are just observations that don't fit well in our
model. This can mean two things, either there is something
wrong witht the observations or there is something wrong with
the model.

There are several ways in which a computer can quantify how
well an observation fits within the model, but there is no way
a computer can decide whether it is the model or the observation
that is to blame.

The solution is to know your data, figure out why a certain
observations have been classified as outliers. If you have many
of those, don't only focus on various forms of "robust"
regression, also consider that variables may have non-linear
effects, i.e. try transformations. That is the art of using
statistics for research.

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```