Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: outliers

Subject   Re: st: outliers
Date   Fri, 27 Aug 2010 13:43:27 +0200 (CEST)

Dear Steve thank you very much!
This new message is for you and all the statalist:

To check for outliers, I run:

predict df, dfits

I discover that I have three observations which have df > [2 x sqroot(k/n)] 
(I did not count the number of df_values with negative values, because the statistic for df does not include absolute values).

One of the three values is very large (i.e., 2,19, vs a treshold of 0.50). How would you consider this condtion? Do I have to drop this last observation? Would you run a mmregress?
More broadly: when would you suggest to use mmregress instead of regress (also with robust option)? Can we say that mmregress is always better than the simple OLS? Or it can be used only in the presence of a large number of outliers? and for how many outliers would you suggest the mmregres instaead of regress?

Thanks a lot!

----- Messaggio originale -----
Da: "Steve Samuels" <>
Inviato: Lunedì, 23 agosto 2010 4:25:02 GMT +01:00 Amsterdam/Berlino/Berna/Roma/Stoccolma/Vienna
Oggetto: Re: st: outliers

There are few rules about outliers, but the most important one is: OLS
is the worst way to detect them. Detection requires a robust
regression program; and a good program will not "reject" all outliers,
but will automatically downweight them.  For covariates, one wants to
identify not outliers per se, but those with high leverage.  But the
decision about what to do with these is not automatic; sometimes they
are the most important points and _must_ be kept.

See: "Robust regression in Stata" by Vincenzo Verardi and Christophe
Croux, The Stata Journal
Volume 9 Number 3: pp. 439-453. Also available at:

See also Verardi and Croux's contributed programs -mmregress- (findit)
and Ben Jann's -robreg- (findit). These are superior to Stata's
long-time built-in command -rreg-.


Steven Samuels
18 Cantine's Island
Saugerties NY 12477
Voice: 845-246-0774
Fax:    206-202-4783

On Sun, Aug 22, 2010 at 4:04 PM, Fabio Zona <> wrote:

> in a OLS model, can I limit the analysis on outliers related to the predictors only? Or do I have to check for eventual outliers also for control variables?

*   For searches and help try:

*   For searches and help try:

© Copyright 1996–2017 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index