Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: outliers


From   fabio.zona@unibocconi.it
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: outliers
Date   Fri, 27 Aug 2010 13:43:27 +0200 (CEST)

Dear Steve thank you very much!
This new message is for you and all the statalist:

To check for outliers, I run:

predict df, dfits

I discover that I have three observations which have df > [2 x sqroot(k/n)] 
(I did not count the number of df_values with negative values, because the statistic for df does not include absolute values).

One of the three values is very large (i.e., 2,19, vs a treshold of 0.50). How would you consider this condtion? Do I have to drop this last observation? Would you run a mmregress?
More broadly: when would you suggest to use mmregress instead of regress (also with robust option)? Can we say that mmregress is always better than the simple OLS? Or it can be used only in the presence of a large number of outliers? and for how many outliers would you suggest the mmregres instaead of regress?

Thanks a lot!




----- Messaggio originale -----
Da: "Steve Samuels" <sjsamuels@gmail.com>
A: statalist@hsphsun2.harvard.edu
Inviato: Lunedì, 23 agosto 2010 4:25:02 GMT +01:00 Amsterdam/Berlino/Berna/Roma/Stoccolma/Vienna
Oggetto: Re: st: outliers

There are few rules about outliers, but the most important one is: OLS
is the worst way to detect them. Detection requires a robust
regression program; and a good program will not "reject" all outliers,
but will automatically downweight them.  For covariates, one wants to
identify not outliers per se, but those with high leverage.  But the
decision about what to do with these is not automatic; sometimes they
are the most important points and _must_ be kept.

See: "Robust regression in Stata" by Vincenzo Verardi and Christophe
Croux, The Stata Journal
Volume 9 Number 3: pp. 439-453. Also available at:
https://lirias.kuleuven.be/bitstream/123456789/202142/1/KBI_0823.pdf

See also Verardi and Croux's contributed programs -mmregress- (findit)
and Ben Jann's -robreg- (findit). These are superior to Stata's
long-time built-in command -rreg-.

Steve

Steven Samuels
sjsamuels@gmail.com
18 Cantine's Island
Saugerties NY 12477
USA
Voice: 845-246-0774
Fax:    206-202-4783

On Sun, Aug 22, 2010 at 4:04 PM, Fabio Zona <fabio.zona@unibocconi.it> wrote:

> in a OLS model, can I limit the analysis on outliers related to the predictors only? Or do I have to check for eventual outliers also for control variables?

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index