Statalist The Stata Listserver


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

RE: st: RE getting rid of the outliners


From   "Maarten Buis" <M.Buis@fsw.vu.nl>
To   <statalist@hsphsun2.harvard.edu>
Subject   RE: st: RE getting rid of the outliners
Date   Mon, 1 May 2006 12:10:17 +0200

Ronnie:

I had the same problem with sending to Vora instead of the
statalist (so Vora received  multiple copies my email before 
I found out what the problem was, sorry about that)

In my not overly humble opinion, determining outliers this way
is nothing more than applying rules of thumb, and it is bad
practice to let your analysis be influenced by a blind 
application of a single rule of thumb. I am a regression man,
so when I am looking for outliers I look at scatter plots, 
various plots involving residuals, cook's distances, and 
leverages. I than try to identify points that worry me and 
try to find out why they are special. Than I decide what I 
am going to do about them, and in many cases the answer is 
nothing. 

HTH,
Maarten

-----------------------------------------
Maarten L. Buis
Department of Social Research Methodology 
Vrije Universiteit Amsterdam 
Boelelaan 1081 
1081 HV Amsterdam 
The Netherlands

visiting adress:
Buitenveldertselaan 3 (Metropolitan), room Z214 

+31 20 5986715

http://home.fsw.vu.nl/m.buis/
-----------------------------------------

-----Original Message-----
From: owner-statalist@hsphsun2.harvard.edu [mailto:owner-statalist@hsphsun2.harvard.edu]On Behalf Of Ronnie Babigumira
Sent: maandag 1 mei 2006 11:24
To: statalist@hsphsun2.harvard.edu
Subject: Re: st: RE getting rid of the outliners

Maarten, I had written in earlier suggesting -lv- (output below) or -iqr- (I just checked and for some reason, my
response went to Vora N and not to the list), however, your response is more true to the original posting.

That said, I have a follow up question for you

Using the fences created by

local u = r(p75) + (3/2) * (r(p75) - r(p25))
local l = r(p25) - (3/2) * (r(p75) - r(p25))

Would capture "mild" outliers. So my question is, how does this sit with the discussion in for example Hamilton,
Statistics with Stata, which distinguishes between mild and severe outliers pointing out that it is severe outliers that
create problems for many statistical techniques.
 



*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index