[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]
st: Definition of "outside" in box plots - new reference
Sometime one sends an email too early, I did so earlier today - apologies
I managed to find time to do a small search on the topic of
"outlier/outside value" definition in relation to box plots.
Frigge, M., D. C. Hoaglin, and B. Iglewicz. 1989. Some implementations of
the box plot. American Statistician 43: 50–54." documents eight different
implementations of the quartile in software/algoritms.
A recent paper also discusses this: "Outlier Labeling With Boxplot
Procedures" C. H. SIM, F. F. GAN, and T. C. CHANG. JOURNAL OF THE AMERICAN
STATISTICAL ASSOCIATION 100 (470): 642-652 JUN 2005
The authors have made large scale simulations and give tables of suggested
outlier detection principles depending on the supposed underlying
And they say "This article shows that the commonly constructed boxplot is
in general inappropriate for detecting outliers in the normal and
especially the exponential samples. We recommend that the graphical
boxplot be constructed based on the knowledge of the underlying
distribution of the dataset and by controling the risk of labeling regular
observations as outliers."
Certainly this recommendation further underlines the need to quote what
type of whiskers are shown in a given box plot. A quick search through a
number of publications usually did not include any definition.
Consultant MD, ph.d. Associate professor
* For searches and help try: