You are correct that outside values are defined as in Tukey 1977. This
is actually the last observed value <= 1.5*IQR above the upper
quartile/below the lower quartile. The Stata graphics manual ([G] graph
box) defines this quite explicitly.
For some purposes I have had cause to use different variations on a box
plot (e.g. plotting 10th and 90th percentiles) - if I do this, then I
ensure the legend makes it clear what I have done.
David
-----Original Message-----
From: owner-statalist@hsphsun2.harvard.edu
[mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of Jens
Lauritsen
Sent: 23 May 2006 16:59
To: statalist@hsphsun2.harvard.edu
Subject: st: Definition of "outside" in box plots
Definining box plots.
The Tukey definition is
The box shows interquartile range (25-75) with median highlighted.
Length of whiskers are at 1.5* interquartile range
But sometimes in teaching medical professionals I see other definitions,
e.g. that whiskers are the 10th and 90th percentile.
I suggest that when box plot manuals are rewritten add the used
definition.
I did not manage to find the definition in any Stata document on how the
"outside" limit definition is in a box plot. But I assume it is the
original Tukey (1.5), since the documentation mentions the Tukey paper
as the origin.
Has anyone else experienced problems with varying understanding of the
definition of box plots ?
Jens Lauritsen
Consultant MD, ph.d. Associate professor Denmark
______________________________________________________________________
This email has been scanned by the MessageLabs Email Security System.
For more information please visit http://www.messagelabs.com/email
______________________________________________________________________
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/