Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: RE: Adjacent values in graph_box

From   "Nick Cox" <>
To   <>
Subject   st: RE: Adjacent values in graph_box
Date   Wed, 2 Jun 2004 10:40:43 +0100

The adjacent values are the most extreme 
values within 1.5 iqr of the nearer 
quartile. This definition goes back 
to Tukey and is, I think, that most 
commonly used in box plots. 

That is, calculate 

iqr = upper quartile - lower quartile 

and identify 

(1) the largest value <= (upper quartile + 1.5 iqr) 

(2) the smallest value >= (lower quartile - 1.5 iqr) 

This leaves open two overlapping issues: 

(a) exactly how the quartiles are calculated: [R] summarize
gives Stata's rule 

(b) other definitions of the box plot e.g. see 	

Frigge, M., Hoaglin, D.C. and Iglewicz, B. 1989. 
Some implementations of the box plot.
The American Statistician 43: 50--54. 

Having said that, I don't think that 
-graph box- offers any handles for 
varying this definition. 

Recently a program -adjacent- was posted
on SSC which list adjacent values according
to this definition. More useful, but not 
yet in the public domain, are -egen- functions
-adjl()-, -adju()- and -outside()- which 
emit the lower and upper adjacent values 
and (any) values outside those. These make 
it a little easier to program your own variants on 
the box plot. 


John Wallace 

> I've misplaced my graphics manual at the moment, and 
> I'm trying to figure out two things:
> 1)     what are the default values for the adjacent 
> values (location of the end of the whiskers on a boxplot)
> 2)     Can/how are they altered?

> I thought it was -graph box, cwhiskers(...)-, but that appears to be 
> for the appearance of the whiskers (line weight, cap appearance, etc)

*   For searches and help try:

© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index