Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: RE: using percentiles for length of whiskers in box plots

From   "Nick Cox" <>
To   <>
Subject   st: RE: using percentiles for length of whiskers in box plots
Date   Sun, 9 May 2010 17:43:21 +0100

This was already answered earlier the same day in a different thread


Although the thread subject doesn't refer to box plots, the same key
reference was repeated, again on the same day, within 


in a thread whose subject does refer to box plots. 

The key ideas are 

1. -graph box- won't do this for you. 

2. You need to calculate all the ingredients explicitly and write your
own code. That's not as difficult as might be feared, as this example
should show. 

sysuse auto 

foreach p in 5 25 50 75 95 {
	egen p`p' = pctile(price), by(foreign) p(`p')

egen tag = tag(foreign)

// what follows is all one command 
twoway rbar p50 p75 foreign if tag, barw(0.4) bcolor(ltblue)
blcolor(dknavy) || 
rbar p50 p25 foreign if tag, barw(0.4) bcolor(ltblue) blcolor(dknavy) ||

rspike p75 p95 foreign if tag, lcolor(dknavy) || 
rspike p25 p5 foreign if tag, lcolor(dknavy) || 
scatter price foreign if !inrange(price,p5,p95), legend(off) 
ytitle("`: var label price'") xla(0 1, valuelabel)

The key reference was 

SJ-9-3  gr0039  . . . . . . . . Speaking Stata: Creating and varying box
        . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  N.
J. Cox
        Q3/09   SJ 9(3):478--496                                 (no
        explains how to use egen to calculate the statistical
        ingredients needed for box plots and variations of box
        plots; shows the use of twoway to then create the plots


Daniel Koralek

I was wondering if anybody had any advice into using some percentiles as
cutoffs for the length of whiskers in box plots.  i.e., the default is a
distance of 1.5 times the IQR above the 75th percentile and 1.5 times
the IQR below the 25th percentile.  I'm more interested in using some
percentile, such as where the 5th and 95th or 1st and 99th percentiles
of the data are.  

*   For searches and help try:

© Copyright 1996–2016 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index