Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: RE: using percentiles for length of whiskers in box plots


From   "Nick Cox" <[email protected]>
To   <[email protected]>
Subject   st: RE: using percentiles for length of whiskers in box plots
Date   Sun, 9 May 2010 17:43:21 +0100

This was already answered earlier the same day in a different thread
within 

<http://www.hsph.harvard.edu/cgi-bin/lwgate/STATALIST/archives/statalist
.1005/date/article-332.html> 

Although the thread subject doesn't refer to box plots, the same key
reference was repeated, again on the same day, within 

<http://www.hsph.harvard.edu/cgi-bin/lwgate/STATALIST/archives/statalist
.1005/date/article-344.html> 

in a thread whose subject does refer to box plots. 

The key ideas are 

1. -graph box- won't do this for you. 

2. You need to calculate all the ingredients explicitly and write your
own code. That's not as difficult as might be feared, as this example
should show. 

sysuse auto 

foreach p in 5 25 50 75 95 {
	egen p`p' = pctile(price), by(foreign) p(`p')
}

egen tag = tag(foreign)

// what follows is all one command 
twoway rbar p50 p75 foreign if tag, barw(0.4) bcolor(ltblue)
blcolor(dknavy) || 
rbar p50 p25 foreign if tag, barw(0.4) bcolor(ltblue) blcolor(dknavy) ||

rspike p75 p95 foreign if tag, lcolor(dknavy) || 
rspike p25 p5 foreign if tag, lcolor(dknavy) || 
scatter price foreign if !inrange(price,p5,p95), legend(off) 
ytitle("`: var label price'") xla(0 1, valuelabel)


The key reference was 

SJ-9-3  gr0039  . . . . . . . . Speaking Stata: Creating and varying box
plots
        . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  N.
J. Cox
        Q3/09   SJ 9(3):478--496                                 (no
commands)
        explains how to use egen to calculate the statistical
        ingredients needed for box plots and variations of box
        plots; shows the use of twoway to then create the plots

Nick 
[email protected] 

Daniel Koralek

I was wondering if anybody had any advice into using some percentiles as
cutoffs for the length of whiskers in box plots.  i.e., the default is a
distance of 1.5 times the IQR above the 75th percentile and 1.5 times
the IQR below the 25th percentile.  I'm more interested in using some
percentile, such as where the 5th and 95th or 1st and 99th percentiles
of the data are.  

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index