From |
"Nick Cox" <n.j.cox@durham.ac.uk> |

To |
<statalist@hsphsun2.harvard.edu> |

Subject |
st: RE: using percentiles for length of whiskers in box plots |

Date |
Sun, 9 May 2010 17:43:21 +0100 |

This was already answered earlier the same day in a different thread within <http://www.hsph.harvard.edu/cgi-bin/lwgate/STATALIST/archives/statalist .1005/date/article-332.html> Although the thread subject doesn't refer to box plots, the same key reference was repeated, again on the same day, within <http://www.hsph.harvard.edu/cgi-bin/lwgate/STATALIST/archives/statalist .1005/date/article-344.html> in a thread whose subject does refer to box plots. The key ideas are 1. -graph box- won't do this for you. 2. You need to calculate all the ingredients explicitly and write your own code. That's not as difficult as might be feared, as this example should show. sysuse auto foreach p in 5 25 50 75 95 { egen p`p' = pctile(price), by(foreign) p(`p') } egen tag = tag(foreign) // what follows is all one command twoway rbar p50 p75 foreign if tag, barw(0.4) bcolor(ltblue) blcolor(dknavy) || rbar p50 p25 foreign if tag, barw(0.4) bcolor(ltblue) blcolor(dknavy) || rspike p75 p95 foreign if tag, lcolor(dknavy) || rspike p25 p5 foreign if tag, lcolor(dknavy) || scatter price foreign if !inrange(price,p5,p95), legend(off) ytitle("`: var label price'") xla(0 1, valuelabel) The key reference was SJ-9-3 gr0039 . . . . . . . . Speaking Stata: Creating and varying box plots . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . N. J. Cox Q3/09 SJ 9(3):478--496 (no commands) explains how to use egen to calculate the statistical ingredients needed for box plots and variations of box plots; shows the use of twoway to then create the plots Nick n.j.cox@durham.ac.uk Daniel Koralek I was wondering if anybody had any advice into using some percentiles as cutoffs for the length of whiskers in box plots. i.e., the default is a distance of 1.5 times the IQR above the 75th percentile and 1.5 times the IQR below the 25th percentile. I'm more interested in using some percentile, such as where the 5th and 95th or 1st and 99th percentiles of the data are.

