Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

# Re: st: Scale break in box plot

 From Scott Merryman To statalist@hsphsun2.harvard.edu Subject Re: st: Scale break in box plot Date Mon, 16 Dec 2013 14:47:56 -0600

```You will need to construct the box plots using -twoway- graphs.

The following example is adapted from Nick Cox's 2009 Stata Journal
(9:3) article "Speaking Stata: Creating and varying box plots"

http://www.stata-journal.com/sjpdf.html?articlenum=gr0039

sysuse lifeexp,clear
replace lexp = 35 if country == "Haiti"
egen median = median(lexp), by(region)
egen upq = pctile(lexp), p(75) by(region)
egen loq = pctile(lexp), p(25) by(region)
egen iqr = iqr(lexp), by(region)
egen upper = max(min(lexp, upq + 1.5 * iqr)), by(region)
egen lower = min(max(lexp, loq - 1.5 * iqr)), by(region)
twoway rbar med upq region, pstyle(p1) blc(gs15) bfc(gs8) barw(0.35) || ///
rbar med loq region, pstyle(p1) blc(gs15) bfc(gs8) barw(0.35) || ///
rspike upq upper region, pstyle(p1) || ///
rspike loq lower region, pstyle(p1) || ///
rcap upper upper region, pstyle(p1) msize(*2) || ///
rcap lower lower region, pstyle(p1) msize(*2) || ///
scatter lexp region if !inrange(lexp, lower, upper) & lexp > 50, ///
ms(Oh)  mla(country) mlabcolor(gs8) xscale(off)  legend(off) ///
yla(, ang(h)) ytitle(Life expectancy (years)) xtitle("") name(gr1,replace)

scatter lexp region if !inrange(lexp, lower, upper) & lexp < 50, ylabel(35) ///
fysize(20) ytitle("") ms(Oh) mlabcolor(blue) mcolor(blue)  mla(country)  ///
xla(1 `" "Europe and" "Central Asia" "' 2 "North America" 3 "South
America", ///
noticks) ytitle(Life expectancy (years), color(white))  yla(, ang(h))  ///
xtitle("")  name(gr2,replace)

graph combine gr1 gr2, col(1) xcommon imargin(zero)

Scott

On Mon, Dec 16, 2013 at 1:24 PM, Rakesh Ghosh <rakeshgh@usc.edu> wrote:
>>>> Dear Stata list members
>>>>
>>>> I have a box plot with many outliers. I would like to insert a scale break to increase the box size and reduce the span of the outliers. I tried both of the options in this Stata scale break link (http://www.stata.com/support/faqs/graphics/scale-breaks/). While inserting a line will not work in my case because I have no break in data points, the second option does work when I create a box plot and a scatter plot and then combine them together.
>>>
>>>> -graph box trafficdensity if trafficdensity>0 & trafficdensity<=125, over(county)-
>>>>
>>>> However, the median, p25 and p75 are underestimated because I restrict the upper limit of the box plot, so it is not good for me. I will have to restrict the upper limit otherwise I will not get the plot of desirable size. Is there any way you can think how I can insert a break on the y axis?
>>>>
>>>> Thanks for any suggestion.
>>>>
>>>> Rakesh Ghosh
>>>>>
>>
>
>
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/faqs/resources/statalist-faq/
> *   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/
```