# st: RE: RE: Plotting the MEAN on a box plot

 From "Nick Cox" To Subject st: RE: RE: Plotting the MEAN on a box plot Date Mon, 23 Feb 2009 12:28:43 -0000

```Also, -stripplot- does not support multiple -over()-s. You'd need to
work out your own composite group variable first. That could allow gaps.

Nick
n.j.cox@durham.ac.uk

-----Original Message-----
From: owner-statalist@hsphsun2.harvard.edu
[mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of Nick Cox
Sent: 23 February 2009 12:20
To: statalist@hsphsun2.harvard.edu
Subject: st: RE: Plotting the MEAN on a box plot

You can always check the archives to see what "got through". There is no
need for any doubt on that point.

On your question, this isn't any easier than it was. One reason is that
box plots are based on the assumption that you want medians, not means.
Another is that -graph box- and -graph hbox- are not members of the
-twoway- family, so you can't superimpose -twoway- graphs.

There is an alternative, however, in -stripplot- from SSC.

This shows some technique, although much else is possible:

sysuse auto, clear

egen mean = mean(mpg) , by(foreign)

gen foreign2 = foreign + 0.3
stripplot mpg, over(foreign) stack height(0.2) ///
box(barw(0.15)) boffset(0.3) vertical ///

egen loq = pctile(mpg), p(25) by(foreign)
egen upq = pctile(mpg), p(75) by(foreign)

stripplot mpg, over(foreign) ///
box(barw(0.15)) vertical ms(none) ///
addplot(scatter mean foreign, ms(D) || ///
scatter mpg foreign if !inrange(mpg, loq, upq))

This code produces two different designs, neither utterly standard. The
first is a dot plot with boxes and means offset. The second is a box
plot but with means added and all data points shown outside the boxes.
My own bias is that box plots too often let down both analyst and
audience by suppressing too much detail in the tails. Only if I have say
30 or 70 boxes rather than 3 or 7 do I want as much reduction as box
plots typically give.

-stripplot- has its own prejudices too, presumably to be blamed on its
author. In particular, it doesn't support a mix of colours for different
boxes -- "bladders are shown in maroon" --  presuming that to the alert
reader the text labelling is signal enough. I've not tried subverting
that via the Graph Editor, as I've never wanted it, but it may be
possible.

Nick
n.j.cox@durham.ac.uk

Neil Martin

**Posting to this list is new for me.  I thought I had submitted this
question two days ago but I can't find it in the archive... That is to
say, sorry if this is a repost.**

I am interested in producing a figure of vertical box plots comparing
groups of four variables across three strata and then across two more
strata.  I'm very happy with the figure I've managed to produce this way
and I would now like to add the mean values superimposed on the boxes
themselves.  I know this topic has been covered previously on this
listserve:
(http://www.stata.com/statalist/archive/2003-06/msg00080.html) but I'm
having trouble interpreting what the final answer was on this issue.  It
seemed like the implications were 1) try to overlay a figure of the
means, or 2) use another software package.  All things being equal, I
would very much like to use the figure I have now but I don't fully
understand how I would overlay the means.

I am using version 10.1 on the Mac.

I'm not sure how much this will help things but I am plotting my
variable (dose) according to first a type data collected (data1), second
over the type of treatment plan (plan1), and finally over a broad target
criteria (internal).  My code is as follows:

graph box dose if oar==1, over(data1, gap(50)) over(plan1, gap(300))
over(internal, gap(500)) box(1, fcolor(dknavy)) box(2,
fcolor(forest_green) lcolor(forest_green)) box(3, fcolor(dkorange)
lcolor(dkorange)) box(4, fcolor(maroon) lcolor(maroon)) medtype(marker)
medmarker(msize(vtiny) msymbol(smplus)) marker(1, mcolor(dknavy))
marker(2, mcolor(forest_green)) marker(3, mcolor(dkorange)) marker(4,
mcolor(maroon)) ytitle(% of total volume) caption(bladder)
legend(rows(1)) scheme(s2color8) xsize(9) ysize(7)

Notably, I can also produce a dot plot of the means according to the
same grouping as above using:

graph dot (mean) dose if oar==1, over(data1, gap(50)) over(plan1,
gap(300)) over(internal, gap(500))

But the figure is oriented horizontally and I don't know the command
which allows me to overlay the two (if such a thing is even possible).

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```