Statalist The Stata Listserver

[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: RE: RE: Box Plot

From   "Ingo Brooks" <>
Subject   Re: st: RE: RE: Box Plot
Date   Thu, 23 Nov 2006 15:30:27 +0100

Dear Nick and Scott

Thanks a lot for your suggestions and the great program.

Nick: Your program delivers exactly the type of the chart I need.
However, ... Would it be very complicated to add a second over()
option (and ideally also a -total- suboption for the first over()
option) to the program? Unfortunately, I've never programmed a
graphics command in Stata and really don't know how I could do this.

The box plot command of my dreams would look something like this:

gen O2 = (_n<30)

box1090_dream  mpg, over(foreign, total relabel(0 "d" 1 "f" 3 "all")) over(O2)

Thanks again.


On 11/23/06, Nick Cox <> wrote:
True, but the recipe there given does not include
separate plotting of data points beyond the 10% and
90% percentiles, which I imagine is often desired.

The following has more pretensions to generality.
There are some hooks to use, but this does not
cover the common cases of horizontal alignment
and box plots for several (similar) variables.

----------------------------------------- box1090.ado
*! NJC 1.0.0 23 Nov 2006
* examples: box1090 mpg, over(rep78) box(barw(0.3)) ms(oh)
*           box1090 length, over(grade) ms(oh) yla(, ang(h)) xla(, noticks)
program box1090
        version 8
        syntax varname(numeric) [if] [in], over(varname) ///
        [box(str asis) spike(str asis) * ]
        local y "`varlist'"

        marksample touse
        markout `touse' `over', strok

        quietly {
                count if `touse'
                if r(N) == 0 error 2000

                tempvar catvar p10 p90 p25 p75 p50 out tag

                foreach p in 10 25 50 75 90 {
                        egen `p`p'' = pctile(`y') if `touse', ///
                        by(`group') p(`p')

                gen `out' = `y' if `touse' & (`y' < `p10' | `y' > `p90')
                egen `catvar' = group(`over') if `touse', label
                su `catvar', meanonly
                local max = r(max)
                egen `tag' = tag(`catvar') if `touse'

                local yttl : var label `y'
                if `"`yttl'"' == "" local yttl "`y'"

                local xttl : var label `over'
                if `"`xttl'"' == "" local xttl "`over'"


        twoway                                             ///
        rbar `p50' `p75' `catvar' if `tag',                ///
        barw(0.4) bc(none) blc(green) blw(medium) `box' || ///
        rbar `p25' `p75' `catvar' if `tag',                ///
        barw(0.4) bc(none) blc(green) blw(medium) `box' || ///
        rspike `p10' `p25' `catvar' if `tag',              ///
        blcolor(green) `spike'                          || ///
        rspike `p75' `p90' `catvar' if `tag',              ///
        blcolor(green) `spike'                          || ///
        scatter `out' `catvar' if `touse',                 ///
        yti("`yttl'") xti("`xttl'") legend(off)            ///
        xla(1/`max', valuelabel) `options'


Scott Merryman

> Nick [Cox] presented a way of doing this a couple weeks ago using
> -statsby- and
> -twoway-.  See

Ingo Brooks

> > I would like to produce a box plot like figure. However, instead of
> > the adjacent values that are provided by Stata's -graph
> box- procedure
> > I would like to depict the 10% and 90% percentile. Is there a way to
> > do this in Stata?

*   For searches and help try:

*   For searches and help try:

© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index