Statalist The Stata Listserver


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: RE: RE: Box Plot


From   "Ingo Brooks" <ingo.brooks@gmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: RE: RE: Box Plot
Date   Fri, 24 Nov 2006 08:59:50 +0100

Dear Nick

Thanks a lot for your suggestions, this was very useful. By using
-egen group- I come close to what I need.

Kind regards,
Ingo


On 11/23/06, Nick Cox <n.j.cox@durham.ac.uk> wrote:
Another work-around when you have two
categorical controls is to map them to
one cross-classification:

egen group = group(catvar1 catvar2), label

and then call

box1090 interestingresponse, over(group)

The spacing may not be optimal, but it's easy.

Nick
n.j.cox@durham.ac.uk

Nick Cox

> Unfortunately, implementing what you want would
> be far more work for me than was involved in writing
> -box1090- in the first place. It is cheap for me
> to call an option -over()-, but not to endow
> it with all the properties that -graph box-'s
> -over()- possesses.
>
> Key to your understanding here is that what I did
> had nothing to do with -graph box-. I used various
> -twoway- plottypes and built up a fake boxplot by
> implementing two -rbar-s, two -rspike-s and
> a -scatter-. If it looks similar, that's good enough.
> But there is no call to -graph box-
> and none of its properties (e.g. its inbuilt options)
> are available to the user-programmer given the route I took,
> or so I understand.
>
> However, you mention three specifics:
>
> 1. Relabelling. You can get whatever text labels
> you want on the x axis just by using -xla()-
> as normal.
>
> 2. A "Total" box as well. Here is a work-around
> for this:
>
> sysuse auto, clear
> preserve
> local np1 = _N + 1
> expand 2
> replace rep78 = 6 in `np1'/l
> label def rep78 6 "Total"
> label val rep78 rep78
> box1090 mpg, over(rep78)
> restore
>
> In words, the whole of the data is just
> added temporarily to the data as if it
> were an extra category.
>
> 3. A second -over()- option. Could be
> done, but doesn't appeal right now.
> Not equivalent, but loosely similar is
> to use -graph combine- to show two -box1090-s
> side-by-side.
>
> Another route illustrated by a Michael Blasnik
> posting in an earlier thread is to delve
> deep into the -graph box- code and clone it
> so that 10%, 90% are available as alternative
> adjacent values, but that is not elementary.
> My own view is that this is a reasonable request
> to StataCorp, but so are about a thousand other
> tweaks to the official code.
>
> Your code below is illegal as "02" is not acceptable
> as a variable name (but we get the idea).
>
> Nick
> n.j.cox@durham.ac.uk
>
> Ingo Brooks
>
> > Dear Nick and Scott
> >
> > Thanks a lot for your suggestions and the great program.
> >
> > Nick: Your program delivers exactly the type of the chart I need.
> > However, ... Would it be very complicated to add a second over()
> > option (and ideally also a -total- suboption for the first over()
> > option) to the program? Unfortunately, I've never programmed a
> > graphics command in Stata and really don't know how I could do this.
> >
> > The box plot command of my dreams would look something like this:
> >
> > gen O2 = (_n<30)
> >
> > box1090_dream  mpg, over(foreign, total relabel(0 "d" 1 "f" 3
> > "all")) over(O2)
> >
> >
> > Thanks again.
> >
> > Best,
> > Ingo
> >
> >
> > On 11/23/06, Nick Cox <n.j.cox@durham.ac.uk> wrote:
> > > True, but the recipe there given does not include
> > > separate plotting of data points beyond the 10% and
> > > 90% percentiles, which I imagine is often desired.
> > >
> > > The following has more pretensions to generality.
> > > There are some hooks to use, but this does not
> > > cover the common cases of horizontal alignment
> > > and box plots for several (similar) variables.
> > >
> > > ----------------------------------------- box1090.ado
> > > *! NJC 1.0.0 23 Nov 2006
> > > * examples: box1090 mpg, over(rep78) box(barw(0.3)) ms(oh)
> > > *           box1090 length, over(grade) ms(oh) yla(,
> > ang(h)) xla(, noticks)
> > > program box1090
> > >         version 8
> > >         syntax varname(numeric) [if] [in], over(varname) ///
> > >         [box(str asis) spike(str asis) * ]
> > >         local y "`varlist'"
> > >
> > >         marksample touse
> > >         markout `touse' `over', strok
> > >
> > >         quietly {
> > >                 count if `touse'
> > >                 if r(N) == 0 error 2000
> > >
> > >                 tempvar catvar p10 p90 p25 p75 p50 out tag
> > >
> > >                 foreach p in 10 25 50 75 90 {
> > >                         egen `p`p'' = pctile(`y') if `touse', ///
> > >                         by(`group') p(`p')
> > >                 }
> > >
> > >                 gen `out' = `y' if `touse' & (`y' < `p10' |
> > `y' > `p90')
> > >                 egen `catvar' = group(`over') if `touse', label
> > >                 su `catvar', meanonly
> > >                 local max = r(max)
> > >                 egen `tag' = tag(`catvar') if `touse'
> > >
> > >                 local yttl : var label `y'
> > >                 if `"`yttl'"' == "" local yttl "`y'"
> > >
> > >                 local xttl : var label `over'
> > >                 if `"`xttl'"' == "" local xttl "`over'"
> > >
> > >         }
> > >
> > >         twoway                                             ///
> > >         rbar `p50' `p75' `catvar' if `tag',                ///
> > >         barw(0.4) bc(none) blc(green) blw(medium) `box' || ///
> > >         rbar `p25' `p75' `catvar' if `tag',                ///
> > >         barw(0.4) bc(none) blc(green) blw(medium) `box' || ///
> > >         rspike `p10' `p25' `catvar' if `tag',              ///
> > >         blcolor(green) `spike'                          || ///
> > >         rspike `p75' `p90' `catvar' if `tag',              ///
> > >         blcolor(green) `spike'                          || ///
> > >         scatter `out' `catvar' if `touse',                 ///
> > >         yti("`yttl'") xti("`xttl'") legend(off)            ///
> > >         xla(1/`max', valuelabel) `options'
> > > end
> > > --------------------------------------------------
> > >
> > > Nick
> > > n.j.cox@durham.ac.uk
> > >
> > > Scott Merryman
> > >
> > > > Nick [Cox] presented a way of doing this a couple weeks
> ago using
> > > > -statsby- and
> > > > -twoway-.  See
> > > >
> > > >
> > http://www.hsph.harvard.edu/cgi-bin/lwgate/STATALIST/archives/
> statalist.0611/Author/article-307.html
> >
> >
> > Ingo Brooks
> >
> > > > I would like to produce a box plot like figure.
> However, instead of
> > > > the adjacent values that are provided by Stata's -graph
> > > box- procedure
> > > > I would like to depict the 10% and 90% percentile. Is
> there a way to
> > > > do this in Stata?

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index