Statalist The Stata Listserver


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

RE: st: RE: RE: Box Plot


From   "Nick Cox" <[email protected]>
To   <[email protected]>
Subject   RE: st: RE: RE: Box Plot
Date   Thu, 23 Nov 2006 17:36:31 -0000

Another work-around when you have two 
categorical controls is to map them to 
one cross-classification: 

egen group = group(catvar1 catvar2), label 

and then call 

box1090 interestingresponse, over(group) 

The spacing may not be optimal, but it's easy. 

Nick 
[email protected] 

Nick Cox
 
> Unfortunately, implementing what you want would 
> be far more work for me than was involved in writing 
> -box1090- in the first place. It is cheap for me 
> to call an option -over()-, but not to endow 
> it with all the properties that -graph box-'s 
> -over()- possesses. 
> 
> Key to your understanding here is that what I did 
> had nothing to do with -graph box-. I used various
> -twoway- plottypes and built up a fake boxplot by 
> implementing two -rbar-s, two -rspike-s and 
> a -scatter-. If it looks similar, that's good enough.
> But there is no call to -graph box-
> and none of its properties (e.g. its inbuilt options) 
> are available to the user-programmer given the route I took, 
> or so I understand. 
> 
> However, you mention three specifics: 
> 
> 1. Relabelling. You can get whatever text labels 
> you want on the x axis just by using -xla()- 
> as normal. 
> 
> 2. A "Total" box as well. Here is a work-around
> for this: 
> 
> sysuse auto, clear
> preserve 
> local np1 = _N + 1 
> expand 2
> replace rep78 = 6 in `np1'/l
> label def rep78 6 "Total"
> label val rep78 rep78
> box1090 mpg, over(rep78)
> restore
> 
> In words, the whole of the data is just 
> added temporarily to the data as if it 
> were an extra category. 
> 
> 3. A second -over()- option. Could be 
> done, but doesn't appeal right now. 
> Not equivalent, but loosely similar is 
> to use -graph combine- to show two -box1090-s
> side-by-side.  
> 
> Another route illustrated by a Michael Blasnik 
> posting in an earlier thread is to delve 
> deep into the -graph box- code and clone it
> so that 10%, 90% are available as alternative 
> adjacent values, but that is not elementary. 
> My own view is that this is a reasonable request 
> to StataCorp, but so are about a thousand other 
> tweaks to the official code. 
> 
> Your code below is illegal as "02" is not acceptable
> as a variable name (but we get the idea). 
> 
> Nick 
> [email protected] 
> 
> Ingo Brooks
>  
> > Dear Nick and Scott
> > 
> > Thanks a lot for your suggestions and the great program.
> > 
> > Nick: Your program delivers exactly the type of the chart I need.
> > However, ... Would it be very complicated to add a second over()
> > option (and ideally also a -total- suboption for the first over()
> > option) to the program? Unfortunately, I've never programmed a
> > graphics command in Stata and really don't know how I could do this.
> > 
> > The box plot command of my dreams would look something like this:
> > 
> > gen O2 = (_n<30)
> > 
> > box1090_dream  mpg, over(foreign, total relabel(0 "d" 1 "f" 3 
> > "all")) over(O2)
> > 
> > 
> > Thanks again.
> > 
> > Best,
> > Ingo
> > 
> > 
> > On 11/23/06, Nick Cox <[email protected]> wrote:
> > > True, but the recipe there given does not include
> > > separate plotting of data points beyond the 10% and
> > > 90% percentiles, which I imagine is often desired.
> > >
> > > The following has more pretensions to generality.
> > > There are some hooks to use, but this does not
> > > cover the common cases of horizontal alignment
> > > and box plots for several (similar) variables.
> > >
> > > ----------------------------------------- box1090.ado
> > > *! NJC 1.0.0 23 Nov 2006
> > > * examples: box1090 mpg, over(rep78) box(barw(0.3)) ms(oh)
> > > *           box1090 length, over(grade) ms(oh) yla(, 
> > ang(h)) xla(, noticks)
> > > program box1090
> > >         version 8
> > >         syntax varname(numeric) [if] [in], over(varname) ///
> > >         [box(str asis) spike(str asis) * ]
> > >         local y "`varlist'"
> > >
> > >         marksample touse
> > >         markout `touse' `over', strok
> > >
> > >         quietly {
> > >                 count if `touse'
> > >                 if r(N) == 0 error 2000
> > >
> > >                 tempvar catvar p10 p90 p25 p75 p50 out tag
> > >
> > >                 foreach p in 10 25 50 75 90 {
> > >                         egen `p`p'' = pctile(`y') if `touse', ///
> > >                         by(`group') p(`p')
> > >                 }
> > >
> > >                 gen `out' = `y' if `touse' & (`y' < `p10' | 
> > `y' > `p90')
> > >                 egen `catvar' = group(`over') if `touse', label
> > >                 su `catvar', meanonly
> > >                 local max = r(max)
> > >                 egen `tag' = tag(`catvar') if `touse'
> > >
> > >                 local yttl : var label `y'
> > >                 if `"`yttl'"' == "" local yttl "`y'"
> > >
> > >                 local xttl : var label `over'
> > >                 if `"`xttl'"' == "" local xttl "`over'"
> > >
> > >         }
> > >
> > >         twoway                                             ///
> > >         rbar `p50' `p75' `catvar' if `tag',                ///
> > >         barw(0.4) bc(none) blc(green) blw(medium) `box' || ///
> > >         rbar `p25' `p75' `catvar' if `tag',                ///
> > >         barw(0.4) bc(none) blc(green) blw(medium) `box' || ///
> > >         rspike `p10' `p25' `catvar' if `tag',              ///
> > >         blcolor(green) `spike'                          || ///
> > >         rspike `p75' `p90' `catvar' if `tag',              ///
> > >         blcolor(green) `spike'                          || ///
> > >         scatter `out' `catvar' if `touse',                 ///
> > >         yti("`yttl'") xti("`xttl'") legend(off)            ///
> > >         xla(1/`max', valuelabel) `options'
> > > end
> > > --------------------------------------------------
> > >
> > > Nick
> > > [email protected]
> > >
> > > Scott Merryman
> > >
> > > > Nick [Cox] presented a way of doing this a couple weeks 
> ago using
> > > > -statsby- and
> > > > -twoway-.  See
> > > >
> > > > 
> > http://www.hsph.harvard.edu/cgi-bin/lwgate/STATALIST/archives/
> statalist.0611/Author/article-307.html
> >
> >
> > Ingo Brooks
> >
> > > > I would like to produce a box plot like figure. 
> However, instead of
> > > > the adjacent values that are provided by Stata's -graph
> > > box- procedure
> > > > I would like to depict the 10% and 90% percentile. Is 
> there a way to
> > > > do this in Stata?

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index