Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: RE: RE: Putting a rug underneath a boxplot


From   Nick Cox <[email protected]>
To   [email protected]
Subject   Re: st: RE: RE: Putting a rug underneath a boxplot
Date   Thu, 24 Mar 2011 09:20:08 +0000

I'd still make the same suggestion. If you are adding several
different rugs below a box plot, there should be space on the axis to
add informative labels.

In my views legends are at best a necessary evil. My slogan is: Lose
the legend if you can! Or, kill the key!

Legends take up valuable real estate (unless inserted in an empty
corner of the plot region) and they usually oblige the reader to go
back and forth to check what is what (except that often is not done).
There are legends that can be absorbed at once: for example a legend
explaining pink for females and blue for males would be quickly
understood in several countries. But then such a legend could often be
omitted, by putting informative text on the graph -- or if that didn't
appeal or there was no suitable space, by just explaining it in the
text caption added within a word or text processing program. (Another
small trick is colour coding variable labels.)

Any way, as emphasised one of the principles underlying -stripplot-'s
design is using the axis labels as text explaining variables and/or
groups. I am happy if -stripplot- can be used as a starting point for
users' own designs, but I wanted to stress this point.

(Some readers may be familiar with software that by default adds a
legend "Series 1" for a plot of a single series (in its terminology),
regardless of the column that series comes from or any explanatory
text in its early rows.)

Nick

On Wed, Mar 23, 2011 at 5:41 PM, Oliver Jones
<[email protected]> wrote:

> I totally agree that in the case when there is just one rug, all one needs
> is the axistitle.
> But if the task is to plot one boxplot (using all date) and then putting
> separate rugs for different groups underneath the boxplot, then it is
> necessary to create a legend explaining which rug represents which group.
>
> Because I needed to create 12 different such "box and multi-rug plots",
> the magic "5" had to be set manually for each plot (the scales of the 12
> variables where quite different).
> To avoid setting it manually (i.e. defining the width of the bars/pipes)
> I did something like this:
>
> summary price, meanonly
> gen price1 = price - ( (r(max) - r(min)) / 2000 )
> gen price2 = price + ( (r(max) - r(min)) / 2000 )

 Am 23.03.2011 17:54, schrieb Nick Cox:

>> Thanks for this. Good point about -symxsize()-. I seem to recall some
>> reason why this didn't seem to be the solution in an earlier thread,
>> but I think you are right: if you want these pseudo-pipes in a legend,
>> then it helps. A bigger point is why do you want a legend at all, as
>> much of the point of a strip plot is that the axis labels carry that
>> information.
>>
>> For those interested in thin rbars as mimicking pipe symbols, here is
>> a simpler example. I will add something like this to the help for
>> -stripplot-. The "5" is empirical as producing a thin bar on the scale
>> of the variable -price-.
>>
>> sysuse auto, clear
>>
>> gen price1 = price - 5
>> gen price2 = price + 5
>>
>> stripplot price, over(rep78) box ms(none) ///
>> addplot(rbar price1 price2 rep78, horizontal barw(0.2) bcolor(gs6))
>>
>> There are small artefacts when the edge of a box coincides with a pipe
>> symbol.
>>
>> Your other troubles seem to arise from using -grc1leg- and/or -graph
>> combine-, but in any case I don't understand what they are.
>>
>> Nick
>>
>> On Wed, Mar 23, 2011 at 3:39 PM, Oliver Jones
>> <[email protected]>  wrote:
>>>
>>> Hi Nick,
>>> thanks al lot for your help. Everything works fine now.
>>>
>>> I just ran into some more trouble while combining 12 of these stripplots
>>> using -grc1leg- from  http://www.stata.com/users/vwiggins. There is no
>>> error
>>> message when one specifys the xsize() option to control overall aspect
>>> ratio.
>>> A very easy work-around is to use -graph display- after generating the
>>> combined
>>> graph, see http://www.stata.com/statalist/archive/2008-12/msg00086.html.
>>>
>>> Also combining graphs in Stata is quiet tricky because the iscale()
>>> option
>>> seams to trade between textsize and markersize, is that correct?
>>>
>>> By the way to change the length of legend entry one can avoid the graph
>>> editor
>>> and use the option symxsize() within the legend() option.
>>
>>
>> Am 23.03.2011 12:16, schrieb Nick Cox:
>>
>>>> First, please remember to specify where user-commands you refer to
>>>> come from. In this case -stripplot- is from SSC. (Otherwise anyone who
>>>> does not know that could try repeating your example and would get
>>>> puzzling error messages.)
>>>>
>>>> This question in essence repeats one raised by Jannik Helweg-Larsen in
>>>> the thread starting with
>>>>
>>>> http://www.stata.com/statalist/archive/2011-03/msg00240.html
>>>>
>>>> The problem is that with -ms(none)- the corresponding legend entry is
>>>> blank.
>>>>
>>>> The work-around in Jannik's thread was to use a thin -rbar-. The
>>>> legend entry is much longer than you want, but you can fix it in the
>>>> Graph Editor.
>>>>
>>>> Here is example code for your example problem:
>>>>
>>>> sysuse auto, clear
>>>>
>>>> * Create ancillary variables
>>>>        gen y = 1
>>>>        gen y_foreign = y
>>>>        label variable y_foreign "Foreign"
>>>>        gen y_not = y - 0.05
>>>>        label variable y_not "Not Foreign"
>>>>        gen y_miss = y - 0.1
>>>>        label variable y_miss "NA"
>>>>
>>>>        gen byte index_region = 1 if foreign == 1
>>>>        replace index_region = 2 if index_region == .
>>>>        replace index_region = 3 if foreign == .
>>>>        label define lbl_index_region 1 "Foreign" 2 "Not Foreign" 3
>>>> "Missing"
>>>>        label values index_region lbl_index_region
>>>>
>>>>        gen price2 = price + 10
>>>>
>>>> *
>>>>
>>>> local boxoffset = 0.1
>>>> stripplot price, over(y) ///
>>>>        title("My Box- and Rug-Plot") ///
>>>>        legend(order(6 7 8) cols(1) on) ///
>>>>        box(barwidth(0.1)) iqr boffset(`boxoffset') ///
>>>>        ms(none) ///
>>>>        addplot(rbar price price2 y_foreign if index_region == 1, ///
>>>>                        horizontal barw(0.02) bcolor(blue) || ///
>>>>                rbar price price2 y_not if index_region == 2, ///
>>>>                        horizontal barw(0.02) bcolor(red) || ///
>>>>                scatter y_miss price if index_region == 3, ///
>>>>                        ms(X) mcolor(black) ///
>>>>                ) ///
>>>>        yscale(off) ylab(1(0.1)1.15, nogrid)
>>>>
>>>> On Wed, Mar 23, 2011 at 10:56 AM, Oliver Jones
>>>> <[email protected]>    wrote:
>>>>
>>>>> the topic is some month old, but today I had a look at the box and rug
>>>>> plots
>>>>> I created
>>>>> using the stripplot command.
>>>>> To create the rug, I followed the advice and generated a string
>>>>> variable
>>>>> containing the
>>>>> symbol "|".
>>>>>
>>>>> Here is an example of what I do:
>>>>>
>>>>> ************************* begin example *************************
>>>>> graph drop _all
>>>>> sysuse auto, clear
>>>>>
>>>>> * Create ancillary variables
>>>>>        gen y = 1
>>>>>        gen y_foreign = y
>>>>>        label variable y_foreign "Foreign"
>>>>>        gen y_not = y - 0.05
>>>>>        label variable y_not "Not Foreign"
>>>>>        gen y_miss = y - 0.1
>>>>>        label variable y_miss "NA"
>>>>>
>>>>>        gen byte index_region = 1 if foreign == 1
>>>>>        replace index_region = 2 if index_region == .
>>>>>        replace index_region = 3 if foreign == .
>>>>>        label define lbl_index_region 1 "Foreign" 2 "Not Foreign" 3
>>>>> "Missing"
>>>>>        label values index_region lbl_index_region
>>>>>
>>>>>        gen pipe = "|"
>>>>> *
>>>>>
>>>>> local boxoffset = 0.1
>>>>> stripplot price, over(y) ///
>>>>>        title("My Box- and Rug-Plot") ///
>>>>>        legend(order(6 7 8) cols(1) on) ///
>>>>>        box(barwidth(0.1)) iqr boffset(`boxoffset') ///
>>>>>        ms(none) ///
>>>>>        addplot(scatter y_foreign price if index_region == 1, ///
>>>>>                        ms(none) mla(pipe) mlabcolor(blue) mlabpos(0) ||
>>>>> ///
>>>>>                scatter y_not price if index_region == 2, ///
>>>>>                        ms(none) mla(pipe) mlabcolor(red) mlabpos(0) ||
>>>>> ///
>>>>>                scatter y_miss price if index_region == 3, ///
>>>>>                        ms(X) mcolor(black) ///
>>>>>                ) ///
>>>>>        yscale(off) ylab(1(0.1)1.15, nogrid)
>>>>>
>>>>> ************************* end example *************************
>>>>>
>>>>>
>>>>> The problem is the legend, I need the red and blue pipe symbol as
>>>>> legend
>>>>> key-symbols.
>>>>> Is there a way I can tell Stata to use such symbols?

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index