Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: statalist-digest V4 #4961: Re: st: Value labels won't show on box plot axis


From   "Allan Reese (Cefas)" <[email protected]>
To   "[email protected]" <[email protected]>
Subject   RE: statalist-digest V4 #4961: Re: st: Value labels won't show on box plot axis
Date   Fri, 2 Aug 2013 11:14:43 +0000

On 1 August 2013 10:46, Walsh, Lee <[email protected]> wrote:
>> I am box plotting a variable grouped by another.  
>> I am telling stata to use the value labels for the y axis 
>> but it insists on showing the numerical values.  
>>
>> graph box response, over(statementNum) ylabel(1(1)7, valuelabel) 

On 1 Aug 2013 11:00:00 +0100, Nick Cox <[email protected]> agreed
> This didn't work either in my experiments. Here's a replicable example
> and a work-around:

> . sysuse auto
> . graph box foreign, over(rep78) yla(0 1, valuelabel)
 
[and built a list of re-labels in a macro named -call-] 
...
> forval i = 1/7 {
>       local call `call' `i' "`: label (response) `i''"
> }
[which was then used in the -graph- command]
 . graph box response, over(statementNum) yla(`call')
-------------------------------------------------------

That's an elegant and general piece of programming, but ignores the semantic question of why you would use a boxplot on a non-metric variable.  I take it that Lee's responses are a 7-point opinion scale. There are differing views on whether it is valid to compare means of scales, given the numeric codes are arbitrary beyond the ordering.  Extracting the five-number summary and making a boxplot seems very doubtful to me. Nick's cars example plot shows the effect of applying boxplots to a discrete variable with few values and multiple coincidences.

It may be quicker and less baffling for small numbers of re-labels to write the code explicitly.  I've recently had numerous graphs where a particular value represents the "limit of detection".  That is certainly worth drawing attention to in the graph rather than just in the caption.  I also prefer natural labels when plotting a logged number.  Bacterial counts usually follow a lognormal distribution.

graph box logbact, over(groupvar) ylab(1.3 "20=LoD" 2 "100" 3 "1000" ...)  

Allan
    



*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index