Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: multiple y-axes, spaghetti plots, box plots (statalist-digest V4 #3900)


From   "Allan Reese (Cefas)" <allan.reese@cefas.co.uk>
To   <statalist@hsphsun2.harvard.edu>
Subject   Re: st: multiple y-axes, spaghetti plots, box plots (statalist-digest V4 #3900)
Date   Fri, 3 Sep 2010 12:14:06 +0100

>> Scott Merryman wrote:
>> sysuse auto
>> twoway hist mpg, yscale(alt axis(1)) || line weight mpg, sort
yaxis(2) yscale(alt axis(2))

> & Clive Nicholas <clivelists@googlemail.com> commented:
> But do you notice that the line graph it produces looks horrible, even
with -sort-?

It's better pre-sorted:
gsort mpg weight
twoway hist mpg, yscale(alt axis(1)) || line weight mpg, yaxis(2)
yscale(alt axis(2))

-gsort- used to allow sorting the weights ascending or descending.
Ascending gives a distinctive saw-tooth line.

I too have used spaghetti plots but just called them multiple-line
plots, based on the trick of sorting into groups then c(L).  For panel
data small (<30) number of lines, you could also use -separate- and c(L)
to get round someone joining the panel after the previous one left.  One
can only admire people who get a publication based on coining a name,
but it's the bane of statistics to have a client demand "zugzug's test"
because another paper had used it, only to find zugzug's formula is just
a particular case of a well-known formula or was an arithmetic short-cut
pre-computers.

I agree there should be an easy way to add sample size information to
boxplots.  Whether this is best done as numbers above each plot or by
varying the box width will be a design decision in each case.  In
general, I incline against variable-width boxes because you are using
the two dimensions for orthogonal purposes while introducing a visual
implication to compare areas.

Going back to the first example, the superimposed line is too fussy
because groups of cars all have the same weight.  Two alternative
designs to demonstrate "heavier = gas_hungrier" would be to draw a
smooth line (eg spline) overlaid on the histogram, or generate a group
variable corresponding to the histogram bins and draw box and whiskers
to show the distributions.  Aligning the histogram and box and whisker
as small multiples may require the editor!  Dropping the legend in
favour of axis titles improves readability, and each axis(n) can be
modified.

twoway hist mpg, yscale(alt axis(1)) ylab(,angle(0)) freq yti(Number of
cars for each mpg) start(10) width(5)  || mspline weight mpg,
ylab(,angle(0) axis(2)) yaxis(2) yscale(alt axis(2)) yti(Weight (lb) vs
mpg - smoothed line, axis(2)) legend(off) 



Allan


***********************************************************************************
This email and any attachments are intended for the named recipient only.  Its unauthorised use, distribution, disclosure, storage or copying is not permitted.  If you have received it in error, please destroy all copies and notify the sender.  In messages of a non-business nature, the views and opinions expressed are the author's own and do not necessarily reflect those of the organisation from which it is sent.  All emails may be subject to monitoring.
***********************************************************************************


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index