Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: Scatterplot with weighted markers


From   "Allan Reese (Cefas)" <[email protected]>
To   <[email protected]>
Subject   st: Scatterplot with weighted markers
Date   Tue, 15 Feb 2011 17:11:35 -0000

Nick Cox questioned some time back -
http://www.stata.com/statalist/archive/2006-06/msg00291.html - whether
this feature is sensible.  As he pointed out, the interpretation of
symbol "size" depends on the individual viewer. (For fans of Father Ted,
"This is *small* but that is *far away*.")

I agree that the perception will be individual and impressionistic, but
it can be used in just that way.  Hence it becomes a design feature and
the person designing the graph can select the scaling in just the same
way you choose the right (best, most-fitting, perfect) word to give the
preferred degree of emphasis to a statement.

The reason for writing is I've been experimenting with weighting symbol
size by functions of the sample size for each point.  I saw such a plot
in a paper and thought it gave a very useful impression of what
confidence you might have in the fitted model (line).

This is very easy to experiment with.  Using the classic data as an
example (but price rather than n)
. use auto
. scatter weight length [w=price], ms(oh)
. scatter weight length [w=sqrt(price)], ms(oh)
. scatter weight length [w=log10(price)], ms(oh)
. summ price

    Variable |       Obs        Mean    Std. Dev.       Min        Max
-------------+--------------------------------------------------------
       price |        74    6165.257    2949.496       3291      15906

Logic says sqrt is "proportional".  Economics says log is "utility", and
maybe sqrt(log10(price)) is the visual logic.  On the other hand, the
ratio biggest:smallest is 5:1, and simple weighting gives the clearest
visual feel for "more expensive".

You can expand the ratio by subtracting an offset that greatly increases
the contrast, though the smallest values then disappear as points.
. scatter weight length [w=(price-3000)], ms(oh)

While you can choose an appropriate ratio between smallest and largest
symbols, there is currently no way to scale all the symbols so the
largest do not overlap or to make the smallest more visible.

R Allan Reese
Senior statistician, Cefas
The Nothe, Weymouth DT4 8UB 

Tel: +44 (0)1305 206614 -direct
Fax: +44 (0)1305 206601 

www.cefas.co.uk 




***********************************************************************************
This email and any attachments are intended for the named recipient only.  Its unauthorised use, distribution, disclosure, storage or copying is not permitted.  If you have received it in error, please destroy all copies and notify the sender.  In messages of a non-business nature, the views and opinions expressed are the author's own and do not necessarily reflect those of the organisation from which it is sent.  All emails may be subject to monitoring.
***********************************************************************************


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index