Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
Nick Cox <n.j.cox@durham.ac.uk> |

To |
"'statalist@hsphsun2.harvard.edu'" <statalist@hsphsun2.harvard.edu> |

Subject |
st: RE: Scatterplot with weighted markers |

Date |
Tue, 15 Feb 2011 18:42:02 +0000 |

Thanks to Allan for digging up my old post. I don't want to add to it, except to underline that Stata lets you play not only with this recipe, but with others. -tabplot- from SSC already lets you show vertical or horizontal bars with specified heights (lengths) and you can put them at specified x, y coordinates. The following example shows the possibilities more directly: sysuse auto, clear gen mpg1 = mpg - price/7000 gen mpg2 = mpg + price/7000 scatter mpg weight, ms(oh) || rbar mpg1 mpg2 weight, /// ytitle("`: var label mpg'") bfcolor(none) barw(100) legend(off) The key points are all very simple: 1. If you show bars, you can control the scaling directly. You know, and can tell your readers, that it is linear (or whatever else you choose). 2. If you use constant widths, the user only has to interpret variations in height and there is no dimensional ambiguity. 3. Naturally, bars can be transparent to allow overlap. Of course, this method has one big disadvantage too, that it arbitrarily uses only one dimension. But it includes genuinely proportional symbols. Nick n.j.cox@durham.ac.uk Allan Reese Nick Cox questioned some time back - http://www.stata.com/statalist/archive/2006-06/msg00291.html - whether this feature is sensible. As he pointed out, the interpretation of symbol "size" depends on the individual viewer. (For fans of Father Ted, "This is *small* but that is *far away*.") I agree that the perception will be individual and impressionistic, but it can be used in just that way. Hence it becomes a design feature and the person designing the graph can select the scaling in just the same way you choose the right (best, most-fitting, perfect) word to give the preferred degree of emphasis to a statement. The reason for writing is I've been experimenting with weighting symbol size by functions of the sample size for each point. I saw such a plot in a paper and thought it gave a very useful impression of what confidence you might have in the fitted model (line). This is very easy to experiment with. Using the classic data as an example (but price rather than n) . use auto . scatter weight length [w=price], ms(oh) . scatter weight length [w=sqrt(price)], ms(oh) . scatter weight length [w=log10(price)], ms(oh) . summ price Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- price | 74 6165.257 2949.496 3291 15906 Logic says sqrt is "proportional". Economics says log is "utility", and maybe sqrt(log10(price)) is the visual logic. On the other hand, the ratio biggest:smallest is 5:1, and simple weighting gives the clearest visual feel for "more expensive". You can expand the ratio by subtracting an offset that greatly increases the contrast, though the smallest values then disappear as points. . scatter weight length [w=(price-3000)], ms(oh) While you can choose an appropriate ratio between smallest and largest symbols, there is currently no way to scale all the symbols so the largest do not overlap or to make the smallest more visible. * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**References**:**st: Scatterplot with weighted markers***From:*"Allan Reese (Cefas)" <allan.reese@cefas.co.uk>

- Prev by Date:
**st: Prediction after a Xtlogit** - Next by Date:
**st: RE: RE: sequential graphing** - Previous by thread:
**Re: st: Scatterplot with weighted markers** - Next by thread:
**st: Prediction after a Xtlogit** - Index(es):