Notice: On March 31, it was **announced** that Statalist is moving from an email list to a **forum**. The old list will shut down on April 23, and its replacement, **statalist.org** is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
Gabi Huiber <ghuiber@gmail.com> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: visualization? |

Date |
Wed, 5 Oct 2011 14:58:12 -0400 |

Nick, thank you for this list. It's a useful refresher. Regarding 2 and 6: I didn't know that transparency was on a wish list, but I'm glad to hear it is. I once saw a nice demonstration of ggplot2 on r-blogger.com: markers of slightly less than 100% transparency acted like disks of glass. One of them looks barely visible; the more of them you stack, the darker the pile. This gives a very nice gradient over a scattershot. It's prettier than the current recommended workaround that we use hollow circles. Gabi On Fri, Sep 30, 2011 at 10:23 AM, Nick Cox <njcoxstata@gmail.com> wrote: > Do you mean Vince Viggins? Sounds like a Dickens character. We saw > Vince Wiggins at the London meeting. > > We are starting with these suggestions. I'll add numbers for convenience. > > 1. If one of the variables is positively skewed, consider plotting > that axis on a log scale. > > 2. If there are a lot of data points (e.g., n > 1000), adopt a > different strategy such as using some form of partial transparency, or > sampling the data; > > 3. If one of the variables takes on a limited number of discrete > categories, consider using a jitter or a sunflower plot; > > 4. If there are three or more variables, consider using a scatterplot matrix; > > 5. Fitting some form of trend line is often useful; > > 6. Adjust the size of the plotting character to the sample size (for > bigger n, use a smaller plotting character); > > Random comments > > 1. I take this as standard. I'll add a plea for consideration of any > reasonable non-linear scale, labelled in the original units! > > 2 and 6. Transparency is on some wishlists for Stata. With lots of > data, you go not only for smaller symbols but more open ones and use > lighter colors. > > 3. I've played with sunflower plots and gone off them. But if you want > to try them, note that they are undocumented [sic] at -help twoway > sunflower-. For highly discrete or even categorical variables, I like > my -tabplot- (SSC). > > 4. Agree, although that does not rule some projection from a > multivariate analysis being helpful too. > > 5. Yes, if "trend" means "smooth". Some special smooths were published in > > SJ-10-1 gr0021_1 . . . . . . . . . . Software update for doublesm and diagsm > (help doublesm, diagsm, polarsm if installed) . . . . . . . N. J. Cox > Q1/10 SJ 10(1):164 > option to carry out smoothing using restricted cubic splines > added to doublesm and diagsm > > SJ-5-4 gr0021 . . . . . . . Speaking Stata: Smoothing in various directions > (help doublesm, diagsm, polarsm if installed) . . . . . . . N. J. Cox > Q4/05 SJ 5(4):574--593 > discusses exploratory tools for determining the structure > of bivariate data > > Some possible additions: > > 7. About 1980, there was a sudden fashion for adding convex hulls, > which faded away quickly. I remember often doing it with a pencil on > lineprinter output. But Allan Reese has a nice implementation on SSC > as -cvxhull-. On occasion that helps a lot. > > 8. When you have a categorical subdivision, try out both several > categories superimposed and a -by()- option to give separate plots. A > third strategy is given in > > SJ-10-4 gr0046 . . . . . . . . . . . . . . . Speaking Stata: Graphing subsets > . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . N. J. Cox > Q4/10 SJ 10(4):670--681 (no commands) > explores graphical comparison of results for two or more > subsets where each subset is plotted in a separate panel, > with the rest of the data as a backdrop > > Nick > > On Fri, Sep 30, 2011 at 2:56 PM, Stas Kolenikov <skolenik@gmail.com> wrote: > >> There was an interesting question on data visualization on >> Stats.StackExchange (http://stats.stackexchange.com/q/13148/5739): >> what are the efficient strategies for tweaking scatterplots depending >> on the data needs? Too much data make it clogged, too little data such >> as ordinal make it too chunky, too skewed data makes it sit in one >> corner, and there are a multitude of other things that needs to be >> adjusted to make the display really informative. >> >> I would be especially curious to hear from Nick Cox and Michael >> Mitchell, I guess, as the greatest contributors to Stata graphics (and >> of course Vince V, but I don't think I've seen him on the list for a >> while). > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/statalist/faq > * http://www.ats.ucla.edu/stat/stata/ > * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**Re: st: visualization?***From:*Nick Cox <njcoxstata@gmail.com>

- Prev by Date:
**Re: st: Stata 11 v Stata 12: difference in batch mode behaviour** - Next by Date:
**st: RE: RE: rmanova or anova with repeated command, what to use?** - Previous by thread:
**st: xtmelogit: interpretation of the _cons and level 1 predictors** - Next by thread:
**Re: st: visualization?** - Index(es):