Notice: On March 31, it was **announced** that Statalist is moving from an email list to a **forum**. The old list will shut down on April 23, and its replacement, **statalist.org** is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
Nick Cox <njcoxstata@gmail.com> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: RE: -scatter and alpha-blending |

Date |
Fri, 28 Sep 2012 11:47:54 +0100 |

Transparency would be great, but usually can only ease a problem, not solve it. Once you have thousands of data points, there will be much overlap or overplotting no matter what you do. Often it's best to have the data display as a backdrop -- light grey|gray colours work well here -- and concentrate on superimposing some smooth(s) as a way of seeing structure. In addition to David's suggestions, a capricious small set of personal recommendations would include 1. -twoway fpfit- If one were to judge by mentions on this list, this is a rarely used command, but it's very flexible. 2. -rcspline- (SSC). This is a convenience command built on top of the excellent -mkspline, cubic-, 3. The stuff discussed in this article (but if interested get the code from gr0021_1 in SJ 10(1)) SJ-5-4 gr0021 . . . . . . . Speaking Stata: Smoothing in various directions (help doublesm, diagsm, polarsm if installed) . . . . . . . N. J. Cox Q4/05 SJ 5(4):574--593 discusses exploratory tools for determining the structure of bivariate data Stata also has -sunflower-. I am one of many people who have played with this idea, but somehow I never want to show the results to anybody else. More generally, everyone would be happier with 100 data points rather than 10, and so forth, but larger is not necessarily easier to deal with. The following kind of "needle in a haystack" demonstration can be of use in teaching. It's not at all original. The data are just noise, except that 1% of them follow the y = x diagonal exactly. set obs 100000 gen x = runiform() gen y = cond(runiform() < 0.01, x, runiform()) The first and very easy lesson is that the default marker symbol is useless. scatter y x more So an easy thing is to change the symbol. scatter y x, ms(p) more Sometimes the structure is easier to see with _fewer_ observations, so you can try things like scatter y x if runiform() < 0.2, ms(p) Naturally, once you know what you are looking for, it is easier to find it. Depending on the audience and your inclinations you can raise the tone by making the needle into a smiley face, an encouraging message or a gratuitous insult aimed at a visiting sports team; or you can lower it by mixing scientifically interesting patterns and noise and encouraging discussion about how we find structure generally. Nick On Thu, Sep 27, 2012 at 11:41 PM, Francesco <cariboupad@gmx.fr> wrote: > thank you very much! > > On 28 September 2012 00:35, David Radwin <dradwin@mprinc.com> wrote: >> Unfortunately, Stata doesn't have transparent fills for scatterplot >> markers, but you might be able to fashion a workaround similar to p. 605 >> of: >> >> Stata tip 27: Classifying data points on scatter plots >> N. J. Cox. 2005. >> Stata Journal Volume 5 Number 4. >> http://www.stata-journal.com/article.html?article=gr0023 >> >> The key would be to identify the observations that are completely >> overlapping and make the markers darker in proportion to their relative >> frequency. >> >> If there are no or few completely overlapping points, an approach that >> might work is using hollow circles as markers (msymbol(Oh) or >> msymbol(oh)), because circles minimize overplotting. >> >> David >> -- >> David Radwin >> Senior Research Associate >> MPR Associates, Inc. >> 2150 Shattuck Ave., Suite 800 >> Berkeley, CA 94704 >> Phone: 510-849-4942 >> Fax: 510-849-0794 >> >> www.mprinc.com >> >> >>> -----Original Message----- >>> From: owner-statalist@hsphsun2.harvard.edu [mailto:owner- >>> statalist@hsphsun2.harvard.edu] On Behalf Of Francesco >>> Sent: Thursday, September 27, 2012 3:13 PM >>> To: statalist@hsphsun2.harvard.edu >>> Subject: st: -scatter and alpha-blending >>> >>> Dear Statalist, >>> >>> I would like to know whether it is possible to obtain in Stata a >>> scatterplot >>> with so called "alpha blending" :the markers are slightly transparent so >>> that darker regions in the graph have a higher point density... >>> Very much like this (In R) : >>> http://stackoverflow.com/questions/7714677/r-scatterplot-with-too-many- >>> points >>> * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/faqs/resources/statalist-faq/ * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**Re: st: RE: -scatter and alpha-blending***From:*Nick Cox <njcoxstata@gmail.com>

**References**:**st: -scatter and alpha-blending***From:*Francesco <cariboupad@gmx.fr>

**st: RE: -scatter and alpha-blending***From:*"David Radwin" <dradwin@mprinc.com>

**Re: st: RE: -scatter and alpha-blending***From:*Francesco <cariboupad@gmx.fr>

- Prev by Date:
**RE: st: Command for all variables.** - Next by Date:
**Re: st: RE: -scatter and alpha-blending** - Previous by thread:
**Re: st: RE: -scatter and alpha-blending** - Next by thread:
**Re: st: RE: -scatter and alpha-blending** - Index(es):