[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: RE: -scatter and alpha-blending
Nick Cox <firstname.lastname@example.org>
Re: st: RE: -scatter and alpha-blending
Fri, 28 Sep 2012 12:02:47 +0100
The demonstration works better if
gen y = cond(runiform() < 0.01, x, runiform())
is changed to something more like
gen y = cond(runiform() < 0.05, x, runiform())
On Fri, Sep 28, 2012 at 11:47 AM, Nick Cox <email@example.com> wrote:
> Transparency would be great, but usually can only ease a problem, not
> solve it. Once you have thousands of data points, there will be much
> overlap or overplotting no matter what you do. Often it's best to have
> the data display as a backdrop -- light grey|gray colours work well
> here -- and concentrate on superimposing some smooth(s) as a way of
> seeing structure.
> In addition to David's suggestions, a capricious small set of personal
> recommendations would include
> 1. -twoway fpfit- If one were to judge by mentions on this list,
> this is a rarely used command, but it's very flexible.
> 2. -rcspline- (SSC). This is a convenience command built on top of the
> excellent -mkspline, cubic-,
> 3. The stuff discussed in this article (but if interested get the code
> from gr0021_1 in SJ 10(1))
> SJ-5-4 gr0021 . . . . . . . Speaking Stata: Smoothing in various directions
> (help doublesm, diagsm, polarsm if installed) . . . . . . . N. J. Cox
> Q4/05 SJ 5(4):574--593
> discusses exploratory tools for determining the structure
> of bivariate data
> Stata also has -sunflower-. I am one of many people who have played
> with this idea, but somehow I never want to show the results to
> anybody else.
> More generally, everyone would be happier with 100 data points rather
> than 10, and so forth, but larger is not necessarily easier to deal
> with. The following kind of "needle in a haystack" demonstration can
> be of use in teaching. It's not at all original.
> The data are just noise, except that 1% of them follow the y = x
> diagonal exactly.
> set obs 100000
> gen x = runiform()
> gen y = cond(runiform() < 0.01, x, runiform())
> The first and very easy lesson is that the default marker symbol is useless.
> scatter y x
> So an easy thing is to change the symbol.
> scatter y x, ms(p)
> Sometimes the structure is easier to see with _fewer_ observations, so
> you can try things like
> scatter y x if runiform() < 0.2, ms(p)
> Naturally, once you know what you are looking for, it is easier to find it.
> Depending on the audience and your inclinations you can raise the tone
> by making the needle into a smiley face, an encouraging message or a
> gratuitous insult aimed at a visiting sports team; or you can lower it
> by mixing scientifically interesting patterns and noise and
> encouraging discussion about how we find structure generally.
> On Thu, Sep 27, 2012 at 11:41 PM, Francesco <firstname.lastname@example.org> wrote:
>> thank you very much!
>> On 28 September 2012 00:35, David Radwin <email@example.com> wrote:
>>> Unfortunately, Stata doesn't have transparent fills for scatterplot
>>> markers, but you might be able to fashion a workaround similar to p. 605
>>> Stata tip 27: Classifying data points on scatter plots
>>> N. J. Cox. 2005.
>>> Stata Journal Volume 5 Number 4.
>>> The key would be to identify the observations that are completely
>>> overlapping and make the markers darker in proportion to their relative
>>> If there are no or few completely overlapping points, an approach that
>>> might work is using hollow circles as markers (msymbol(Oh) or
>>> msymbol(oh)), because circles minimize overplotting.
>>> David Radwin
>>> Senior Research Associate
>>> MPR Associates, Inc.
>>> 2150 Shattuck Ave., Suite 800
>>> Berkeley, CA 94704
>>> Phone: 510-849-4942
>>> Fax: 510-849-0794
>>>> -----Original Message-----
>>>> From: firstname.lastname@example.org [mailto:owner-
>>>> email@example.com] On Behalf Of Francesco
>>>> Sent: Thursday, September 27, 2012 3:13 PM
>>>> To: firstname.lastname@example.org
>>>> Subject: st: -scatter and alpha-blending
>>>> Dear Statalist,
>>>> I would like to know whether it is possible to obtain in Stata a
>>>> with so called "alpha blending" :the markers are slightly transparent so
>>>> that darker regions in the graph have a higher point density...
>>>> Very much like this (In R) :
* For searches and help try: