# AW: st: overlaying two histograms (or distribution curves)

 From "Martin Weiss" To Subject AW: st: overlaying two histograms (or distribution curves) Date Sun, 22 Nov 2009 16:40:58 +0100

```<>

"Why is this graph form ... not enormously better known?

Probable answer: Manual entry too short (half a page, [R], p. 348), and
little publicity elsewhere:
http://www.stata-journal.com/sjpdf.html?articlenum=gr0003

HTH
Martin

-----Ursprüngliche Nachricht-----
Von: owner-statalist@hsphsun2.harvard.edu
[mailto:owner-statalist@hsphsun2.harvard.edu] Im Auftrag von Nick Cox
Gesendet: Sonntag, 22. November 2009 16:29
An: statalist@hsphsun2.harvard.edu
Betreff: RE: st: overlaying two histograms (or distribution curves)

For comparison of precisely two distributions -- especially when there
is not a prior prejudice or hypothesis that we are expecting
approximations to some named, equation-specified distribution -- I
regard quantile-quantile plots as near optimal. They allow you to focus
on both similarity and differences and to think directly in terms of
what is being measured.

Why is this graph form which is

(a) information-rich
(b) free of arbitrary assumptions (bin or kernel width, etc.)
(c) easy to explain
(d) easy to compute
(e) well documented

not enormously better known?

See -qqplot-.

Nick
n.j.cox@durham.ac.uk

Ariel Linden, DrPH

Thank you both (Maarten and Austin) for all these choices I had not
known
about (violin, byhist, kdens).

Austin, I don't have a known distribution per-se. I have two groups
(treated
and controls), and the outcome variables could follow any distribution.
The
motivation for this is to visually describe how the distribution of an
outcome variable overlaps (or doesn't) between two groups.

Date: Fri, 20 Nov 2009 10:47:06 -0500
From: Austin Nichols <austinnichols@gmail.com>

Maarten--
I think that is the same graph I gave for comparison purposes, but I
don't think it compares well with -byhist- unless one takes a bit more
care on the -kdensity- side--the kernel density estimates should at
least use the same bandwidths, and perhaps the same estimation points
if we really wish to compare them.  Other considerations might apply
if Ariel told us something about the theoretical distribution of the
variable filling the role of "price" (is it discrete? does it have a
finite range?).

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```