A simple example of procedure comes from the same data.
sysuse auto, clear
separate mpg, by(foreign) veryshortlabel
qqplot mpg1 mpg0
qqplot mpg1 mpg0, ysc(log) xsc(log)
shows that distributions are related by a multiplicative
shift (rather than an additive one). That is, the line
of paired quantiles diverges from the line of equality
on raw scales, but is approximately parallel on log scales.
In this case, the standard t-test shows an overwhelming
result, but it is still asking the wrong question!
Alejandro doesn't say what test is behind the statement
that for his example means differ at a significance level
of 0.01% (meaning, P < 0.0001), but I guess that he is
using similar if not identical machinery.
The difference between an additive shift and a multiplicative
shift is exactly the kind of important structure that is
often evident on a quantile-quantile plot, but somewhat lost
in the bars and baloney of an overlapping histogram. Karl
Pearson is long dead, and rest in peace; let's bury
unhelpful histograms too.
Naturally, there is no guarantee that exactly the
same trick will work with Alejandro's data, on
percent of households, but percents tend not to be
distributed symmetrically, so it wouldn't surprise me.
Nick
n.j.cox@durham.ac.uk
Nick Cox
> You say "overlap", but any overlapping is a property of
> the data rather than of graphical procedure. You
> can superimpose histograms, for example like this:
>
> sysuse auto, clear
>
> twoway histogram mpg if foreign, ///
> start(10) width(2) bcolor(none) blcolor(red) || ///
> histogram mpg if !foreign , ///
> start(10) width(2) bcolor(none) blcolor(blue) ///
> legend(order(1 "foreign" 2 "domestic") col(1) pos(1) ring(0))
>
> but I find that in general the result is a mess:
>
> 1. If distributions overlap, then necessarily one histogram
> will partly occlude another. This can be reduced by
> for example setting bar colours to invisible, but it cannot
> be eliminated. Perceiving the Gestalt is difficult even
> for foresters accustoming to seeing the trees for the wood
> and the wood for the trees.
>
> 2. There is always the minor -- and sometimes the major --
> worry of arbitrariness of bin width and origin.
>
> The histogram is 19th century technology: you can do
> much better with 1960s technology, namely the quantile-quantile
> plot implemented as -qqplot-.
Alejandro Delafuente
> > am would like to overlap two histograms,
> > can anyone tell how
> > to do so? The code that I have produced so far is the
> > following, but it
> > displays two separate histograms with same scale magnitudes:
> >
> > histogram CONTINUOUS VARIABLE if round!=1, percent
> > lcolor(red) ytitle(Percent
> > of households) xtitle(???) xlabel(0(.3)1, ticks) title(,
> > justification(center))
> > note(, justification(center) alignment(top)) legend(off)
> > by(round, note
> > (Difference in means test significant at 0.01% , size(vsmall)
> > justification
> > (left)))
>
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/