That said, this kind of histogram is in my opinion statistically
questionable. I'm pretty clear we wouldn't allow it in the
Stata Journal!
It compounds the arbitrary binning and origin
that are the worst features of histograms by splitting
each distribution into _separate_ bins. The idea that
the reader can perceive both the component distributions
as Gestalts and compare fine structure seems far fetched to me.
To compare two distributions, consider quantile-quantile plots
(-qqplot-), superimposed density estimates (-kdensity-, -twoway
kdensity-), superimposed quantile functions (-qplot- from SJ),
superimposed distribution functions (-distplot- from SJ),
etc., etc.
Nick
[email protected]
Scott Merryman
> Try this:
>
> Use -twoway__histogram_gen- to generate the height and bin
> locations for
> each variable, -merge- them together so the bin locations
> line up, and call
> -graph bar- to produce the graph.
>
> clear
> set obs 50
> set seed 12345
> gen x1 = invnorm(uniform())
> gen x2 = invnorm(uniform())*2
> gen x3 = invnorm(uniform()) + 3
>
> preserve
> local min = .
> foreach var of varlist x* {
> sum `var'
> if floor(`=r(min)') < `min' {
> local min = floor(r(min))
> }
> }
> tempfile foo
> save `foo'.dta
> forv i = 1/3 {
> use `foo'.dta
> twoway__histogram_gen x`i', display gen(h`i' p`i') start(`min')
> width(1)
> tempfile foo`i'
> sort p`i'
> drop if p`i' ==.
> rename p`i' p
> keep p h`i'
> save `foo`i''.dta
> }
>
> use `foo1'.dta
> forv i = 2/3 {
> merge p using `foo`i''.dta
> drop _m
> sort p
> }
> l
> graph bar h*, over(p) bar(1, lcolor(black) fcolor(gs15)) ///
> bar(2, lcolor(black) fcolor(gs11)) bar(3, lcolor(black) fcolor(gs6))
> legend(off)
> restore
Justin Gengler
> > I would like to produce a SINGLE histogram (unlike the two
> > split-screen histograms produced using the 'by(...)' argument) that
> > combines two histograms of some variable 'x' (i.e., x given some
> > value of some other variable 'z') -- for example, a single histogram
> > combining two individual histograms for some variable when sex == 0
> > and when sex == 1. Thus what I am looking for is akin to a bar plot
> > with the over(...) argument specified, but of course with the output
> > being a histogram rather than a bar plot.
> >
> > For an illustration of what I am looking for, see this graph:
> > http://addictedtor.free.fr/graphiques/RGraphGallery.php?graph=82.
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/