Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: RE: Adding normal density to overlayed histograms


From   Nick Cox <[email protected]>
To   "'[email protected]'" <[email protected]>
Subject   st: RE: Adding normal density to overlayed histograms
Date   Thu, 21 Oct 2010 13:09:22 +0100

Michael Mitchell and Ulrich Kohler explained what is going on in Stata terms and gave excellent and essentially identical solutions to the problem posed. Here I broaden the discussion. 

A histogram has some advantages and some disadvantages. This list is a personal take and naturally not intended to be definitive or complete: 

+1. It is likely to seem familiar to analyst and audience. 

+2. People can focus on modes, left and right tails. 

-1. One histogram can easily occlude part of the other, unless you do a lot of work. 

-2. More generally, the result can easily look a bit of a mess. 

-3. Histograms depend on choices about bin width and bin starts, even if those choices are automated; such choices can be hard to optimise. 

-4. Linked to that, you can lose detail that might be important. 

-5. If the normal is a reference, the comparison is of a curve with a set of bars, which is not the easiest comparison to get right. (Sometimes, the graph is a propaganda graph presented in the spirit "Look, it's roughly normal", when a more critical look would show important features, such as heavier tails or a mild outlier.)  

Now, in terms of alternatives: 

I mention first -histogram, by() normal- which eases some of the problems. 

A very different approach is to use quantile-quantile plots. Stata's own -qnorm- is very limited (one variable, one group), but it is easy enough 

(a) to do it yourself or 
(b) to exploit user-written programs. 

On (a), see 

SJ-7-2  gr0027  . .  Stata tip 47: Quantile-quantile plots without programming
        . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  N. J. Cox
        Q2/07   SJ 7(2):275--279                                 (no commands)
        tip on producing various quantile-quantile (Q-Q) plots

The .pdf of that short paper is accessible to all via 

http://www.stata-journal.com/sjpdf.html?articlenum=gr0027

so I'll not repeat the exposition, other than to underline that the first worked example is precisely that raised in this posting, two groups and whether they are normally distributed. 

On (b), -qplot- offers one-liners such as 

. qplot mpg, over(foreign) trscale(invnormal(@)) 

-search qplot, sj- for publications and download sources. 

Nick 
[email protected] 

Dorothy Bridges

I am overlaying two histograms and would like Stata to add a normal
density curve for each.

hist x, normal addplot(hist x2)

works fine, but

hist x, normal addplot(hist x2, normal)

tells me that normal is not an option.  Any ideas as to why this is happening?


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index