Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: adding means to histograms


From   "Nick Cox" <n.j.cox@durham.ac.uk>
To   <statalist@hsphsun2.harvard.edu>
Subject   st: adding means to histograms
Date   Tue, 24 May 2005 21:25:58 +0100

Following an earlier reply to Neil Shephard, here 
is a note on a further technique made possible 
by Stata 9 for adding marginal information to graphs, 
for example histograms. 

This example concerns means, but the principles
can clearly be applied to other measures. 

All the code here is given in one chunk at
the end for copying and pasting if desired. 

Firing up the auto data,  

. sysuse auto, clear 

we need one variable that is identically zero 

. gen zero = 0 

and one that is identically some relatively 
small negative number: 

. gen minus = -0.25 

"Relatively small" is judged with reference to 
the range on the vertical axis: with a larger 
maximum frequency than that to come, you would 
need a much larger negative number; with 
densities, usually a much smaller one. 

Now we want the mean. In this first example, the 
mean is naturally just a single value, and there 
are other ways to do it, but in other examples 
it is convenient to put results in a variable, 
despite the redundancy implied. (That is also 
why constants were put in variables above.) 

. egen mean = mean(mpg) 

Now we have all the ingredients: 

. twoway histogram mpg, 
	w(1) freq legend(off) xti("Mileage (mpg)") 
	|| pcarrow minus mean zero mean, 
	barbsize(3) msize(3)

So the mean symbol is just an arrow that horizontally 
is at the value of -mean- and vertically extends from 
-minus- to -zero-. The arrow, however, is hidden by 
its own rather large arrowhead, so a triangle alone is 
visible. 

You could -- especially in Stata 8 -- go for the 
same effect by overplotting with a scatter plot with -ms(T)-,  
but it can be troublesome to get the triangle in exactly 
the right position and at exactly the right size. 
Or at least that was my experience. 

The same minor trickery can be used to populate 
a series of histograms with their own mean symbols: 

. egen meanby = mean(mpg), by(rep78) 

. twoway histogram mpg, w(1) freq 
	by(rep78, legend(off) 
	subti("Mileage (mpg)", ring(1) pos(6))) 
	|| pcarrow minus meanby zero meanby, 
	barbsize(2) msize(2)  

"To the vector belong the spoils." 
(Norton Juster, "The dot and the line") 

Nick 
n.j.cox@durham.ac.uk 

sysuse auto, clear 

egen mean = mean(mpg) 

gen zero = 0 
gen minus = -0.25 

twoway histogram mpg, w(1) freq legend(off) xti("Mileage (mpg)") ///
|| pcarrow minus mean zero mean, barbsize(3) msize(3)

more 

egen meanby = mean(mpg), by(rep78) 

twoway histogram mpg, w(1) freq by(rep78, legend(off)  ///
subti("Mileage (mpg)", ring(1) pos(6))) ///
|| pcarrow minus meanby zero meanby, ///
barbsize(2) msize(2)  

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index