Nick Cox

statalist@hsphsun2.harvard.edu

st: adding means to histograms

Tue, 24 May 2005 21:25:58 +0100

Following an earlier reply to Neil Shephard, here is a note on a further technique made possible by Stata 9 for adding marginal information to graphs, for example histograms. This example concerns means, but the principles can clearly be applied to other measures. All the code here is given in one chunk at the end for copying and pasting if desired. Firing up the auto data, . sysuse auto, clear we need one variable that is identically zero . gen zero = 0 and one that is identically some relatively small negative number: . gen minus = -0.25 "Relatively small" is judged with reference to the range on the vertical axis: with a larger maximum frequency than that to come, you would need a much larger negative number; with densities, usually a much smaller one. Now we want the mean. In this first example, the mean is naturally just a single value, and there are other ways to do it, but in other examples it is convenient to put results in a variable, despite the redundancy implied. (That is also why constants were put in variables above.) . egen mean = mean(mpg) Now we have all the ingredients: . twoway histogram mpg, w(1) freq legend(off) xti("Mileage (mpg)") || pcarrow minus mean zero mean, barbsize(3) msize(3) So the mean symbol is just an arrow that horizontally is at the value of -mean- and vertically extends from -minus- to -zero-. The arrow, however, is hidden by its own rather large arrowhead, so a triangle alone is visible. You could -- especially in Stata 8 -- go for the same effect by overplotting with a scatter plot with -ms(T)-, but it can be troublesome to get the triangle in exactly the right position and at exactly the right size. Or at least that was my experience. The same minor trickery can be used to populate a series of histograms with their own mean symbols: . egen meanby = mean(mpg), by(rep78) . twoway histogram mpg, w(1) freq by(rep78, legend(off) subti("Mileage (mpg)", ring(1) pos(6))) || pcarrow minus meanby zero meanby, barbsize(2) msize(2) "To the vector belong the spoils." (Norton Juster, "The dot and the line") Nick n.j.cox@durham.ac.uk sysuse auto, clear egen mean = mean(mpg) gen zero = 0 gen minus = -0.25 twoway histogram mpg, w(1) freq legend(off) xti("Mileage (mpg)") /// || pcarrow minus mean zero mean, barbsize(3) msize(3) more egen meanby = mean(mpg), by(rep78) twoway histogram mpg, w(1) freq by(rep78, legend(off) /// subti("Mileage (mpg)", ring(1) pos(6))) /// || pcarrow minus meanby zero meanby, /// barbsize(2) msize(2) * * For searches and help try: * http://www.stata.com/support/faqs/res/findit.html * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

