# Re: st: RE: Y axis values for hist ,density

 From Marcello Pagano To statalist@hsphsun2.harvard.edu Subject Re: st: RE: Y axis values for hist ,density Date Thu, 27 Oct 2005 09:24:57 -0400

```I would recommend reading the well written page

http://www.stata.com/support/faqs/graphics/histvary.html

and paying special attention to the equal probability version
(eqprhistogram); it has a lot going for it, including its
dislike for zero-height columns.

m.p.

Jann Ben wrote:

```
Bang! I don't agree. The purpose of a histogram is to make visible the shape of a density. It is therefore natural to report the y-axis in terms of a density. ben

-----Original Message-----
From: owner-statalist@hsphsun2.harvard.edu [mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of Allan Reese (Cefas)
Sent: Thursday, October 27, 2005 3:03 PM
To: statalist@hsphsun2.harvard.edu
Subject: st: Y axis values for hist ,density

The default "hist x" command in Stata gives a Y axis labelled a density. I've never given it much attention until I saw the scale went up to 2 on a plot. Hold on, density functions sum to 1 over the variable.

Further investigation and discussion with Statacorp identified that the default tries to make the "area" of the bars add up to 1. If the number of bars changes, so does their width and so does the Y labelling. In my example, the data were discrete, so increasing the number of intervals did not change the plot except to add more zero-height columns and hence make each column narrower.

hist x, bin(n) therefore caused different Y labelling with varying n
hist x, xcale(xrange(0 n) did not affect the labelling, though the bars got narrower with bigger n
hist x, frac and hist x, discrete both gave correct labelling, and the sum of column heights was 1.
Do other users think this is perverse behaviour, especially as the default? My take is that, when drawing a histogram, the column width is taken as an arbitrary unit, not directly related to the x-scale. The implication is that you need to scale the height only when there are mixed-width columns, but would not label the Y axis in "freq/absolute-width" units. Having "densities" that vary and are in such peculiar units (1/locust in my example!) does not seem helpful.

Shoot me down
Allan

**************************************************************
*********************
This email and any attachments are intended for the named recipient only. Its unauthorised use, distribution, disclosure, storage or copying is not permitted. If you have received it in error, please destroy all copies and notify the sender. In messages of a non-business nature, the views and opinions expressed are the author's own and do not necessarily reflect those of the organisation from which it is sent. All emails may be subject to monitoring.
**************************************************************
*********************

*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/

*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/

```*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```