[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

From |
Marcello Pagano <pagano@hsph.harvard.edu> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: RE: Y axis values for hist ,density |

Date |
Thu, 27 Oct 2005 09:24:57 -0400 |

I would recommend reading the well written page http://www.stata.com/support/faqs/graphics/histvary.html and paying special attention to the equal probability version (eqprhistogram); it has a lot going for it, including its dislike for zero-height columns. m.p. Jann Ben wrote:

Bang! I don't agree. The purpose of a histogram is to make visible the shape of a density. It is therefore natural to report the y-axis in terms of a density. ben

-----Original Message-----*

From: owner-statalist@hsphsun2.harvard.edu [mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of Allan Reese (Cefas)

Sent: Thursday, October 27, 2005 3:03 PM

To: statalist@hsphsun2.harvard.edu

Subject: st: Y axis values for hist ,density

The default "hist x" command in Stata gives a Y axis labelled a density. I've never given it much attention until I saw the scale went up to 2 on a plot. Hold on, density functions sum to 1 over the variable.

Further investigation and discussion with Statacorp identified that the default tries to make the "area" of the bars add up to 1. If the number of bars changes, so does their width and so does the Y labelling. In my example, the data were discrete, so increasing the number of intervals did not change the plot except to add more zero-height columns and hence make each column narrower.

hist x, bin(n) therefore caused different Y labelling with varying n

hist x, xcale(xrange(0 n) did not affect the labelling, though the bars got narrower with bigger n

hist x, frac and hist x, discrete both gave correct labelling, and the sum of column heights was 1.

Do other users think this is perverse behaviour, especially as the default? My take is that, when drawing a histogram, the column width is taken as an arbitrary unit, not directly related to the x-scale. The implication is that you need to scale the height only when there are mixed-width columns, but would not label the Y axis in "freq/absolute-width" units. Having "densities" that vary and are in such peculiar units (1/locust in my example!) does not seem helpful.

Shoot me down

Allan

**************************************************************

*********************

This email and any attachments are intended for the named recipient only. Its unauthorised use, distribution, disclosure, storage or copying is not permitted. If you have received it in error, please destroy all copies and notify the sender. In messages of a non-business nature, the views and opinions expressed are the author's own and do not necessarily reflect those of the organisation from which it is sent. All emails may be subject to monitoring.

**************************************************************

*********************

*

* For searches and help try:

* http://www.stata.com/support/faqs/res/findit.html

* http://www.stata.com/support/statalist/faq

* http://www.ats.ucla.edu/stat/stata/

* For searches and help try:

* http://www.stata.com/support/faqs/res/findit.html

* http://www.stata.com/support/statalist/faq

* http://www.ats.ucla.edu/stat/stata/

* * For searches and help try: * http://www.stata.com/support/faqs/res/findit.html * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**References**:**st: RE: Y axis values for hist ,density***From:*"Jann Ben" <ben.jann@soz.gess.ethz.ch>

- Prev by Date:
**st: every n-th individual** - Next by Date:
**st: Finding "near"-matches** - Previous by thread:
**st: RE: Y axis values for hist ,density** - Next by thread:
**st: RE: Y axis values for hist ,density** - Index(es):

© Copyright 1996–2014 StataCorp LP | Terms of use | Privacy | Contact us | What's new | Site index |