[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

From |
"Cowell, Alexander J." <[email protected]> |

To |
"'[email protected]'" <[email protected]> |

Subject |
Re: st: RE: Histograms (was: Multiple (overlaid) Histogram) |

Date |
Fri, 30 May 2003 15:34:49 -0400 |

Hi there It seems to me there's a fast growing base of support for histograms with varying widths in Stata. This is just to add my vote in favor of implementing such an option in Stata 9, and in further ado files in the meantime. Nick Cox's arguments against are: 1. with the way data are these days, few people will need histograms with varying widths, and 2. it's a slippery slope to infinitely large or small supports. My responses are: 1. There are data sets that allow only varying widths in the support, and not all of them are obscure. Take for example the public use Statistics of Income (SOI) data published by the United States' Internal Revenue Service, which gives frequencies in unequal intervals of taxable income in the U.S. Moreover, there are many instances where unequal widths in the support make natural sense. Consider if one were to examine frequency of drug use by age, one may want to examine frequencies by broad age categories that are clinically meaningful but not of equal interval. The age ranges may be: young (0-12), adolescent (12-17), young adult (18-24), ..., mature adult but not retired (50-65), etc... 2. Let the user decide where to draw the faux-closed interval for what is logically an infinite continuum. Us users are used to making such decisions, and we usually know where we can ask if we need help deciding. Often the subject discipline will give guidelines based on precedent. This is, after all, how top-codes are often used in reporting income in many survey data sets. In fact, my case is a general plea to allow for greater user power in the histogram command. Thanks Alex Cowell Alexander J. Cowell, Ph.D. Economist Behavioral Health Economics Program Research Triangle Institute 3040 Cornwallis Rd PO Box 12194 Research Triangle Park NC 27709-2194 email: [email protected] phone: 919 541 8754 fax: 919 541 6683 Nick Cox wrote: 1. Empirical. You will see histograms with unequal widths particularly in older books and papers, and the reason was that data for them came already grouped in such classes. There's an example in Snedecor and Cochran's venerable text. That seems far less common today when more and more data sets are available in raw, ungrouped form, modulo confidentiality constraints. I don't see people asking for this often on Statalist, and one good reason for this being low down in priority is that it is practice rarely needed. 2. The "slippery slope question": if unequal widths are supported, then next in line is the question of support for a histogram with a class which extends from large positive number to infinity and/or a class which extends from a large negative number to minus infinity. Even quite what you _should_ draw then seems to me an open question (pun intended). Richard Goldstein wrote: I very strongly disagree with Nick's conclusion here (even taking it as somewhat tongue-in-cheek): For any graphic command that has an option such as bin (histogram), bwidth (lowess), width (kdensity), I would very much like to see dynamic graphics -- i.e., a slider such that I can change, e.g., the number of bins in real time and see what kind of difference it makes to the graph. Would anyone else like to see something like this? Rich Goldstein Marcello Pagano wrote: As I have said before, I would very much like, with Allan Reese, an option for equi-probability histograms. I think that this is especially useful when thinking of the histogram as an estimator of an underlying density function of a continuous variable. I do not see a strong argument, other than tradition (laziness?), for being constrained to histograms with equi-spaced bins. m.p. * * For searches and help try: * http://www.stata.com/support/faqs/res/findit.html * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

- Prev by Date:
**st: interpret results from conditional logit model** - Next by Date:
**RE: st: RE: Histograms (was: Multiple (overlaid) Histogram)** - Previous by thread:
**st: interpret results from conditional logit model** - Next by thread:
**st: RE: merge or alternative** - Index(es):

© Copyright 1996–2024 StataCorp LLC | Terms of use | Privacy | Contact us | What's new | Site index |