Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

RE: st: RE: Histograms (was: Multiple (overlaid) Histogram)


From   "Nick Cox" <[email protected]>
To   <[email protected]>
Subject   RE: st: RE: Histograms (was: Multiple (overlaid) Histogram)
Date   Sat, 31 May 2003 18:55:46 +0100

Marcello Pagano
 
> > As I have said before, I would very much like, with Allan 
> Reese, an
> > option for equi-probability histograms.  I think that this
> > is especially
> > useful when thinking of the histogram as an estimator of an
> > underlying
> > density function of a continuous variable. I do not see a strong
> > argument, other than tradition (laziness?), for being
> > constrained to
> > histograms with equi-spaced bins.

Nick Cox 

> An interesting problem.
 
> I see one key question arising.
> 
> If I understand this properly, we just need to find so many 
> quantiles
> x_(i) equally spaced on the probability scale and calculate bar
> heights proportional to 1 / (x_(i+1) - x_(i)). In practice, it is
> not that uncommon to get ties, implying infinite bar heights. We
> can fudge this by interpolating linearly between different 
> quantiles.
> 
> What lies below is a hack. The number of bins is wired in at no
> more than 20. My experiments indicate that you need
> a lot of data to avoid somewhat bizarre details. Essentially
> you are bound to see the kind of instability inherent in crude
> numerical derivatives.
> 
> Don't point out that density estimates are nicer! I know that.
> This is an attempt at what Marcello wants.

< code omitted > 

Having slept on this, 

1. I realised that the method of adjusting for tied 
quantiles was arbitrary, biased, implemented incorrectly 
even in its own terms and in any case not applicable 
if quantiles tied with the maximum. In short, a bad 
idea. After thinking about various alternatives, 
I decided that the most honest thing to do 
was to refuse to draw the graph if quantiles are tied. 

2. I generalised to allow frequency and analytic 
weights. 

3. Most interestingly, Vince Wiggins kindly told me 
of an undocumented option -bartype(spanning)- 
which removes the need for the convoluted 
trickery used to draw bars. Thanks! (It 
appears to have one quirk, which is worked 
around in my code.) 

If you copied and pasted the code posted yesterday, 
please junk it and use this instead. I will send a regular 
version to SSC shortly. 

Nick 
[email protected] 

program eqprhistogram, sortpreserve 
*! for Marcello, eppur si muove!  
*! NJC 2.0.0 31 May 2003 
	version 8 
	syntax varname(numeric) [if] [in] [aweight fweight] /// 
	[ , bin(numlist int >1 <=20) * ]

	// #bins defaults to 20 
	if "`bin'" == "" local bin = 20 
	local binp1 = `bin' + 1 
	local binp2 = `bin' + 2 
	local binm1 = `bin' - 1 

	// enough data? 
	marksample touse 
	qui count if `touse' 
	if r(N) < `binp2' { 
		di as err "insufficient observations" 
		error 2000
	} 
	
	tempvar quantile qnum density 
	qui {
		// get quantiles
		su `varlist' [`weight' `exp'] if `touse', meanonly 
		generate `quantile' = r(min) in 1 
		replace `quantile' = r(max) in `binp1'  
		_pctile `varlist' [`weight' `exp'] if `touse', nq(`bin') 
		forval i = 1/`binm1' { 
			replace `quantile' = r(r`i')  in `= `i' + 1' 
		} 

		// check for tied quantiles 
		bysort `quantile'  : ///
			gen byte `qnum' = _N * (`quantile' < .) 
		su `qnum', meanonly 
		if r(max) > 1 { 
			noi di as txt ///
	                "{p}`varlist' has tied quantiles: " /// 
			"try fewer bins? graph inappropriate?{p_end}" 
			exit 0 
		} 	
		
		// prepare graph
		gen `density' = ///
		1 / (`bin' * (`quantile'[_n+1] - `quantile')) 
		_crcslbl `quantile' `varlist' 
		label var `density' "Density" 

		// work-around spanning quirk
		replace `quantile' = `quantile'[1] in `binp2' 
		replace `density' = 0 in `binp2'

		sort `quantile' `density'
	} 

    	twoway bar `density' `quantile', ///
	bartype(spanning) bstyle(histogram) `options'
end
		


*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index