Statalist The Stata Listserver


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: changing limits of histogram bins


From   n j cox <n.j.cox@durham.ac.uk>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: changing limits of histogram bins
Date   Mon, 19 Mar 2007 14:31:47 +0000

I believe that the problem here is the old precision problem again, thinly disguised. -search precision- should yield various manual, FAQ and published paper accounts.

Once more from the top, allegro molto vivace:

1. You think decimal, Stata thinks binary. Stata does its
level best to talk your language, but it's still a foreign
language, and misunderstandings will occur.

2. 0(0.1)1 look nice and simple to you but Stata has a hard time of
it approximating most of these using binary arithmetic. 0, 0.5 and 1
behave as hoped and expected but none of the others can be held
exactly as a binary fraction. Consider some results which underline
that decimals must be approximated. In most
cases, Stata will be only a smidgen away from where you think it
should be. Usually this matters not, but sometimes, as here, it bites.

. forval i = 0/10 {
2. di %23.18f `i'/10
3. }
0.000000000000000000
0.100000000000000010
0.200000000000000010
0.299999999999999990
0.400000000000000020
0.500000000000000000
0.599999999999999980
0.699999999999999960
0.800000000000000040
0.900000000000000020
1.000000000000000000

I don't have a strong sense of Richard's exact problem, but
precision problems can be improved, if not solved, by
using -double- variables, not -float-, and avoiding comparisons
which are likely to fail, i.e. by not assuming that you know where
the boundary of a closed interval really is.

Small fudges may work, but it is better to understand what is going
on.

Nick
n.j.cox@durham.ac.uk

Svend Juul

Richard Hisock wrote:

I am trying to create a frequency histogram where bins consist
of cuts of a variable (proportion of observations therefore
range [0,1]) in 0.1 bins sizes.

In Stata v9 using histogram..., width(0.1) start(0(0.1)1) the bins
appear to be constructed [0,0.1) [0.1 ,0.2) ... & finally [0.9,1]

Is it possible to change the bin limits (0,0.1] (0.1,0.2] so that
the upper limit is exclude from the bin?

Or should I just reset the bin limits for example
...,width(0.1) start (0(0.10001)1) xlab(0(0.1)1)?

-----------------------------------------------------------------------

My Stata 9.2 does not accept
histogram ... , start(0(0.1)1)
but it does accept
histogram ... , start(0) width(0.1)

I made this experiment:

clear
set obs 100
gen x=int(10*uniform())/10
tab1 x
histogram x , start(0) width(0.1) frequency

and found (as expected) that it was not predictable whether, e.g.,
0.7 was included in the bin below or above 0.7.

One possible solution is:

generate x1 = x+0.00001
histogram x1 , start(0) width(0.1) frequency

*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/




© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index