Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: RE: Rounding problem in graph


From   "Nick Cox" <[email protected]>
To   <[email protected]>
Subject   st: RE: Rounding problem in graph
Date   Fri, 12 Sep 2003 12:17:21 +0100

Friedrich Huebler
 
> This is an excerpt from a larger data set:
> 
>      sex   rg5      order       tmax       mmax       fmax
>     Male  72.2          2       74.9       72.2       74.9
>   Female  74.9          1       74.9       72.2       74.9
> 
> I want to draw a simple bar graph that indicates the value for each
> sex. The problem is that I cannot find a way for Stata to draw the
> value for Female (74.9) rounded to one decimal place. Instead, the
> number 74.90000000000001 is drawn.
> 
>   #delimit ;
>   local textpos1 = tmax/100*20;
>   local textpos2 = tmax/100*50;
>   local mvalue = round(mmax,.1);
>   local fvalue = round(fmax,.1);
>   graph bar rg5,
>     over(sex, sort(order) gap(0) axis(off))
>     bar(1, bcolor(0 153 255) blcolor(white))
>     yscale(r(0 `max(tmax)') off) ylabel(minmax, nogrid)
>     outergap(0) plotr(m(zero))
>     graphregion(color(white) margin(zero))
>     text(`textpos1' 25 "Girls", size(20) color(white))
>     text(`textpos1' 75 "Boys", size(20) color(white))
>     text(`textpos2' 25 "`fvalue'%", size(20) color(white))
>     text(`textpos2' 75 "`mvalue'%", size(20) color(white))
>     legend(off);
> 
> When the line
> 
>   local fvalue = round(fmax,.1);
> 
> is changed to
> 
>   local fvalue = round(fmax,1);
> 
> a rounded value is displayed. When 74.8888 is substituted 
> for 74.9 in
> fmax, it is possible to round to two or three decimal 
> places, but not
> to one decimal place. How can I make Stata display exactly 74.9 in
> the graph?

At root this is not a graphics issue at all. 

It is, to at least some of us, an old friend: the precision issue
when you attempt to hold decimals in binary form. 

There is a run-down at [U] 16.10 
"Precision and problems therein"
and an FAQ which at first seems to be on a different issue at 
http://www.stata.com/support/faqs/data/mod.html -- but 
it is pertinent. 

Everyone recalls when reminded that computer software 
is based on binary arithmetic, but naturally most of 
the time we want to do our numerics with decimal input 
and output. A lot of very smart work has been put into
hiding the details in 

	decimal input at high level 
		|
	binary input at low level 
		|
	manipulations at low level 
		| 
	binary output at low level 
		|
	decimal output at high level 

from our eyes (and from our minds), but they can bite (byte?) 
at times. 

Given 72.2, for example, what does Stata make of it? To 
Stata it is more like this hexadecimal number, and 
you can see that Stata is straining to get 
as close as it can: 

. di %21x 72.2 
+1.20ccccccccccdX+006

The story is easier to follow if we use a 
near equivalent decimal format 

. di %21.18f 72.2 
72.200000000000003000

Imagine you're in fact conversing with Stata here: 

"OK, Stata my love, I have a number here, 72.2, can you deal with
that?" 

"Well, sweetie, the best approximation I can give you 
is the binary equivalent of (about) 
72.200000000000003000. Is that going to be 
close enough?" 

(The undocumented 

. set conversation on 

will produce more talkative output in Stata 
9 onwards.) 

Similarly with 74.9: 

. di %21.18f 74.9
74.900000000000006000

Some decimals map easily to binaries, 

like something + 1/2 

. di %21.18f 74.5  
74.500000000000000000

or something + 1/4 

. di %21.18f 74.25
74.250000000000000000

or something + 1/8 

. di %21.18f 74.125
74.125000000000000000

and so forth -- but not all, as both Friedrich's examples 
show. 

Note that -round(,)- here makes no 
difference. -round(,)- maps numbers 
to numbers, and gains nothing here: 

. di %21.18f round(74.9,0.1)
74.900000000000006000

(It is as if you're saying to Stata, 
"Try harder!", but it is already doing the 
best it can do.) 

-round(#,1)- naturally always produces 
a number that can be held exactly as binary, 
subject to the limits on very large numbers. 

This sounds bad in general, but the remedy is 
that you must get Stata to 
think of the thing to be displayed as 
a string. Here is one way 

local mvalue : di %2.1f mmax[1] 

and here is another 

local mvalue = string(mmax[1], "%2.1f") 

Incidentally, in your code, you have statements
like this: 

local textpos1 = tmax/100*20 

Beware: this is a source of bugs. -tmax- is 
a variable; in this context Stata is going 
to treat this as 

local textpos1 = tmax[1]/100*20 

which is what you want in this case. However, 
it is Bad Style, or so I suggest. 

This is, I suggest, Nice Style: 

su rg5, meanonly 
local textpos1 = r(max)/100*20
local textpos2 = r(max)/100*50

So you don't need the extravagance 
of holding a constant repeatedly 
in a variable.  

Similarly, your 

`max(tmax)' 

is a guess at Stata syntax, but it is 
isn't legal syntax really. By a series of 
little accidents you should get exactly 
what you want, as Stata is going to 
treat this as a reference to 

`max' 

which (from your example) doesn't exist. 
The "(tmax)" then gets ignored as trailing garbage. 

In context, the whole thing then becomes 

r(0 ) 

which is OK here. 

(I went through a similar story on macro names in a reply 
to Deborah Garvey on 29 August.) 

Nick 
[email protected] 

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index