[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

RE: st: strange behavior of int() function...not truncating properly

From	Zachary Harrison <[email protected]>
To	[email protected]
Subject	RE: st: strange behavior of int() function...not truncating properly
Date	Tue, 30 Jan 2007 09:58:00 -0800 (PST)

Nick-

Your answer below is great! Thank you very much for
the detailed explanation.

zach
...

This is at root nothing to do with -int()-. The issue
is the expectation that computations to finite
precision
are always exact. On the contrary, they almost always
entail approximations and occasionally the results
cause
surprise. Also, it is salutary to recall that while
you
can do arithmetic with pencil and paper and use base
10
ideas, Stata is using base 2 approximations and quite
different algorithms. The two should come close, but
they are not guaranteed identical.

The format of %9.5f is insufficient to show what is
happening.

We, humans who know arithmetic, can see that int(99.14
* 100)/100
should be (is!) 9914 / 100 = 99.14. But Stata does not
look at the formula and use its knowledge. It has no
knowledge. It is a machine and goes for the best
binary approximation it can find. Here is a
hexadecimal story

. di %21x int(99.14 * 100)/100
+1.8c8f5c28f5c29X+006

No, it's not transparent to me either, but this is the
closest that users can get to seeing how Stata thinks
of this problem. Here is a decimal representation
of that

. di %21.18f int(99.14 * 100)/100
99.140000000000001000

and with the format used this is acceptable as the
right answer. But that isn't what Zach did that he
found puzzling. By default -generate- produces
float variables and there aren't enough bits in those
to
get what Zach sees as being the right answer.

. set obs 1
obs was 0, now 1

. gen v1 = int(99.14 * 100)/100

. di %21.18f v1[1]
99.139999389648437000

This is only a smidgen under 99.14, but the
difference is enough to be noticeable in Zach's
results.

With a -double-, you can reproduce what I did with
-display-:

. gen double V1 = int(99.14 * 100)/100

. di %21.18f V1[1]
99.140000000000001000

Otherwise put, 14/100 = 7/50 is an exact decimal, but
its binary
representation requires an indefinite number of bits.

The same issue is discussed at
FAQ . . . . . . . . . . . . . . . . . . . Results of
the mod(x,y) function
2/03 Why does the mod(x,y) function sometimes give
puzzling results?
Why is mod(0.3,0.1) not equal to 0?
http://www.stata.com/support/faqs/data/mod.html

and in a more recent Mata matters column by William
Gould.

Nick
[email protected]

Zachary Harrison

Here is a very simplified example demonstrating how
int() appears to not be properly truncating. What am
I missing here?

. set obs 1
obs was 0, now 1

. gen v1 = 99.1400000

. format v1 %9.5f

. list

+----------+
| v1 |
|----------|
1. | 99.14000 |
+----------+

. gen v2 = int(v1 * 100)/100

. gen v3 = v1 * 100

. replace v3 = int(v3)
(0 real changes made)

. replace v3 = v3 / 100
(1 real change made)

. format v2 %9.5f

. format v3 %9.5f

. list

+--------------------------------+
| v1 v2 v3 |
|--------------------------------|
1. | 99.14000 99.13000 99.14000 |
+--------------------------------+

I realize I can do my truncation in more than 1 step,
as v3 does, but would like to know what is different
here. I also of course realize the effect of
truncating 99.14 to 2 places is no change!

I am using Intercooled Stata 8.2 for Windows.
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/



 
____________________________________________________________________________________
The fish are biting. 
Get more visitors on your site using Yahoo! Search Marketing.
http://searchmarketing.yahoo.com/arp/sponsoredsearch_v2.php
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

Prev by Date: Re: st: Calculating the Medians among group categories
Next by Date: st: RE: problem with -tabstatmat-
Previous by thread: RE: st: strange behavior of int() function...not truncating properly
Next by thread: RE: st: strange behavior of int() function...not truncating properly
Index(es):
- Date
- Thread