From |
wgould@stata.com (William Gould, Stata) |

To |
statalist@hsphsun2.harvard.edu |

Subject |
st: RE: Floating-point reals, hexadecimal and decimals |

Date |
Mon, 07 Jul 2003 08:13:09 -0500 |

Patrick Joly <Joly.Patrick@ic.gc.ca> asked about writing a Perl script to read/write Stata's missing values and then later, answered his own question. Patrick is obviously facil and working with IEEE floating-point numbers, which is how ouir computers store numbers such as 1.5, 3.14159, etc. Just in case any of you every have to work directly with the IEEE format, I wanted to tell you about one documented features and four more undocumented features in Stata that can help with understanding. The documented feature is the %21x format. It displays floating-point numbers in a "readable" way that is a bit-by-bit accurate representation of the number the computer really has stored: . display %21x 1.5 +1.8000000000000X+000 Numbers are displayed in hexadecmial multiplied by a power of 2: +1.8000000000000X+000 --------------- --- base 16 number \ power of 2 Ergo, the above number is (1 + 8/16) * 2^0 = 1.5 in decimal Pi in the %21x format looks like this: . display %21x _pi +1.921fb54442d18X+001 The %21x format is an accurate representation, but it is nonetheless a translation for the IEE format. The point of %21x is really numerical analysis. If one is going to analyze the round-off error in some calculated result, it is really best to look at the number in the same base the computer uses. For instance, in %21x, one can readily see the effects of using float precision: . display %21x float(_pi) +1.921fb60000000X+001 This is rather lost in the base-10 translation, where _pi in %18.0g is 3.141592653589793 and float(_pi) is 3.141592741012573: Value of Pi in %21x in %18.0g -------------------------------------------------------- double +1.921fb54442d18X+001 3.141592653589793 float +1.921fb60000000X+001 3.141592741012573 ------- \ float looses 7 hex digits As another example of the use of %21x: How much inaccuracy is in there in the calculation sqrt(2)^2? . display %21x 2 _n %21x sqrt(2)^2 +1.0000000000000X+001 +1.0000000000001X+001 Answer: 1 bit. By the way, %21x can be used as an *INPUT* format as well as an output format in Stata, and you can even use it in expressions: . display (1.921X+1)/2 1.5705566 This is a great way to introduce constants in programs and be sure that you are using the same constant across platforms. The %21x notation was "invented" here at Stata. At least, I have never seen this compact notation used in any computer science or numerical analysis book. The four undocumented formats I want to mention are %16H, %16L, %8H, and %8L. These are exactly the bits of the floating-point number in IEEE format. %16H and %16L show the number in 8-byte format (double); %8H and %8L show the number in 4-byte (float) format. H shows the number written from left-to-right as is done by Suns and Macintoshes, L shows the number written from right-to-left as is done by Intel-based computers. The %16H format is almost readable (if you know what to look for), the %16L nearly always confuses me because you read right-to-left little chunks that are themselves written left-to-right, and I find the %8H and %8L formats unreadable because of a bit-shift in the IEEE 4-byte format. It does not matter, however, because this is what the computer wants to see: . display %16H _pi 400921fb54442d18 . display %8H _pi 40490fdb . display %16L _pi 182d4454fb210940 . display %8L _pi db0f4940 Patrick Jolly might have found these last four formats useful were he not so facil with IEEE format. It is convenient that Stata includes both H and L, so one does not have to visit different computers to see the numbers left-to-right or right-to-left. The %{8|16}{H|L} formats cannot be used as input formats, however. -- Bill wgould@stata.com * * For searches and help try: * http://www.stata.com/support/faqs/res/findit.html * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

