[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

From |
wgould@stata.com (William Gould, StataCorp LP) |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: tempfile already exists |

Date |
Thu, 30 Aug 2007 09:23:38 -0500 |

After my posting about how tempfiles are named, Jeph Herrin <junk@spandrel.net> wrote, > One oddity: base-34? and then followed up, > I know well what base-34 means, but I have never > seen used before. It seems a strange choice - even > base-36 makes more sense. I have no good explanation for our choice of base 34 over, say, base 32 or even base 36. First, some of you may be wondering why any base other than base 10. In base 34, the top digit is "X". We use six base-34 digits in a tempfile name, so that largest number, XXXXXX, equals 1,544,804,415 in base 10. Thus we are able to record in six characters what would take ten characters in base 10. That was important in the old days of the 8.3 filenaming convention. Now, had the programmer (that would be me) been thinking clearly, he would have chosen base 32 because one can convert from base 2 to base 32 more quickly than one can convert from base 2 to base 34. Base 2 is important because that is how numbers are actually stored inside the computer. Anyway, for base 2 to 34 conversion, there's a trick: take the base-2 number in 5-digit groups and separately translate each group into a single base-32 digit. Each digit can be translated separately! For instance, the base-10 number 42 in base 2 is 101010. To totranslate to base 32, start by noting that 101010 in 5-bit groups is 1-01010, where "-" is just a dash, not a minus. 1 base 2 translates to 1 base 32. 01010 base 2 translates to "A" base 32. Thus, the base-32 equivalent is 1A. It's rather fun to prove that, given two bases that are powers of each other, one can translate between them a digit at a time. Jeph Herrin <junk@spandrel.net> wrote "even base 36 makes more sense". Jeph is right. Even though, binarywise, there is nothing special about 36, base 36 would have used all the letters, and then the largest number would have been ZZZZZZ, equal to 2,176,782,335 in base 10. Interestingly, someone at Stata noted the conceptual mitake in the use of base 34 for making tempfile names because, the routine that makes tempory variable names indeed uses base 36: temporary variable names in stata are of the form two underscores followed by a six-digit base-36 number. -- Bill wgould@stata.com * * For searches and help try: * http://www.stata.com/support/faqs/res/findit.html * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

- Prev by Date:
**Re: st: Data conversion from plot level to household level** - Next by Date:
**st: Time Series/ arima postestimation- How to forecast more than one-step-ahead?** - Previous by thread:
**Re: st: tempfile already exists** - Next by thread:
**RE: st: tempfile already exists** - Index(es):

© Copyright 1996–2014 StataCorp LP | Terms of use | Privacy | Contact us | What's new | Site index |