Stata: Data Analysis and Statistical Software

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: st: RE: Change roman to Arabic numerals

From	"Lachenbruch, Peter" <[email protected]>
To	"'[email protected]'" <[email protected]>
Subject	RE: st: RE: Change roman to Arabic numerals
Date	Thu, 23 Dec 2010 10:03:08 -0800

With the hex material this should have occurred on Oct 31

Tony

Peter A. Lachenbruch
Department of Public Health
Oregon State University
Corvallis, OR 97330
Phone: 541-737-3832
FAX: 541-737-4001


-----Original Message-----
From: [email protected] [mailto:[email protected]] On Behalf Of Michael N. Mitchell
Sent: Tuesday, December 21, 2010 3:53 PM
To: [email protected]
Subject: Re: st: RE: Change roman to Arabic numerals

This reminds me of a practical joke we played on a mainframe programmer who *loved* to 
read Hex dumps. We got some greenbar paper and wrote a program that generated something 
that looked exactly like a Hex dump, but instead filled it all in with Roman numbers. It 
looked exactly right, except that instead of "A6", it would say "XIVI". We printed it and 
stood near him puzzling over it. He grabbed it from us and said "Let the expert handle 
this...". He was befuddled and shook his head for a bit and then the light bulb went on 
and he threw it back at us, giving some choice exclamations about our youth and parental 
heritage.

Michael N. Mitchell
Data Management Using Stata      - http://www.stata.com/bookstore/dmus.html
A Visual Guide to Stata Graphics - http://www.stata.com/bookstore/vgsg.html
Stata tidbit of the week         - http://www.MichaelNormanMitchell.com



On 2010-12-21 2.43 PM, Kieran McCaul wrote:
> ...
>
> I suppose this means that while we have Stata for Windows, Stata for Unix, and Stata for the Mac, we may never have Stata for Romans.
>
>
>
>
> -----Original Message-----
> From: [email protected] [mailto:[email protected]] On Behalf Of Nick Cox
> Sent: Tuesday, 21 December 2010 7:18 PM
> To: '[email protected]'
> Subject: RE: st: RE: Change roman to Arabic numerals
>
> Sergiy is commenting on the code I posted last Friday and not the expanded code I posted on Monday. But his main points remain correct.
>
> My code indulges many numerals that might be considered improper. It doesn't have much error checking. This raises a practical question of whether that matters to any particular user and a question of principle of whether it's worth adding a constraint on improper numerals, e.g. by some kind of regular expression definition.
>
> My interest was piqued by this problem because it seemed an amusing little programming problem.
>
> However, my interest peaked when I ran into the kind of fuzziness that Sergiy also encountered, e.g. would IL be allowed as 49? Once you find that authorities disagree, or do not provide rigorous definitions, there is no longer an absolutely precise problem to be solved.
>
> When made public, my code and help file could serve as stepping-stones for anyone else who wanted to take the problem further.
>
> Nick
> [email protected]
>
> Sergiy Radyakin
>
> Peter writes that his Colleague had to deal only with Roman numbers in
> the range I-X. For that situation another code (one
> line) can be suggested:
> generate byte arabic =
> strpos("=====I====II===III==IV===V====VI===VII==VIII=IX===X====","="+roman+"=")/5
> if !missing(roman)
> Below is a demo. Obviously the code is not easily extended to the
> larger ranges, or to include dual notation for 4: "IV" and "IIII".
>
> Nick's code is doing a great job of converting the numbers both small
> and large, however it appears to be too robust,
> converting even misspelled Roman numerals, such as: IM (999), or IL
> (49). Both notations may occur in practice (depends
> on the source of data). Wikipedia denotes them as "would not be
> generally accepted". I think it would be great to modify the
> code to either report them as erroneous (misspelled Roman numeral), or
> convert based on the possible intuition of the
> respondents (999 and 49 correspondingly), but not to incorrect values
> (1001 and 51 correspondingly) as the current version
> does.
> Numbers like "IIIIIIIIIIIIIIIIIIIII" are converted correctly, but
> perhaps it's better to report an argument error in such a case.
>
> Similarly, not used, characters are currently not reported as errors
> (e.g. "K" in "XK", although one could encounter "K" in
> exotic medieval Roman numerals...). This wouldn't be a problem if the
> program handled all symbols of the Roman numerals,
> but e.g. "S" (half) is not handled (again see Wikipedia for reference).
>
> [...]
>
> On Fri, Dec 17, 2010 at 2:38 PM, Nick Cox<[email protected]>  wrote:
>> I think anyone tempted to write this would be best advised to extract the subtraction parts of the syntax first, i.e. CM etc.
>>
>> (Also, from what I recall IIII is sometimes allowed as a non-standard variant of IV.)
>>
>> Here is one stab. This is a Mata function that works on a string vector of Roman numerals in upper case.
>>
>> Example first:
>>
>> . mata
>>
>> : stuff = ("IV", "MCMIV")
>>
>> : roman_to_arabic(stuff)
>>           1      2
>>     +---------------+
>>   1 |     4   1904  |
>>     +---------------+
>>
>> : roman_to_arabic(stuff')
>>           1
>>     +--------+
>>   1 |     4  |
>>   2 |  1904  |
>>     +--------+
>>
>> : end
>>
>> Code second:
>>
>> mata :
>>
>> real roman_to_arabic(string vector roman) {
>>
>>         numeric vector ro
>>         string vector work
>>         ro = J(rows(roman), cols(roman), 0)
>>         work = roman
>>
>>         ro = ro + 900 * (strpos(work, "CM") :>  0)
>>         work = subinstr(work, "CM", "", .)
>>         ro = ro + 400 * (strpos(work, "CD") :>  0)
>>         work = subinstr(work, "CD", "", .)
>>         ro = ro + 90 * (strpos(work, "XC") :>  0)
>>         work = subinstr(work, "XC", "", .)
>>         ro = ro + 40 * (strpos(work, "XL") :>  0)
>>         work = subinstr(work, "XL", "", .)
>>         ro = ro + 9 * (strpos(work, "IX") :>  0)
>>         work = subinstr(work, "IX", "", .)
>>         ro = ro + 4 * (strpos(work, "IV") :>  0)
>>         work = subinstr(work, "IV", "", .)
>>
>>         while (sum(strpos(work, "M"))) {
>>                 ro = ro + 1000 * (strpos(work, "M") :>  0)
>>                 work = subinstr(work, "M", "", 1)
>>         }
>>
>>         while (sum(strpos(work, "D"))) {
>>                 ro = ro + 500 * (strpos(work, "D") :>  0)
>>                 work = subinstr(work, "D", "", 1)
>>         }
>>
>>         while (sum(strpos(work, "C"))) {
>>                 ro = ro + 100 * (strpos(work, "C") :>  0)
>>                 work = subinstr(work, "C", "", 1)
>>         }
>>
>>         while (sum(strpos(work, "L"))) {
>>                 ro = ro + 50 * (strpos(work, "L") :>  0)
>>                 work = subinstr(work, "L", "", 1)
>>         }
>>
>>         while (sum(strpos(work, "X"))) {
>>                 ro = ro + 10 * (strpos(work, "X") :>  0)
>>                 work = subinstr(work, "X", "", 1)
>>         }
>>
>>         while (sum(strpos(work, "V"))) {
>>                 ro = ro + 5 * (strpos(work, "V") :>  0)
>>                 work = subinstr(work, "V", "", 1)
>>         }
>>
>>         while (sum(strpos(work, "I"))) {
>>                 ro = ro + (strpos(work, "I") :>  0)
>>                 work = subinstr(work, "I", "", 1)
>>         }
>>
>>         return(ro)
>> }
>>
>> end
>
> Lachenbruch, Peter
>
>> A colleague wants to generate Arabic numbers from Roman numerals and I was = wondering if anyone has written a routine for this.  She only has I to X so=  I suggested Gen numb=(rom=="I")+2*(rom=="2")+3*(rom=="3")+4*(rom=="4"=
>> )  etc.
>> This is OK for this application, but not if we have many numbers.  Of course the ordering  gets messed up - I, II, III, IV, IX, V, VI, VII, VIII, X so=  encode won't work  and gen numb=3Dreal(rom) won't do either.
>
>
>
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
>
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

References:
- st: Change roman to Arabic numerals
  - From: "Lachenbruch, Peter" <[email protected]>
- st: RE: Change roman to Arabic numerals
  - From: Nick Cox <[email protected]>
- Re: st: RE: Change roman to Arabic numerals
  - From: Sergiy Radyakin <[email protected]>
- RE: st: RE: Change roman to Arabic numerals
  - From: Nick Cox <[email protected]>
- RE: st: RE: Change roman to Arabic numerals
  - From: "Kieran McCaul" <[email protected]>
- Re: st: RE: Change roman to Arabic numerals
  - From: "Michael N. Mitchell" <[email protected]>

Prev by Date: RE: RE: st: RE: Change roman to Arabic numerals
Next by Date: RE: st: Assigned sex
Previous by thread: Re: st: RE: Change roman to Arabic numerals
Next by thread: RE: st: RE: Change roman to Arabic numerals
Index(es):
- Date
- Thread