 Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

# st: RE: RE: Change roman to Arabic numerals

 From Nick Cox To "'statalist@hsphsun2.harvard.edu'" Subject st: RE: RE: Change roman to Arabic numerals Date Mon, 20 Dec 2010 17:45:21 +0000

```I pushed this a bit further. A help file (not included here) spells out the assumptions (and limitations) which can also be inferred from the code. I'll ask Kit Baum to put this up on SSC to complete the loop.

Mata isn't essential for this problem, but it makes the problem more fun.

*! 1.0.0 NJC 20 December 2010
program romantoarabic
version 9
syntax varname(string) [if] [in] , Generate(str)

quietly {
marksample touse, strok
count if `touse'
if r(N) == 0 error 2000

confirm new variable `generate'

tempvar work
gen `work' = upper(trim(itrim(`varlist'))) if `touse'
gen `generate' = .
mata : roman_to_arabic("`work'", "`generate'", "`touse'")

count if `work' != "" & `touse'
replace `generate' = . if `work' != "" & `touse'
}

if r(N) {
di _n as txt "Problematic input: "
list `varlist' if `work' != "" & `touse'
}
end

mata :

void roman_to_arabic(string scalar varname,
string scalar genname,
string scalar usename) {
string colvector work
real colvector y

work = st_sdata(., varname, usename)
y = J(rows(work), 1, 0)

y = y + 900 * (strpos(work, "CM") :> 0)
work = subinstr(work, "CM", "", .)
y = y + 400 * (strpos(work, "CD") :> 0)
work = subinstr(work, "CD", "", .)
y = y + 90 * (strpos(work, "XC") :> 0)
work = subinstr(work, "XC", "", .)
y = y + 40 * (strpos(work, "XL") :> 0)
work = subinstr(work, "XL", "", .)
y = y + 9 * (strpos(work, "IX") :> 0)
work = subinstr(work, "IX", "", .)
y = y + 4 * (strpos(work, "IV") :> 0)
work = subinstr(work, "IV", "", .)

while (sum(strpos(work, "M"))) {
y = y + 1000 * (strpos(work, "M") :> 0)
work = subinstr(work, "M", "", 1)
}

while (sum(strpos(work, "D"))) {
y = y + 500 * (strpos(work, "D") :> 0)
work = subinstr(work, "D", "", 1)
}

while (sum(strpos(work, "C"))) {
y = y + 100 * (strpos(work, "C") :> 0)
work = subinstr(work, "C", "", 1)
}

while (sum(strpos(work, "L"))) {
y = y + 50 * (strpos(work, "L") :> 0)
work = subinstr(work, "L", "", 1)
}

while (sum(strpos(work, "X"))) {
y = y + 10 * (strpos(work, "X") :> 0)
work = subinstr(work, "X", "", 1)
}

while (sum(strpos(work, "V"))) {
y = y + 5 * (strpos(work, "V") :> 0)
work = subinstr(work, "V", "", 1)
}

while (sum(strpos(work, "I"))) {
y = y + (strpos(work, "I") :> 0)
work = subinstr(work, "I", "", 1)
}

st_store(., genname, usename, y)
st_sstore(., varname, usename, work)
}

end

Nick
n.j.cox@durham.ac.uk

-----Original Message-----
From: owner-statalist@hsphsun2.harvard.edu [mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of Nick Cox
Sent: 17 December 2010 19:39
To: 'statalist@hsphsun2.harvard.edu'
Subject: st: RE: Change roman to Arabic numerals

I think anyone tempted to write this would be best advised to extract the subtraction parts of the syntax first, i.e. CM etc.

(Also, from what I recall IIII is sometimes allowed as a non-standard variant of IV.)

Here is one stab. This is a Mata function that works on a string vector of Roman numerals in upper case.

Example first:

. mata

: stuff = ("IV", "MCMIV")

: roman_to_arabic(stuff)
1      2
+---------------+
1 |     4   1904  |
+---------------+

: roman_to_arabic(stuff')
1
+--------+
1 |     4  |
2 |  1904  |
+--------+

: end

Code second:

mata :

real roman_to_arabic(string vector roman) {

numeric vector ro
string vector work
ro = J(rows(roman), cols(roman), 0)
work = roman

ro = ro + 900 * (strpos(work, "CM") :> 0)
work = subinstr(work, "CM", "", .)
ro = ro + 400 * (strpos(work, "CD") :> 0)
work = subinstr(work, "CD", "", .)
ro = ro + 90 * (strpos(work, "XC") :> 0)
work = subinstr(work, "XC", "", .)
ro = ro + 40 * (strpos(work, "XL") :> 0)
work = subinstr(work, "XL", "", .)
ro = ro + 9 * (strpos(work, "IX") :> 0)
work = subinstr(work, "IX", "", .)
ro = ro + 4 * (strpos(work, "IV") :> 0)
work = subinstr(work, "IV", "", .)

while (sum(strpos(work, "M"))) {
ro = ro + 1000 * (strpos(work, "M") :> 0)
work = subinstr(work, "M", "", 1)
}

while (sum(strpos(work, "D"))) {
ro = ro + 500 * (strpos(work, "D") :> 0)
work = subinstr(work, "D", "", 1)
}

while (sum(strpos(work, "C"))) {
ro = ro + 100 * (strpos(work, "C") :> 0)
work = subinstr(work, "C", "", 1)
}

while (sum(strpos(work, "L"))) {
ro = ro + 50 * (strpos(work, "L") :> 0)
work = subinstr(work, "L", "", 1)
}

while (sum(strpos(work, "X"))) {
ro = ro + 10 * (strpos(work, "X") :> 0)
work = subinstr(work, "X", "", 1)
}

while (sum(strpos(work, "V"))) {
ro = ro + 5 * (strpos(work, "V") :> 0)
work = subinstr(work, "V", "", 1)
}

while (sum(strpos(work, "I"))) {
ro = ro + (strpos(work, "I") :> 0)
work = subinstr(work, "I", "", 1)
}

return(ro)
}

end

Nick
n.j.cox@durham.ac.uk

-----Original Message-----
From: owner-statalist@hsphsun2.harvard.edu [mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of Lachenbruch, Peter
Sent: 17 December 2010 18:50
To: 'statalist@hsphsun2.harvard.edu'
Subject: st: Change roman to Arabic numerals

A colleague wants to generate Arabic numbers from Roman numerals and I was = wondering if anyone has written a routine for this.  She only has I to X so=  I suggested Gen numb=(rom=="I")+2*(rom=="2")+3*(rom=="3")+4*(rom=="4"=
)  etc.
This is OK for this application, but not if we have many numbers.  Of course the ordering  gets messed up - I, II, III, IV, IX, V, VI, VII, VIII, X so=  encode won't work  and gen numb=3Dreal(rom) won't do either.

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```