Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: puzzling string conversion


From   Dimitri Szerman <[email protected]>
To   statalist <[email protected]>
Subject   st: puzzling string conversion
Date   Thu, 10 Feb 2011 14:57:32 +0000

Hi again,

I got this puzzling result. I have a string variable, mystring, which
has both numeric and non-numeric characters. I'd like to extract only
the numeric ones, and form a numeric variable with this (in fact, it's
going to be an id). I'm using regular expressions, and this is what
I'm doing

input str30 mystring
"111.aaa.22.2/33-33"
"011.xyz.22.2/33-33"
"101.abc.22.2/33-33"
"222.foo.22.2/33-33"
"111.bla.22.2/33-33"
end

gen id = mystring
while regexm(id, "[^0-9]" ) {
 replace id = regexr(id,"[^0-9]","")
}
destring id, gen(numid)

And it works fine. However, if mystring has an observation which
contains very few (when compared to the other observations)
non-numeric characters, this seems to break down:

clear
input str30 mystring
"A"
"011.xyz.22.2/33-33"
"101.abc.22.2/33-33"
"222.foo.22.2/33-33"
"111.bla.22.2/33-33"
end

gen id = mystring
while regexm(id, "[^0-9]" ) {
 replace id = regexr(id,"[^0-9]","")
}
destring id, gen(numid)

Am I missing something? Why doesn't this work? Any suggestions?

Thanks,
Dimitri
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index