Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: regular expression or some simpler data extraction method


From   "Ben Hoen" <bhoen@lbl.gov>
To   <statalist@hsphsun2.harvard.edu>
Subject   st: regular expression or some simpler data extraction method
Date   Wed, 16 Nov 2011 14:34:11 -0500

I have a number of possible string variations from which I am trying to
extract a portion of, and am having trouble figuring out the correct regular
expression, or, for that matter, if I can punt and use another (hopefully
simpler) expression.  

The string variable is named "phase"

1 PV, 5 CC, 37 WT
101 WT
2 PV, 9 WT
1 WT
38 WT

All I am concerned with is the number directly in front of "WT".  That
number can be 1, 2 or 3 digits.

The closest iteration was:
gen vi_tnum=regexs(1) if regexm(phase, "[ 0-9A-Z]?[\,]?([0-9]+)[ A-Z]+$")

but that produced a result that had its leading digit truncated for any
number with more than one digit.  

Any thoughts, oh brilliant ones?

Ben Hoen
Principal Research Associate
Lawrence Berkeley National Laboratory
Office: 845-758-1896
Cell: 718-812-7589
bhoen@lbl.gov
http://eetd.lbl.gov/ea/emp/staff/hoen.html



*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index