Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
From | brendan.halpin@ul.ie (Brendan Halpin) |
To | statalist@hsphsun2.harvard.edu |
Subject | Re: st: regular expression or some simpler data extraction method |
Date | Wed, 16 Nov 2011 20:18:37 +0000 |
On Wed, Nov 16 2011, Ben Hoen wrote: > I have a number of possible string variations from which I am trying to > extract a portion of, and am having trouble figuring out the correct regular > expression, or, for that matter, if I can punt and use another (hopefully > simpler) expression. Your regex doesn't need to describe the whole string. Assuming " WT" is the end of each example, the following should work: |. input str20 phase | | phase | 1. "1 PV, 5 CC, 37 WT" | 2. "101 WT" | 3. "2 PV, 9 WT" | 4. "1 WT" | 5. "38 WT" | 6. end | |. |. gen nwt = real(regexs(1)) if regexm(phase,"([0-9]+) WT$") | |. |. list | | +-------------------------+ | | phase nwt | | |-------------------------| | 1. | 1 PV, 5 CC, 37 WT 37 | | 2. | 101 WT 101 | | 3. | 2 PV, 9 WT 9 | | 4. | 1 WT 1 | | 5. | 38 WT 38 | | +-------------------------+ Brendan -- Brendan Halpin, Department of Sociology, University of Limerick, Ireland Tel: w +353-61-213147 f +353-61-202569 h +353-61-338562; Room F1-009 x 3147 mailto:brendan.halpin@ul.ie ULSociology on Facebook: http://on.fb.me/fjIK9t http://teaching.sociology.ul.ie/bhalpin/wordpress twitter:@ULSociology * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/