Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
From | "Ben Hoen" <bhoen@lbl.gov> |
To | <statalist@hsphsun2.harvard.edu> |
Subject | RE: st: regular expression or some simpler data extraction method |
Date | Wed, 16 Nov 2011 15:27:25 -0500 |
Thanks again Mathew & Brendan. I realized that I had changed the variable name in the meantime to "phase_description", which was causing the type mismatch error. This syntax worked great! gen vi_tnum = regexs(1) if regexm(phase_description, "([0-9]+) WT$") Ben Hoen LBNL Office: 845-758-1896 Cell: 718-812-7589 -----Original Message----- From: Ben Hoen [mailto:bhoen@lbl.gov] Sent: Wednesday, November 16, 2011 3:22 PM To: statalist@hsphsun2.harvard.edu Subject: Re: st: regular expression or some simpler data extraction method Thanks Mathew. That didn't seem to work. I am getting a "type mismatch" error. Based on your first response I also tried: gen vi_tnum = regexs(1) if regexm(phase, "[\, ]?([0-9]+) WT$") and gen vi_tnum = regexs(1) if regexm(phase, "[\, ]?([0-9]+)[ WT]$") and got the same "type mismatch" error, so maybe they are related. I tried these because WT is always the end of the string, therefore any comma would necessarily precede the digits and the WT. Maybe that was not clear originally. Ben Ben Hoen Principal Research Associate Lawrence Berkeley National Laboratory Office: 845-758-1896 Cell: 718-812-7589 bhoen@lbl.gov http://eetd.lbl.gov/ea/emp/staff/hoen.html Re: st: regular expression or some simpler data extraction method ________________________________________ From Matthew White <mwhite@poverty-action.org> To statalist@hsphsun2.harvard.edu Subject Re: st: regular expression or some simpler data extraction method Date Wed, 16 Nov 2011 15:01:27 -0500 ________________________________________ Hi Ben, Scratch that; the "[ ,]?" isn't a good idea. The following should work as long as there aren't codes other than "WT" that start with "WT": gen vi_tnum = substr(regexs(0), 1, strpos(regexs(0), " ") - 1) if regexm(phase, "[0-9]+ WT[ ,]?") destring vi_tnum, replace * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/