Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: regular expression or some simpler data extraction method


From   Matthew White <[email protected]>
To   [email protected]
Subject   Re: st: regular expression or some simpler data extraction method
Date   Wed, 16 Nov 2011 15:01:27 -0500

Hi Ben,

Scratch that; the "[ ,]?" isn't a good idea. The following should work
as long as there aren't codes other than "WT" that start with "WT":

gen vi_tnum = substr(regexs(0), 1, strpos(regexs(0), " ") - 1) if
regexm(phase, "[0-9]+ WT[ ,]?")
destring vi_tnum, replace

Best,
Matt

On Wed, Nov 16, 2011 at 2:56 PM, Matthew White
<[email protected]> wrote:
> Hi Ben,
>
> How about:
> gen vi_tnum = regexs(0) if regexm(phase, "[0-9]+ WT[ ,]?")
>
> Best,
> Matt
>
> On Wed, Nov 16, 2011 at 2:34 PM, Ben Hoen <[email protected]> wrote:
>> I have a number of possible string variations from which I am trying to
>> extract a portion of, and am having trouble figuring out the correct regular
>> expression, or, for that matter, if I can punt and use another (hopefully
>> simpler) expression.
>>
>> The string variable is named "phase"
>>
>> 1 PV, 5 CC, 37 WT
>> 101 WT
>> 2 PV, 9 WT
>> 1 WT
>> 38 WT
>>
>> All I am concerned with is the number directly in front of "WT".  That
>> number can be 1, 2 or 3 digits.
>>
>> The closest iteration was:
>> gen vi_tnum=regexs(1) if regexm(phase, "[ 0-9A-Z]?[\,]?([0-9]+)[ A-Z]+$")
>>
>> but that produced a result that had its leading digit truncated for any
>> number with more than one digit.
>>
>> Any thoughts, oh brilliant ones?
>>
>> Ben Hoen
>> Principal Research Associate
>> Lawrence Berkeley National Laboratory
>> Office: 845-758-1896
>> Cell: 718-812-7589
>> [email protected]
>> http://eetd.lbl.gov/ea/emp/staff/hoen.html
>>
>>
>>
>> *
>> *   For searches and help try:
>> *   http://www.stata.com/help.cgi?search
>> *   http://www.stata.com/support/statalist/faq
>> *   http://www.ats.ucla.edu/stat/stata/
>>
>
>
>
> --
> Matthew White
> Data Coordinator
> Innovations for Poverty Action
> 101 Whitney Avenue, New Haven, CT 06510 USA
> +1 434-305-9861
> www.poverty-action.org
>



-- 
Matthew White
Data Coordinator
Innovations for Poverty Action
101 Whitney Avenue, New Haven, CT 06510 USA
+1 434-305-9861
www.poverty-action.org

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index