gen add =regexs(1) if regexm(address,"(.+)\,")
also works. Stata's regex parser matches the first occurrence
-Steve
On Mon, Jan 11, 2010 at 9:55 AM, joe j <[email protected]> wrote:
> thank you!
>
> On Mon, Jan 11, 2010 at 2:38 PM, Martin Weiss <[email protected]> wrote:
>>
>> <>
>>
>> If you do insist on using -string- functions (see [D], p. 224):
>>
>>
>> *************
>> clear
>> input str60 address
>> "4905 Lakeway Drive, College Station, Texas 77845 USA"
>> "673 Jasmine Street, Los Angeles, CA 90024"
>> "2376 First street, San Diego, CA 90126"
>> "6 West Central St, Tempe AZ 80068"
>> "1234 Main St. Cambridge, MA 01238-1234"
>> end
>>
>> compress
>>
>> gen str25 first=substr(address, 1, strpos(address, ",")-1)
>> l address first, noo
>> *************
>>
>>
>>
>> HTH
>> Martin
>>
>>
>> -----Ursprüngliche Nachricht-----
>> Von: [email protected]
>> [mailto:[email protected]] Im Auftrag von joe j
>> Gesendet: Montag, 11. Januar 2010 14:23
>> An: [email protected]
>> Betreff: Re: st: AW: regular expression matching
>>
>> fantastic! thanks much Martin.
>>
>> On Mon, Jan 11, 2010 at 2:07 PM, Martin Weiss <[email protected]> wrote:
>>>
>>> <>
>>>
>>>
>>>
>>> *************
>>> clear
>>> input str60 address
>>> "4905 Lakeway Drive, College Station, Texas 77845 USA"
>>> "673 Jasmine Street, Los Angeles, CA 90024"
>>> "2376 First street, San Diego, CA 90126"
>>> "6 West Central St, Tempe AZ 80068"
>>> "1234 Main St. Cambridge, MA 01238-1234"
>>> end
>>>
>>> split address, parse(,)
>>> ren address1 first
>>>
>>> l address first, noo
>>> *************
>>>
>>>
>>>
>>> HTH
>>> Martin
>>>
>>>
>>> -----Ursprüngliche Nachricht-----
>>> Von: [email protected]
>>> [mailto:[email protected]] Im Auftrag von joe j
>>> Gesendet: Montag, 11. Januar 2010 14:03
>>> An: [email protected]
>>> Betreff: st: regular expression matching
>>>
>>> >From a string address variable I want to extract the portion of the
>>> text preceding the 'first' comma.
>>>
>>> Let me illustrate this with the following example:
>>>
>>> clear
>>> input str60 address
>>> "4905 Lakeway Drive, College Station, Texas 77845 USA"
>>> "673 Jasmine Street, Los Angeles, CA 90024"
>>> "2376 First street, San Diego, CA 90126"
>>> "6 West Central St, Tempe AZ 80068"
>>> "1234 Main St. Cambridge, MA 01238-1234"
>>> end
>>>
>>> >From the address column, I want to create a column named First:
>>>
>>> 4905 Lakeway Drive
>>> 673 Jasmine Street
>>> 2376 First street
>>> 6 West Central St
>>> 1234 Main St. Cambridge
>>>
>>> I tried the following:
>>> gen first = regexs(1) if (regexm(address, "(.*)[,]"))
>>>
>>> This however extracts everything in address preceding the last comma,
>>> not the first comma.
>>>
>>> Any pointers would be appreciated.
>>> JJ
>>> *
>>> * For searches and help try:
>>> * http://www.stata.com/help.cgi?search
>>> * http://www.stata.com/support/statalist/faq
>>> * http://www.ats.ucla.edu/stat/stata/
>>>
>>>
>>> *
>>> * For searches and help try:
>>> * http://www.stata.com/help.cgi?search
>>> * http://www.stata.com/support/statalist/faq
>>> * http://www.ats.ucla.edu/stat/stata/
>>>
>>
>> *
>> * For searches and help try:
>> * http://www.stata.com/help.cgi?search
>> * http://www.stata.com/support/statalist/faq
>> * http://www.ats.ucla.edu/stat/stata/
>>
>>
>> *
>> * For searches and help try:
>> * http://www.stata.com/help.cgi?search
>> * http://www.stata.com/support/statalist/faq
>> * http://www.ats.ucla.edu/stat/stata/
>>
>
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/statalist/faq
> * http://www.ats.ucla.edu/stat/stata/
>
--
Steven Samuels
[email protected]
18 Cantine's Island
Saugerties NY 12477
USA
845-246-0774
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/