Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: Removing a particular expression from string variable


From   Enayetur Raheem <[email protected]>
To   "'[email protected]'" <[email protected]>
Subject   st: Removing a particular expression from string variable
Date   Wed, 12 Oct 2011 17:03:42 +0000

Dear listers

I am trying to remove a particular expression/pattern from the address field, and retain the remaining portion. I can extract the portion I want to remove, but could not extract the remaining part. 

Consider the following data:

clear
input str60 address
"#12-4905 Lakeway Drive, College Station, Texas 77845 USA"
"#12 - 673 Jasmine Street, Los Angeles, CA 90024"
"2376 First street, San Diego, CA 90126"
"66666 West Central St, Tempe AZ 80068"
"12345 Main St. Cambridge, MA 01238-1234"
"12345 Main St  Sommerville  MA 01239-2345"
"12345 Main St  Watertwon  MA 01239   USA"
end

I need to remove anything starting with "#" and ending with "-" in the beginning of the field. That is, I need to remove 

"#12-" from the first case
"#12 - " from the second case (note the space before and after the dash.
Other cases will remain intact. 

The following code gives me the expressions I want to remove. 
gen apt = regexs(0) if regexm(address, "(^[\#][0-9]+[ ]*[\-][ ]*)")

But I actually want to retain the remaining part. What would be the syntax for that? Any clue will be much appreciated. 

Thanks in advance.

Enayet



*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index