Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down at the end of May, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: How to get rid of leading and trailing letters and symbols?


From   Steve Nakoneshny <scnakone@ucalgary.ca>
To   "statalist@hsphsun2.harvard.edu" <statalist@hsphsun2.harvard.edu>
Subject   Re: st: How to get rid of leading and trailing letters and symbols?
Date   Wed, 26 Oct 2011 08:41:53 -0600

Assuming that the id element you want to extract is always preceded by "id=" and is always a string of 7-digits, try this:

gen newid = regexs(2) if (regexm(oldid, "(([i][d][=]+)*[0-9][0-9][0-9][0-9][0-9][0-9][0-9])"))


It should extract only the numeric component. In order to populate an indicator variable relating to the relative position in the string, I would suggest looking up -help strpos()- and -help length()-. 


Steve

On 2011-10-26, at 4:00 AM, Ulrich Kohler wrote:

> Ekatarina,
> 
> you should get that using regular expressions (see help regexp). I don't
> use regular expression very often in Stata, but in my favourite Editor,
> Emacs, the regular expression to find a number of arbitrary length
> would be 
> 
> \(\[0-9]+\)
> 
> which would store the number in \1. The Stata regular expression should
> work very similar. 
> 
> Uli
> 
> 
> 
> 
> Am Mittwoch, den 26.10.2011, 10:37 +0100 schrieb Ekaterina Hertog:
>> Dear All,
>> I have got a dataset where the id variable is a part of a web-link. It 
>> can contain letters followed by the id number: (e.g. 
>> /profile/?id=9596986) or it can contain the id number in the middle 
>> (e.g. /profile/?id=9591886&reftype=detail). I need to create a variable 
>> which will only contain the number that is part of the id variable. I 
>> also need to be able to distinguish between the cases where the number 
>> is trailing vs. cases where it is in the middle. I looked at the advice 
>> available on removing leading or trailing 0s in Stata 11 
>> (http://www.stata.com/support/faqs/data/leadingzeros.html), but in my 
>> case I cannot actually specify the letters and symbols that lead or 
>> trail so I am stuck. I use Stata 11.
>> Any advice will be very much appreciated,
>> Ekaterina
>> 
>> *
>> *   For searches and help try:
>> *   http://www.stata.com/help.cgi?search
>> *   http://www.stata.com/support/statalist/faq
>> *   http://www.ats.ucla.edu/stat/stata/
> 
> 
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index