Notice: On March 31, it was **announced** that Statalist is moving from an email list to a **forum**. The old list will shut down on April 23, and its replacement, **statalist.org** is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
Steve Nakoneshny <scnakone@ucalgary.ca> |

To |
"statalist@hsphsun2.harvard.edu" <statalist@hsphsun2.harvard.edu> |

Subject |
Re: st: How to get rid of leading and trailing letters and symbols? |

Date |
Wed, 26 Oct 2011 08:41:53 -0600 |

Assuming that the id element you want to extract is always preceded by "id=" and is always a string of 7-digits, try this: gen newid = regexs(2) if (regexm(oldid, "(([i][d][=]+)*[0-9][0-9][0-9][0-9][0-9][0-9][0-9])")) It should extract only the numeric component. In order to populate an indicator variable relating to the relative position in the string, I would suggest looking up -help strpos()- and -help length()-. Steve On 2011-10-26, at 4:00 AM, Ulrich Kohler wrote: > Ekatarina, > > you should get that using regular expressions (see help regexp). I don't > use regular expression very often in Stata, but in my favourite Editor, > Emacs, the regular expression to find a number of arbitrary length > would be > > \(\[0-9]+\) > > which would store the number in \1. The Stata regular expression should > work very similar. > > Uli > > > > > Am Mittwoch, den 26.10.2011, 10:37 +0100 schrieb Ekaterina Hertog: >> Dear All, >> I have got a dataset where the id variable is a part of a web-link. It >> can contain letters followed by the id number: (e.g. >> /profile/?id=9596986) or it can contain the id number in the middle >> (e.g. /profile/?id=9591886&reftype=detail). I need to create a variable >> which will only contain the number that is part of the id variable. I >> also need to be able to distinguish between the cases where the number >> is trailing vs. cases where it is in the middle. I looked at the advice >> available on removing leading or trailing 0s in Stata 11 >> (http://www.stata.com/support/faqs/data/leadingzeros.html), but in my >> case I cannot actually specify the letters and symbols that lead or >> trail so I am stuck. I use Stata 11. >> Any advice will be very much appreciated, >> Ekaterina >> >> * >> * For searches and help try: >> * http://www.stata.com/help.cgi?search >> * http://www.stata.com/support/statalist/faq >> * http://www.ats.ucla.edu/stat/stata/ > > > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/statalist/faq > * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**References**:**st: sgini and negative incomes***From:*"antonella.tantari@libero.it" <antonella.tantari@libero.it>

**st: RE: sgini and negative incomes***From:*Nick Cox <n.j.cox@durham.ac.uk>

**st: How to get rid of leading and trailing letters and symbols?***From:*Ekaterina Hertog <ekaterina.hertog@sociology.ox.ac.uk>

**Re: st: How to get rid of leading and trailing letters and symbols?***From:*Ulrich Kohler <kohler@wzb.eu>

- Prev by Date:
**st: access to repec.org restored** - Next by Date:
**Re: st: RE: Leap years in computing age** - Previous by thread:
**RE: st: How to get rid of leading and trailing letters and symbols?** - Next by thread:
**st: How to proceed a landmark survival analysis (tests and plots)?** - Index(es):