Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: st: Regular expressions


From   Joe Canner <[email protected]>
To   "[email protected]" <[email protected]>
Subject   RE: st: Regular expressions
Date   Fri, 7 Mar 2014 13:41:05 +0000

Good point about the first digit :)

-----Original Message-----
From: [email protected] [mailto:[email protected]] On Behalf Of Nick Cox
Sent: Friday, March 07, 2014 8:38 AM
To: [email protected]
Subject: Re: st: Regular expressions

Unsurprisingly, this is almost identical to Joe's

I feel confident that the first digit must be 1 or 2.
Nick
[email protected]


On 7 March 2014 13:35, Nick Cox <[email protected]> wrote:
> clear
> set obs 1
> gen test = "Robin Hood (2000)"
> gen test2 = trim(regexr(test, "(\([1-2][0-9][0-9][0-9]\))", ""))
> list
> Nick
> [email protected]
>
>
> On 7 March 2014 13:28, Marco Savegnago <[email protected]> wrote:
>> Dear all,
>> as regard point 1) this might work:
>>
>> gen movie2 = rtrim(substr(movie, 1, index(movie, "(") - 1))
>>
>> I thinks it works as long as the title of the movie does not contain
>> other round brackets except those for the year.
>>
>> What do you think?
>> best,
>> Marco
>>
>> 2014-03-07 12:49 GMT+01:00 Nick Cox <[email protected]>:
>>> Your second problem sounds like for -split-. I wouldn't reach for
>>> regular expressions there.
>>> Nick
>>> [email protected]
>>>
>>>
>>> On 7 March 2014 11:40, Estrella Gomez <[email protected]> wrote:
>>>> Hi,
>>>>
>>>> I would like to do two modifications to two string variables using
>>>> regular expression:
>>>>
>>>> 1) I have a list of movie titles with a year included; for instance:
>>>> "Robin Hood (2010)". I would like to drop the years and the
>>>> parenthesis, so the final value should be "Robin Hood". The number of
>>>> words in the title varies a lot across movies
>>>>
>>>> 2) I have a variable indicating where the movie was produced. In some
>>>> cases there are several countries, for instance "UK, Germany, Canada,
>>>> Switzerland". I would like to generate one variable per country (1st
>>>> variable take value UK, 2nd Germany and so on). Again, the number of
>>>> countries per movie is not fixed; it varies from 1 to 4
>>>>
>>>> Any suggestion?
>>>>
>>>> Thanks a lot,
>>>> Estrella
>>>> *
>>>> *   For searches and help try:
>>>> *   http://www.stata.com/help.cgi?search
>>>> *   http://www.stata.com/support/faqs/resources/statalist-faq/
>>>> *   http://www.ats.ucla.edu/stat/stata/
>>> *
>>> *   For searches and help try:
>>> *   http://www.stata.com/help.cgi?search
>>> *   http://www.stata.com/support/faqs/resources/statalist-faq/
>>> *   http://www.ats.ucla.edu/stat/stata/
>> *
>> *   For searches and help try:
>> *   http://www.stata.com/help.cgi?search
>> *   http://www.stata.com/support/faqs/resources/statalist-faq/
>> *   http://www.ats.ucla.edu/stat/stata/
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index