Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: regular expressions has too many literals


From   Steve Nakoneshny <scnakone@ucalgary.ca>
To   "statalist@hsphsun2.harvard.edu" <statalist@hsphsun2.harvard.edu>
Subject   Re: st: regular expressions has too many literals
Date   Mon, 25 Feb 2013 20:54:22 -0700

Why not something like -encode string,g(team)- and then -keep if team=1 | ... | team=n-? Same process, shorter expression.

Sent via carrier pigeon

On 2013-02-25, at 8:43 PM, "Dimitriy V. Masterov" <dvmaster@gmail.com> wrote:

> I would like to do something like this:
> 
> keep if regexm(string,"Buffalo Bills") | regexm(string,"Dallas
> Cowboys") | regexm(string,"Miami Dolphins") | regexm(string,"New York
> Giants") | regexm(string,"New England Patriots") |
> regexm(string,"Philadelphia Eagles") | regexm(string,"New York Jets")
> | regexm(string,"Washington Redskins") | regexm(string,"Baltimore
> Ravens") | regexm(string,"Chicago Bears") | regexm
> (string,"Cincinnati Bengals") | regexm(string,"Detroit Lions") |
> regexm(string,"Cleveland Browns") | regexm(string,"Green Bay Packers")
> | regexm(string,"Pittsburgh Steelers") | regexm(s    tring,"Minnesota
> Vikings") | regexm(string,"Houston Texans") | regexm(string,"Atlanta
> Falcons") | regexm(string,"Indianapolis Colts") |
> regexm(string,"Carolina Panthers") | regexm(string,"Jacksonville
> Jaguars") | regexm(string,"New Orleans Saints") |
> regexm(string,"Tennessee Titans") | regexm(string,"Tampa Bay
> Buccaneers") | regexm(string,"Denver Broncos") |
> regexm(string,"Arizona Cardinals") | regexm(string,"Kansas City
> Chiefs") | regexm(string,"San Francisco 49ers") |
> regexm(string,"Oakland Raiders") | regexm(string,"Seattle Seahawks") |
> regexm(string,"San Diego Chargers") | regexm(string,"St. Louis Rams")
> 
> Just looking at this, you know the expression is too long for Stata to
> evaluate. Is the only way around this to loop over the 32 team names
> like this:
> 
> gen keepers = .
> foreach team in "Buffalo Bills" "Dallas Cowboys" "Miami Dolphins" "New
> York Giants" "New England Patriots" "Philadelphia Eagles" "New York
> Jets" "Washington Redskins" "Baltimore Ravens" "Chicago Bears"
> "Cincinnati Bengals" "Detroit Lions" "Cleveland Browns" "Green Bay
> Packers" "Pittsburgh Steelers" "Minnesota Vikings" "Houston Texans"
> "Atlanta Falcons" "Indianapolis Colts" "Carolina Panthers"
> "Jacksonville Jaguars" "New Orleans Saints" "Tennessee Titans" "Tampa
> Bay Buccaneers" "Denver Broncos" "Arizona Cardinals" "Kansas City
> Chiefs" "San Francisco 49ers" "Oakland Raiders" "Seattle Seahawks"
> "San Diego Chargers" "St. Louis Rams" {
>     replace keepers = 1 if regexm(string,"`team'")
> }
> keep if keepers ==1
> 
> Or is there a more clever way?
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/faqs/resources/statalist-faq/
> *   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index