Notice: On March 31, it was **announced** that Statalist is moving from an email list to a **forum**.

Nick Cox <njcoxstata@gmail.com>

statalist@hsphsun2.harvard.edu |

Re: st: String function headache.

Mon, 25 Apr 2011 10:47:48 +0100

To expand on this, with problem-solving hints. Learning software from definitions is like learning mathematics from definitions. If you know the concept already, or are super-smart, you can see immediately what is implied. The rest of us need examples. In my class learning mathematics in secondary [high] school, there was one guy who always seemed to understand each new mathematical idea immediately. (He became a mountaineer, but that is a different story: http://en.wikipedia.org/wiki/Alan_Rouse ). Almost all the rest of us needed examples. (In fact I now guess that he sometimes played small psychological games with us, as usually he had read ahead on his own.) I don't think I've ever used -strmatch()- before answering this question. I've always used -strpos()- for finding literal matches or turned to -regex*()-. That just means what it says, but I had to find out too quite how -strmatch()- works. In my experience, as in Scott's example, the real problem involves a dataset I care about with variables. But when I don't understand, I fire up -display- and play with very simple examples. I found this. In looking for a literal character, an pattern expression matches itself, . di strmatch("2", "2") 1 but matching means matching, not inclusion: . di strmatch("42", "2") 0 You need the pattern to be big enough . di strmatch("42", "?2") 1 . di strmatch("42", "*2") 1 . di strmatch("42", "*2*") 1 A silly analogy: will a shirt fit you? If it's too small, the answer is just a No. If it fits exactly, or it's bigger than you are, the answer is a Yes, and you then have to decide whether too big is a problem or not. (No for formal wear, possibly OK if you want something really loose.) Similarly with -strmatch()- the pattern can be bigger than you need, but the answer will still be a Yes. On Mon, Apr 25, 2011 at 9:28 AM, Nick Cox <njcoxstata@gmail.com> wrote: > If you want to check for occurrence, just use -strpos()- instead. I > often see people on this list struggling with the regex functions or > -strmatch()- when a simpler function will do the job. I have offered a > talk on functions for the London users' meeting and this point is > already one of the slides. > > foreach y in # { > forvalues x=1/6 { > replace mynumber `x'= strpos(mystring`x', "`y'") > 0 > } > > Otherwise, my understanding is this: a pattern that is just a literal > character will be matched only by strings that are exactly that > character; for almost all matching problems, you must specify * and/or > ?. You seem to be expecting -strmatch()- to behave more like > -regexm()-, but they have different jobs. > > But as said -strpos()- is easier to figure out. > > Nick > > On Mon, Apr 25, 2011 at 4:45 AM, Scott Talkington <talkings@gmu.edu> wrote: >> I just can't seem to make this work. What I want to do is search for any >> occurrence of the "#" character in a string variable and set a flag for that >> observation. I'm searching 6 different strings labeled something like >> mystring1 mystring2 etc. and the flags are mynumber1 mynumber2 etc.. >> >> So my do file: >> >> forvalues x=1/6 { >> foreach y in # { >> replace mynumber `x'= strmatch(mistring`x', "`y'") >> } >> } >> >> I just listed one character in the y list above, but in reality I'm not >> having a problem with normal strings like "APT" but with wildcards and with >> the number sign character itself. >> >> I assumed that placing a "?" character iyn the search string (s2) would >> match zero or one characters + the "#" but it seems to be matching all >> strings with one character that are either a number or a letter. Huh? >> >> If I include the wildcard (either the asterisk or the question mark) >> *anywhere* (either in the "foreach" part of the do file or in the "replace" >> command) it just doesn't work the way I expect it to. There's a difference >> between what I get depending on how many quotes I use and where as well, >> but I'm just not getting anything that does what I want it to. I've even >> tried using the backslash character to indicate that I don't want the "#" to >> be read as an operator, but I'm not even sure where to put the backslash or >> how to arrange the quotation marks. It's driving me nuts. There's some >> rule here that I'm just not getting. > * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

