Notice: On March 31, it was **announced** that Statalist is moving from an email list to a **forum**. The old list will shut down on April 23, and its replacement, **statalist.org** is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
Scott Talkington <talkings@gmu.edu> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: String function headache. |

Date |
Mon, 25 Apr 2011 09:08:56 -0400 |

--Scott On 4/25/2011 5:47 AM, Nick Cox wrote:

To expand on this, with problem-solving hints. Learning software from definitions is like learning mathematics from definitions. If you know the concept already, or are super-smart, you can see immediately what is implied. The rest of us need examples. In my class learning mathematics in secondary [high] school, there was one guy who always seemed to understand each new mathematical idea immediately. (He became a mountaineer, but that is a different story: http://en.wikipedia.org/wiki/Alan_Rouse ). Almost all the rest of us needed examples. (In fact I now guess that he sometimes played small psychological games with us, as usually he had read ahead on his own.) I don't think I've ever used -strmatch()- before answering this question. I've always used -strpos()- for finding literal matches or turned to -regex*()-. That just means what it says, but I had to find out too quite how -strmatch()- works. In my experience, as in Scott's example, the real problem involves a dataset I care about with variables. But when I don't understand, I fire up -display- and play with very simple examples. I found this. In looking for a literal character, an pattern expression matches itself, . di strmatch("2", "2") 1 but matching means matching, not inclusion: . di strmatch("42", "2") 0 You need the pattern to be big enough . di strmatch("42", "?2") 1 . di strmatch("42", "*2") 1 . di strmatch("42", "*2*") 1 A silly analogy: will a shirt fit you? If it's too small, the answer is just a No. If it fits exactly, or it's bigger than you are, the answer is a Yes, and you then have to decide whether too big is a problem or not. (No for formal wear, possibly OK if you want something really loose.) Similarly with -strmatch()- the pattern can be bigger than you need, but the answer will still be a Yes. On Mon, Apr 25, 2011 at 9:28 AM, Nick Cox<njcoxstata@gmail.com> wrote:If you want to check for occurrence, just use -strpos()- instead. I often see people on this list struggling with the regex functions or -strmatch()- when a simpler function will do the job. I have offered a talk on functions for the London users' meeting and this point is already one of the slides. foreach y in # { forvalues x=1/6 { replace mynumber `x'= strpos(mystring`x', "`y'")> 0 } Otherwise, my understanding is this: a pattern that is just a literal character will be matched only by strings that are exactly that character; for almost all matching problems, you must specify * and/or ?. You seem to be expecting -strmatch()- to behave more like -regexm()-, but they have different jobs. But as said -strpos()- is easier to figure out. Nick On Mon, Apr 25, 2011 at 4:45 AM, Scott Talkington<talkings@gmu.edu> wrote:I just can't seem to make this work. What I want to do is search for any occurrence of the "#" character in a string variable and set a flag for that observation. I'm searching 6 different strings labeled something like mystring1 mystring2 etc. and the flags are mynumber1 mynumber2 etc.. So my do file: forvalues x=1/6 { foreach y in # { replace mynumber `x'= strmatch(mistring`x', "`y'") } } I just listed one character in the y list above, but in reality I'm not having a problem with normal strings like "APT" but with wildcards and with the number sign character itself. I assumed that placing a "?" character iyn the search string (s2) would match zero or one characters + the "#" but it seems to be matching all strings with one character that are either a number or a letter. Huh? If I include the wildcard (either the asterisk or the question mark) *anywhere* (either in the "foreach" part of the do file or in the "replace" command) it just doesn't work the way I expect it to. There's a difference between what I get depending on how many quotes I use and where as well, but I'm just not getting anything that does what I want it to. I've even tried using the backslash character to indicate that I don't want the "#" to be read as an operator, but I'm not even sure where to put the backslash or how to arrange the quotation marks. It's driving me nuts. There's some rule here that I'm just not getting.* * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

* * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**Re: st: String function headache.***From:*Nick Cox <njcoxstata@gmail.com>

**References**:**st: String function headache.***From:*Scott Talkington <talkings@gmu.edu>

**Re: st: String function headache.***From:*Nick Cox <njcoxstata@gmail.com>

**Re: st: String function headache.***From:*Nick Cox <njcoxstata@gmail.com>

- Prev by Date:
**st: Impulse response functions from mean group estimators** - Next by Date:
**st: Average Multiple Records** - Previous by thread:
**Re: st: String function headache.** - Next by thread:
**Re: st: String function headache.** - Index(es):