Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: String function headache.


From   Nick Cox <njcoxstata@gmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: String function headache.
Date   Mon, 25 Apr 2011 09:28:21 +0100

If you want to check for occurrence, just use -strpos()- instead. I
often see people on this list struggling with the regex functions or
-strmatch()- when a simpler function will do the job. I have offered a
talk on functions for the London users' meeting and this point is
already one of the slides.

foreach y in # {
forvalues x=1/6 {
          replace mynumber `x'= strpos(mystring`x', "`y'") > 0
}

Otherwise, my understanding is this: a pattern that is just a literal
character will be matched only by strings that are exactly that
character; for almost all matching problems, you must specify * and/or
?. You seem to be expecting -strmatch()- to behave more like
-regexm()-, but they have different jobs.

But as said -strpos()- is easier to figure out.

Nick

On Mon, Apr 25, 2011 at 4:45 AM, Scott Talkington <talkings@gmu.edu> wrote:
> I just can't seem to make this work.  What I want to do is search for any
> occurrence of the "#" character in a string variable and set a flag for that
> observation.  I'm searching 6 different strings labeled something like
> mystring1 mystring2 etc. and the flags are mynumber1 mynumber2 etc..
>
> So my do file:
>
> forvalues x=1/6 {
> foreach y in # {
> replace mynumber `x'= strmatch(mistring`x', "`y'")
> }
> }
>
> I just listed one character in the y list above, but in reality I'm not
> having a problem with normal strings like "APT" but with wildcards and with
> the number sign character itself.
>
> I assumed that placing a "?" character iyn the search string (s2) would
> match zero or one characters + the "#" but it seems to be matching all
> strings with one character that are either a  number  or a letter.  Huh?
>
> If I include the wildcard (either the asterisk or the question mark)
> *anywhere* (either in the "foreach" part of the do file or in the "replace"
> command) it just doesn't work the way I expect it to.  There's a difference
> between what I get depending on how many quotes I use  and where as well,
> but I'm just not getting anything that does what I want it to.  I've even
> tried using the backslash character to indicate that I don't want the "#" to
> be read as an operator, but I'm not even sure where to put the backslash or
> how to arrange the quotation marks.  It's driving me nuts.  There's some
> rule here that I'm just not getting.

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index