Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: st: string function


From   Grace Jessie <[email protected]>
To   <[email protected]>
Subject   RE: st: string function
Date   Wed, 24 Aug 2011 13:15:03 +0000

OK. Thanks.
Grace

----------------------------------------
> Date: Wed, 24 Aug 2011 13:21:14 +0100
> Subject: Re: st: string function
> From: [email protected]
> To: [email protected]
>
> To correct a typo:
>
> For all, the initialisation should be to 1.
>
> gen found = 1
>
> qui foreach letter in s o m e t h i n g {
> replace found = min(found, strpos(strvar, "`letter'") > 0)
> }
>
> That is: the logic is to assume initially that all are present; if any
> one is absent you change your mind. You can also do it this way
>
> gen found = 1
> qui foreach letter in s o m e t h i n g {
> replace found = 0 if strpos(strvar, "`letter'") == 0
> }
>
> Also, in this solution "letter" is just an name that makes sense for letters.
>
> gen found = 1
> qui foreach pkg in Stata SAS SPSS {
> replace found = 0 if strpos(strvar, "`pkg'") == 0
> }
>
> 2011/8/24 Grace Jessie <[email protected]>:
> > OK,thank you, Nick.
> > Grace
> >
> > ----------------------------------------
> >> Date: Wed, 24 Aug 2011 12:37:23 +0100
> >> Subject: Re: st: string function
> >> From: [email protected]
> >> To: [email protected]
> >>
> >> I just said that they _could_ be written.
> >>
> >> At that time, I had forgotten about -egen- solutions for your first
> >> problem in -egenmore- (SSC). Both of those solutions (by Nick Winter
> >> and myself) overlooked what now seems to me a cleaner solution using
> >> -subinstr()- and -length()-. See also
> >>
> >> <http://statadaily.wordpress.com/2011/01/20/counting-occurrence-of-strings-within-strings/>
> >>
> >> and my Speaking Stata column in SJ 11(1) 2011 for discussion.
> >>
> >> I am not aware of coded -egen- solutions for your other problems.
> >>
> >> I imagine that they would just be wrappers for those -foreach- loops,
> >> with no gain in efficiency or even comprehensibility.
> >>
> >> I'm setting them as an exercise for homework.
> >>
> >> Nick
> >>
> >> 2011/8/24 Grace Jessie <[email protected]>:
> >> > Nick,
> >> > thank you.
> >> > Counld you please also tell me the -egen- solution for my questions?
> >> >
> >> > Grace
> >> >
> >> > ----------------------------------------
> >> >> Date: Wed, 24 Aug 2011 11:59:19 +0100
> >> >> Subject: Re: st: string function
> >> >> From: [email protected]
> >> >> To: [email protected]
> >> >>
> >> >> Solutions to all these could be written as -egen- functions or Mata functions.
> >> >>
> >> >> Here I focus on "official Stata only" solutions.
> >> >>
> >> >> First question is discussed in
> >> >>
> >> >> Nicholas J. Cox
> >> >> Stata tip 98: Counting substrings within strings
> >> >> The Stata Journal 11(2): 318-320
> >> >>
> >> >> length("abcdaf") - length(subinstr("abcdaf", "a", "", .))
> >> >>
> >> >> Last two questions
> >> >>
> >> >> any of "a", "b", "c"
> >> >>
> >> >> max(strpos("abcdaf","a"), strpos("abcdaf", "b"), strpos("abcdaf", "c")) > 0
> >> >>
> >> >> all of "a", "b", "c"
> >> >>
> >> >> min(strpos("abcdaf","a"), strpos("abcdaf", "b"), strpos("abcdaf", "c")) > 0
> >> >>
> >> >> If you had a long list of candidates, I would do something like this:
> >> >>
> >> >> gen found = 0
> >> >>
> >> >> qui foreach letter in s o m e t h i n g {
> >> >> replace found = max(found, strpos(strvar, "`letter'") > 0)
> >> >> }
> >> >>
> >> >> where for "max" substitute "min" as needed.
> >> >>
> >> >> The mapping max <-> any, min <-> all is discussed in
> >> >> http://www.stata.com/support/faqs/data/anyall.html
> >> >>
> >> >> Nick
> >> >>
> >> >> 2011/8/24 Grace Jessie <[email protected]>:
> >> >>
> >> >> > How to count how many times a substring appears in a string?
> >> >> > For example,
> >> >> > function("abcdaf","a")=2
> >> >> >
> >> >> > And, how to check if a string variable has certain substrings?
> >> >> > With regard to this, I want to ask two functions.
> >> >> > For example,
> >> >> > function("abcdaf","a","b","c")
> >> >> > One of what I want to do is to return 1 if a or b or c is included in "abcdaf", ;
> >> >> > the other is to return 1 if a, b and c are included in "abcdaf".
> >> >> > Could anyone tell me the correct functions for thoes above?
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/statalist/faq
> * http://www.ats.ucla.edu/stat/stata/ 		 	   		  
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index