Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: string function


From   Nick Cox <njcoxstata@gmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: string function
Date   Wed, 24 Aug 2011 13:21:14 +0100

To correct a typo:

For all, the initialisation should be to 1.

gen found = 1

qui foreach letter in s o m e t h i n g {
         replace found = min(found, strpos(strvar, "`letter'") > 0)
}

That is: the logic is to assume initially that all are present; if any
one is absent you change your mind. You can also do it this way

gen found = 1
qui foreach letter in s o m e t h i n g {
        replace found = 0 if strpos(strvar, "`letter'") == 0
}

Also, in this solution "letter" is just an name that makes sense for letters.

gen found = 1
qui foreach pkg in Stata SAS SPSS {
       replace found = 0 if strpos(strvar, "`pkg'")  == 0
}

2011/8/24 Grace Jessie <gracejessie@hotmail.com>:
> OK,thank you, Nick.
> Grace
>
> ----------------------------------------
>> Date: Wed, 24 Aug 2011 12:37:23 +0100
>> Subject: Re: st: string function
>> From: njcoxstata@gmail.com
>> To: statalist@hsphsun2.harvard.edu
>>
>> I just said that they _could_ be written.
>>
>> At that time, I had forgotten about -egen- solutions for your first
>> problem in -egenmore- (SSC). Both of those solutions (by Nick Winter
>> and myself) overlooked what now seems to me a cleaner solution using
>> -subinstr()- and -length()-. See also
>>
>> <http://statadaily.wordpress.com/2011/01/20/counting-occurrence-of-strings-within-strings/>
>>
>> and my Speaking Stata column in SJ 11(1) 2011 for discussion.
>>
>> I am not aware of coded -egen- solutions for your other problems.
>>
>> I imagine that they would just be wrappers for those -foreach- loops,
>> with no gain in efficiency or even comprehensibility.
>>
>> I'm setting them as an exercise for homework.
>>
>> Nick
>>
>> 2011/8/24 Grace Jessie <gracejessie@hotmail.com>:
>> > Nick,
>> > thank you.
>> > Counld you please also tell me the -egen- solution for my questions?
>> >
>> > Grace
>> >
>> > ----------------------------------------
>> >> Date: Wed, 24 Aug 2011 11:59:19 +0100
>> >> Subject: Re: st: string function
>> >> From: njcoxstata@gmail.com
>> >> To: statalist@hsphsun2.harvard.edu
>> >>
>> >> Solutions to all these could be written as -egen- functions or Mata functions.
>> >>
>> >> Here I focus on "official Stata only" solutions.
>> >>
>> >> First question is discussed in
>> >>
>> >> Nicholas J. Cox
>> >> Stata tip 98: Counting substrings within strings
>> >> The Stata Journal 11(2): 318-320
>> >>
>> >> length("abcdaf") - length(subinstr("abcdaf", "a", "", .))
>> >>
>> >> Last two questions
>> >>
>> >> any of "a", "b", "c"
>> >>
>> >> max(strpos("abcdaf","a"), strpos("abcdaf", "b"), strpos("abcdaf", "c")) > 0
>> >>
>> >> all of "a", "b", "c"
>> >>
>> >> min(strpos("abcdaf","a"), strpos("abcdaf", "b"), strpos("abcdaf", "c")) > 0
>> >>
>> >> If you had a long list of candidates, I would do something like this:
>> >>
>> >> gen found = 0
>> >>
>> >> qui foreach letter in s o m e t h i n g {
>> >> replace found = max(found, strpos(strvar, "`letter'") > 0)
>> >> }
>> >>
>> >> where for "max" substitute "min" as needed.
>> >>
>> >> The mapping max <-> any, min <-> all is discussed in
>> >> http://www.stata.com/support/faqs/data/anyall.html
>> >>
>> >> Nick
>> >>
>> >> 2011/8/24 Grace Jessie <gracejessie@hotmail.com>:
>> >>
>> >> > How to count how many times a substring appears in a string?
>> >> > For example,
>> >> > function("abcdaf","a")=2
>> >> >
>> >> > And, how to check if a string variable has certain substrings?
>> >> > With regard to this, I want to ask two functions.
>> >> > For example,
>> >> > function("abcdaf","a","b","c")
>> >> > One of what I want to do is to return 1 if a or b or c is included in "abcdaf", ;
>> >> > the other is to return 1 if a, b and c are included in "abcdaf".
>> >> > Could anyone tell me the correct functions for thoes above?
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index