Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down at the end of May, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Comparing strings


From   jo la frite <jo_la_frite@yahoo.com>
To   "statalist@hsphsun2.harvard.edu" <statalist@hsphsun2.harvard.edu>
Subject   Re: st: Comparing strings
Date   Sun, 25 Mar 2012 14:20:35 -0700 (PDT)

thanks Nick and Eric. As far as I understand, the indexnot command compares strings regardless of the ordering of the characters in the string. for example, "frog" and "ogfr" are viewed as identical by indexnot. 


Is there a way of controling for the ordering of the characters. So for example, "comparing "frog" and "fragro" retuns 3 (position of the first character from "frog" not in "fragro"). 


Thanks!

Jo



----- Original Message -----
From: Nick Cox <njcoxstata@gmail.com>
To: statalist@hsphsun2.harvard.edu
Cc: 
Sent: Sunday, March 25, 2012 12:48 PM
Subject: Re: st: Comparing strings

Stata naturally does have a concept of alphanumeric order for strings;
otherwise it could not -sort- them. Consider

. di ("frog" < "toad")
1

. di ("frog" < "foo")
0

The first statement is true and the second false. Otherwise put, with
strings < means "precedes" and > means "follows" in alphanumeric
order.

This allows one step further forwards:

gen compare = cond(str1 > str2, indexnot(str1, str2), -indexnot(str1, str2))

If strings are identical, this yields 0. Jo did not make explicit that
this is what SAS does too, but either way it seems logical to me.

Nick

On Sat, Mar 24, 2012 at 10:47 PM, Eric Booth <eric.a.booth@gmail.com> wrote:

> Take a look at the string function (-help string_functions-) indexnot() (e.g., "gen x = indexnot(string1, string2)" )  which will give you the leftmost position where the two strings differ.
> This Stata string function does not assign the positive/negative sign like the sas function you describe, but you can code those yourself by using other string functions to find how they differ in order/sequence/length.

On Mar 24, 2012, at 5:12 PM, jo la frite wrote:

>> Is there a Stata function that correspondons to the Sas function "COMPARE". It allows to compare strings. Specifically, in Sas COMPARE(string-1, string-2) returns a numeric value. The sign of the result is negative if string-1 precedes string-2 in a sort sequence, and positive if string-1 follows string-2 in a sort sequence. The magnitude of the result is equal to the position of the leftmost character at which the strings differ.

*
*   For searches and help try:
*  http://www.stata.com/help.cgi?searchhttp://www.stata.com/support/statalist/faqhttp://www.ats.ucla.edu/stat/stata/


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index