Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

# Re: st: Comparing strings

 From jo la frite To "statalist@hsphsun2.harvard.edu" Subject Re: st: Comparing strings Date Sun, 25 Mar 2012 14:20:35 -0700 (PDT)

```thanks Nick and Eric. As far as I understand, the indexnot command compares strings regardless of the ordering of the characters in the string. for example, "frog" and "ogfr" are viewed as identical by indexnot.

Is there a way of controling for the ordering of the characters. So for example, "comparing "frog" and "fragro" retuns 3 (position of the first character from "frog" not in "fragro").

Thanks!

Jo

----- Original Message -----
From: Nick Cox <njcoxstata@gmail.com>
To: statalist@hsphsun2.harvard.edu
Cc:
Sent: Sunday, March 25, 2012 12:48 PM
Subject: Re: st: Comparing strings

Stata naturally does have a concept of alphanumeric order for strings;
otherwise it could not -sort- them. Consider

. di ("frog" < "toad")
1

. di ("frog" < "foo")
0

The first statement is true and the second false. Otherwise put, with
strings < means "precedes" and > means "follows" in alphanumeric
order.

This allows one step further forwards:

gen compare = cond(str1 > str2, indexnot(str1, str2), -indexnot(str1, str2))

If strings are identical, this yields 0. Jo did not make explicit that
this is what SAS does too, but either way it seems logical to me.

Nick

On Sat, Mar 24, 2012 at 10:47 PM, Eric Booth <eric.a.booth@gmail.com> wrote:

> Take a look at the string function (-help string_functions-) indexnot() (e.g., "gen x = indexnot(string1, string2)" )  which will give you the leftmost position where the two strings differ.
> This Stata string function does not assign the positive/negative sign like the sas function you describe, but you can code those yourself by using other string functions to find how they differ in order/sequence/length.

On Mar 24, 2012, at 5:12 PM, jo la frite wrote:

>> Is there a Stata function that correspondons to the Sas function "COMPARE". It allows to compare strings. Specifically, in Sas COMPARE(string-1, string-2) returns a numeric value. The sign of the result is negative if string-1 precedes string-2 in a sort sequence, and positive if string-1 follows string-2 in a sort sequence. The magnitude of the result is equal to the position of the leftmost character at which the strings differ.

*
*   For searches and help try:
*  http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```