# st: RE: RE: RE: RE: RE: Strings and the greater than/less than operators

 From "sdm1" To Subject st: RE: RE: RE: RE: RE: Strings and the greater than/less than operators Date Thu, 14 May 2009 15:15:00 +0100

```Sorry to trouble you again on this topic.

I was thinking about sorting strings in the same way as one would sort
numerics so that, for example, a four digit integer is always greater than a
three digit integer.  Clearly I shouldn't think like this when it comes to
strings because "N12 " is less than "N13" and, although 1234 is greater than
124, "1234" is less than "124".  When thinking about the sort order of
strings, would it be sensible to think of them being left aligned (whereas
with integers they're right aligned)?

Thanks.

Steve

-----Original Message-----
From: owner-statalist@hsphsun2.harvard.edu
[mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of sdm1
Sent: 13 May 2009 19:24
To: statalist@hsphsun2.harvard.edu
Subject: st: RE: RE: RE: RE: Strings and the greater than/less than
operators

Nick/Gary,

Thanks very much for your help.  -asciiplot- provides a particularly useful
display of the sort 'order' of characters.

Cheers!

Steve

-----Original Message-----
From: owner-statalist@hsphsun2.harvard.edu
[mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of Nick Cox
Sent: 13 May 2009 18:14
To: statalist@hsphsun2.harvard.edu
Subject: st: RE: RE: RE: Strings and the greater than/less than operators

I wrote -char()-, not -char-. The () signal a function, -char()-. For
example,

. di char(65)
A

. di char(97)
a

Referring to -char()- is more precise than referring to (say) ASCII order,
which doesn't mean the same thing in absolutely all circumstances.

Stata doesn't offer an inverse to -char()-, but -asciiplot- from SSC gives
you a graphical display of the characters on your system. In any case,
typing e.g.

di ("a" > "A")

gives you 1 for true and 0 for false.

Incidentally, these data look like first parts of UK postcodes. Right or
wrong, you might use -trim()- to lose the trailing spaces now in order not
to be bitten again.

Nick
n.j.cox@durham.ac.uk

sdm1

Noooooooooooo!

Thanks Nick...and, of course, you're dead right.

The giveaway, I realise now, is the alignment of the values of code under
the heading 'code' in the tabulation.  I think that the last character
aligns vertically with the 'e' of 'code'.

The only bit I don't understand is: "The order is that of -char()-".

It sounds to me as if char is user defined.  This is from the help for
char:

The dataset itself and each variable within the dataset have associated
with them a set of characteristics.
Characteristics are named and referred to as varname[charname], where
varname is the name of a variable or _dta.
The characteristics contain text.  Characteristics are stored with the
dataset in the Stata-format .dta dataset,
so they are recalled whenever the dataset is loaded.

If characteristics for a variable are not defined by the user, what's the
default order?  Is there a list somewhere which will tell me the order in
which Stata sorts characters e.g. alphabetric, numeric, spaces, etc.  Or am
I misinterpreting here?

Once again, thanks for your help.

Nick Cox

General question: Absolutely. The order is that of -char()-.

Specific question: "N05 " > "N05". You have trailing spaces. They are
characters too.

Nick
n.j.cox@durham.ac.uk

Steve

Can the greater than (>) and less than (<) operators be applied to strings?

If the answer is 'yes' (as I thought), why is "N05" included in the output
for the following command?

code<"N13")

code |        31         32 |     Total
-----------+----------------------+----------
N05  |       103        163 |       266
N06  |    23,858        132 |    23,990
N07  |   364,687      2,653 |   367,340
N08  |     8,079         18 |     8,097
N09  |    70,953        132 |    71,085
N10  |    24,606         88 |    24,694
N11  |   123,635        256 |   123,891
N12  |   546,148     21,998 |   568,146
-----------+----------------------+----------
Total | 1,162,069     25,440 | 1,187,509

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```