Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Unique identifier from a string name

From (Brendan Halpin)
Subject   Re: st: Unique identifier from a string name
Date   Thu, 24 Nov 2011 11:48:21 +0000

On Thu, Nov 24 2011, Barry Quinn wrote:

> I would like to create a uniquely identified numerical code from a
> string variable by ideally converting each letter of the string into a
> number (but any other solutions that could create a unique identifier
> from a string are welcomed).
> Is there some way of reversing the function char() to do this ?

-encode- will do it, as long as your dataset is fixed (i.e. there will
be a fixed one-to-one relationship between string and number that
depends on what data is there when the command is issued).

If you want a deterministic relationship between any possible string and
its integer, you'll need another solution. For short strings something
conceptually similar to:

char(substr(string,1,1)) + 256*(
  char(substr(string,2,1)) + 256*(
    char(substr(string,3,1)) + 256*(
      char(substr(string,4,1)) + 256*(
        char(substr(string,5,1)) + 256*(

will work, but will soon generate numbers you can't store (and assumes
char is a byte).

I've often thought that Stata should have an MD5 hash function for
situations like this. 

Brendan Halpin,   Department of Sociology,   University of Limerick,   Ireland
Tel: w +353-61-213147  f +353-61-202569  h +353-61-338562;  Room F1-009 x 3147    ULSociology on Facebook:         twitter:@ULSociology
*   For searches and help try:

© Copyright 1996–2015 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index