Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: Unique identifier from a string name
From
[email protected] (Brendan Halpin)
To
[email protected]
Subject
Re: st: Unique identifier from a string name
Date
Thu, 24 Nov 2011 11:48:21 +0000
On Thu, Nov 24 2011, Barry Quinn wrote:
> I would like to create a uniquely identified numerical code from a
> string variable by ideally converting each letter of the string into a
> number (but any other solutions that could create a unique identifier
> from a string are welcomed).
> Is there some way of reversing the function char() to do this ?
-encode- will do it, as long as your dataset is fixed (i.e. there will
be a fixed one-to-one relationship between string and number that
depends on what data is there when the command is issued).
If you want a deterministic relationship between any possible string and
its integer, you'll need another solution. For short strings something
conceptually similar to:
char(substr(string,1,1)) + 256*(
char(substr(string,2,1)) + 256*(
char(substr(string,3,1)) + 256*(
char(substr(string,4,1)) + 256*(
char(substr(string,5,1)) + 256*(
...)))))
will work, but will soon generate numbers you can't store (and assumes
char is a byte).
I've often thought that Stata should have an MD5 hash function for
situations like this.
Brendan
--
Brendan Halpin, Department of Sociology, University of Limerick, Ireland
Tel: w +353-61-213147 f +353-61-202569 h +353-61-338562; Room F1-009 x 3147
mailto:[email protected] ULSociology on Facebook: http://on.fb.me/fjIK9t
http://teaching.sociology.ul.ie/bhalpin/wordpress twitter:@ULSociology
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/