Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

RE: st: string variable


From   "Nick Cox" <[email protected]>
To   <[email protected]>
Subject   RE: st: string variable
Date   Tue, 13 Nov 2007 10:55:53 -0000

Austin is right that -egen, group()- will assign integers 
1 up. But if -encode- won't play at assigning labels because
there are too many distinct values, then I don't think -labmask- 
(or even -egen, group()- with the -label- option) will help 
either. 

I am still puzzled at the original question. On the face of 
it the variable in question is some kind of identifier. It
is difficult to see any sense in which it is better off as 
a numeric variable. If there are thousands of distinct values
it would be no use for any kind of modelling, so far as I can imagine. 

Nick
[email protected] 

Austin Nichols

You can make a numeric id with

egen g=group(id)

and then you can try adding labels with

ssc inst labutil
labmask g, val(id)

or perhaps

gen numid=real(id)
gen strid=id if mi(numid)
egen g=group(strid)
su numid
replace g=r(max)+g
ssc inst labutil
labmask g, val(id)

to cut down on the label creation.

On Nov 12, 2007 4:41 PM,  <[email protected]> wrote:

>   I want to convert a string variables(type:  T0274K0VH550101) in
> numeric.  I try to use the commands destring and encode, but with the
> first I din't have any result, while with the second I have this
error:
>   encode codind, gen(id)
> too many values
> r(134);
>   with destring I have
>   destring codind, generate(id) force
> codind contains non-numeric characters; id generated as byte
> (128147 missing values generated)
>
> exist  another solution?
> thanks in advance for your help.

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index