Statalist The Stata Listserver


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

RE: st: Re: Creating a unique identifier from a string and byte variable


From   "Gauri Khanna" <[email protected]>
To   [email protected]
Subject   RE: st: Re: Creating a unique identifier from a string and byte variable
Date   Tue, 03 Apr 2007 09:22:39 +0000

Dear Sergiy,

Thanks for the prompt reply. I tried both options, and the first one worked. I checked for -duplicates report my_id- and there are none, so indeed a combination of "caseid" and "bidx" works. So my problem is solved but I still wanted to show what went wrong with the second command.

The second option

In your example bidx is somewhere between 0 and 999, so one can:
gen my_id=string(caseid)*1000+bidx
which will create a number identifier
You must check that caseid*1000 still can be stored completely (without loss) in Stata.gave the following error :
gen my_id=string(caseid)*1000+bidx
type mismatch
r(109);

I don't think it is possible to multiply a number with a string variable. (Variable bidx only has two values, 1 and 2.)

Thank you.

Gauri



From: "Sergiy Radyakin" <[email protected]>
Reply-To: [email protected]
To: <[email protected]>
Subject: st: Re: Creating a unique identifier from a string and byte variable
Date: Tue, 3 Apr 2007 11:01:22 +0200

Hi,

you can always create a string identifier from variables of different types.
In your example:
gen my_id=caseid+"#"+string(bidx)
Notice that symbol "#" separates the two sources, which resolves the frequent
problem:
caseid=123 bidx=4 => my_id=1234
caseid=12 bidx=34 => my_id=1234
Use any symbol instead of "#" which is not in your identifiers.

Another technique can be used when one of the ids is of low dimension.
In your example bidx is somewhere between 0 and 999, so one can:
gen my_id=string(caseid)*1000+bidx
which will create a number identifier
You must check that caseid*1000 still can be stored completely (without loss)
in Stata.

Notice that the identifiers will be "unique" (as you requested) only if a combination
of caseid and bidx is unique.

Hint: use -compress- to reduce the types to simplier ones, e.g. Long-->Byte (if possible).

Regards,
Sergiy Radyakin



----- Original Message ----- From: "Gauri Khanna" <[email protected]>
To: <[email protected]>
Sent: Tuesday, April 03, 2007 10:42 AM
Subject: st: Creating a unique identifier from a string and byte variable



Dear Stata List,

I am using cross sectional data with around 31,000 observations. I would like to create a unique identifier called "idchild" composed of two variables: caseid(string variable) and bidx(byte). I have described the variables below and listed them as well (observations, 29 & 30, 37 & 38, 960 & 961 have the same caseid's but different bidx's).

des caseid

storage display value
variable name type format label variable label
-------------------------------------------------------------------------------
caseid str15 %15s case identification

. des bidx

storage display value
variable name type format label variable label
-------------------------------------------------------------------------------
bidx byte %8.0g birth column number

. list caseid bidx

+------------------------+
| caseid bidx |
|------------------------|
1. | 2 1 66 4 1 |
2. | 2 1 66 7 1 |
3. | 2 1 93 7 1 |
4. | 2 1 111 4 1 |
5. | 2 1 147 2 1 |
|------------------------|
6. | 2 1 174 7 1 |
7. | 2 1 201 2 1 |
8. | 2 1 237 2 1 |
9. | 2 1 255 3 1 |
10. | 2 2 40 2 1 |
|------------------------|
11. | 2 2 65 4 1 |
12. | 2 2 95 2 1 |
13. | 2 2 105 11 1 |
14. | 2 2 120 4 1 |
15. | 2 2 130 4 1 |
|------------------------|
16. | 2 2 145 7 1 |
17. | 2 3 13 2 1 |
18. | 2 3 25 4 1 |
19. | 2 3 55 2 1 |
20. | 2 3 91 6 1 |
|------------------------|
21. | 2 3 97 4 1 |
22. | 2 3 97 8 1 |
23. | 2 3 121 6 1 |
24. | 2 3 139 2 1 |
25. | 2 3 145 3 1 |
|------------------------|
26. | 2 3 157 3 1 |
27. | 2 3 181 3 1 |
28. | 2 4 62 2 1 |
29. | 2 4 89 7 1 |
30. | 2 4 89 7 2 |
|------------------------|
31. | 2 4 116 3 1 |
32. | 2 4 134 2 1 |
33. | 2 4 197 5 1 |
34. | 2 4 251 3 1 |
35. | 2 4 260 3 1 |
|------------------------|
36. | 2 5 277 8 1 |
37. | 2 5 413 8 1 |
38. | 2 5 413 8 2 |
39. | 2 5 429 4 1 |
40. | 2 5 445 4 1 |
|------------------------|
41. | 2 5 461 4 1 |
42. | 2 5 469 2 1 |
43. | 2 5 501 4 1 |
44. | 2 5 509 2 1 |
45. | 2 5 533 2 1 |
|------------------------|
46. | 2 5 549 2 1 |
47. | 2 5 557 2 1 |
48. | 2 6 93 2 1 |
49. | 2 6 159 2 1 |
50. | 2 6 311 2 1 |
|------------------------|
51. | 2 7 7 3 1 |
52. | 2 7 20 3 1 |
53. | 2 7 85 5 1 |
54. | 2 7 98 3 1 |
........

959. | 2116 52 4 1
960. | 2116 56 4 1
|------------------------------------|
961. | 2116 56 4 2
962. | 2116 60 3 1
963. | 2116 84 4 1
964. | 2116 112 4 1
965. | 2117 10 2 1
|------------------------------------|
966. | 2117 26 5 1
967. | 2117 50 5 1
968. | 2117 54 8 1
969. | 2117 58 4 1
970. | 2117 62 3 1
|------------------------------------|
971. | 2117 62 3 2
972. | 2117 86 5 1
973. | 2117 86 6 1
974. | 2117 130 2 1
975. | 2117 134 2 1
--Break--


I tried the following :

egen idchild = concat(caseid, bidx)
invalid syntax
r(198);

I realise that I am trying to concatenate two different *types* of variables and so I then tried the following:

decode bidx, gen(childbidx)
bidx not labeled
r(182);

Then I tried changing the caseid variable:

encode caseid, gen(childcase)

. des childcase

storage display value
variable name type format label variable label
-------------------------------------------------------------------------------
childcase long %15.0g childcase
case identification

. egen idchild = concat(childcase, bidx)
invalid syntax
r(198);

How can I create a unique idchild? I am using Stata 9.2.

Thank you for your help.

Regards,

Gauri

_________________________________________________________________
Express yourself instantly with MSN Messenger! Download today it's FREE! http://messenger.msn.click-url.com/go/onm00200471ave/direct/01/

*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
_________________________________________________________________
Express yourself instantly with MSN Messenger! Download today it's FREE! http://messenger.msn.click-url.com/go/onm00200471ave/direct/01/

*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/




© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index