[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]
Re: st: combine 3 numeric vars. to create 1 unique id. var
At 10:14 AM 8/13/2002 +0600, Nyima wrote:
First, does the other data set have the same three identifiers? If so,
then there is no need to combine them into one variable. Use the three
together as you key:
I need to create a unique id variable by combining three different numeric
variables viz. "district", "house" and "person". "District" has been
entered as a four digit number code (1000 to 1500). Each "District" has
"house"s numbered 1 to 150 (no zero in front) and each "house" has
"person" with numbers ranging from 1 to 12 (again no zero in front). I
need the uniqe id variable to merge this dataset with another data file
having other information about these particular people.
merge district house person using ...
But if you still want to combine them into one variable, do this:
assert district >=0 & district <=1500
assert house >=0 & house <=150
assert person >=0 & person <=12
gen long newvar = (district * 100000) + (house * 100) + person
Note that the upper bound on person could be as high as 99, and the upper
bound on house could be as high as 999. These assure that the mapping is
one-to-one. The upper bound on district could also be higher; it is there
to assure that you don't get numeric overflow.
(The coefficients are powers of 10; that isn't absolutely necessary, but it
is convenient for human readers of the resulting numbers. Smaller
coefficients can be used for a more compact mapping. But in any case, the
coefficients must be tailored to the ranges of values in person and house.)
Given this, you can also go backwards -- taking your newvar and deriving
the district, house, and person.
gen byte person = mod(newvar, 100)
gen int house = mod(int(newvar/100), 1000)
gen int district = int(newvar/100000)
I hope this helps.
Institute for Policy Studies
Johns Hopkins University
* For searches and help try: