Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: RE: problem creating unique identifier


From   "Martin Weiss" <martin.weiss1@gmx.de>
To   <statalist@hsphsun2.harvard.edu>
Subject   st: RE: problem creating unique identifier
Date   Sun, 6 Dec 2009 20:53:15 +0100

<>

Make sure you specify -double- as the type for the new variable:



*******
clear*

input   itemid       cityid1      cityid2
 22             61              6           
 22             61              8           
 22             61              9           
 22             61              7           
 22             61              10          
 22             61              13          
end

compress

gen double newid = 1000000*itemid + 1000*cityid1 + cityid2
list, noobs
*******


HTH
Martin

-----Original Message-----
From: owner-statalist@hsphsun2.harvard.edu
[mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of Christopher
Hajzler
Sent: Sonntag, 6. Dezember 2009 20:47
To: statalist@hsphsun2.harvard.edu
Subject: st: problem creating unique identifier

Dear Statalist,

My version of Stata is doing something quite strange (Version 8.0
intercooled).  I have several variables which are contained in
separate data files, and I want to merge them by creating a unique
identifier for each observation (over which I can sort the data - I
realize it is not strictly necessary to do it this way, but I thought
this method would help avoid errors).  Each observation represents a
city pair (just over 100 cities), and an "item", and in each file I've
created the variable using the following command:
gen newid = 0
replace newid = 1000000*itemid + 1000*cityid1 + cityid2

For some reason Stata will only do a partial job - it seems to be
getting the itemid and cityid1 right when producing newid (the first 4
or 5 digits), and "almost" gets cityid2 right, assigning the correct
values every few observations but then incorrectly allocating the same
value to adjacent observations. For example, one file reads:

newid                itemid       cityid1      cityid2      var1
...
22061006           22             61              6           ...
22061008           22             61              8           ...
22061008           22             61              9           ...
22061008           22             61              7           ...
22061010           22             61              10          ...
22061012           22             61              13          ...
...

I have worked with much larger datasets, so I cannot imagine memory
allocation is an issue. Any other ideas?

Best wishes,
Chris

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index