Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: Identifier values change after Merge


From   Anjanette Chan Tack <amc75@uchicago.edu>
To   statalist@hsphsun2.harvard.edu
Subject   st: Identifier values change after Merge
Date   Wed, 10 Nov 2010 16:12:27 -0600 (CST)

Hi


I am using intercooled stata 9.1 to do a 1 to 1 merge using an 11 digit long identifier that uniquely designates a census tract. As background, I got the census tract data from the geolytics Neighborhood Change database, and these 11 digit numbers are the unique identifiers that come with them. 

The identifier is being stored as double. In executing the merge, I ask stata to keep the matched observations only and drop the unmatched observations. Since the master file's list of identifiers is a subset of the using file, I was hoping that it would allow me to extract this subset of observations and their attendant information easily. To do so, I use this command:


merge 1:1 tractno using C:\Program Files\Stata9\Filename assert (match, master) keep (match)

In some ways the merge proceeds well. The resulting list of N observations is the N I expect. The problem is that after the merge, the value of the identifiers change. Where previously, census tracts had unique 11 digit identifiers like, these idenifiers are all rounded to the same number in the new merged dataset.


Thus I have a BEFORE and AFTER that look like this:

Before:

17031020500
17031020600
17031020700
17031130100
17031090100
17031090200

After
1.70E+10
1.70E+10
1.70E+10
1.70E+10
1.70E+10
1.70E+10

Where 1.70E+10  = 17030000000 in all cases. 

I thought that this might be due to the way that stata is storing the information, so I googled "help stata is approximating numeric values". I found an archived response to a problem that seems similar here: http://www.stata.com/statalist/archive/2010-06/msg01017.html

The help answer says that the double storage type can sustain up to 15 digits. Since my identifier is only 11 digits long, I can't understand what the problem might be.

I am quite unfamiliar with stata (it's the first time I'm using it in 3 years, and the first time outside a classroom setting for basic trainign in statistics), so I would be grateful for any suggestions and advice.

Many thanks in advance!

Anjie.
-------------------------------
Anjanette M. Chan Tack
PhD student 
University of Chicago Department of Sociology
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index