Stata: Data Analysis and Statistical Software

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: Identifier values change after Merge

From	Anjanette Chan Tack <[email protected]>
To	[email protected]
Subject	st: Identifier values change after Merge
Date	Wed, 10 Nov 2010 16:12:27 -0600 (CST)

I am using intercooled stata 9.1 to do a 1 to 1 merge using an 11 digit long identifier that uniquely designates a census tract. As background, I got the census tract data from the geolytics Neighborhood Change database, and these 11 digit numbers are the unique identifiers that come with them.

The identifier is being stored as double. In executing the merge, I ask stata to keep the matched observations only and drop the unmatched observations. Since the master file's list of identifiers is a subset of the using file, I was hoping that it would allow me to extract this subset of observations and their attendant information easily. To do so, I use this command:

merge 1:1 tractno using C:\Program Files\Stata9\Filename assert (match, master) keep (match)

In some ways the merge proceeds well. The resulting list of N observations is the N I expect. The problem is that after the merge, the value of the identifiers change. Where previously, census tracts had unique 11 digit identifiers like, these idenifiers are all rounded to the same number in the new merged dataset.

Thus I have a BEFORE and AFTER that look like this:

Before:

17031020500
17031020600
17031020700
17031130100
17031090100
17031090200

After
1.70E+10
1.70E+10
1.70E+10
1.70E+10
1.70E+10
1.70E+10

Where 1.70E+10 = 17030000000 in all cases.

I thought that this might be due to the way that stata is storing the information, so I googled "help stata is approximating numeric values". I found an archived response to a problem that seems similar here: http://www.stata.com/statalist/archive/2010-06/msg01017.html

The help answer says that the double storage type can sustain up to 15 digits. Since my identifier is only 11 digits long, I can't understand what the problem might be.

I am quite unfamiliar with stata (it's the first time I'm using it in 3 years, and the first time outside a classroom setting for basic trainign in statistics), so I would be grateful for any suggestions and advice.

Many thanks in advance!

Anjie.
-------------------------------
Anjanette M. Chan Tack
PhD student
University of Chicago Department of Sociology
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/

Prev by Date: st: robust standard errors using cdsimeq
Next by Date: Re: st: Identifier values change after Merge
Previous by thread: st: robust standard errors using cdsimeq
Next by thread: Re: st: Identifier values change after Merge
Index(es):
- Date
- Thread