Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: Merge chaniging census tracts


From   Julia Gamas <jgamas@MIT.EDU>
To   statalist@hsphsun2.harvard.edu
Subject   st: Merge chaniging census tracts
Date   Thu, 5 May 2005 12:55:06 -0400

Hi all,
I have to merge two databases from two different years where each observation is
a census tract. The challenge is avoiding double counting after the merge, as a
result of census tracts changing through time. I would like to create a code
that identifies the groups of tracts that split up or merged but am not sure
how to do it.  Other than doing it by hand (I have 4000 tracts), I'm wondering
if there is a somewhat automatic way to do this in Stata.
Currently, if I were to merge the two datasets using the
corresponding equivalence table I would get something like this.
year 1     year 2
Code pop  code  pop  what happened to tract over time:
xx    22  lm     10  census tract from year 1 split into two 
xx    22  pq     5   "
pl    12  pl     5   census tract remained but got bigger because
da     5  pl     5   census tracts from year 1 merged into an existing
gr     1  pl     5   tract
fa     8  fa     16  census tract remained the same
00     0  no     10  census tract did not exist in year 1 but appeared in year
2
00     0  lp     8   "
ll    10  00     0   census tract existed in year 1 but disappeared in year 2
pp    10  00     0   "

If at this point I were to collapse (sum) the population for year 2, for
example, then I would end up triple counting population in census tract pl,
similarly, if I were to collapse (sum) population for year 1.  Population for
pl in year 2 would be counted as 15 when it is only 5.  Population for tract xx
would be counted as 44 when it is only 22.  As an intermediate step I was
thinking of doing the following:
         year 1     year 2
newcode  Code  pop code  pop  
1        xx    22  lm     10  
1        xx    22  pq     5   
2        pl    12  pl     5   
2        da     5  pl     5   
2        gr     1  pl     5   
3        fa     8  fa     16  
4        00     0  no     10  
5        00     0  lp     8   
6        ll    10  00     0    
7        pp    10  00     0   

That way, I could keep track of the population for each tract or "group" of
tracts and end up with
something like this:
         year 1   year 2
newcode  pop      pop  
1        22        15 
2        18         5      
3        8         16   
4        0         10 
5        0          8
6        10         0  
7        10         0
How do I tell stata how to generate newcode (i.e. can I get stata to recognize
groups of census tracts with the same code and assign them a newcode, for both
years)?
Any help would be appreciated.
Sincerely,
Julia A. Gamas
Mexico City Project
MIT
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index