Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: Merge changing census tracts


From   Julia Gamas <jgamas@MIT.EDU>
To   statalist@hsphsun2.harvard.edu
Subject   st: Merge changing census tracts
Date   Mon, 9 May 2005 14:34:49 -0400

Hi Nick and all,
I found another way to get what I want, although I confess it isn't because I
think it a better way to do it, but because I'm not too familiar with macros. 
Egen turned out to be really useful.  Thanks again for your help.  Here it is: 


egen g1 = group(year1code district) 
sort g1
list
egen g2=group(year2code district)
list
collapse (mean) pop1 (sum) pop2 (max) g2, by (g1 district)
list
collapse (mean) pop2 (sum) pop1 (max) g1, by (g2 district)
egen newcode=group(g1 g2 district)
 
> Date: Fri,  6 May 2005 11:21:33 -0400
> From: Julia Gamas <jgamas@MIT.EDU>
> Subject: st: Merge changing census tracts
> 
> Dear all,
> the equivalence table in my example is structured as follows:
> 
> year 1 year 2
> Code   code  what happened to tract over time:
> 
> xx     lm    census tract from year 1 split into two 
> xx     pq    "
> pl     pl    census tract remained but got bigger because
> da     pl    census tracts from year 1 merged into an existing
> gr     pl    tract
> fa     fa    census tract remained the same
> 00     no    census tract did not exist in year 1 but appeared in year
> 00     lp    "
> ll     00    census tract existed in year 1 but disappeared in year 2
> pp     00    "
> 
> Is this the information you need?
> Thank you for your help.
> 
> Julia
> 
> Date: Thu, 5 May 2005 21:22:22 +0100
> From: "Nick Cox" <n.j.cox@durham.ac.uk>
> Subject: st: RE: Merge chaniging census tracts
> 
> Evidently there is an equivalence table that tells 
> you that the mapping is 
> 
>    xx         -> lm, pq 
>    pl, da, gr -> pl         
>    fa         -> fa 
>               -> no 
> 
>    etc. 
> 
> independently of the two files. You need to 
> tell us how that is structured, I believe. 
> 
> Nick 
> n.j.cox@durham.ac.uk 
> 
> Julia Gamas
> 
> > I have to merge two databases from two different years where 
> > each observation is
> > a census tract. The challenge is avoiding double counting 
> > after the merge, as a
> > result of census tracts changing through time. I would like 
> > to create a code
> > that identifies the groups of tracts that split up or merged 
> > but am not sure
> > how to do it.  Other than doing it by hand (I have 4000 
> > tracts), I'm wondering
> > if there is a somewhat automatic way to do this in Stata.
> > Currently, if I were to merge the two datasets using the
> > corresponding equivalence table I would get something like this.
> > year 1     year 2
> > Code pop  code  pop  what happened to tract over time:
> > xx    22  lm     10  census tract from year 1 split into two 
> > xx    22  pq     5   "
> > pl    12  pl     5   census tract remained but got bigger because
> > da     5  pl     5   census tracts from year 1 merged into an existing
> > gr     1  pl     5   tract
> > fa     8  fa     16  census tract remained the same
> > 00     0  no     10  census tract did not exist in year 1 but 
> > appeared in year
> > 2
> > 00     0  lp     8   "
> > ll    10  00     0   census tract existed in year 1 but 
> > disappeared in year 2
> > pp    10  00     0   "
> > 
> > If at this point I were to collapse (sum) the population for 
> > year 2, for
> > example, then I would end up triple counting population in 
> > census tract pl,
> > similarly, if I were to collapse (sum) population for year 1. 
> >  Population for
> > pl in year 2 would be counted as 15 when it is only 5.  
> > Population for tract xx
> > would be counted as 44 when it is only 22.  As an 
> > intermediate step I was
> > thinking of doing the following:
> >          year 1     year 2
> > newcode  Code  pop code  pop  
> > 1        xx    22  lm     10  
> > 1        xx    22  pq     5   
> > 2        pl    12  pl     5   
> > 2        da     5  pl     5   
> > 2        gr     1  pl     5   
> > 3        fa     8  fa     16  
> > 4        00     0  no     10  
> > 5        00     0  lp     8   
> > 6        ll    10  00     0    
> > 7        pp    10  00     0   
> > 
> > That way, I could keep track of the population for each tract 
> > or "group" of
> > tracts and end up with
> > something like this:
> >          year 1   year 2
> > newcode  pop      pop  
> > 1        22        15 
> > 2        18         5      
> > 3        8         16   
> > 4        0         10 
> > 5        0          8
> > 6        10         0  
> > 7        10         0
> > How do I tell stata how to generate newcode (i.e. can I get 
> > stata to recognize
> > groups of census tracts with the same code and assign them a 
> > newcode, for both
> > years)?
> > Any help would be appreciated.
> 
> - ----------END OF MESSAGE----------

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index