[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]
st: Merge changing census tracts
Hi Nick and all,
I found another way to get what I want, although I confess it isn't because I
think it a better way to do it, but because I'm not too familiar with macros.
Egen turned out to be really useful. Thanks again for your help. Here it is:
egen g1 = group(year1code district)
egen g2=group(year2code district)
collapse (mean) pop1 (sum) pop2 (max) g2, by (g1 district)
collapse (mean) pop2 (sum) pop1 (max) g1, by (g2 district)
egen newcode=group(g1 g2 district)
> Date: Fri, 6 May 2005 11:21:33 -0400
> From: Julia Gamas <jgamas@MIT.EDU>
> Subject: st: Merge changing census tracts
> Dear all,
> the equivalence table in my example is structured as follows:
> year 1 year 2
> Code code what happened to tract over time:
> xx lm census tract from year 1 split into two
> xx pq "
> pl pl census tract remained but got bigger because
> da pl census tracts from year 1 merged into an existing
> gr pl tract
> fa fa census tract remained the same
> 00 no census tract did not exist in year 1 but appeared in year
> 00 lp "
> ll 00 census tract existed in year 1 but disappeared in year 2
> pp 00 "
> Is this the information you need?
> Thank you for your help.
> Date: Thu, 5 May 2005 21:22:22 +0100
> From: "Nick Cox" <email@example.com>
> Subject: st: RE: Merge chaniging census tracts
> Evidently there is an equivalence table that tells
> you that the mapping is
> xx -> lm, pq
> pl, da, gr -> pl
> fa -> fa
> -> no
> independently of the two files. You need to
> tell us how that is structured, I believe.
> Julia Gamas
> > I have to merge two databases from two different years where
> > each observation is
> > a census tract. The challenge is avoiding double counting
> > after the merge, as a
> > result of census tracts changing through time. I would like
> > to create a code
> > that identifies the groups of tracts that split up or merged
> > but am not sure
> > how to do it. Other than doing it by hand (I have 4000
> > tracts), I'm wondering
> > if there is a somewhat automatic way to do this in Stata.
> > Currently, if I were to merge the two datasets using the
> > corresponding equivalence table I would get something like this.
> > year 1 year 2
> > Code pop code pop what happened to tract over time:
> > xx 22 lm 10 census tract from year 1 split into two
> > xx 22 pq 5 "
> > pl 12 pl 5 census tract remained but got bigger because
> > da 5 pl 5 census tracts from year 1 merged into an existing
> > gr 1 pl 5 tract
> > fa 8 fa 16 census tract remained the same
> > 00 0 no 10 census tract did not exist in year 1 but
> > appeared in year
> > 2
> > 00 0 lp 8 "
> > ll 10 00 0 census tract existed in year 1 but
> > disappeared in year 2
> > pp 10 00 0 "
> > If at this point I were to collapse (sum) the population for
> > year 2, for
> > example, then I would end up triple counting population in
> > census tract pl,
> > similarly, if I were to collapse (sum) population for year 1.
> > Population for
> > pl in year 2 would be counted as 15 when it is only 5.
> > Population for tract xx
> > would be counted as 44 when it is only 22. As an
> > intermediate step I was
> > thinking of doing the following:
> > year 1 year 2
> > newcode Code pop code pop
> > 1 xx 22 lm 10
> > 1 xx 22 pq 5
> > 2 pl 12 pl 5
> > 2 da 5 pl 5
> > 2 gr 1 pl 5
> > 3 fa 8 fa 16
> > 4 00 0 no 10
> > 5 00 0 lp 8
> > 6 ll 10 00 0
> > 7 pp 10 00 0
> > That way, I could keep track of the population for each tract
> > or "group" of
> > tracts and end up with
> > something like this:
> > year 1 year 2
> > newcode pop pop
> > 1 22 15
> > 2 18 5
> > 3 8 16
> > 4 0 10
> > 5 0 8
> > 6 10 0
> > 7 10 0
> > How do I tell stata how to generate newcode (i.e. can I get
> > stata to recognize
> > groups of census tracts with the same code and assign them a
> > newcode, for both
> > years)?
> > Any help would be appreciated.
> - ----------END OF MESSAGE----------
* For searches and help try: