Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: tricky data merge/joinby problem


From   "Dimitriy V. Masterov" <[email protected]>
To   Statalist <[email protected]>
Subject   st: tricky data merge/joinby problem
Date   Fri, 4 Mar 2011 10:30:38 -0500

I have two files that I would like to merge. The first contains data
on city blocks and block groups (BGs) and fraction of population
variable. A simplified version of the data looks like this:

bid bgid  fracpop
11  1    .5
12  1    .5
21  2    .3
22  2    .2
23  2    .5

For example, BG 1 contains 2 blocks, each of which has half of the BG
1's population (fracpop==.5). The unique identifier in this file is
bid.

I would like to merge the data above with panel data file2 that
contains block group populations over time. This data looks like:

bgid dateyq bgpop
1    2010q1 100
1    2010q2 105
1    2010q3 106
1    2010q4 125

Here bgid and dateyq are the identifiers. The final goal of merging is
to come up with a population for each block by allocating bgpop using
the weights in fracpop. For example, for BG 1, this would yield:

bid bgid dateyq bpop
11  1    2010q1 50
12  1    2010q1 50

Does this require the dreaded m:m merge with bgid as the id as the
first step? That appears to work (although I only checked a few
cases). Or is is better to expand the first file into a panel and then
merge on bgid and dateyq? Or should I use -joinby bgid using
file2.dta-? I am not sure which is the most efficient solution.

DVM
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index