Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: tricky data merge/joinby problem
From 
 
David Kantor <[email protected]> 
To 
 
[email protected] 
Subject 
 
Re: st: tricky data merge/joinby problem 
Date 
 
Fri, 04 Mar 2011 11:28:04 -0500 
Dimitry,
I still think that an m:m merge yields meaningless pairings. In you 
example, for bgid 2,
bid bgid  fracpop
21  2    .3
22  2    .2
23  2    .5
Assuming that you have, in the second file,
bgid dateyq bgpop
2    2010q1 whatever
2    2010q2 whatever
2    2010q3 whatever
2    2010q4 whatever
The first case (bid 21) would pair with 2010q1; the second (bid 22) 
with 2010q2; the third (bid 23) would be replicated and paired with 
2010q3 and 2010q4.
I'm not sure that this is meaningful.
But now that I understand your expand-to-a-panel scheme, it does look 
correct. And it makes sense that it would be faster than -joinby-.
Best wishes,
--David
At 11:13 AM 3/4/2011, you wrote:
David,
I wrote m:m merge since each BG usually appears more than once in the
first file (since blocks are the ids) and more than once in the second
(since it's a block group panel). I checked a few cases with the real
data and it seems to have worked. I just wanted to make sure that
there was nothing that I was missing and hoping to find a special case
that does not produce garbage.
By expanding into a panel, I meant stack the file1 on top
of itself
four times (4 quarters of 2010) and create a dateyq variable. The data
would not change over time, but it seemed to make m:1 by date and bgid
easier (at least in my head).
The reason I wanted to try merge is that is appears to be much faster
than joinby, which has been running for a long time on a pretty fast
server.
DVM
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/