Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: RE: Merge issues - m:m not returning all matches


From   Aaron Legler <[email protected]>
To   [email protected]
Subject   Re: st: RE: Merge issues - m:m not returning all matches
Date   Fri, 20 Jan 2012 10:46:11 -0500

Joinby works great - thanks Nick.

Aaron

On Fri, Jan 20, 2012 at 10:40 AM, Nick Cox <[email protected]> wrote:
> Also, your problem sounds more like one for -joinby-.
>
> Nick
> [email protected]
>
>
> -----Original Message-----
> From: Nick Cox
> Sent: 20 January 2012 15:36
> To: '[email protected]'
> Subject: RE: Merge issues - m:m not returning all matches
>
> On m:m merges: see the thread last week starting with
>
> http://www.stata.com/statalist/archive/2012-01/msg00370.html
>
> However, please ignore my post in that thread: it missed the point, which is well explained by others.
>
> Nick
> [email protected]
>
>
> -----Original Message-----
> From: [email protected] [mailto:[email protected]] On Behalf Of Aaron Legler
> Sent: 20 January 2012 15:25
> To: [email protected]
> Subject: st: Merge issues - m:m not returning all matches
>
> I am having an issue with merge -
>
> I have one dataset with patient_id and censustract, and another file with
> censustract and distance to 16 locations
>
> When I perform the merge I am not getting all the possible matches:
>
> This is the original patient with 2 records
>
>    patiennum         geoid    svc_date
>       12345   25009205500   01 Aug 09
>       12345   25009205500   05 Sep 10
>
> after the merge:  merge m:m geoid using chc.censustract.dist.dta
>
> I should get 32 records (2 patient records x 16 locatons) but I'm only
> getting 16:
>
>    patien~m         geoid    svc_date   km_to_~c   hosp        _merge
>       12345   25009205500   01 Aug 09     13.701      2   matched (3)
>       12345   25009205500   05 Sep 10     15.144      1   matched (3)
>       12345   25009205500   05 Sep 10     15.144      5   matched (3)
>       12345   25009205500   05 Sep 10     15.144     13   matched (3)
>       12345   25009205500   05 Sep 10     15.144     14   matched (3)
>       12345   25009205500   05 Sep 10     19.156     12   matched (3)
>       12345   25009205500   05 Sep 10     19.156     16   matched (3)
>       12345   25009205500   05 Sep 10     20.407      3   matched (3)
>       12345   25009205500   05 Sep 10     20.407      4   matched (3)
>       12345   25009205500   05 Sep 10     20.407      6   matched (3)
>       12345   25009205500   05 Sep 10     20.407      8   matched (3)
>       12345   25009205500   05 Sep 10     20.407     11   matched (3)
>       12345   25009205500   05 Sep 10     20.407     15   matched (3)
>       12345   25009205500   05 Sep 10     25.031      9   matched (3)
>       12345   25009205500   05 Sep 10     25.038      7   matched (3)
>       12345   25009205500   05 Sep 10     25.583     10   matched (3)
>
> It seems like the system isn't recognizing the differences in svc_date and
> just running 1 match.
>
> I checked to ensure the geoids are the same:
>
> . tab geoid
>      geoid |      Freq.     Percent        Cum.
> ------------+-----------------------------------
>   2.50e+10 |         16      100.00      100.00
> ------------+-----------------------------------
>      Total |         16      100.00
>
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index