Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
st: Merge issues - m:m not returning all matches
From 
 
Aaron Legler <[email protected]> 
To 
 
[email protected] 
Subject 
 
st: Merge issues - m:m not returning all matches 
Date 
 
Fri, 20 Jan 2012 10:24:33 -0500 
I am having an issue with merge -
I have one dataset with patient_id and censustract, and another file with
censustract and distance to 16 locations
When I perform the merge I am not getting all the possible matches:
This is the original patient with 2 records
    patiennum         geoid    svc_date
       12345   25009205500   01 Aug 09
       12345   25009205500   05 Sep 10
after the merge:  merge m:m geoid using chc.censustract.dist.dta
I should get 32 records (2 patient records x 16 locatons) but I'm only
getting 16:
    patien~m         geoid    svc_date   km_to_~c   hosp        _merge
       12345   25009205500   01 Aug 09     13.701      2   matched (3)
       12345   25009205500   05 Sep 10     15.144      1   matched (3)
       12345   25009205500   05 Sep 10     15.144      5   matched (3)
       12345   25009205500   05 Sep 10     15.144     13   matched (3)
       12345   25009205500   05 Sep 10     15.144     14   matched (3)
       12345   25009205500   05 Sep 10     19.156     12   matched (3)
       12345   25009205500   05 Sep 10     19.156     16   matched (3)
       12345   25009205500   05 Sep 10     20.407      3   matched (3)
       12345   25009205500   05 Sep 10     20.407      4   matched (3)
       12345   25009205500   05 Sep 10     20.407      6   matched (3)
       12345   25009205500   05 Sep 10     20.407      8   matched (3)
       12345   25009205500   05 Sep 10     20.407     11   matched (3)
       12345   25009205500   05 Sep 10     20.407     15   matched (3)
       12345   25009205500   05 Sep 10     25.031      9   matched (3)
       12345   25009205500   05 Sep 10     25.038      7   matched (3)
       12345   25009205500   05 Sep 10     25.583     10   matched (3)
It seems like the system isn't recognizing the differences in svc_date and
just running 1 match.
I checked to ensure the geoids are the same:
. tab geoid
      geoid |      Freq.     Percent        Cum.
------------+-----------------------------------
   2.50e+10 |         16      100.00      100.00
------------+-----------------------------------
      Total |         16      100.00
Any suggestions would be very appreciated.  thanks.
Aaron Legler
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/