Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Merge issues - m:m not returning all matches


From   Aaron Legler <aaron.legler@gmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: Merge issues - m:m not returning all matches
Date   Fri, 20 Jan 2012 10:49:03 -0500

Scott,

Thanks - Nick pointed me to Joinby and it worked.

I always thought merge m:m would also form all pairwise combinations -
my own misinterpretation of the command.

Aaron

On Fri, Jan 20, 2012 at 10:41 AM, Scott Merryman
<scott.merryman@gmail.com> wrote:
> For example:
>
> clear*
> set obs 2
> gen id = 2500
> gen patiennum = 10
> gen date = _n
> save  id,replace
> clear
> set obs 16
> gen id = 2500
> gen dist = runiform()
> save tract,replace
>
> use id
> merge m:m id using tract
> count
> use id,clear
> joinby id  using tract
> count
>
> Scott
>
>
>
> On Fri, Jan 20, 2012 at 9:24 AM, Aaron Legler <aaron.legler@gmail.com> wrote:
>> I am having an issue with merge -
>>
>> I have one dataset with patient_id and censustract, and another file with
>> censustract and distance to 16 locations
>>
>> When I perform the merge I am not getting all the possible matches:
>>
>> This is the original patient with 2 records
>>
>>    patiennum         geoid    svc_date
>>       12345   25009205500   01 Aug 09
>>       12345   25009205500   05 Sep 10
>>
>> after the merge:  merge m:m geoid using chc.censustract.dist.dta
>>
>> I should get 32 records (2 patient records x 16 locatons) but I'm only
>> getting 16:
>>
>>    patien~m         geoid    svc_date   km_to_~c   hosp        _merge
>>       12345   25009205500   01 Aug 09     13.701      2   matched (3)
>>       12345   25009205500   05 Sep 10     15.144      1   matched (3)
>>       12345   25009205500   05 Sep 10     15.144      5   matched (3)
>>       12345   25009205500   05 Sep 10     15.144     13   matched (3)
>>       12345   25009205500   05 Sep 10     15.144     14   matched (3)
>>       12345   25009205500   05 Sep 10     19.156     12   matched (3)
>>       12345   25009205500   05 Sep 10     19.156     16   matched (3)
>>       12345   25009205500   05 Sep 10     20.407      3   matched (3)
>>       12345   25009205500   05 Sep 10     20.407      4   matched (3)
>>       12345   25009205500   05 Sep 10     20.407      6   matched (3)
>>       12345   25009205500   05 Sep 10     20.407      8   matched (3)
>>       12345   25009205500   05 Sep 10     20.407     11   matched (3)
>>       12345   25009205500   05 Sep 10     20.407     15   matched (3)
>>       12345   25009205500   05 Sep 10     25.031      9   matched (3)
>>       12345   25009205500   05 Sep 10     25.038      7   matched (3)
>>       12345   25009205500   05 Sep 10     25.583     10   matched (3)
>>
>> It seems like the system isn't recognizing the differences in svc_date and
>> just running 1 match.
>>
>> I checked to ensure the geoids are the same:
>>
>> . tab geoid
>>      geoid |      Freq.     Percent        Cum.
>> ------------+-----------------------------------
>>   2.50e+10 |         16      100.00      100.00
>> ------------+-----------------------------------
>>      Total |         16      100.00
>> Any suggestions would be very appreciated.  thanks.
>>
>> Aaron Legler
>> *
>> *   For searches and help try:
>> *   http://www.stata.com/help.cgi?search
>> *   http://www.stata.com/support/statalist/faq
>> *   http://www.ats.ucla.edu/stat/stata/
>
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index