Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: Unique Clients with Multiple IDs that need to be integrated


From   "Emily Putnam-Hornstein" <eputnamhornstein@berkeley.edu>
To   <statalist@hsphsun2.harvard.edu>
Subject   st: Unique Clients with Multiple IDs that need to be integrated
Date   Thu, 11 Oct 2012 15:19:52 -0700

I am working with a large substance abuse treatment dataset with roughly
900,000 observations for approximately 200,000 unique clients. 

In working with these data (these records were being linked to a second
dataset), it was discovered that a number of clients had been inadvertently
assigned a new id when returning for a follow-up treatment service. In other
words, the agency thought it was linking/working with dataset of unique
clients, only to discover that many records were clearly for the same
individual. 

To complete the data linkages the decision was made to maintain one id/one
record for each client, but to create up to 10 new variables with the
alternative ids listed so that after the linkages were completed, data for
these other ids could be pulled to construct a longitudinal record of
service encounters for a given individual. They were attached to a client?s
record as depicted in the data below. So, for example, the first client
(cid1=1) was not found to have any duplicate ids, so fields altid1=cid1 and
altid2-altid10 are all empty (I have only shown alt1-altid3 for simplicity).
But for the second client (cid1=2) it was discovered that this client has
information under two other ids ?65958? and ?62781?

To be able to construct a longitudinal record of service encounters for each
client I need to recode the records currently appearing as cid1=65958 and
cid1=62781 to cid1=2 (and I need to do this for up to 10 ids for some
clients). Additionally, the correct client id is listed as an alternative id
for those records I want to recode. So for cid1=65958, altid1=2 and
altid2=62781 (I have also shown this below, under ?Later Data?). 

Finally, I mentioned the data linkages earlier because it is important that
I maintain the first (as sorted) cid1 as that is the id that links to other
data source ? although it is not necessarily the id associated with a
client?s first admission.

I will not burden the listserv with the completely inelegant code I have
been working on for the last two days (consisting of attempts to reshape,
merge subsets of records, etc. and has gotten me nowhere). Nor will I
attempt to justify decisions that were made in creating this dataset that
was given to me?but would greatly appreciate any and all guidance! Thank you
and apologies in advance if I have explained this poorly. Emily

       
+---------------------------------------------------------------------------
-----------------+
        | CLIENT   modality_type        adm1        dis1       cid1  
  altid1     altid2     altid3 |
       
|---------------------------------------------------------------------------
-----------------|
     1. |      1       Admission   12 Dec 06           .          1         
1          .          . |
     2. |      1       Discharge   12 Dec 06   31mar2007          1         
1          .          . |
       
|---------------------------------------------------------------------------
-----------------|
     3. |      2       Admission   08 Mar 07           .          2         
2      65958      62781 |
     4. |      2       Discharge   08 Mar 07   17mar2007          2         
2      65958      62781 |
     5. |      2       Admission   14 Jun 07           .          2         
2      65958      62781 |
     6. |      2       Discharge   14 Jun 07   30jun2007          2         
2      65958      62781 |
       
|---------------------------------------------------------------------------
-----------------|
     7. |      3       Admission   14 Jul 06           .          3    
123171          3          . |
     8. |      3       Discharge   14 Jul 06   03aug2006          3    
123171          3          . |
     9. |      3       Admission   02 Oct 06           .          3    
123171          3          . |
    10. |      3       Discharge   02 Oct 06   22oct2006          3    
123171          3          . |
    11. |      3       Admission   30 Apr 07           .          3    
123171          3          . |
    12. |      3       Discharge   30 Apr 07   20may2007          3    
123171          3          . |
       
|---------------------------------------------------------------------------
-----------------|
    13. |      4       Admission   07 Mar 07           .         11        
11   15262202          . |
    14. |      4       Discharge   07 Mar 07   27mar2007         11        
11   15262202          . |
       
|---------------------------------------------------------------------------
-----------------|
    15. |      5       Admission   20 Dec 06           .         13     
66234         13     172805 |
    16. |      5       Discharge   20 Dec 06   07jan2007         13     
66234         13     172805 |
    17. |      5       Admission   01 Mar 07           .         13     
66234         13     172805 |
    18. |      5       Discharge   01 Mar 07   12mar2007         13     
66234         13     172805 |
       
|---------------------------------------------------------------------------
-----------------|
    19. |      6       Admission   13 Feb 07           .         31  
14997478         31     190690 |
    20. |      6       Discharge   13 Feb 07   06mar2007         31  
14997478         31     190690 |
    21. |      6       Admission   22 Jun 10           .         31  
14997478         31     190690 |
    22. |      6       Discharge   22 Jun 10   23sep2010         31  
14997478         31     190690 |

(LATER DATA)

       
+---------------------------------------------------------------------------
----------+
        | CLIENT   mo~y_type        adm1        dis1    cid1       altid1  
altid2     altid3 |
       
|---------------------------------------------------------------------------
----------|
67821. |  14922   Admission   19 Dec 06           .   65958          2     
65958      62781 |
67822. |  14922   Discharge   19 Dec 06   30jan2007   65958          2     
65958      62781 |
67823. |  14922   Admission   08 Feb 07           .   65958          2     
65958      62781 |
67824. |  14922   Discharge   08 Feb 07   12mar2007   65958          2     
65958      62781 |
       
+---------------------------------------------------------------------------
----------+




*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index