Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: duplicated studyid when merging files


From   [email protected] (William Gould, Stata)
To   [email protected]
Subject   Re: st: duplicated studyid when merging files
Date   Mon, 05 Dec 2005 08:59:43 -0600

Richard Lenhardt <[email protected]> experienced a surprise when
merging files:

> Sorted both files by studyid. 
> Merged file B into file A.
> Works fine, except that one studyid was duplicated.  _merge variable was 
> "3" for both copies. 

What this means is that there is more than one record with the same 
studyid in file B or in file A.  For instance, if the original duplicate 
was in file B, 

       File A                          File B
       studyid    other vars           studyid   other vars
          2116    4 5 2 9                 2116   90401
                          `               2116   90402

    Result:

       studyid     other vars       _merge
          2116     4 5 2 9 90401         3
          2116     4 5 2 9 90402         3

What -merge- did in this case should make sense to you.  File B had two
observations with studyid=2116, so -merge- duplicated the single studyid=2116
observation in File A and then merged.  This can be of great use.  For
instance, one might have a file of persons, and in the person file is recorded
the state in which they live.  One might have another file of state
characteristics.  One could merge the two files by state and then have a file
of persons, with characteristics of states appropriately duplicated.

In most cases, however, the id variable is unique, or supposed to be unique,
and in those cases, I reccommend specifying -merge-'s option -unique-.  It 
will not solve the problem, but it will look for the problem and issue an 
error message if it finds it.  If that is the problem, then the question 
becomes how File B (or File A) ended up with a duplicate observation when 
it should not have.

-- Bill
[email protected]
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index