Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: How to merge datasets when there are missing values in the matching variables


From   shihying yao <berkeley.yao@gmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   st: How to merge datasets when there are missing values in the matching variables
Date   Sat, 21 Jan 2012 14:55:54 -0500

Hi there,
I am trying to merge two data files using two unique ID variables, ID1
and ID2. Note that not all of the subjects have both ID1 and ID2
information in both files. Suppose the names of the data files are
"master" and "subset." Below resembles the code I used:

use subset, clear
sort ID1 ID2
save subset,replace

use master, clear
sort ID1 ID2
merge ID1 ID2 using subset

The problem occurs for subjects whose ID1 information is missing in
one of the data files (either one). Although these subjects can be
uniquely identified using ID2 in both files, their records are not
merged and there are duplicate records (i.e., one record has both ID1
and ID2 information, while the other record has ID2 information and
ID1 missing) in the merged file. It doesn't help whether I sort ID1 or
ID2 first, since some subjects have ID2 information in only one file.

The version I am using is STATA 10. Any help is appreciated.

Best,
Shihying
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index