Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: Appending unique cases based on two variables

From   "Logan-Greene, Patricia" <>
To   "''" <>
Subject   st: Appending unique cases based on two variables
Date   Thu, 9 Aug 2012 11:50:49 -0400


I am doing a fairly complicated merge between two sets of data (from criminal court records) that each contain an ID number and dates (along with many other variables). Here's some background:
1. The two files represent a) an assessment, given at approximately the same time as the beginning of probation, and b) discharge records.
2. Each file contains ID numbers that can be used to match individuals across files. The ID number can appear multiple times in each dataset (multiple entries reflect recidivism).
3. The entries are dated, which represents for a) the date on which the assessment in given, and for b) the official start date for probation. Although there are multiple entries for many ID numbers, there is only one instance of a particular ID and a particular date in each file.
2. As the dates don't match identically, we conducted a fuzzy match that paired assessment entries with discharge information (based on the beginning of probation) when the dates were within 6 weeks of each other.
3. We now need to add the unique cases from the assessment data (that may represent, for example, an incomplete probation). I know how to append unique cases based on a single identifier, but not with two. Will append even work if there are duplicates for one of the identifiers? 

Can anyone help?


*   For searches and help try:

© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index