Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: RE: Duplicate combinations of variables

From   "Nick Cox" <>
To   <>
Subject   st: RE: Duplicate combinations of variables
Date   Thu, 25 Mar 2010 15:19:06 -0000

Assuming that your identifiers are numeric, as implied here, then 

gen minID = min(CaseID, ControlID)
gen maxID = max(CaseID, ControlID) 
duplicates report minID maxID 

and this check will do no harm: 

assert CaseID != ControlID 

For more on the min-max trick here, see e.g. 

SJ-8-4  dm0043  . Tip 71: The problem of split identity, or how to group
        . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  N.
J. Cox
        Q4/08   SJ 8(4):588--591                                 (no
        tip on how to handle dyadic identifiers


Miranda Kim

I am a Stata 11 user, and I wonder if anyone has advice on achieving the

I have two variables, one with ID numbers for cases and one with ID 
numbers for matched controls in a case control study. There are 3 
controls matched to each case, and a subject may serve as a control for 
more than one case or later become a case. I am wanting to check that in

my dataset all the pairs of cases and their matched control are unique; 
ie that I don't have for example:
Case ID Control ID
123 253
253 123

*   For searches and help try:

© Copyright 1996–2016 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index