Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down at the end of May, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: RE: Duplicate combinations of variables


From   "Nick Cox" <n.j.cox@durham.ac.uk>
To   <statalist@hsphsun2.harvard.edu>
Subject   st: RE: Duplicate combinations of variables
Date   Thu, 25 Mar 2010 15:19:06 -0000

Assuming that your identifiers are numeric, as implied here, then 

gen minID = min(CaseID, ControlID)
gen maxID = max(CaseID, ControlID) 
duplicates report minID maxID 

and this check will do no harm: 

assert CaseID != ControlID 

For more on the min-max trick here, see e.g. 

SJ-8-4  dm0043  . Tip 71: The problem of split identity, or how to group
dyads
        . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  N.
J. Cox
        Q4/08   SJ 8(4):588--591                                 (no
commands)
        tip on how to handle dyadic identifiers

Nick 
n.j.cox@durham.ac.uk 

Miranda Kim

I am a Stata 11 user, and I wonder if anyone has advice on achieving the

following:
I have two variables, one with ID numbers for cases and one with ID 
numbers for matched controls in a case control study. There are 3 
controls matched to each case, and a subject may serve as a control for 
more than one case or later become a case. I am wanting to check that in

my dataset all the pairs of cases and their matched control are unique; 
ie that I don't have for example:
Case ID Control ID
123 253
253 123

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index