Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: Generating unique values from unique and duplicate cases

From   Tim Morris <>
Subject   st: Generating unique values from unique and duplicate cases
Date   Mon, 17 Oct 2011 15:49:42 +0100


I have created syntax to check for and label potential duplicate cases across 3 variables (two text and one numeric). the syntax is as follows:

sort var1 var2 var3
quietly by var1 var2 var3: gen dup = cond(_N==1,0,_n) if var1!=. | var2!="" | var3!=""

this results in a new variable (dup) which may read as follows through the cases: 0, 0, 0, 1, 2, 0, 0 (1 and 2 being duplicate cases the rest being unique). What i want to do is create a new variable (id) that assigns a unique id to each unique case and groups together the corresponding duplicates into the same id, so based upon the example above the results will be along the lines of:

dup	id
0 	1
0 	2
0 	3
1 	4
2 	4
0 	5
0 	6

I have played around with various code, searched online and spoken to other STATA users for help but cannot find a way to make STATA assign unique values for each 'group' of duplicates. Thanks in advance for any help.

tim morris

Tim Morris, ALSPAC
0117 331 0022

*   For searches and help try:

© Copyright 1996–2015 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index