Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: analogue of NODUPKEY


From   Dan Blanchette <dan_blanchette@unc.edu>
To   statalist@hsphsun2.harvard.edu
Subject   st: analogue of NODUPKEY
Date   Tue, 30 Dec 2003 10:08:46 -0500 (EST)

The closet thing I know of is the "by" command.
If you include variables with repeating information
in the "by" then you are assure that they are duplicates.

.  by id a b: keep if _n==1

will keep the first observation when there are multiple
observations of the same id, a and b combinations.

Data:

id	a	b	c
100	1	3	4
100	1	3	2
100	1	3	1
102	3	4	8
102	3	4	9
102	3	4	1
102	3	4	2

.  by id a b: keep if _n==1
.  list

returns:

id	a	b	c
100	1	3	4
102	3	4	8

Here's a webpage for a longer explanation:
http://www.cpc.unc.edu/services/computer/presentations/statatutorial/example21.html

dan

carolina population center, unc-ch
dan_blanchette@unc.edu


> Dear Statalisters,
>
> I am looking for a Stata analogue of a SAS procedure for a certain type of
> duplicate removal.  Suppose a dataset has fields A-J. For all subsets of
> records for which fields A-C are identical, I wish to keep only the first
> record and discard the rest, keeping all fields of the retained records.
> What is the simplest way to do this with Stata commands?
>
> Thanks very much in advance.
>
> Howard Burkom




*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index