Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: Detecting Duplicate Records


From   "Siyam,AA (pgr)" <[email protected]>
To   <[email protected]>
Subject   st: Detecting Duplicate Records
Date   Wed, 14 Aug 2002 19:22:16 +0100

Dear  Stata-users,

I have a household roster data file which consists of about 20 variables measured on household members.  I have my doubts that the persons_id within a household is not unique.  Is there a way I can "mass-check" all 20 variables between members of the same households to determine duplicate records.  I thought of the following:

sort hhid persons_id

for var V1-V20: gen DX=X[_n]==X[_n-1]

quietly by hhid: egen DSUM=rsum( DV1. .... DV20)

quietly by hhid: drop if DSUM[n]==20


Does that make sense!

Many thanks for your thoughts in advance...

Amani



*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index