[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: Detecting Duplicate Records

From	"Siyam,AA (pgr)" <[email protected]>
To	<[email protected]>
Subject	st: Detecting Duplicate Records
Date	Wed, 14 Aug 2002 19:22:16 +0100

Dear  Stata-users,

I have a household roster data file which consists of about 20 variables measured on household members.  I have my doubts that the persons_id within a household is not unique.  Is there a way I can "mass-check" all 20 variables between members of the same households to determine duplicate records.  I thought of the following:

sort hhid persons_id

for var V1-V20: gen DX=X[_n]==X[_n-1]

quietly by hhid: egen DSUM=rsum( DV1. .... DV20)

quietly by hhid: drop if DSUM[n]==20


Does that make sense!

Many thanks for your thoughts in advance...

Amani



*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

Follow-Ups:
- Re: st: Detecting Duplicate Records
  - From: Sarah Mustillo <[email protected]>
- Re: st: Detecting Duplicate Records
  - From: Devra Golbe <[email protected]>
- st: Re: Detecting Duplicate Records
  - From: "Marcela Perticara" <[email protected]>

Prev by Date: st: RE: single spaces in outfile
Next by Date: st: Re: Detecting Duplicate Records
Previous by thread: st: RE: single spaces in outfile
Next by thread: st: Re: Detecting Duplicate Records
Index(es):
- Date
- Thread