Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: Detecting Duplicate Records


From   Devra Golbe <devra.golbe@hunter.cuny.edu>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: Detecting Duplicate Records
Date   Wed, 14 Aug 2002 14:39:32 -0400

Your program may well work.  An easier solution can be found by typing

-findit dups-


At 02:22 PM 8/14/02, you wrote:
Dear Stata-users,

I have a household roster data file which consists of about 20 variables measured on household members. I have my doubts that the persons_id within a household is not unique. Is there a way I can "mass-check" all 20 variables between members of the same households to determine duplicate records. I thought of the following:

sort hhid persons_id

for var V1-V20: gen DX=X[_n]==X[_n-1]

quietly by hhid: egen DSUM=rsum( DV1. .... DV20)

quietly by hhid: drop if DSUM[n]==20


Does that make sense!

Many thanks for your thoughts in advance...

Amani



*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index