Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down at the end of May, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: RE: Dropping Duplicates that Aren't Exactly Duplicates


From   Nick Cox <n.j.cox@durham.ac.uk>
To   "'statalist@hsphsun2.harvard.edu'" <statalist@hsphsun2.harvard.edu>
Subject   st: RE: Dropping Duplicates that Aren't Exactly Duplicates
Date   Wed, 2 Nov 2011 18:32:07 +0000

In general, you are in charge. You get to define what counts as a duplicate you want to drop. 

Also, you can drop duplicates using any syntax you want that does the job. 

The -duplicates- command is the way of dealing with duplicates with which I am most familiar. I think you want to 

duplicates drop id violation, force 

Nick 
n.j.cox@durham.ac.uk 

Lisa Chavez

I have data in long file format that has three variables:  id, arrdate 
and violation.

Below is an example of a person who has three arrest events (I have 
separated them with lines).

Looking at the first two arrest dates (11mar2004 and 13jan2005) you see 
that each arrest has three violations and they are exactly the same.

I have lots of examples like this one;  in all instances I want to drop 
the last arrest event where this duplication occurs.

In the case below, I would want to drop all rows associated with the 
13jan2005 arrest event.

I'd appreciate any help you can offer.

Thanks!

Lisa

+----------------------------------------------------------------------------------------+
id                      
arrdate                                                         violation
----------------------------------------------------------------------------------------
A0000518   11mar2004                                 Cocaine-Possess  
Possess Cocaine
A0000518   11mar2004   Nonmoving Traffic Viol  Drive While Lic Susp 
Habitual Offender
A0000518   11mar2004                    Traffic Offense  Dui Alcohol Or 
Drugs 1St Off
----------------------------------------------------------------------------------------
A0000518   13jan2005                                 Cocaine-Possess  
Possess Cocaine
A0000518   13jan2005   Nonmoving Traffic Viol  Drive While Lic Susp 
Habitual Offender
A0000518   13jan2005                    Traffic Offense  Dui Alcohol Or 
Drugs 1St Off
----------------------------------------------------------------------------------------
A0000518   27feb2009                                   
Hallucinogen-Sell  Schedule Ii
+----------------------------------------------------------------------------------------+



*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index