Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Deleting Duplicates based on criteria


From   Nick Cox <[email protected]>
To   "[email protected]" <[email protected]>
Subject   Re: st: Deleting Duplicates based on criteria
Date   Thu, 11 Jul 2013 15:56:45 +0100

gen last = .

tokenize  homicide sex robbery assault trafficking  burglary  larceny
motor sales  weapon  DUI possession other

qui forval j = 1/13 {
          replace last = 14 - `j'  if ``j''  == 1
}

bysort id (last) : keep if _n == _N
Nick
[email protected]


On 11 July 2013 15:43, Dirlam, Jonathan C. <[email protected]> wrote:
> Highest charge determined by this order: 1. Homicide, 2. Sex offense, 3. Robbery, 4. Agg Assault, 5. Drug Trafficking, 6. Burglary, 7. Larceny Theft, 8. Motor Vehicle Theft, 9. Drug Sales, 10. Weapon, 11. DUI, 12. Drug Possession, 13. Other
>
> Example of data with 3 of 13 dummies:
> Court case number        id        robberydummy    burglarydummy    homicidedummy
> 000000038CFMA            6                 1                           0                         0
> 000000038CFMA            6                 1                           0                         0
> 000000038CFMA            6                 0                           1                         0
> 000000045CFMA            8                 1                           0                         0
> 000000045CFMA            8                 0                           0                         1
>
> In this example, I want one of the robbery observations for id=6 and the homicide observation for id=8.
> Thanks.
>
> ________________________________________
> From: [email protected] [[email protected]] on behalf of Nick Cox [[email protected]]
> Sent: Thursday, July 11, 2013 10:23 AM
> To: [email protected]
> Subject: Re: st: Deleting Duplicates based on criteria
>
> Yes,  but tell us the rules for determining the highest charge and
> give us a realistic example of a block of observations for some court
> case. (Need not be real, just realistic.)
> Nick
> [email protected]
>
>
> On 11 July 2013 15:18, Dirlam, Jonathan C. <[email protected]> wrote:
>> Dear Statalist,
>> I have duplicate observations where the duplicates are the same court case number. I want to eliminate all the observations for a court case except for the observation that has the highest charge (homicide, robbery, etc.) I have 12 dummy variables that capture charges and used the duplicates command to get unique ids for each court case number. Is there a way to write a program that eliminates or keeps duplicates based on criteria you give it (Example, homicidedummy==1) and stops once all but one observation are eliminated?
>> Thanks.
>>
>>
>> *
>> *   For searches and help try:
>> *   http://www.stata.com/help.cgi?search
>> *   http://www.stata.com/support/faqs/resources/statalist-faq/
>> *   http://www.ats.ucla.edu/stat/stata/
>
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/faqs/resources/statalist-faq/
> *   http://www.ats.ucla.edu/stat/stata/
>
>
>
>
>
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/faqs/resources/statalist-faq/
> *   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index