Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: Collapsing with strings


From   wgould@stata.com (William Gould, Stata)
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: Collapsing with strings
Date   Mon, 16 Jan 2006 08:05:45 -0600

Daphna Bassok asked, 

> I have several duplicate observations in my data set.  However, they are 
> not perfect duplicates.  Only the id # is the same. So there might be 
> two observations with id#16 for instance, the first will have values for 
> some variables, and missing values for others. The second also have some 
> values filled and some missing.  There are no cases in which both have 
> values- that is... either the first in the pair has the value OR the 
> second has a value (or neither).

Daphna thought about solving this problem with -collapse- and there have been 
lots of answers and advice.  I want to suggest a different approach:  -merge-.

Let's call Daphan's original dataset master.dta.  What I suggest is this

    1.  From master.dta, copy the nonduplicates into nondups.dta.

    2.  From master.dta, copy the duplicates into dups.dta.

    3.  From dups.dta, copy the first occurance to first.dta.

    4.  From dups.dta, copy the second occuranced to second.dta.

    5.  Merge first.dta and second.dta.  That will replace missing 
        values where they exist.

    6.  Take the result from (5), and append it to master nondups.dta.

So here's the code:

        . use master, clear 
        . sort id
        . save, replace

        . by id:  keep if _N==1
        . save nondups

        . use master, clear 
        . by id:  keep if _N==2
        . sort id
        . save dups

        . by id: keep if _n==1
        . sort id
        . save first

        . use dups, clear 
        . by id: keep if _n==2
        . sort id
        . save second

        . use first, clear
        . merge id using second 
        . assert _merge==3
        . drop _merge 

        . append using nondups

-- Bill
wgould@stata.com
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2021 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index