From | Gary Longton <[email protected]> |
To | [email protected] |
Subject | Re: st: Collapsing with strings |
Date | Fri, 13 Jan 2006 11:57:53 -0800 |
Daphna Bassok wrote:
I have several duplicate observations in my data set. However, they are not perfect duplicates. Only the id # is the same. So there might be two observations with id#16 for instance, the first will have values for some variables, and missing values for others. The second also have some values filled and some missing. There are no cases in which both have values- that is... either the first in the pair has the value OR the second has a value (or neither).
For example: suppose I have two observations with id# 16... The first has values for var1 and 2 and not 3. The second ONLY has values for var 3. What i would like to do is simply collapse these into a single observation with all the relevant info. meaning, 1 observation with id#16 that has values for all three variables.
I am trying to do this with the collapse command with no success.
My code is:
collapse (min) var1-var3, by(id)
I thought this would create a new observation that has all the data in it.
I am getting a "type mismatch" error.
Is this because some of my variables are string variables?
Nick Cox suggested:
What you can do is -- if your description is correct --
egen nmiss = rowmiss(<insert variable names>) bysort id (nmiss) : keep if _n == 1
as the sort will sort the observation with more missings to second place.
and Austin Nichols suggested:
It is a rare day when one can make a correction to a typically accurate and elegant Nick Cox solution, so I make this one fearing that I've probably missed somthing obvious.foreach v of varlist put all the relevant varnames here { bys id (`v'): qui replace `v'=`v'[_n-1] if mi(`v') } bys id: drop if _n>1
© Copyright 1996–2024 StataCorp LLC | Terms of use | Privacy | Contact us | What's new | Site index |