Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: cleaning a specific data structure


From   "Ban,R (pgt)" <R.Ban@lse.ac.uk>
To   <statalist@hsphsun2.harvard.edu>
Subject   st: cleaning a specific data structure
Date   Fri, 21 Nov 2003 12:40:43 -0000

Dear all,
The data is organized like this, numbers are made-up for this description:
 
id dummy descriptor
13 1 <blank>
13 0 abc
13 1 <blank>
14 0 <blank>
14 0 def
14 0 def

The idea is that the id variable should be unique, but for some reason it is not.
This means that both the dummy and descriptor should have the same values accross the 
id groups. A complication is that for the dummy, if there's a "1" in a group all the group
should be "1". 
 
I want to reduce this to a clean version which looks like this:
 
id dummy descriptor
13 1 abc
14 0 def
 
For the dummy part I dealt with it like this (probably a convoluted method):
bysort id: egen maxdummy = max(dummy)
replace dummy = maxdummy
bysort id: keep if _n == 1
 
But I am a bit stuck on how to deal with the string descriptor. I mean I know one
way of doing by splitting the data and then merging it back but there has to be a
more efficient way.
 
I would appreciate any help on this.
Radu Ban

<<winmail.dat>>




© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index