Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

RE: st: Deleteing all observations for individuals with anomalous data


From   "Nick Cox" <n.j.cox@durham.ac.uk>
To   <statalist@hsphsun2.harvard.edu>
Subject   RE: st: Deleteing all observations for individuals with anomalous data
Date   Tue, 9 Aug 2005 22:38:59 +0100

Another way of doing this, without any new 
variables: 

bysort ID (Cost) : drop if missing(Cost[_N]) 

Nick 
n.j.cox@durham.ac.uk 

Antoine Terracol

> I would try something like :
> 
> generate tag=(cost==.)
> egen toberemoved=sum(tag), by(ID)
> drop if toberemoved>0
> drop tag toberemoved
> 
> 
> You will need to replace the "cost==." in the fisrt line by a more 
> general way to tag your erroneous values (such as "cost==. | 
> cost>9999")
 
Murray Lowe 

> > I am working with a large dataset and have discovered that 
> some of the data
> > are missing values or have erroneous values. The data is 
> panel data with
> > observations per individual over a 5 year period. For example:
> > 
> > ID	Year	Cost
> > 
> > 1	1	100
> > 1	2	200	
> > 1	3	500
> > 1	4	150
> > 1	5	x
> > 2	1	100	
> > 2	2	200	
> > 2	3	500
> > 2	4	600
> > 2	5	100
> > 
> > The problem is this: If an individual has a missing / 
> erroneous value for a
> > particular year, I want to exclude ALL of their 
> observations from the
> > dataset. In the example patient 1 would be removed from the dataset
> > entirely. How can this be done through an automated-type process?
> > Essentially I need a code / method that looks for the 
> anomalous data;
> > identifies the patient and then removes all of their 
> observations from the dataset.

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index