Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: Deleteing all observations for individuals with anomalousdata


From   Antoine Terracol <Antoine.Terracol@univ-paris1.fr>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: Deleteing all observations for individuals with anomalousdata
Date   Mon, 08 Aug 2005 18:27:55 +0200

le 08/08/2005 17:51, Murray Lowe a ecrit :
Hi,

I am new to Statalist, hope someone can help me!

I am working with a large dataset and have discovered that some of the data
are missing values or have erroneous values. The data is panel data with
observations per individual over a 5 year period. For example:

ID	Year	Cost

1	1	100
1	2	200	
1	3	500
1	4	150
1	5	x
2	1	100	
2	2	200	
2	3	500
2	4	600
2	5	100

The problem is this: If an individual has a missing / erroneous value for a
particular year, I want to exclude ALL of their observations from the
dataset. In the example patient 1 would be removed from the dataset
entirely. How can this be done through an automated-type process?
Essentially I need a code / method that looks for the anomalous data;
identifies the patient and then removes all of their observations from the
dataset.

Hope you can help,

Murray Lowe

Hello,

I would try something like :

generate tag=(cost==.)
egen toberemoved=sum(tag), by(ID)
drop if toberemoved>0
drop tag toberemoved


You will need to replace the "cost==." in the fisrt line by a more general way to tag your erroneous values (such as "cost==. | cost>9999")

Best,
Antoine

--
Ce message a été vérifié par MailScanner
pour des virus ou des polluriels et rien de
suspect n'a été trouvé.

*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/




© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index