Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: Deleteing all observations for individuals with anomalousdata

From   Christian Holz <>
Subject   Re: st: Deleteing all observations for individuals with anomalousdata
Date   Tue, 09 Aug 2005 23:00:52 +0100

I think, however, that Nick's approach does not work, if a value for year 5 is there and another year has a missing value, as Nick's command only checks the last observation of each ID group.
I might be wrong, but in case I am not, it's worth mentionning...
Christian Holz
Department of Sociology
University of Glasgow
Scotland, U.K.

Nick Cox wrote:

Another way of doing this, without any new variables:
bysort ID (Cost) : drop if missing(Cost[_N])
Antoine Terracol

I would try something like :

generate tag=(cost==.)
egen toberemoved=sum(tag), by(ID)
drop if toberemoved>0
drop tag toberemoved

You will need to replace the "cost==." in the fisrt line by a more general way to tag your erroneous values (such as "cost==. | cost>9999")
Murray Lowe

I am working with a large dataset and have discovered that
some of the data

are missing values or have erroneous values. The data is
panel data with

observations per individual over a 5 year period. For example:

ID Year Cost

1 1 100
1 2 200
1 3 500
1 4 150
1 5 x
2 1 100
2 2 200
2 3 500
2 4 600
2 5 100

The problem is this: If an individual has a missing /
erroneous value for a

particular year, I want to exclude ALL of their
observations from the

dataset. In the example patient 1 would be removed from the dataset
entirely. How can this be done through an automated-type process?
Essentially I need a code / method that looks for the
anomalous data;

identifies the patient and then removes all of their
observations from the dataset.

*   For searches and help try:

*   For searches and help try:

© Copyright 1996–2017 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index