[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: Deleteing all observations for individuals with anomalousdata

From	Christian Holz <[email protected]>
To	[email protected]
Subject	Re: st: Deleteing all observations for individuals with anomalousdata
Date	Tue, 09 Aug 2005 23:06:59 +0100

Ermmm. Sorry. Next time I think a wee bit more before posting. Of course Nick's command sorts the data set by ID and then by cost within the tied values of ID. And since Stata interprets missings as very large numbers, the missing value (if any) will be last in each group.
Sorry :)

---
Christian Holz
Department of Sociology
University of Glasgow
Scotland, U.K.

Christian Holz wrote:

I think, however, that Nick's approach does not work, if a value for year 5 is there and another year has a missing value, as Nick's command only checks the last observation of each ID group.
I might be wrong, but in case I am not, it's worth mentionning...
Best
---
Christian Holz
Department of Sociology
University of Glasgow
Scotland, U.K.

Nick Cox wrote:
Another way of doing this, without any new variables:
bysort ID (Cost) : drop if missing(Cost[_N])
Nick [email protected]
Antoine Terracol
I would try something like :

generate tag=(cost==.)
egen toberemoved=sum(tag), by(ID)
drop if toberemoved>0
drop tag toberemoved

You will need to replace the "cost==." in the fisrt line by a more general way to tag your erroneous values (such as "cost==. | cost>9999")

Murray Lowe
I am working with a large dataset and have discovered that
some of the data
are missing values or have erroneous values. The data is
panel data with
observations per individual over a 5 year period. For example:

ID Year Cost

1 1 100
1 2 200 1 3 500
1 4 150
1 5 x
2 1 100 2 2 200 2 3 500
2 4 600
2 5 100

The problem is this: If an individual has a missing /
erroneous value for a
particular year, I want to exclude ALL of their
observations from the
dataset. In the example patient 1 would be removed from the dataset
entirely. How can this be done through an automated-type process?
Essentially I need a code / method that looks for the
anomalous data;
identifies the patient and then removes all of their
observations from the dataset.
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

References:
- RE: st: Deleteing all observations for individuals with anomalous data
  - From: "Nick Cox" <[email protected]>
- Re: st: Deleteing all observations for individuals with anomalousdata
  - From: Christian Holz <[email protected]>

Prev by Date: RE: st: Deleteing all observations for individuals with anomalous data
Next by Date: st: RE: ask for help about xtabond!
Previous by thread: Re: st: Deleteing all observations for individuals with anomalousdata
Next by thread: RE: st: Deleteing all observations for individuals with anomalous data
Index(es):
- Date
- Thread