Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: Dropping the largest and smallest 1% of observations


From   Ronan Conroy <rconroy@rcsi.ie>
To   "statalist hsphsun2.harvard.edu" <statalist@hsphsun2.harvard.edu>
Subject   Re: st: Dropping the largest and smallest 1% of observations
Date   Thu, 13 Feb 2003 11:42:57 +0000

on 13/02/2003 7:37 am, FUKUGAWA, Nobuya at fukugawa-nobuya@rieti.go.jp
wrote:

> I want to cut off extraordinarily large and small values from variables
> used in regression analysis.
> What is the easiest way to drop the largest and smallest 1% of observations
> from variables in STATA-7?

These values are potentially very informative. You can try other approaches
such as 
- median regression
- intreg 
- robust regression

Very large and very small values can indicate problems with measurement.
-intreg- can be used to specify that these values are not known precisely
but are bigger/smaller than some threshold.
Robust regression is useful to confirm that substantive conclusions from
your analysis are not being 'driven' by influential observations.

 

I hate discarding data. These strange values are trying to tell us
something. We ignore them at our peril. I am analysing some microbiology
data at the moment. There is a tradition of discarding any measurements
where there were so many bugs that the plate was unreadable. You can imagine
the havoc that this has played with results!

Ronan M Conroy (rconroy@rcsi.ie)
Lecturer in Biostatistics
Royal College of Surgeons
Dublin 2, Ireland
+353 1 402 2431 (fax 2764)

--------------------

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index