Statalist The Stata Listserver


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: RE getting rid of the outliners


From   "Maarten Buis" <M.Buis@fsw.vu.nl>
To   <statalist@hsphsun2.harvard.edu>
Subject   st: RE getting rid of the outliners
Date   Mon, 1 May 2006 10:17:04 +0200

-findit adjacent value- brings up the Nick's module
-adjacent- which you can install. It will only show
you the adjacent values, it does not store them so
you can use them to drop outliers. That could be an
oversight on the part of Nick, but I would not be
surprised if it was deliberate to prevent people
from mechanically dropping outliers.

Underneath I show how to create a new variable that
is one when mpg is an outliner and zero when it is
not, and how that variable could be used without
dropping cases. For details have a look at:
http://www.stata.com/support/faqs/data/trueorfalse.html


*----------------begin example-----------------
sysuse auto, clear
sum mpg, detail
local u = r(p75) + (3/2) * (r(p75) - r(p25))
local l = r(p25) - (3/2) * (r(p75) - r(p25))
gen out = mpg<`l' | mpg>`u'
hist mpg          /*histogram including outlier*/
hist mpg if !out  /*historgram excluding outlier*/
*---------------end example---------------------

HTH,
Maarten

-----------------------------------------
Maarten L. Buis
Department of Social Research Methodology
Vrije Universiteit Amsterdam
Boelelaan 1081
1081 HV Amsterdam
The Netherlands

visiting adress:
Buitenveldertselaan 3 (Metropolitan), room Z214

+31 20 5986715

http://home.fsw.vu.nl/m.buis/
-----------------------------------------

-----Original Message-----
From: vora n [mailto:vora_stata@hotmail.com]
Sent: zondag 30 april 2006 2:47
To: statalist@hsphsun2.harvard.edu
Subject: st: getting rid of the outliners

Is there any STATA command that can drop
the observations that are the outliners?

Let's say I graph the box-and-whisker plot

graph box y

and then the graph will show the outliners.
Is there any built-in command that can identify
these outliners and drop them out of my data?

Or is there any command that tells the upper
adjacent value and the lower adjacent value
so that I can drop the outliners manually?

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index