Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: RE: Outlier: Detection


From   "Austin Nichols" <austinnichols@gmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: RE: Outlier: Detection
Date   Wed, 20 Feb 2008 11:48:18 -0500

Sergiy, is there a reason to limit n to 90, or to use -inspect-
(necessarily limiting n to 99)?  Would this version accomplish the
same goal?

program Grubbs2, rclass sortpreserve
 syntax [varlist] [if] [in] [, Level(int 95)]
 marksample touse
 foreach v of local varlist {
  tempvar c
  qui bys `v' `touse': g `c'=_N-_n if `touse'
  qui count if `c'==0 & `touse'
  local n=r(N)
  local t2=(invttail(`n'-2,(1-`level'/200)/(2*`n')))^2
  local G_cr=((`n'-1)/sqrt(`n'))*sqrt(`t2'/(`n'-2+`t2'))
  quietly levelsof `v' if `touse', local(levs)
  if `: word count `levs''!=`n' error 198
  loc levsum=0
  loc sqsum=0
  foreach lev of local levs {
   local levsum=`levsum'+`lev'
   local sqsum=`sqsum'+`lev'*`lev'
  }
  local mean=`levsum'/`n'
  local levsdev=sqrt(`sqsum'/`n'-`mean'*`mean')
  local outliers
  foreach lev of local levs {
   local Z=abs(`mean'-`lev')/`levsdev'
   if `Z'>`G_cr' local outliers "`outliers' `lev'"
  }
  di as txt "Outliers in `v': " as res "`outliers'"
 }
 return local outliers="`outliers'"
end

sysuse auto
Grubbs2 pr-gear, lev(99)
Grubbs2 pr-gear if for==1, lev(99)
Grubbs2 pr-gear if for==0, lev(99)

(Disclaimer: I have not read the Grubbs article, but I share Maarten's
skepticism about the utility of this approach.)
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index