Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

RE: st: RE: Outlier: Detection


From   <badri.prasad@hrsdc-rhdsc.gc.ca>
To   <statalist@hsphsun2.harvard.edu>
Subject   RE: st: RE: Outlier: Detection
Date   Wed, 20 Feb 2008 12:35:47 -0500

Hi Austin,
I ran your program with my data set of 190717 observations and found the following result.

. Grubbs2 lnwage, lev(95)
macro length exceeded
r(1000);

The variable lnwage is float type. What is the size of the macro length that is allowed to be used by this program. How to use program with 190717 or more number of observations in the data set.

With regards.

Badri Prasad
Policy, Reporting and Data Development
Labour Standards and Workplace Equity
National Labour Operations Directorate
HRSDC
(819) 956 - 8146


-----Original Message-----
From: owner-statalist@hsphsun2.harvard.edu [mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of Austin Nichols
Sent: 2008-02-20 11:48 AM
To: statalist@hsphsun2.harvard.edu
Subject: Re: st: RE: Outlier: Detection


Sergiy, is there a reason to limit n to 90, or to use -inspect-
(necessarily limiting n to 99)?  Would this version accomplish the
same goal?

program Grubbs2, rclass sortpreserve
 syntax [varlist] [if] [in] [, Level(int 95)]
 marksample touse
 foreach v of local varlist {
  tempvar c
  qui bys `v' `touse': g `c'=_N-_n if `touse'
  qui count if `c'==0 & `touse'
  local n=r(N)
  local t2=(invttail(`n'-2,(1-`level'/200)/(2*`n')))^2
  local G_cr=((`n'-1)/sqrt(`n'))*sqrt(`t2'/(`n'-2+`t2'))
  quietly levelsof `v' if `touse', local(levs)
  if `: word count `levs''!=`n' error 198
  loc levsum=0
  loc sqsum=0
  foreach lev of local levs {
   local levsum=`levsum'+`lev'
   local sqsum=`sqsum'+`lev'*`lev'
  }
  local mean=`levsum'/`n'
  local levsdev=sqrt(`sqsum'/`n'-`mean'*`mean')
  local outliers
  foreach lev of local levs {
   local Z=abs(`mean'-`lev')/`levsdev'
   if `Z'>`G_cr' local outliers "`outliers' `lev'"
  }
  di as txt "Outliers in `v': " as res "`outliers'"
 }
 return local outliers="`outliers'"
end

sysuse auto
Grubbs2 pr-gear, lev(99)
Grubbs2 pr-gear if for==1, lev(99)
Grubbs2 pr-gear if for==0, lev(99)

(Disclaimer: I have not read the Grubbs article, but I share Maarten's
skepticism about the utility of this approach.)
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index