Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: RE: Proper usage of Macros stored in summarize


From   "Nick Cox" <n.j.cox@durham.ac.uk>
To   <statalist@hsphsun2.harvard.edu>
Subject   st: RE: Proper usage of Macros stored in summarize
Date   Tue, 2 Dec 2008 18:27:00 -0000

In your first complicated command, p(75) is evidently a typo for r(p75).


In your second complicated command, p(25) is evidently a typo for
r(p25).

I didn't look further. 

There is no gain here, and much fiddly extra typing, in writing e.g.
`r(p75)' rather than r(p75). 

Note that -extremes- from SSC and the -egen- functions -adjl()- and
-adju()- from -egenmore- from SSC already incorporate similar
functionality. 

Nick 
n.j.cox@durham.ac.uk 

Thomas Speidel

I am trying to use macros stored in the summarize command to flag 
outliers/influenetial observations if they fall outside of this range:
p25 - 2IQR <= var  <= p75 +2IQR

suppose I try to do this on the weight var from the auto.dta dataset 
(with a little modification):

sysuse auto, clear
set obs 75
replace weight = 8000 in 75
qui: summ  weight, d
gen weight_outlier=1 if (weight>`p(75)'+2*(`r(p75)'-`r(p25)') &
(weight<.))
replace weight_outlier=1 if (weight<`p(25)'-2*(`r(p75)'-`r(p25)'))

If I was to do it by hand:
. di 3*2240-2*3670
-620
. di 3*3670-2*2240
6530
gen weight_outlier2=1 if weight>6530 & weight <.

There is something I am doing wrong in the first approach - read: poor 
macro programming :-) - but can't quite grasp what the problem is.

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index