Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: RE: Percentiles with basic statistics


From   Maarten Buis <maartenlbuis@gmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: RE: Percentiles with basic statistics
Date   Thu, 18 Aug 2011 16:41:06 +0200

On Thu, Aug 18, 2011 at 4:11 PM, Ozgur Ozdemir wrote:
> I have a list of 20 variables in an excel and would like to get the mean, standard dev etc of all variables after eliminating the1% and 99% percentiles. is there any easy way of doing it rather than doing it one by one. thanks

I have a number of comments:

1)  We can only help you with Stata, not with excel (whatever that may be).

2) Do you want this variable by variable, or remove all observations
where at least one variable satisfies your criterion? If you want to
use these variables later together in an estimation command you will
have to use the latter option and live with the fact that you will
loose more than 2% of your observations.

3) This is a very bad idea. Outliers are the most informative
observations, blindly throwing those away should (IMHO) be a crime!
See: http://www.stata.com/statalist/archive/2011-08/msg00398.html

Anyhow, below is an example that shows how to do this over all
variables and variable by variable:

*------------------------ begin example ---------------------
sysuse auto, clear

// store a list of all variables excluding foreign and make in
// a local macro `varl'
ds foreign make, not
local varl `"`r(varlist)'"'

// remove if at least one variable contains "outlier"
gen byte touse = 1
foreach var of local varl {
	qui sum `var', detail
	replace touse = 0 if `var' <= r(p1) | ( `var' >= r(p99) & `var' < .)
}

// get the means and standard deviations
sum `varl' if touse == 1

// remove "outlier" variable by variable
foreach var of local varl {
	qui sum `var', detail
	sum `var' if `var' > r(p1) & `var' < r(p99)
}
*---------------- end example -----------------------
(For more on examples I sent to the Statalist see:
http://www.maartenbuis.nl/example_faq )

--------------------------
Maarten L. Buis
Institut fuer Soziologie
Universitaet Tuebingen
Wilhelmstrasse 36
72074 Tuebingen
Germany


http://www.maartenbuis.nl
--------------------------
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index