Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down at the end of May, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: RE: Winsorize by time and group


From   Nick Cox <n.j.cox@durham.ac.uk>
To   "'statalist@hsphsun2.harvard.edu'" <statalist@hsphsun2.harvard.edu>
Subject   st: RE: Winsorize by time and group
Date   Tue, 29 Nov 2011 10:51:54 +0000

There are no precise web references to Statalist postings here to comment on.

The allusion is presumably to -winsor- on SSC. 

You are asked to explain _where_ user-written programs you refer to come from. 

1. It makes questions much easier to understand. Questions that are not understood will probably just be deleted. 
2. It gives people, even those who will not answer the question, some information they might find helpful. 
3. It gives some credit or publicity to the original program authors, if only indirectly. 
4. Sometimes, as Maarten stressed earlier today, there are different versions of a program and it is important to be clear which you are using. 

In this case, as you say, -winsor- does not support -by:- and the author evidently has no immediate intention to add that support. 

So, here is an example of appropriate technique

sysuse auto, clear 
egen group = group(rep78)
gen winsorised = .
su group, meanonly
forval i  = 1/`r(max)' {
	capture { 
		winsor mpg if group == `i', gen(work) h(1)
		replace winsorised = work if group == `i'
		drop work
   }
}

-egen, group()- will happily work with combinations of two or more variables, so the example can be adopted to your situation. 

And here is some explanation of that technique: 

http://www.stata.com/support/faqs/data/foreach.html

Nick 
n.j.cox@durham.ac.uk 

Gokhan Yilmaz

I have a time variable "t" that takes values between 1 to 12 for each
month and a group variable that is {0,1}. I want to winsorize my
return variable "ret" for each group in each month. Since "by" cannot
be combined with "winsor", can you suggest a syntax in this case? In
my statalist search i found the syntax  for winsorizing on one
dimension (every year) but couldn't find a case with two dimensions.


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index