Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down at the end of May, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Elimination of outliers


From   "Achmed Aldai" <Hauptseminar@gmx.de>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: Elimination of outliers
Date   Mon, 06 Jun 2011 16:07:23 +0200

Hi

sorry I cannot really understand why it is a bad idea. I want to eliminate the outliers beacuse I think they cause a bias in my results. 

How can I transform my predictors and what do you mean by that?

What is a non-identity link function?

Thank you

FElix
-------- Original-Nachricht --------
> Datum: Mon, 6 Jun 2011 13:39:20 +0100
> Von: Nick Cox <njcoxstata@gmail.com>
> An: "statalist@hsphsun2.harvard.edu" <statalist@hsphsun2.harvard.edu>
> Betreff: Re: st: Elimination of outliers

> In general, a very bad idea. Consider transforming your response or  
> predictors or using a non-identity link function in a generalized  
> linear model or some flavour of robust regression as more measured  
> tactics.
> 
> Nick
> 
> On 6 Jun 2011, at 12:46, "Achmed Aldai" <Hauptseminar@gmx.de> wrote:
> 
> > Hi
> >
> > I am currently working on a do file where I want to eliminate  
> > outliers which have the highest and the lowest values regarding  
> > certain variables. Here it is e.g. at and lt. In general I have  
> > 150000 observations and out of these observations I want to delete  
> > 25 observations from the upper and lower boundaries. But it might  
> > also be better to do it relatively meaning that I dont take the  
> > highest and lowest 25 but the lower and upper 1% of the  
> > corresponding variables.
> >
> > gvkey           at           lt
> > 1001            1120         231
> > 1001            1230         312
> > 1210            57           32
> > 1210            67           25
> > 1354            789          560
> > 1368            650          500
> > 1481            1230         900
> > 2930            21           30
> > 3201            234          213
> > 3201            256          220
> > 3210            267          320
> > 4510            4335         3214
> >
> > I hope this became clear.
> >
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/

-- 
Empfehlen Sie GMX DSL Ihren Freunden und Bekannten und wir
belohnen Sie mit bis zu 50,- Euro! https://freundschaftswerbung.gmx.de
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index