Stata: Data Analysis and Statistical Software

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: RE: st: Elimination of outliers

From	Nick Cox <[email protected]>
To	"'[email protected]'" <[email protected]>
Subject	RE: RE: st: Elimination of outliers
Date	Mon, 6 Jun 2011 15:48:43 +0100

I said I was not going to do this, but Austin Nichols gave you a gun. 

Nick 
[email protected] 

Achmed Aldai

Hi Nick,

can you please tell me how to eliminate the top and bottom 2% of each variable because in my regression so far I am not getting the proper results and want to find out with this if this causes the problem.

Thank you!
-------- Original-Nachricht --------
> Datum: Mon, 6 Jun 2011 15:17:32 +0100
> Von: Nick Cox <[email protected]>
> An: "\'[email protected]\'" <[email protected]>
> Betreff: RE: st: Elimination of outliers

> 1. Transformation means using a transformed scale (e.g. logarithms) for
> one or more of your variables. 
> 
> 2. A non-identity link function in a generalized linear model means what
> it says: the help for -glm- is the place to start and points to other
> documentation. 
> 
> Otherwise, I assert that elimination of outliers is a very bad idea
> _unless_ you know from independent evidence that they arise from serious and
> irremediable problems of measurement, in which case chopping the tails of the
> distribution is _not_ the way to do it. In most fields I know, the outliers
> that stick out are genuine and important (the Amazon in hydrology, USA or
> China wherever it is in economics, and so on, and so on) and leaving them
> out is in my view lousy science and lousy statistics. 
> 
> If you disagree, well, we disagree, but I am not going to tell you how to
> do this in Stata. 
> 
> Nick 
> [email protected] 
> 
> 
> -----Original Message-----
> From: [email protected]
> [mailto:[email protected]] On Behalf Of Achmed Aldai
> Sent: 06 June 2011 15:07
> To: [email protected]
> Subject: Re: st: Elimination of outliers
> 
> Hi
> 
> sorry I cannot really understand why it is a bad idea. I want to eliminate
> the outliers beacuse I think they cause a bias in my results. 
> 
> How can I transform my predictors and what do you mean by that?
> 
> What is a non-identity link function?
> 
> Thank you
> 
> FElix
> -------- Original-Nachricht --------
> > Datum: Mon, 6 Jun 2011 13:39:20 +0100
> > Von: Nick Cox <[email protected]>
> > An: "[email protected]" <[email protected]>
> > Betreff: Re: st: Elimination of outliers
> 
> > In general, a very bad idea. Consider transforming your response or  
> > predictors or using a non-identity link function in a generalized  
> > linear model or some flavour of robust regression as more measured  
> > tactics.
> > 
> > Nick
> > 
> > On 6 Jun 2011, at 12:46, "Achmed Aldai" <[email protected]> wrote:
> > 
> > > Hi
> > >
> > > I am currently working on a do file where I want to eliminate  
> > > outliers which have the highest and the lowest values regarding  
> > > certain variables. Here it is e.g. at and lt. In general I have  
> > > 150000 observations and out of these observations I want to delete  
> > > 25 observations from the upper and lower boundaries. But it might  
> > > also be better to do it relatively meaning that I dont take the  
> > > highest and lowest 25 but the lower and upper 1% of the  
> > > corresponding variables.
> > >
> > > gvkey           at           lt
> > > 1001            1120         231
> > > 1001            1230         312
> > > 1210            57           32
> > > 1210            67           25
> > > 1354            789          560
> > > 1368            650          500
> > > 1481            1230         900
> > > 2930            21           30
> > > 3201            234          213
> > > 3201            256          220
> > > 3210            267          320
> > > 4510            4335         3214
> > >

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

Follow-Ups:
- Re: RE: st: Elimination of outliers
  - From: Austin Nichols <[email protected]>

References:
- st: Elimination of outliers
  - From: "Achmed Aldai" <[email protected]>
- Re: st: Elimination of outliers
  - From: Nick Cox <[email protected]>
- Re: st: Elimination of outliers
  - From: "Achmed Aldai" <[email protected]>
- RE: st: Elimination of outliers
  - From: Nick Cox <[email protected]>
- Re: RE: st: Elimination of outliers
  - From: "Achmed Aldai" <[email protected]>

Prev by Date: Re: st: Elimination of outliers
Next by Date: Re: RE: st: Elimination of outliers
Previous by thread: Re: RE: st: Elimination of outliers
Next by thread: Re: RE: st: Elimination of outliers
Index(es):
- Date
- Thread