Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

RE: st: RE: Decision on trimming the data


From   "Nick Cox" <n.j.cox@durham.ac.uk>
To   <statalist@hsphsun2.harvard.edu>
Subject   RE: st: RE: Decision on trimming the data
Date   Tue, 22 Jun 2004 17:32:32 +0100

You make yourself very clear, but my 
comment remains the same. 

As so often on Statalist, what you 
should do with your data is not 
reducible to a straight technical 
"Do this, don't do that". 

Nick 
n.j.cox@durham.ac.uk 

Rijo John
 
> Dear Nick,
> 
>  Thanks for your reply. Actually my problem is as follows. I 
> have a survey
> data with so many observations. The distribution of the variables I am
> considering are not normal. The standard OLS regression suffers from
> Heteroscedasticity even after log transformations as any one 
> would expect
> in survey data. So my idea is to run a LAD regression (median 
> regression
> using qreg) which is not affected by outlier values compared 
> to OLS. The
> LAD is also robust to heteroscedsticity compared to OLS. So 
> while doing it
> I also want to compare the result between standard OLS and LAD. I also
> want to check if I trim the outliers will OLS estimates come 
> close to the
> LAD estimates. This is what I want to do. I hope I make myself clear.
> 
> Regards,
> Rijo John.
> 
> On Tue, 22 Jun 2004, Nick Cox wrote:
> 
> :I guess there's a literature on this somewhere,
> :but it doesn't seem that trimming of tails
> :before regression ever caught on as standard practice
> :(unless there's a subdiscipline that does it all the
> :time, as a living refutation of this guess).
> :
> :The key question to me is what is your underlying
> :problem? Worrying about long tails is often
> :best met by quantile or robust regression or using
> :transformations or non-identity link functions.
> :Far simpler and better supported than tinkering
> :with the tails...
> :
> :Nick
> :n.j.cox@durham.ac.uk
> :
> :Rijo John
> :>
> :>  I have a data set with quite a few outliers. Suppose I am 
> trimming my
> :> dependent  variable 1% each from top and bottom using 1st and 99th
> :> percentiles. And I have the regression estimates before and after
> :> trimming. Let us also suppose that some of the variables that were
> :> significant before trimming turned out to be insignificant
> :> after trimming
> :> and/or viceversa.
> :>
> :>  Is there a standard way by which one can decide how much 
> percentage
> :> of data should be trimmed? Is a chow test for the equality of
> :> coefficients
> :> enough for this? I mean trim upto the point where the changes in
> :> coefficients becomes insignificant? Or is there any other
> :> standard way to
> :> do this?
> *****************************************************:
> Rijo.M.John,Research Scholar
> Indira Gandhi Institute of Developement Research,
> Film City Road, Goregaon East,
> Mumbai, India-400065.
> contact: (+91)9892412476
> 
> 
> *
> *   For searches and help try:
> *   http://www.stata.com/support/faqs/res/findit.html
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
> 

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index