Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: RE: Decision on trimming the data


From   Rijo John <rijo@igidr.ac.in>
To   <statalist@hsphsun2.harvard.edu>
Subject   Re: st: RE: Decision on trimming the data
Date   Tue, 22 Jun 2004 21:50:23 +0530 (IST)

Dear Nick,

 Thanks for your reply. Actually my problem is as follows. I have a survey
data with so many observations. The distribution of the variables I am
considering are not normal. The standard OLS regression suffers from
Heteroscedasticity even after log transformations as any one would expect
in survey data. So my idea is to run a LAD regression (median regression
using qreg) which is not affected by outlier values compared to OLS. The
LAD is also robust to heteroscedsticity compared to OLS. So while doing it
I also want to compare the result between standard OLS and LAD. I also
want to check if I trim the outliers will OLS estimates come close to the
LAD estimates. This is what I want to do. I hope I make myself clear.

Regards,
Rijo John.

On Tue, 22 Jun 2004, Nick Cox wrote:

:I guess there's a literature on this somewhere,
:but it doesn't seem that trimming of tails
:before regression ever caught on as standard practice
:(unless there's a subdiscipline that does it all the
:time, as a living refutation of this guess).
:
:The key question to me is what is your underlying
:problem? Worrying about long tails is often
:best met by quantile or robust regression or using
:transformations or non-identity link functions.
:Far simpler and better supported than tinkering
:with the tails...
:
:Nick
:n.j.cox@durham.ac.uk
:
:Rijo John
:>
:>  I have a data set with quite a few outliers. Suppose I am trimming my
:> dependent  variable 1% each from top and bottom using 1st and 99th
:> percentiles. And I have the regression estimates before and after
:> trimming. Let us also suppose that some of the variables that were
:> significant before trimming turned out to be insignificant
:> after trimming
:> and/or viceversa.
:>
:>  Is there a standard way by which one can decide how much percentage
:> of data should be trimmed? Is a chow test for the equality of
:> coefficients
:> enough for this? I mean trim upto the point where the changes in
:> coefficients becomes insignificant? Or is there any other
:> standard way to
:> do this?
*****************************************************:
Rijo.M.John,Research Scholar
Indira Gandhi Institute of Developement Research,
Film City Road, Goregaon East,
Mumbai, India-400065.
contact: (+91)9892412476


*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index