[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]
RE: st: RE: Decision on trimming the data
You make yourself very clear, but my
comment remains the same.
As so often on Statalist, what you
should do with your data is not
reducible to a straight technical
"Do this, don't do that".
> Dear Nick,
> Thanks for your reply. Actually my problem is as follows. I
> have a survey
> data with so many observations. The distribution of the variables I am
> considering are not normal. The standard OLS regression suffers from
> Heteroscedasticity even after log transformations as any one
> would expect
> in survey data. So my idea is to run a LAD regression (median
> using qreg) which is not affected by outlier values compared
> to OLS. The
> LAD is also robust to heteroscedsticity compared to OLS. So
> while doing it
> I also want to compare the result between standard OLS and LAD. I also
> want to check if I trim the outliers will OLS estimates come
> close to the
> LAD estimates. This is what I want to do. I hope I make myself clear.
> Rijo John.
> On Tue, 22 Jun 2004, Nick Cox wrote:
> :I guess there's a literature on this somewhere,
> :but it doesn't seem that trimming of tails
> :before regression ever caught on as standard practice
> :(unless there's a subdiscipline that does it all the
> :time, as a living refutation of this guess).
> :The key question to me is what is your underlying
> :problem? Worrying about long tails is often
> :best met by quantile or robust regression or using
> :transformations or non-identity link functions.
> :Far simpler and better supported than tinkering
> :with the tails...
> :Rijo John
> :> I have a data set with quite a few outliers. Suppose I am
> trimming my
> :> dependent variable 1% each from top and bottom using 1st and 99th
> :> percentiles. And I have the regression estimates before and after
> :> trimming. Let us also suppose that some of the variables that were
> :> significant before trimming turned out to be insignificant
> :> after trimming
> :> and/or viceversa.
> :> Is there a standard way by which one can decide how much
> :> of data should be trimmed? Is a chow test for the equality of
> :> coefficients
> :> enough for this? I mean trim upto the point where the changes in
> :> coefficients becomes insignificant? Or is there any other
> :> standard way to
> :> do this?
> Rijo.M.John,Research Scholar
> Indira Gandhi Institute of Developement Research,
> Film City Road, Goregaon East,
> Mumbai, India-400065.
> contact: (+91)9892412476
> * For searches and help try:
> * http://www.stata.com/support/faqs/res/findit.html
> * http://www.stata.com/support/statalist/faq
> * http://www.ats.ucla.edu/stat/stata/
* For searches and help try: