Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: winsorization and normality


From   "gary tian" <g.tian@uws.edu.au>
To   <statalist@hsphsun2.harvard.edu>
Subject   st: winsorization and normality
Date   Wed, 23 Jun 2004 08:57:41 +1000

Further to John's question regarding trimming, I would like to raise the
following question to seek your help.
I and testing cointegration and causality for daily return of share indices
time series (first log difference) data based on VAR model. whatever I put
different lag of each variable, I found there is still non-normality exist
in the time series by residual test. I applied sort of winsorization in
which the returns are winsorized by replacing all returns outside the range
[mean +/- standard deviations] with these boundary values. the problems of
non-normality has been largely improved but still existed. the Second
method, I found it is more effective is using monthly and quarterly data,
the problem is losing the original meaning of integration in precise number
of days. Are these standard ways to treat the problem, or is there any other
better way? thanks.  Gary
-----Original Message-----
From: owner-statalist@hsphsun2.harvard.edu
[mailto:owner-statalist@hsphsun2.harvard.edu]On Behalf Of Nick Cox
Sent: Wednesday, 23 June 2004 2:08 AM
To: statalist@hsphsun2.harvard.edu
Subject: st: RE: Decision on trimming the data


I guess there's a literature on this somewhere,
but it doesn't seem that trimming of tails
before regression ever caught on as standard practice
(unless there's a subdiscipline that does it all the
time, as a living refutation of this guess).

The key question to me is what is your underlying
problem? Worrying about long tails is often
best met by quantile or robust regression or using
transformations or non-identity link functions.
Far simpler and better supported than tinkering
with the tails...

Nick
n.j.cox@durham.ac.uk

Rijo John
>
>  I have a data set with quite a few outliers. Suppose I am trimming my
> dependent  variable 1% each from top and bottom using 1st and 99th
> percentiles. And I have the regression estimates before and after
> trimming. Let us also suppose that some of the variables that were
> significant before trimming turned out to be insignificant
> after trimming
> and/or viceversa.
>
>  Is there a standard way by which one can decide how much percentage
> of data should be trimmed? Is a chow test for the equality of
> coefficients
> enough for this? I mean trim upto the point where the changes in
> coefficients becomes insignificant? Or is there any other
> standard way to
> do this?

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index