Statalist The Stata Listserver


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: Detecting Outliers


From   "Raphael Fraser" <raphael.fraser@gmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: Detecting Outliers
Date   Wed, 3 May 2006 08:19:08 -0500

I might also add that the data is also time series since the dates of
the measurements were recorded.


On 5/3/06, Raphael Fraser <raphael.fraser@gmail.com> wrote:
The data I am refering to is panel data. The purpose of the analysis
is to detect possible errors. I have on average 50 observations on 100
subjects.

On 5/3/06, Robert A Yaffee <bob.yaffee@nyu.edu> wrote:
> There are many types of outliers, depending upon
> whether you have time series or panel data.
>    In time series, there are additive outliers, innovational
> outliers, outlier patches, for example.   Some have worse
> effects than others.   Adjacent outliers may smear or
> mask others.  They may have good or bad leverage.
>    One should have the choice of detecting, modeling, or
> replacing them depending upon their theoretical significance.
>    What kind of analysis is being done here?
>       RY
>
> Robert A. Yaffee, Ph.D.
> Research Professor
> Shirley M. Ehrenkranz
> School of Social Work
> New York University
>
> home address:
> Apt 19-W
> 2100 Linwood Ave.
> Fort Lee, NJ
> 07024-3171
> Phone: 201-242-3824
> Fax: 201-242-3825
> yaffee@nyu.edu
>
> ----- Original Message -----
> From: n j cox <n.j.cox@durham.ac.uk>
> Date: Tuesday, May 2, 2006 9:37 am
> Subject: Re: st: Detecting Outliers
>
> > The short answer is Yes, many of them.
> > A longer answer is more difficult to do well
> > given such little information.
> >
> > We have just had a thread on an overlapping
> > question. Look for "outliners" [sic] in
> > the archives.
> >
> > You don't quite say so, but these sound like
> > panel data. For concreteness, I guess 500
> > patients and 10 observations on each, one
> > for each year. My guesses have some
> > influence on my suggestions.
> >
> > What is an outlier in this context? Presumably
> > a patient who differs from many others; or
> > an observation that differs from the rest
> > of the patient's history. Both could make
> > sense, e.g. in the case of anorexic/bulimic
> > patients, or patients who had a really bad
> > year, say a fight with cancer or being
> > caught up in "Lost".
> >
> > First off, if a patient's height varies more than
> > trivially over 10 years, either there is something
> > going on, say growth for young people or some aging
> > effect, or there is a error in the data.
> >
> > Weight fluctuations would seem rather different
> > and everyone knows reasons for various kinds
> > of weight change even in adulthood. It would
> > seem a bit more difficult to pick up
> > on errors (meaning mistakes).
> >
> > There are lots of things you can do. You
> > could set up a loop to plot the time series
> > for each patient. For 500 patients that would
> > be a little tedious, but it is a direct
> > approach.
> >
> > You could try reductions, e.g.
> >
> > last height - first height
> > last weight - first weight
> > mean height over period
> > mean weight over period
> > some measure of variability of each
> >
> > and look for outliers on pairwise plots
> > of each. A scatterplot matrix often
> > shows errors even in data that have
> > supposedly been cleaned. Often
> > the cleaning is univariate, but a
> > weird data value can show up like
> > a run in fabric.
> >
> > My prejudice is that no testing or
> > measuring approach beats graphics
> > for finding outliers.
> >
> > Nick
> > n.j.cox@durham.ac.uk
> >
> >
> > Raphael Fraser
> >
> > I have 10 years data (5000 observations) on patients heights and
> > weights. Is there any ado-file that could assist in locating possible
> > outliers?
> > *
> > *   For searches and help try:
> > *   http://www.stata.com/support/faqs/res/findit.html
> > *   http://www.stata.com/support/statalist/faq
> > *   http://www.ats.ucla.edu/stat/stata/
> >
> *
> *   For searches and help try:
> *   http://www.stata.com/support/faqs/res/findit.html
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
>

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index