Statalist The Stata Listserver


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: Detecting Outliers


From   "Raphael Fraser" <raphael.fraser@gmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: Detecting Outliers
Date   Wed, 3 May 2006 09:55:13 -0500

One can safely assume the critical values will be the same within each panel.

On 5/3/06, Robert A Yaffee <bob.yaffee@nyu.edu> wrote:
Will the critical values for outliers be the same within each panel
or will they differ from panel to panel (perhaps depending upon
the sample size of each panel)?

Robert A. Yaffee, Ph.D.
Research Professor
Shirley M. Ehrenkranz
School of Social Work
New York University

home address:
Apt 19-W
2100 Linwood Ave.
Fort Lee, NJ
07024-3171
Phone: 201-242-3824
Fax: 201-242-3825
yaffee@nyu.edu

----- Original Message -----
From: Raphael Fraser <raphael.fraser@gmail.com>
Date: Wednesday, May 3, 2006 9:14 am
Subject: Re: st: Detecting Outliers

> The data I am refering to is panel data. The purpose of the analysis
> is to detect possible errors. I have on average 50 observations on 100
> subjects.
>
> On 5/3/06, Robert A Yaffee <bob.yaffee@nyu.edu> wrote:
> > There are many types of outliers, depending upon
> > whether you have time series or panel data.
> >    In time series, there are additive outliers, innovational
> > outliers, outlier patches, for example.   Some have worse
> > effects than others.   Adjacent outliers may smear or
> > mask others.  They may have good or bad leverage.
> >    One should have the choice of detecting, modeling, or
> > replacing them depending upon their theoretical significance.
> >    What kind of analysis is being done here?
> >       RY
> >
> > Robert A. Yaffee, Ph.D.
> > Research Professor
> > Shirley M. Ehrenkranz
> > School of Social Work
> > New York University
> >
> > home address:
> > Apt 19-W
> > 2100 Linwood Ave.
> > Fort Lee, NJ
> > 07024-3171
> > Phone: 201-242-3824
> > Fax: 201-242-3825
> > yaffee@nyu.edu
> >
> > ----- Original Message -----
> > From: n j cox <n.j.cox@durham.ac.uk>
> > Date: Tuesday, May 2, 2006 9:37 am
> > Subject: Re: st: Detecting Outliers
> >
> > > The short answer is Yes, many of them.
> > > A longer answer is more difficult to do well
> > > given such little information.
> > >
> > > We have just had a thread on an overlapping
> > > question. Look for "outliners" [sic] in
> > > the archives.
> > >
> > > You don't quite say so, but these sound like
> > > panel data. For concreteness, I guess 500
> > > patients and 10 observations on each, one
> > > for each year. My guesses have some
> > > influence on my suggestions.
> > >
> > > What is an outlier in this context? Presumably
> > > a patient who differs from many others; or
> > > an observation that differs from the rest
> > > of the patient's history. Both could make
> > > sense, e.g. in the case of anorexic/bulimic
> > > patients, or patients who had a really bad
> > > year, say a fight with cancer or being
> > > caught up in "Lost".
> > >
> > > First off, if a patient's height varies more than
> > > trivially over 10 years, either there is something
> > > going on, say growth for young people or some aging
> > > effect, or there is a error in the data.
> > >
> > > Weight fluctuations would seem rather different
> > > and everyone knows reasons for various kinds
> > > of weight change even in adulthood. It would
> > > seem a bit more difficult to pick up
> > > on errors (meaning mistakes).
> > >
> > > There are lots of things you can do. You
> > > could set up a loop to plot the time series
> > > for each patient. For 500 patients that would
> > > be a little tedious, but it is a direct
> > > approach.
> > >
> > > You could try reductions, e.g.
> > >
> > > last height - first height
> > > last weight - first weight
> > > mean height over period
> > > mean weight over period
> > > some measure of variability of each
> > >
> > > and look for outliers on pairwise plots
> > > of each. A scatterplot matrix often
> > > shows errors even in data that have
> > > supposedly been cleaned. Often
> > > the cleaning is univariate, but a
> > > weird data value can show up like
> > > a run in fabric.
> > >
> > > My prejudice is that no testing or
> > > measuring approach beats graphics
> > > for finding outliers.
> > >
> > > Nick
> > > n.j.cox@durham.ac.uk
> > >
> > >
> > > Raphael Fraser
> > >
> > > I have 10 years data (5000 observations) on patients heights and
> > > weights. Is there any ado-file that could assist in locating
> possible> > outliers?
> > > *
> > > *   For searches and help try:
> > > *   http://www.stata.com/support/faqs/res/findit.html
> > > *   http://www.stata.com/support/statalist/faq
> > > *   http://www.ats.ucla.edu/stat/stata/
> > >
> > *
> > *   For searches and help try:
> > *   http://www.stata.com/support/faqs/res/findit.html
> > *   http://www.stata.com/support/statalist/faq
> > *   http://www.ats.ucla.edu/stat/stata/
> >
>
> *
> *   For searches and help try:
> *   http://www.stata.com/support/faqs/res/findit.html
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
>
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index