Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Interpretation of Two-sample t test with equal variances?


From   David Hoaglin <[email protected]>
To   [email protected]
Subject   Re: st: Interpretation of Two-sample t test with equal variances?
Date   Wed, 20 Mar 2013 19:50:32 -0400

Jay,

If the way people teach boxplots is the (main) source of the
difficulty, I would not be inclined to blame the boxplot!

I'm not aware of an assumption that outliers are an issue.  If the
data contain outliers, a boxplot will show them as individual points,
beyond the ends of the "whiskers."  The aim is to show observations
that are "outside" and may need further scrutiny.  People do refer,
incorrectly, to observations that are beyond the "fences" as
"outliers."  In data from a normal distribution, however, much more
than 5% of small to moderate-sized samples contain one or more
"outside" observations.

I'm not sure what you mean by "the box ends up being too big" if the
data are light-tailed.  I would expect the "whiskers" to be unusually
short.

A boxplot can do only so much.  The display was not designed to reveal
bimodal or multimodal data.  A dotplot would usually show that
structure easily.

David Hoaglin

On Wed, Mar 20, 2013 at 7:19 PM, JVerkuilen (Gmail)
<[email protected]> wrote:
> On Wed, Mar 20, 2013 at 3:22 PM, David Hoaglin <[email protected]> wrote:
>> Jay,
>>
>> I'm not aware that boxplots make any assumptions.  They show what they
>> are intended to show.  Their "performance" comes from the way people
>> interpret them.  Boxplots of skewed data will tend to have certain
>> characteristics, boxplots of light-tailed data will have other
>> characteristics, and so on.  Some patterns suggest bimodal data.
>
> Oh definitely they show what they were intended to show, and they are
> incredibly useful, but the way we teach them I think leads many folks
> down the garden path. The assumptions I'm thinking of include ones
> such as the largely unstated background assumption that outliers are
> an issue. I've become adept at recognizing when a boxplot is giving me
> a light tailed distribution because the box ends up being too big, but
> if you have multiple modes that will get blown away and they provide
> too much reduction.
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index