Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Interpretation of Two-sample t test with equal variances?

From   David Hoaglin <>
Subject   Re: st: Interpretation of Two-sample t test with equal variances?
Date   Wed, 20 Mar 2013 19:50:32 -0400


If the way people teach boxplots is the (main) source of the
difficulty, I would not be inclined to blame the boxplot!

I'm not aware of an assumption that outliers are an issue.  If the
data contain outliers, a boxplot will show them as individual points,
beyond the ends of the "whiskers."  The aim is to show observations
that are "outside" and may need further scrutiny.  People do refer,
incorrectly, to observations that are beyond the "fences" as
"outliers."  In data from a normal distribution, however, much more
than 5% of small to moderate-sized samples contain one or more
"outside" observations.

I'm not sure what you mean by "the box ends up being too big" if the
data are light-tailed.  I would expect the "whiskers" to be unusually

A boxplot can do only so much.  The display was not designed to reveal
bimodal or multimodal data.  A dotplot would usually show that
structure easily.

David Hoaglin

On Wed, Mar 20, 2013 at 7:19 PM, JVerkuilen (Gmail)
<> wrote:
> On Wed, Mar 20, 2013 at 3:22 PM, David Hoaglin <> wrote:
>> Jay,
>> I'm not aware that boxplots make any assumptions.  They show what they
>> are intended to show.  Their "performance" comes from the way people
>> interpret them.  Boxplots of skewed data will tend to have certain
>> characteristics, boxplots of light-tailed data will have other
>> characteristics, and so on.  Some patterns suggest bimodal data.
> Oh definitely they show what they were intended to show, and they are
> incredibly useful, but the way we teach them I think leads many folks
> down the garden path. The assumptions I'm thinking of include ones
> such as the largely unstated background assumption that outliers are
> an issue. I've become adept at recognizing when a boxplot is giving me
> a light tailed distribution because the box ends up being too big, but
> if you have multiple modes that will get blown away and they provide
> too much reduction.
*   For searches and help try:

© Copyright 1996–2016 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index