Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down at the end of May, and its replacement, is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: confidence intervals overlap [was: positive interaction - negative covariance]

From   Dirk Enzmann <>
Subject   st: confidence intervals overlap [was: positive interaction - negative covariance]
Date   Tue, 26 Feb 2013 15:55:23 +0100

David points to the problem of judging the significance of differences of independent estimates by comparing their confidence intervals. However, I don't think that "it is usually a mistake to compare their confidence intervals". This warning goes far too far: As long as the estimates are independent, graphs of confidence intervals are a very useful tool to assist "inference by eye".

Schenker & Gentleman (2001, p. 182) write: "To judge whether the difference between two point estimates is statistically significant, data analysts sometimes examine the overlap between the two associated confidence intervals. If there is no overlap, the difference is judged significant, and if there is overlap, the difference is not judged significant."

The problem is that the "no overlap" rule is naive. However, the attempt to judge the significance by examining the overlap of two confidence intervals is (nearly) perfectly valid if you adjust the rule to: "As long as the confidence intervals to not overlap by more than half of the average arm length, the difference is significant." More precisely: If n per sample >= 10, half arm length overlap corresponds to p about .05, touching arms correspond to p about .01. For a detailed discussion see Cumming & Finch (2005). Not the *overlap* of confidence intervals but the *amount* of overlap is what the reader of graphs showing confidence intervals of several group means should take into account.

Applying this rule of thumb to the example given by Schenker & Gentleman (2001, p. 183) clearly shows that both proportions differ signficantly (somewhere between p < .05 and p < .01), in fact p = .017:

* ==== Start Stata commands: ===========================================

* Example "Comparing Proportions" (Schenker & Gentleman, 2001, p. 183)

input samp y freq
1 0  88
1 1 112
2 0 112
2 1  88
logistic y samp [fw=freq]

mat meanci = J(2,4,.)
forvalues i = 1/2 {
  ci y if samp==`i' [fw=freq], b w
  mat meanci[`i',1] = r(mean)
  mat meanci[`i',2] = r(lb)
  mat meanci[`i',3] = r(ub)
  mat meanci[`i',4] = `i'

svmat meanci
label variable meanci1 "proportion"
label variable meanci2 "ci_l"
label variable meanci3 "ci_u"
label variable meanci4 "sample"

twoway (scatter meanci1 meanci4 in 1/2) ///
       (rcap meanci2 meanci3 meanci4 in 1/2), ytitle("y") ///
       ylab(, angle(0)) xscale(range(0.5 2.5)) xlabel(1(1)2) ///
       title("Example Schenker & Gentleman (95% Wilson CIs)", ///
       size(medium)) legend(cols(1) position(4))

* ==== End Stata commands. =============================================

Of course, our eyes tend to tell us lies, this simple rule of thumb is not exact, and as with every rule of thumb there are problems, such as issues of efficiency, multiple testing, etc. But when using the "half arm length rule" instead of the "naive rule" (as considered by Schenker & Gentleman), inference by eye is too useful to be thrown out with the bathwater.

It is necessary to point to the general misconception and popular fallacy that confidence intervals should not overlap if the difference is statistically significant. When read properly, figures with confidence intervals are useful for inferential purpose.


Cumming, G. & Finch, S. (2005). Inference by eye. Confidence intervals and how to read pictures of data. American Psychologist, 60, 170-180.

Schenker, N. & Gentleman J. F. (2001). On judging the significance of differences by examining the overlap between confidence intervals. The American Statistician, 55, 182-186.


Sat, 23 Feb 2013 15:34:39 -0500, David Hoaglin <> wrote:
Subject: Re: st: positive interaction - negative covariance

The problems that arise from trying to compare confidence intervals
are more general.  They arise in situations where the estimates are
independent.  Thus, the covariance in the sampling distribution of b1
and b3 is not the real issue.

To assess the difference between two estimates, it is usually a
mistake to compare their confidence intervals.  The correct approach
is to form the appropriate confidence interval for the difference and
ask whether that confidence interval includes zero.  I often encounter
people who think that they can determine whether two estimates (e.g.,
the means of two independent samples) are different by checking
whether the two confidence intervals overlap.  They are simply wrong.
The article by Schenker and Gentleman (2001) explains.  (I said
"usually" above to exclude intervals that are constructed specifically
for use in assessing the significance of pairwise comparisons.)

David Hoaglin

Nathaniel Schenker and Jane F. Gentleman, On judging the significance
of differences by examining the overlap between confidence intervals.
The American Statistician 2001; 55(3):182-186.

Dr. Dirk Enzmann
Institute of Criminal Sciences
Dept. of Criminology
Rothenbaumchaussee 33
D-20148 Hamburg

phone: +49-(0)40-42838.7498 (office)
       +49-(0)40-42838.4591 (Mrs Billon)
fax:   +49-(0)40-42838.2344
*   For searches and help try:

© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index