Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: Binomial confidence intervals

From   Marcello Pagano <>
Subject   Re: st: Binomial confidence intervals
Date   Wed, 08 Sep 2004 16:22:40 -0400

I disagree. This is too drastic a summary. If this were all there is and you prefer A, then I have a terrific CI for you: It satisfies A, it is robust to everything, including the data, and it is the interval zero to one. It is guaranteed to work every time.

No, there is more to the controversy, including the length of the interval, and, some would add, the center of the interval.

I agree with the authors of the paper when they say that approximation plays a central role to the whole discussion--
including why the interval zero to one is silly!


The controversy can be summarized like this:

A. Suppose I have a true probability of success P that I am trying to estimate with a sample of size N. If I draw 1 million samples and compute my interval, I want at least 95% of those to cover the true P. I want this to happen, irrespective of the true value of P and the sample size N (ie, 100% of the time). So, if you have a different situation with probability P* and sample size N*, you should also have coverage of 95% or better. In other words, this means that I want an interval with coverage AT LEAST 95% ALL THE TIME (for some situations, it will have more).

B. In contrast, I may want the 95% coverage to be "on average," across different situations. In other words, in some situation of P and N, I am willing to accept coverage less than 95%, while for some other situation P* and N*, I will have coverage better than 95%. All I want is my interval to give at least 95% coverage ON AVERAGE.

(A) leads us to conservative intervals such as Clopper-Pearson. They have to cover the truth 95% of the time no matter what the situation. Even a single situation where they fall to 94.9% is not acceptable. However, they need not be as conservative as Clopper-Pearson. The Blyth-Still-Casella is much better (ie, it stays at 95% or better no matter what, but does not become as conservative as Clopper-Pearson) and is very competitive with those advocated in Agresti & Coull and Brown et al.

(B) leads us to generally shorter intervals, but in some situations they fall below 95% (sometimes way below that). Brown et al. propose certain "corrections" to remove the most severe dips in coverage probability. Still, none of those intervals is guaranteed to have 95% coverage or better in
a particular situation. And the problem is we cannot be sure how much below 95% they can fall. All we know is that they will have 95% coverage on average.

I go with A. I think that B is a bit ad hoc and a dip below 95% may be acceptable to me but unacceptable to you. But either approach seems quite defensible.

The documents accompanying this transmission may contain confidential health or business information. This information is intended for the use of the individual or entity named above. If you have received this information in error, please notify the sender immediately and arrange for the return or destruction of these documents.

Constantine Daskalakis, ScD
Assistant Professor,
Biostatistics Section, Thomas Jefferson University,
211 S. 9th St. #602, Philadelphia, PA 19107
Tel: 215-955-5695
Fax: 215-503-3804
* For searches and help try:
*   For searches and help try:

© Copyright 1996–2015 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index