# Re: st: Binomial confidence intervals

 From rgutierrez@stata.com (Roberto G. Gutierrez, StataCorp) To statalist@hsphsun2.harvard.edu Subject Re: st: Binomial confidence intervals Date Tue, 07 Sep 2004 13:17:58 -0500

```Richard Williams <Richard.A.Williams.5@nd.edu> asks:

> The -ci- command includes several options for computing binomial confidence
> intervals: exact (the default), wilson, agresti and jeffreys.  Just so I am
> clear on these, is "exact" really really really exact?  Am I correct in
> guessing that "exact" can be the most difficult to do by hand, and the
> others are therefore approximations that are somewhat easier to calculate if
> you don't have a computer?  Is there any reason I would not want to use
> "exact", other than perhaps to replicate a calculation done using one of the
> other methods? Thanks for any insights.

> I remember checking this at some point. The "exact" option indeed gives the
> exact binomial confidence interval (which requires an iterative algorithm to
> get the limits, eg, bifurkation algorithm). StatXact calls this the
> "Clopper-Pearson" interval.

> I have no idea what these other intervals are and why we need them. I was
> not aware of those options and they are not documented in the manual.

The exact interval used by -ci, binomial- is the Clopper-Pearson interval,
but you must realize that "exact" is a bit of a misnomer.  It is exact in the
sense that it uses the binomial distribution as the basis of the calculation.
However, the binomial distribution is a discrete distribution and as such its
cumulative probabilities will have discrete jumps, and thus you'll be hard
pressed to get (say) exactly 95% coverage.

What Clopper-Pearson does do is guarantee that the coverage is AT LEAST 95%
(or whatever level you specify) and so it is desirable in that sense.  It is
able to accomplish this goal by using the exact binomial distribution in its
calculations.

However, by guaranteeing 95% coverage, Clopper-Pearson can be a bit
conservative (wide) for some tastes, since for some n and p the true coverage
can even get quite close to 100%.  The other intervals (Jeffrey's, Agresti,
Wilson) offered by -ci- are an attempt to not be so conservative, but yet
still get the right coverage without the constraint of having to be at least
the stated coverage level.  These new intervals were added (by popular demand)
after the release of Stata 8, and so you won't find them in the manual.

The definitive article covering all this, including definitions for Jeffrey's,
Agresti, and Wilson, is

Brown, Cai, & DasGupta.  Interval Estimation for a Binomial Proportion.
Statistical Science, 2001, 16, pp. 101-133.

Great article if you are into this sort of thing.

--Bobby
rgutierrez@stata.com
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```