Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: Binomial confidence intervals


From   rgutierrez@stata.com (Roberto G. Gutierrez, StataCorp)
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: Binomial confidence intervals
Date   Tue, 07 Sep 2004 14:46:50 -0500

Earlier, I wrote:

>>The exact interval used by -ci, binomial- is the Clopper-Pearson interval,
>>but you must realize that "exact" is a bit of a misnomer.  It is exact in the
>>sense that it uses the binomial distribution as the basis of the calculation.
>>However, the binomial distribution is a discrete distribution and as such its
>>cumulative probabilities will have discrete jumps, and thus you'll be hard
>>pressed to get (say) exactly 95% coverage.
 
to which Constantine Daskalakis <C_Daskalakis@mail.jci.tju.edu> responds:

> I do not think this is correct. For the CI, it is the parameter space, not
> the sample space, that matters (and the former is continuous). In other
> words, if we have k successes out of N trials, we are looking for limits
> {p_l, p_u}, such that

> Pr [K <= k | p_l] = a/2

> and

> Pr [K >= k | p_u] = a/2

> In general, there exist such limits that correspond to tail probabilities of
> (exactly) a/2. The fact that the sample space is highly discrete (when N is
> small) has nothing to do with it. The only exception is when the observed
> number of successes is either 0 or N; in that case, one limit is on the
> boundary of the parameter space (p_l=0 or p_u=1) and the corresponding tail
> probability on that side is exactly 0, not a/2 (as the manual correctly
> points out).

Constantine is correct that there are exact solutions to the above equations.
However, the problem is that there are only N+1 possible CIs that can be
generated from a binomial experiment of N trials.

Consider the case where N=9.  There are ten possible outcomes of the binomial
experiment, namely zero successes, one success, ..., nine successes.  Since
there are only ten possible k's (0,1,...,9), there are only ten possible
confidence intervals, namely

. cii 9 0, exact
                                                         -- Binomial Exact --
    Variable |        Obs        Mean    Std. Err.       [95% Conf. Interval]
-------------+---------------------------------------------------------------
             |          9           0           0               0    .3362671*

. cii 9 1, exact
                                                         -- Binomial Exact --
    Variable |        Obs        Mean    Std. Err.       [95% Conf. Interval]
-------------+---------------------------------------------------------------
             |          9    .1111111    .1047566        .0028091    .4824965

... 

. cii 9 9, exact
                                                         -- Binomial Exact --
    Variable |        Obs        Mean    Std. Err.       [95% Conf. Interval]
-------------+---------------------------------------------------------------
             |          9           1           0        .6637329           1*

Some of these ten intervals cover p, and some don't.  The probability of
coverage is then the cumulative sum (with respect to the binomial(9,p)
distribution) of the probabilities of the k's that result in intervals that
cover p.  This, I state, is where it is difficult to make the coverage
probability equal 95% exactly.

--Bobby
rgutierrez@stata.com
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index