# RE: st: RE: RE: Confidence Interval for Proportion

 From "Nick Cox" To Subject RE: st: RE: RE: Confidence Interval for Proportion Date Tue, 11 Mar 2008 18:28:23 -0000

```There is a superb review paper at

Brown, L.D., Cai, T.T., DasGupta, A. 2001. Interval estimation for a
binomial
proportion. Statistical Science 16: 101-133.

This should be accessible to many, if not all, Statalist members at

<http://projecteuclid.org/DPubS/Repository/1.0/Disseminate?handle=euclid
.ss/1009213286&view=body&content-type=pdf_1>

Nick
n.j.cox@durham.ac.uk

Maarten buis

Actually, exact confidence intervals are not as exact as the name
suggests, especially in the case of small proportions. These confidence
interval tends to be conservative, see: (Agresti 2002, pp. 18-19) and
the simulation below. If the exact method where truely exact in all
regards, than the proportion of 95% confidence intervals containing the
true proportion should be .95. In actual fact the proportion is higher,
this is what I mean with the interval being conservative.

*--------------- begin example ----------------------------
set more off
capture program drop sim
program define sim, rclass
drop _all
set obs 1000
gen x = uniform()<.99
ci x, binomial
return scalar correct = r(lb)<.99 & r(ub)>.99
end
simulate correct=r(correct), reps(10000): sim
sum correct
*------------------- end example --------------------------
(For more on how to use examples I sent to the Statalist, see
http://home.fsw.vu.nl/m.buis/stata/exampleFAQ.html )

The reference you seem to refer to is:
Agresti, A. and B.C. Coull (1998) "Approximate is better than exact for
interval estimation of binomial parameters" The American Statistician,
pp. 119--126.

Alan Agresti (2002) "Categorical Data Analysis", 2nd edition, Wiley.

Hope this helps,
Maarten

--- "Lachenbruch, Peter" <Peter.Lachenbruch@oregonstate.edu> wrote:
> For small proportions, the exact option is useful.  It is the
> standard that the other methods hope to reach.  Coverage is exact.
> Agresti and Coull have a nice paper (I don't remember the
> attribution,  but I think it's American Statistician, somewhere
> around 2000).

Nick Cox

> The "correct" CI for a binomial variable is a matter of dispute.
>
> In your case you are looking for a CI around a point estimate of
> 0.029.
>
> A symmetric CI around such a point estimate is likely to include 0
> and some negative values unless the sample size is very, very large.
>
> Some people just truncate the interval at 0, but a more defensible
> procedure is to work on a transformed scale and back-transform, or do
>
> something approximately equivalent that yields positive endpoints
> for the CI with about the right coverage. [R] ci has several pointers
> to the literature.
>
> Alternative CIs can be got in this way:
>
> . gen rep78_1 = rep78 == 1
> . ci rep78_1 if rep78 < ., binomial jeffreys
> . ci rep78_1 if rep78 < ., binomial Wilson
>
> Nick
> n.j.cox@durham.ac.uk
>
> Martin Weiss
>
> try this in Stata:
>
>
> ************************
> sysuse auto, clear
> proportion rep78
> matrix define A=e(b)
> matrix define B=e(V)
> count if rep78!=.
> *Upper/Lower Bound for proportion of "1"
> di A[1,1]+invnormal(1-0.05/2)*sqrt(A[1,1]*(1-A[1,1])/`r(N)')
> di A[1,1]-invnormal(1-0.05/2)*sqrt(A[1,1]*(1-A[1,1])/`r(N)')
> *Standard Error for "1"
> *Mistake obviously there...
> di sqrt(A[1,1]*(1-A[1,1])/`r(N)')
> ************************
>
>
> Then let me know: why do I not hit the correct CI for the proportion
> of
> "1"
> in the repair record? Something`s wrong with the standard error, I do
> not
> know what, though...

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```