[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

RE: st: RE: RE: Confidence Interval for Proportion

From   "Nick Cox" <>
To   <>
Subject   RE: st: RE: RE: Confidence Interval for Proportion
Date   Tue, 11 Mar 2008 18:28:23 -0000

There is a superb review paper at 

Brown, L.D., Cai, T.T., DasGupta, A. 2001. Interval estimation for a
proportion. Statistical Science 16: 101-133.

This should be accessible to many, if not all, Statalist members at 



Maarten buis

Actually, exact confidence intervals are not as exact as the name
suggests, especially in the case of small proportions. These confidence
interval tends to be conservative, see: (Agresti 2002, pp. 18-19) and
the simulation below. If the exact method where truely exact in all
regards, than the proportion of 95% confidence intervals containing the
true proportion should be .95. In actual fact the proportion is higher,
this is what I mean with the interval being conservative.

*--------------- begin example ----------------------------
set more off
capture program drop sim
program define sim, rclass
	drop _all
	set obs 1000
	gen x = uniform()<.99
	ci x, binomial
	return scalar correct = r(lb)<.99 & r(ub)>.99
simulate correct=r(correct), reps(10000): sim
sum correct
*------------------- end example --------------------------
(For more on how to use examples I sent to the Statalist, see )

The reference you seem to refer to is:
Agresti, A. and B.C. Coull (1998) "Approximate is better than exact for
interval estimation of binomial parameters" The American Statistician,
pp. 119--126. 

Alan Agresti (2002) "Categorical Data Analysis", 2nd edition, Wiley.

Hope this helps,

--- "Lachenbruch, Peter" <> wrote:
> For small proportions, the exact option is useful.  It is the
> standard that the other methods hope to reach.  Coverage is exact.  
> Agresti and Coull have a nice paper (I don't remember the
> attribution,  but I think it's American Statistician, somewhere
> around 2000).
Nick Cox
> The "correct" CI for a binomial variable is a matter of dispute. 
> In your case you are looking for a CI around a point estimate of
> 0.029. 
> A symmetric CI around such a point estimate is likely to include 0 
> and some negative values unless the sample size is very, very large. 
> Some people just truncate the interval at 0, but a more defensible 
> procedure is to work on a transformed scale and back-transform, or do
> something approximately equivalent that yields positive endpoints
> for the CI with about the right coverage. [R] ci has several pointers
> to the literature. 
> Alternative CIs can be got in this way: 
> . gen rep78_1 = rep78 == 1 
> . ci rep78_1 if rep78 < ., binomial jeffreys
> . ci rep78_1 if rep78 < ., binomial Wilson
> Nick
> Martin Weiss
> try this in Stata:
> ************************
> sysuse auto, clear
> proportion rep78
> matrix define A=e(b)
> matrix define B=e(V)
> count if rep78!=.
> *Upper/Lower Bound for proportion of "1"
> di A[1,1]+invnormal(1-0.05/2)*sqrt(A[1,1]*(1-A[1,1])/`r(N)')
> di A[1,1]-invnormal(1-0.05/2)*sqrt(A[1,1]*(1-A[1,1])/`r(N)')
> *Standard Error for "1"
> *Mistake obviously there...
> di sqrt(A[1,1]*(1-A[1,1])/`r(N)')
> ************************
> Then let me know: why do I not hit the correct CI for the proportion
> of
> "1"
> in the repair record? Something`s wrong with the standard error, I do
> not
> know what, though...

*   For searches and help try:

© Copyright 1996–2015 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index