Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

RE: st: RE: RE: Confidence Interval for Proportion


From   "Nick Cox" <[email protected]>
To   <[email protected]>
Subject   RE: st: RE: RE: Confidence Interval for Proportion
Date   Tue, 11 Mar 2008 18:28:23 -0000

There is a superb review paper at 

Brown, L.D., Cai, T.T., DasGupta, A. 2001. Interval estimation for a
binomial
proportion. Statistical Science 16: 101-133.

This should be accessible to many, if not all, Statalist members at 

<http://projecteuclid.org/DPubS/Repository/1.0/Disseminate?handle=euclid
.ss/1009213286&view=body&content-type=pdf_1>

Nick 
[email protected] 

Maarten buis

Actually, exact confidence intervals are not as exact as the name
suggests, especially in the case of small proportions. These confidence
interval tends to be conservative, see: (Agresti 2002, pp. 18-19) and
the simulation below. If the exact method where truely exact in all
regards, than the proportion of 95% confidence intervals containing the
true proportion should be .95. In actual fact the proportion is higher,
this is what I mean with the interval being conservative.

*--------------- begin example ----------------------------
set more off
capture program drop sim
program define sim, rclass
	drop _all
	set obs 1000
	gen x = uniform()<.99
	ci x, binomial
	return scalar correct = r(lb)<.99 & r(ub)>.99
end
simulate correct=r(correct), reps(10000): sim
sum correct
*------------------- end example --------------------------
(For more on how to use examples I sent to the Statalist, see
http://home.fsw.vu.nl/m.buis/stata/exampleFAQ.html )

The reference you seem to refer to is:
Agresti, A. and B.C. Coull (1998) "Approximate is better than exact for
interval estimation of binomial parameters" The American Statistician,
pp. 119--126. 

Alan Agresti (2002) "Categorical Data Analysis", 2nd edition, Wiley.

Hope this helps,
Maarten

--- "Lachenbruch, Peter" <[email protected]> wrote:
> For small proportions, the exact option is useful.  It is the
> standard that the other methods hope to reach.  Coverage is exact.  
> Agresti and Coull have a nice paper (I don't remember the
> attribution,  but I think it's American Statistician, somewhere
> around 2000).
 
Nick Cox
 
> The "correct" CI for a binomial variable is a matter of dispute. 
> 
> In your case you are looking for a CI around a point estimate of
> 0.029. 
> 
> A symmetric CI around such a point estimate is likely to include 0 
> and some negative values unless the sample size is very, very large. 
> 
> Some people just truncate the interval at 0, but a more defensible 
> procedure is to work on a transformed scale and back-transform, or do
> 
> something approximately equivalent that yields positive endpoints
> for the CI with about the right coverage. [R] ci has several pointers
> to the literature. 
> 
> Alternative CIs can be got in this way: 
> 
> . gen rep78_1 = rep78 == 1 
> . ci rep78_1 if rep78 < ., binomial jeffreys
> . ci rep78_1 if rep78 < ., binomial Wilson
> 
> Nick
> [email protected] 
> 
> Martin Weiss
> 
> try this in Stata:
> 
> 
> ************************
> sysuse auto, clear
> proportion rep78
> matrix define A=e(b)
> matrix define B=e(V)
> count if rep78!=.
> *Upper/Lower Bound for proportion of "1"
> di A[1,1]+invnormal(1-0.05/2)*sqrt(A[1,1]*(1-A[1,1])/`r(N)')
> di A[1,1]-invnormal(1-0.05/2)*sqrt(A[1,1]*(1-A[1,1])/`r(N)')
> *Standard Error for "1"
> *Mistake obviously there...
> di sqrt(A[1,1]*(1-A[1,1])/`r(N)')
> ************************
> 
> 
> Then let me know: why do I not hit the correct CI for the proportion
> of
> "1"
> in the repair record? Something`s wrong with the standard error, I do
> not
> know what, though...

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index