 Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

# Re: st: why don't confidence intervals from -proportion- use the same formula as -ci-?

 From "JVerkuilen (Gmail)" To statalist@hsphsun2.harvard.edu Subject Re: st: why don't confidence intervals from -proportion- use the same formula as -ci-? Date Fri, 11 Jan 2013 10:13:57 -0500

```On Fri, Jan 11, 2013 at 6:44 AM, Ronan Conroy <rconroy@rcsi.ie> wrote:

> Or indeed to tell me that they have managed to publish a paper that included confidence intervals such as the
> one above?
>
>
> I myself find this bizarre. Consider the example above. The confidence interval includes a value that is impossible - zero. With two observed successes, the success rate cannot be zero. And it includes probabilities that have no definition: negative probabilities. While I am prepared to accept that physicists have now produced temperatures that are lower than absolute zero, I cannot bring myself to persuade anyone that a confidence interval for a probability can extend beyond the interval 0-1.>

This is a common issue with Wald confidence intervals for proportions
or other bounded quantities such as Poisson rates. Notice that the
confidence interval for 0 also exceeds 1.

. expand freq
(21 observations created)

. reg outcome

------------------------------------------------------------------------------
outcome |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
_cons |   .0869565   .0600739     1.45   0.162     -.037629    .2115421
------------------------------------------------------------------------------

You'll see that the answer is the same, so in this case it's using the
unbiased estimate of the sampling variance here. Then there's:

. prtest outcome == .1

One-sample test of proportion                outcome: Number of obs =       23
------------------------------------------------------------------------------
Variable |       Mean   Std. Err.                     [95% Conf. Interval]
-------------+----------------------------------------------------------------
outcome |   .0869565   .0587534                      -.028198     .202111
------------------------------------------------------------------------------
p = proportion(outcome)                                       z =  -0.2085
Ho: p = 0.1

Ha: p < 0.1                 Ha: p != 0.1                   Ha: p > 0.1
Pr(Z < z) = 0.4174         Pr(|Z| > |z|) = 0.8348          Pr(Z > z) = 0.5826

In this case the SE is what you'd get from using sqrt(pi*(1-pi)/n)).

-margins- uses the delta method and generates a similarly inadmissible
confidence interval.

This is clearly a not-well-thought through use of the delta method in
a small sample where asymptotics don't apply. There are many different
ways to make a better estimator, none of which appear to be clear
winners, though Agresti and Coull have gone through the options.

http://en.wikipedia.org/wiki/Binomial_proportion_confidence_interval

Agresti, Alan; Coull, Brent A. (1998). Approximate is better than
'exact' for interval estimation of binomial proportions. The American
Statistician 52: 119–126.

As to whether I've seen papers published like that... probably. There
are some horrible things in journals. Do a meta-analysis sometime if
you need proof!

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/
```