Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: why don't confidence intervals from -proportion- use the same formula as -ci-?

From   Marcello Pagano <>
To   <>
Subject   Re: st: why don't confidence intervals from -proportion- use the same formula as -ci-?
Date   Fri, 11 Jan 2013 08:59:44 -0500

Hear! Hear!

Someone should clean this up and whilst at it also the abuse of the word "exact" in the manuals. The "exact" confidence intervals are more than likely exactly wrong. They do not deserve the moniker. The paper listed below should be a must read.


On 1/11/2013 6:44 AM, Ronan Conroy wrote:
I have a real problem with the confidence intervals produced by the -proportion- command.

. input outcome freq

        outcome       freq
   1. 0 21
   2. 1 2
   3. end

Here is the confidence interval which is most probably closest the the nominal coverage according to
- Brown L, Cai T, DasGupta A. Interval estimation for a binomial proportion. Statistical Science. 2001;16(2):101–17.

. ci outcome [fw=freq], bin wil

                                                          ------ Wilson ------
     Variable |        Obs        Mean    Std. Err.       [95% Conf. Interval]
      outcome |         23    .0869565    .0587534          .02418    .2679598

Now here is what -proportion- does.

. proportion outcome [fw=freq]

Proportion estimation               Number of obs    =      23

              | Proportion   Std. Err.     [95% Conf. Interval]
outcome      |
            0 |   .9130435   .0600739      .7884579    1.037629
            1 |   .0869565   .0600739      -.037629    .2115421

end of do-file

According to the manual:

"Methods and formulas
proportion is implemented as an ado-file.
Proportions are means of indicator variables; see [R] mean."

Is anyone prepared to defend this approach as the only formula implemented by -proportion-? Or indeed to tell me that they have managed to publish a paper that included confidence intervals such as the one above?

I myself find this bizarre. Consider the example above. The confidence interval includes a value that is impossible - zero. With two observed successes, the success rate cannot be zero. And it includes probabilities that have no definition: negative probabilities. While I am prepared to accept that physicists have now produced temperatures that are lower than absolute zero, I cannot bring myself to persuade anyone that a confidence interval for a probability can extend beyond the interval 0-1.

I believe it would be good if Stata's -proportion- command allowed the choice of some more believable methods.

Ronán Conroy
Associate Professor
Division of Population Health Sciences
Royal College of Surgeons in Ireland
Beaux Lane House
Dublin 2

*   For searches and help try:

*   For searches and help try:

© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index