Ronan Conroy <rconroy@rcsi.ie> : This has been discussed many times over the years on Statalist, with the usual advice being: don't do that. If you want CIs on proportions, or to test differences in proportions, you probably want to use -svy:tab- (and if you don't have a survey start with -svyset, srs-). See also http://www.stata.com/statalist/archive/2010-05/msg00569.html for the case when even -svy- commands fail to appropriately constrain proportions to [0,1]. On Fri, Jan 11, 2013 at 6:44 AM, Ronan Conroy <rconroy@rcsi.ie> wrote: > I have a real problem with the confidence intervals produced by the -proportion- command. > > . input outcome freq > > outcome freq > 1. 0 21 > 2. 1 2 > 3. end > > > Here is the confidence interval which is most probably closest the the nominal coverage according to > - Brown L, Cai T, DasGupta A. Interval estimation for a binomial proportion. Statistical Science. 2001;16(2):101–17. > > . ci outcome [fw=freq], bin wil > > ------ Wilson ------ > Variable | Obs Mean Std. Err. [95% Conf. Interval] > -------------+--------------------------------------------------------------- > outcome | 23 .0869565 .0587534 .02418 .2679598 > > > > Now here is what -proportion- does. > > > . proportion outcome [fw=freq] > > Proportion estimation Number of obs = 23 > > -------------------------------------------------------------- > | Proportion Std. Err. [95% Conf. Interval] > -------------+------------------------------------------------ > outcome | > 0 | .9130435 .0600739 .7884579 1.037629 > 1 | .0869565 .0600739 -.037629 .2115421 > -------------------------------------------------------------- > > . > end of do-file > > According to the manual: > > > "Methods and formulas > proportion is implemented as an ado-file. > Proportions are means of indicator variables; see [R] mean." > > Is anyone prepared to defend this approach as the only formula implemented by -proportion-? Or indeed to tell me that they have managed to publish a paper that included confidence intervals such as the one above? > > > I myself find this bizarre. Consider the example above. The confidence interval includes a value that is impossible - zero. With two observed successes, the success rate cannot be zero. And it includes probabilities that have no definition: negative probabilities. While I am prepared to accept that physicists have now produced temperatures that are lower than absolute zero, I cannot bring myself to persuade anyone that a confidence interval for a probability can extend beyond the interval 0-1. > > > I believe it would be good if Stata's -proportion- command allowed the choice of some more believable methods. * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/faqs/resources/statalist-faq/ * http://www.ats.ucla.edu/stat/stata/

