Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: RE: comparing proportions


From   Eva Poen <[email protected]>
To   [email protected]
Subject   Re: st: RE: comparing proportions
Date   Fri, 23 Jan 2009 15:15:12 +0000

Thanks to everyone who replied.

I have now received a number of emails, on and off list, of people who
showed some interest in this routine. For those who are interested,
I'll give some references for the procedure, and try and explain the
test.

The original reference is Boschloo (1970). Mehrotra, Chan and Berger
(2003) compare several tests and recommend Boschloo's exact test as
well as a score statistic. My implementation only computes the
two-sided version of Boschloo's test. Therefore, the null hypothesis
is that both proportions come from the same binomial distribution.

Very briefly, Boschloo's test takes the p-value from Fisher's exact
test as a test statistic. It performs a Fisher's test for every
possible combination of proportions in the sample and checks whether a
more extreme p-value is observed than the original Fisher's test's
p-value. It sums up the binomial probabilities of all combinations
where that's the case, and the supremum of this function is the
p-value for Boschloo's test.

There were at least two catches for me when implementing this procedure:

a) The number of combinations that need to be checked can be very
high. If the column totals of the contingency table are N1 and N2,
then the number of combinations to be checked is (N1+1)*(N2+1)-2. Code
written in .ado may be very inefficient to handle this.

b) The resulting function is a function of the binomial parameter p
(the probability). The speed, and accuracy, of the test depend quite
dramatically on the number of values of p (=resolution) that the
calculations are carried out for; however, I haven't really found any
references in the literature to this.

Due to all of this, the procedure is relatively slow, certainly
compared to -tabulate..., exact-. If I run the example on page 444 of
Mehrotra, Chan and Berger (2003) with a resolution of 500, my routine
takes 80 seconds to compute the result. Smaller sample sizes are
significantly faster; an example with N1=22 and N2=52 takes only 3
seconds to run.

The speed and resolution issues have so far prevented me from
submitting the test to ssc. However, I am happy to revise this opinion
if more experienced contributors think that it's good to go on ssc
nonetheless.

Meanwhile, I will send the files out to those people who have requested them.

Eva


References:

Boschloo, R.D. (1970): Raised conditional level of significance for
the 2x2 table when testing the equality of probabilities. Statistica
Neerlandica 24, 1-35.

Mehrotra, Chan and Berger (2003): A cautionary note on exact
unconditional inference for a difference between two independent
binomial proportions. Biometrics 59, 441-450.



2009/1/22 Lachenbruch, Peter <[email protected]>:
> A reference to Boschloo would be most welcome
>
> Tony
>
> Peter A. Lachenbruch
> Department of Public Health
> Oregon State University
> Corvallis, OR 97330
> Phone: 541-737-3832
> FAX: 541-737-4001
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index