[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Stata not recommended for exact 2x2 test

From (Roberto G. Gutierrez, StataCorp)
Subject   Re: st: Stata not recommended for exact 2x2 test
Date   Mon, 10 Nov 2008 16:15:41 -0600

David Airey <david.airey@Vanderbilt.Edu> writes regarding an article
discussing software for the analysis of 2x2 tables.

> I was looking at the following citation, and found the author explicitly
> argued against using Stata for a particular 2x2 exact test.  Here is the
> citation:

> Int J Epidemiol. 2008 Aug 18. [Epub ahead of print]
> Analysis of 2 x 2 tables of frequencies: matching test to experimental
> design.
> Ludbrook J.
> Department of Surgery, The University of Melbourne, Parkville, Victoria,
> Australia.

The main issue at hand is Stata's use of Fisher's exact test and confidence
intervals based on that test in the commands for the analysis of 2x2
epidemiological tables, namely commands -cc- for case control studies and -cs-
for cohort studies.

In the above-cited article, Ludbrook recommends against using Stata for these
types of analyses.  His contention is not that Stata's implementation of
Fisher's exact test is incorrect, but instead that Fisher's test itself is not
always appropriate.  In other words, if you do wish to perform Fisher's exact
test, use Stata and do so with confidence.  That stated, we can now discuss
the attributes of Fisher's test with respect to the experimental-design
considerations made in that article.

In a Fisher's test of association in a 2x2 table, one conditions on both sets
of marginal totals (rows and columns).  Ludbrook refers to this as "double
conditioning" and argues that this rarely mirrors experimental conditions.  He
is right; it is difficult for us to imagine a trial where, say, we fix both
the total smokers/non-smokers _and_ the total cases/non-cases.  Ludbrook cites
an example of double conditioning in a tea-tasting experiment, the conditions
of which further emphasize that such would almost never happen in biomedical

This of course then raises the question of why the Fisher test double
conditions at the analysis stage if that does not match the experiment.  One
reason is computational simplicity -- conditioning on both row and column
totals reduces the size of the sample space making it easier to enumerate.
Another reason is theoretical:  the totals are approximately ancillary (Yates
1984, pp. 447-449), and there are general arguments for conditioning on
ancillary statistics.

Does the fact that the conditioning does not match experimental conditions
make the method invalid?  We don't think so.  There exist many examples in
statistics where one gives away information -- conditions that is -- in order
to gain some sort of advantage.  As just one example, in conditional
fixed-effects logistic regression, -clogit-, you stipulate the total group
successes in order to avoid having to estimate the group-level fixed effects.
Does that match the experimental conditions?  On occasion (for example a
single-choice model), but certainly not always, and the method remains valid
even when the conditioning does not match the experimental conditions.

Such examples are numerous.  In the context of 2x2 tables, conditioning allows
one to derive Fisher's exact test from any of four models:  the four-Poisson
model, the two-binomial model, the multinomial, and the model for randomized
experiments (Cox 2006, pp. 52-54, 190).  The four models (together with the
hypergeometric) justify Fisher's exact test for any of the three 2x2 study
designs: doubly-conditioned, singly-conditioned (for example randomized
clinical trials, as pointed out by Joseph Coveney) and unconditioned.

The conditioning in Fisher's test introduces no bias, but you can lose some
efficiency.  How much efficiency you lose is commensurate with how much
information about row/column association is contained in the four marginal
totals.  If the margins were orthogonal to relative internal cell sizes, there
would be nothing lost, but unfortunately they are not strictly orthogonal.
The literature is somewhat divided on exactly how much information is lost
through double conditioning (Plackett 1977; Barnard 1984), but our feeling is
that given the contention involved, it can't be much.

As computers get faster and new computational methods come about, it is
becoming more feasible to have software to supplement Fisher's test with 
other exact tests that condition on only one set of marginal totals.  We are
looking into adding such methods, and you can look forward to using them in
Stata at some point in the future.  Of course, when that time comes you will
still have the option of using Fisher's test as well.

We've discussed only the main point of Ludbrook's article:  the use of
Fisher's exact test when both margins are not fixed by design.  We are also
preparing an FAQ that discusses the article and Stata on a more point-by-point
basis.  This FAQ will be available soon, at which time we will make an
announcement to the list. 

-Bobby				     -Wes


Barnard, G. A. 1984. Comment on "Tests of significance for 2x2 contingency
   tables" by F. Yates.  Journal of the Royal Statistical Society Series A 147,

Cox, D. R. 2006. Principles of Statistical Inference.  Cambridge: Cambridge 
   University Press.

Plackett, R. L. 1977. "The marginal totals of a 2x2 table."  Biometrika 64,

Yates, F. 1984. "Tests of significance for 2x2 contingency tables."  Journal
   of the Royal Statistical Society Series A 147, 426-463.

*   For searches and help try:

© Copyright 1996–2020 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index