»  Home »  Resources & support »  FAQs »  Stata 6: Continuity adjustments
Note: This FAQ is for users of Stata 6. It is not relevant for more recent versions.

## Stata 6: Why do Stata’s cc and cci commands report different confidence intervals than Epi Info?

 Title Stata 6: Continuity adjustments Author William Gould, StataCorp

The short answer is that we do not apply the continuity adjustment, but Epi Info does. The rest of the FAQ details why we believe our answer is to be slightly preferred except when N is very small, in which case neither result is to be trusted.

### A particular problem

A user sent the following 2x2 table to us:

             |   Exposed   Unexposed
---------+------------------------
Cases |        11           3
Controls |       106         223


The user reported that Stata and Epi Info differed in their reported 95% confidence intervals even though both packages claimed to be using the Cornfield approximation. The reported confidence intervals are

    Epi Info              [1.94, 35.63]
Stata                 [2.26, 26.20]


The Stata result can be obtained by typing cci 11 3 106 223.

### The reason for the discrepancy

We have independently verified that Stata results are the results intended; see Appendix below.

We have independently verified that the Epi Info results are the results they intended; see Appendix below.

The difference in reported results is not due to programming errors. Rather, the difference hinges on whether one makes a continuity correction to the Cornfield iterative formula.

The Cornfield formula presented in Schlesselman (1982, 177) includes the continuity correction. Our two justifications for not including the continuity correction are

1. The continuity correction is only justified statistically when you have an exact formula (exact at finite N) for the variance. In this case we only have asymptotic formulas for the variance.
2. For skewed distributions such as this, the continuity correction often does more harm than good when N is above a small number.

If you really care about the confidence interval when dealing with small N, you should be using exact methods such as those available in the StatXact software package.

### Comparison with logistic regression

Logistic regression provides another way one can obtain estimates of the odds ratio and the standard error. The estimated odds ratio will be the same as reported by Stata’s cci command (and by Epi Info). The standard error and derived confidence interval will be different from those reported by cci because different formulas are used.

In any case, we obtained the following results:

    Epi Info              [1.94, 35.63]
Stata                 [2.26, 26.20]
logistic regression   [2.11, 28.23]


Below we obtain the logistic regression results:

. list

1.         1          1         11
2.         1          0        106
3.         0          1          3
4.         0          0        223

Logit Estimates                                         Number of obs =    343
LR chi2(1)    =  12.15
Prob > chi2   = 0.0005
Log Likelihood = -214.05327                             Pseudo R2     = 0.0276

dead   Odds Ratio   Std. Err.       z     P>|z|       [95% Conf. Interval]

expos     7.713836   5.105902      3.087   0.002       2.107888    28.22885



### Simulation results

As a quick way of determining the reliability of the Cornfield approximation without the continuity correction, we ran a simulation, under the null hypothesis (odds ratio==1), for a table with the same marginals as in the example above. In 1,000 replications, the results were

. summarize

Variable       Obs        Mean   Std. Dev.       Min        Max

accept      1000         .96   .1960572          0          1


That is to say, the C.I. reported by Stata that was calculated without the continuity correction resulted in nonrejection of the null hypothesis in 960 out of 1,000 cases. Thus widening the confidence interval — as the continuity correction would — does not seem called for.

The following Stata do-file will reproduce the simulation results reported above and allow you to run your own:

  ------------------------------------------ BEGIN --- mysim.do --- CUT HERE ---
version 6.0
program drop _all

program define mkdta
set obs 343
gen exposed = _n<=lt;=14
end

program define asim
gen u = uniform()
sort u
gen case = _n<=lt;=117
cc case exposed
post mm ($S_10<=lt;=1 &$S_11>=gt;=1)
drop u case
end

program define sim
drop _all
mkdta
postfile mm accept using myres, replace
local i 1
qui while i' <=lt;= 1' {
asim
local i = i' + 1
}
postclose mm
use myres, clear
end

set seed 39483
sim 1000
sum
-------------------------------------------- END --- mysim.do --- CUT HERE ---


### Appendix: Independent reproduction of reported results

The purpose of this appendix is to establish that Stata is using the Cornfield approximation without the continuity correction and that Epi Info is using the same formula with the continuity correction.

Let us use the following notation:

             |   Exposed   Unexposed |
---------+-----------------------+---
Cases |        a            b | M1
Controls |        c            d | M0
---------+-----------------------+---
|       N1           N2 |  T


The Cornfield confidence interval is

    ol = al(M0 - N1 + al)/((N1-al)(M1-al))
ou = au(M0 - N1 + au)/((N1-au)(M1-au))


where al and au are obtained from

    a[i+1] = a +/-
z*1/sqrt( 1/a[i] + 1/(N1-a[i]) + 1/(M1-a[i]) + 1/(M0-N1+a[i]) )


At least, that is the formula Stata uses. Epi Info uses

    a[i+1] = a +/- .5 +/-
z*1/sqrt( 1/a[i] + 1/(N1-a[i]) + 1/(M1-a[i]) + 1/(M0-N1+a[i]) )


That is, Epi Info includes the continuity correction whereas Stata does not.

The following program will reproduce the Stata results:

 program define upper /* a0 */
local a = 11
local b = 106
local c = 3
local d = 223

local M1 = a' + b'
local M0 = c' + d'

local N1 = a' + c'
local N0 = b' + d'

local T = M1' + M0'

local z = 1.96

local ai = 1'
while (1) {
di ai' " " ou'
local ai = a' + z'*1/sqrt( /*
*/ 1/ai' + /*
*/ 1/(N1'-ai') + /*
*/ 1/(M1'-ai') + /*
*/ 1/(M0'-N1'+ai') /*
*/ )
local ou = ai'*(M0'-N1'+ai') / /*
*/ ((N1'-ai')*(M1'-ai'))
}
end


The result of running this program is

. upper 3
3
13.962681 820.50662
11.37803 9.1775262
13.819558 167.61792
11.826157 11.577601
13.62256 78.771227
12.184741 14.356851
13.436792 51.933344
12.435577 17.061584
13.288298 40.55852
[output omitted]
12.932766 26.192115
12.932766 26.192115
12.932766 26.192115
--Break--
r(1);


The slight difference from the result reported by Stata is due to our use of the (imprecise) 1.96.

We then modified the program to add ½ to `ai'. This resulted in nonconvergence. However, if we first converged the noncontinuity corrected formula and then used the continuity corrected formula, the formula would converge to 35.635.