Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: Re: Confidence interval of difference between two proportions and -csi-

From   Roger Newson <>
Subject   st: Re: Confidence interval of difference between two proportions and -csi-
Date   Thu, 18 Mar 2004 17:17:43 +0000

At 16:37 18/03/04 +1100, Garry Anderson wrote:

I am enquiring if a more appropriate method could be used please to calculate the 95% CI of the difference between two proportions in the -csi- command?

At the moment it is possible for the upper bound of the confidence interval of the difference between two proportions to be greater than 1.0. I realize that the approximation that is used is not appropriate for small sample sizes, however I think that reporting of results that are impossible should be avoided.
One possibility is to use the -somersd- package, downloadable (complete with a .pdf manual) from SSC, using its -transform()- option. The difference between 2 proportions is a special case of Somers' D, and the -somersd- package offers a choice of transformations appropriate for Somers' D, notably the hyperbolic arctangent (or z) transformation or the arcsine transformation. If -diseased- and -exposed- are 2 binary (0,1) variables indicating disease and exposure, respectively, then Garry might type

somersd exposed diseased, tr(z)

or, alternatively,

somersd exposed diseased, tr(asin)

and get a confidence interval for the difference between the proportion of exposed individuals with the disease and the proportion of unexposed individuals with the disease, using a normalizing and variance-stabilizing transformation.

However, it should be stressed that, with Garry's example, there is a zero cell (for exposed noncases), so one of the proportions is either zero or one, so a normalizing or variance-stabilizing transformation might be inappropriate because the sample size is so low. In such circumstances, it might be better to use the -exactcci- package to define a conservative confidence interval for the odds ratio, which may have an infinite upper limit or a zero lower limit. If Garry uses -findit- to find and install the -exactcci- fackage and types

exactcci 5 1 0 4, exact

then the so-called "exact" confidence interval is generated. (Note, however, that this confidence interval is conservative, not exact. It is called "exact" because it uses the exact hypergeometric distribution to calculate conservative confidence limits.)

I hope this helps.


Roger Newson
Lecturer in Medical Statistics
Department of Public Health Sciences
King's College London
5th Floor, Capital House
42 Weston Street
London SE1 3QD
United Kingdom

Tel: 020 7848 6648 International +44 20 7848 6648
Fax: 020 7848 6620 International +44 20 7848 6620
or 020 7848 6605 International +44 20 7848 6605

Opinions expressed are those of the author, not the institution.

* For searches and help try:

© Copyright 1996–2015 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index