Tables for epidemiologists
|
|
- 2 × 2 and 2 × 2 stratified table for longitudinal, cohort study,
case–control, and matched case–control data
- Odds ratio, incidence ratio, risk ratio, risk difference, and
attributable fraction
- Confidence intervals for the above
- Chi-squared, Fishers’s exact, and Mantel–Haenszel tests
- Tests for homogeneity
|
- Choice of weights for stratified tables: Mantel–Haenszel,
standardized, or user specified
- Exact McNemar test for matched case–control data
- Tabulated odds and odds ratios
- Score test for linear trend
|
Stata has a set of commands for dealing with 2 x 2 tables, including
stratified tables, known collectively as the epitab commands. To calculate
appropriate statistics and suppress inappropriate statistics, these commands
are organized in the same way that epidemiologists conceptualize data.
Stata’s ir command is used with incidence-rate (incidence
density or person-time) data; point estimates and confidence intervals for
the incidence-rate ratio and difference are calculated, along with the
attributable or prevented fractions for the exposed and total populations.
Stata’s cs command is used with cohort study data with equal
follow-up time per subject. Risk is then the proportion of subjects who
become cases. Point estimates and confidence intervals for the risk
difference, risk ratio, and (optionally) the odds ratio are calculated,
along with attributable or prevented fractions for the exposed and total
population.
Stata’s cc command is used with case–control and
cross-sectional data. Point estimates and confidence intervals for the odds
ratio are calculated along with attributable or prevented fractions for the
exposed and total population.
mcc is used with matched case–control data. It calculates
McNemar’s chi-squared, point estimates, and confidence intervals for
the difference, ratio, and relative difference of the proportion with the
factor, along with the odds ratio.
All these commands come in two flavors: their normal forms and an
“immediate” form. In their normal forms, the commands form
counts by summing the dataset in use. In their immediate forms, the data
are specified on the command line.
For instance, Boice and Monson (1977 and reprinted in Rothman, Greenland, and Lash
2008, 244) reported on breast cancer cases and person-years of observations
for women with tuberculosis who were repeatedly exposed to multiple X-ray
fluoroscopies and for those not exposed:
X-ray fluoroscopy
Exposed Unexposed
-------------------------------------------
Breast cancer cases 41 15
Person years 28,010 19,017
Using the immediate form of ir, you specify the values in the table
following the command:
. iri 41 15 28010 19017
| Exposed Unexposed | Total
-----------------+------------------------+------------
Cases | 41 15 | 56
Person-time | 28010 19017 | 47027
-----------------+------------------------+------------
| |
Incidence rate | .0014638 .0007888 | .0011908
| |
| Point estimate | [95% Conf. Interval]
|------------------------+------------------------
Inc. rate diff. | .000675 | .0000749 .0012751
Inc. rate ratio | 1.855759 | 1.005684 3.6093 (exact)
Attr. frac. ex. | .4611368 | .0056519 .722938 (exact)
Attr. frac. pop | .337618 |
+-------------------------------------------------
(midp) Pr(k>=41) = 0.0177 (exact)
(midp) 2*Pr(k>=41) = 0.0355 (exact)
The grander ir command itself can work with individual-level or
aggregate data and also work with stratified data. Rothman, Greenland, and Lash
(2008, 264) report results from Doll and Hill (1966) on age-specific
coronary disease deaths among British male doctors in relation to cigarette
smoking:
Smokers Nonsmokers
Age Deaths Person-years Deaths Person-years
-------------------------------------------------------
35-44 32 52,407 2 18,790
45-54 104 43,248 12 10,673
55-64 206 28,612 28 5,710
65-74 186 12,663 28 2,585
75-84 102 5,317 31 1,462
We have entered these data into Stata:
. list, separator(0)
+-----------------------------------+
| age smokes deaths pyears |
|-----------------------------------|
1. | 35-44 1 32 52,407 |
2. | 35-44 0 2 18,790 |
3. | 45-54 1 104 43,248 |
4. | 45-54 0 12 10,673 |
5. | 55-64 1 206 28,612 |
6. | 55-64 0 28 5,710 |
7. | 65-74 1 186 12,663 |
8. | 65-74 0 28 2,585 |
9. | 75-84 1 102 5,317 |
10. | 75-84 0 31 1,462 |
+-----------------------------------+
We can obtain the Mantel–Haenszel combined estimate of the
incidence-rate ratio, along with 90% confidence intervals, by typing
. ir deaths smokes pyears, by(age) level(90)
age | IRR [90% Conf. Interval] M-H Weight
-----------------+-------------------------------------------------
35-44 | 5.736638 1.704271 33.61646 1.472169 (exact)
45-54 | 2.138812 1.274552 3.813282 9.624747 (exact)
55-64 | 1.46824 1.044915 2.110422 23.34176 (exact)
65-74 | 1.35606 .9626026 1.953505 23.25315 (exact)
75-84 | .9047304 .6375194 1.305412 24.31435 (exact)
-----------------+-------------------------------------------------
Crude | 1.719823 1.437544 2.0688 (exact)
M-H combined | 1.424682 1.194375 1.699399
-------------------------------------------------------------------
Test of homogeneity (M-H) chi2(4) = 10.41 Pr>chi2 = 0.0340
Rothman and Greenland (1998, 264) obtain the standardized incidence-rate
ratio and 90% confidence intervals, weighting each age category by the
population of the exposed group, thus producing the standardized mortality
ratio (SMR). This calculation can be reproduced by specifying
by(age) to indicate that the table is stratified, and
istandard to specify that we want the internally standardized rate:
. ir deaths smokes pyears, by(age) level(90) istandard
age | IRR [90% Conf. Interval] Weight
-----------------+-------------------------------------------------
35-44 | 5.736638 1.704271 33.61646 52407 (exact)
45-54 | 2.138812 1.274552 3.813282 43248 (exact)
55-64 | 1.46824 1.044915 2.110422 28612 (exact)
65-74 | 1.35606 .9626026 1.953505 12663 (exact)
75-84 | .9047304 .6375194 1.305412 5317 (exact)
-----------------+-------------------------------------------------
Crude | 1.719823 1.437544 2.0688 (exact)
I. Standardized | 1.417609 1.186541 1.693676
If we want the externally standardized ratio (weights proportional to the
population of the unexposed group), we can substitute estandard for
istandard in the command above.
See
New in Stata 12
for more about what was added in Stata Release 12.
References
- Boice, J. D., and R. R. Monson. 1977.
- Breast cancer in women after repeated fluoroscopic examinations of the chest.
Journal of the National Cancer Institute 59: 823–832.
- Doll, R., and A. B. Hill. 1966.
- Mortality of British doctors in relation to smoking: observations on coronary thrombosis.
In Epidemiological Approaches to the Study of Cancer and Other Chronic Diseases, ed. W. Haenszel. National Cancer Institute Monograph 19: 205–268.
- Rothman, K. J., S. Greenland, and T. L. Lash. 2008.
- Modern Epidemiology. 3rd ed.
Philadelphia: Lippincott Williams & Wilkins.
|
Stata 12
Overview: Why use Stata?
Stata/MP
Capabilities
New in Stata 12
Supported platforms
Which Stata?
Technical support
User comments
|