Tables for epidemiologists

Home / Products / Features / Epidemiology / Tables for epidemiologists

Tables for epidemiologists

2 × 2 and 2 × 2 stratified table for longitudinal, cohort study, case–control, and matched case–control data
Odds ratio, incidence ratio, risk ratio, risk difference, and attributable fraction
Confidence intervals for the above
Chi-squared, Fishers’s exact, and Mantel–Haenszel tests
Tests for homogeneity

Choice of weights for stratified tables: Mantel–Haenszel, standardized, or user specified
Exact McNemar test for matched case–control data
Tabulated odds and odds ratios
Score test for linear trend

Stata has a suite of tools for dealing with 2 × 2 tables, including stratified tables, known collectively as the epitab features. To calculate appropriate statistics and suppress inappropriate statistics, these features are organized in the same way that epidemiologists conceptualize data.

Stata’s ir is used with incidence-rate (incidence density or person-time) data; point estimates and confidence intervals for the incidence-rate ratio and difference are calculated, along with the attributable or prevented fractions for the exposed and total populations.

Stata’s cs is used with cohort study data with equal follow-up time per subject. Risk is then the proportion of subjects who become cases. Point estimates and confidence intervals for the risk difference, risk ratio, and (optionally) the odds ratio are calculated, along with attributable or prevented fractions for the exposed and total population.

Stata’s cc is used with case–control and cross-sectional data. Point estimates and confidence intervals for the odds ratio are calculated along with attributable or prevented fractions for the exposed and total population.

mcc is used with matched case–control data. It calculates McNemar’s chi-squared, point estimates, and confidence intervals for the difference, ratio, and relative difference of the proportion with the factor, along with the odds ratio.

All these tools come in two editions: their normal forms and an “immediate” form. In their normal forms, the commands form counts by summing the dataset in use. In their immediate forms, the data are specified on the command line.

For instance, Boice and Monson (1977 and reprinted in Rothman, Greenland, and Lash 2008, 244) reported on breast cancer cases and person-years of observations for women with tuberculosis who were repeatedly exposed to multiple X-ray fluoroscopies and for those not exposed:

X-ray fluoroscopy Exposed Unexposed

Breast cancer cases 41 15 Person years 28,010 19,017

Using the immediate form of ir, you specify the values in the table following the command:

. iri 41 15 28010 19017 Incidence-rate comparison

	Exposed Unexposed	Total

Cases	41 15	56
Person-time	28010 19017	47027


Incidence rate	.0014638 .0007888	.0011908

	Point estimate	[95% Conf. Interval]

Inc. rate diff.	.000675	.0000749 .0012751
Inc. rate ratio	1.855759	1.005684 3.6093	(exact)
Attr. frac. ex.	.4611368	.0056519 .722938	(exact)
Attr. frac. pop	.337618

Mid p-values for tests of incidence-rate difference: Adj Pr(Exposed cases <= 41) = 0.9823 (lower one-sided) Adj Pr(Exposed cases >= 41) = 0.0177 (upper one-sided) Two-sided p-value = 0.0355

The grander ir itself can work with individual-level or aggregate data and also work with stratified data. Rothman, Greenland, and Lash (2008, 264) report results from Doll and Hill (1966) on age-specific coronary disease deaths among British male doctors in relation to cigarette smoking:

Smokers Nonsmokers Age Deaths Person-years Deaths Person-years

35-44 32 52,407 2 18,790 45-54 104 43,248 12 10,673 55-64 206 28,612 28 5,710 65-74 186 12,663 28 2,585 75-84 102 5,317 31 1,462

We have entered these data into Stata:

. list, separator(0)


		age smokes deaths pyears

1.		35-44 1 32 52,407
2.		45-54 1 104 43,248
3.		55-64 1 206 28,612
4.		65-74 1 186 12,663
5.		75-84 1 102 5,317
6.		35-44 0 2 18,790
7.		45-54 0 12 10,673
8.		55-64 0 28 5,710
9.		65-74 0 28 2,585
10.		75-84 0 31 1,462

We can obtain the Mantel–Haenszel combined estimate of the incidence-rate ratio, along with 90% confidence intervals, by typing

. ir deaths smokes pyears, by(age) level(90) Stratified incidence-rate analysis

Age category	IRR [90% Conf. Interval] M-H Weight

35-44	5.736638 1.704271 33.61646 1.472169	(exact)
45-54	2.138812 1.274552 3.813282 9.624747	(exact)
55-64	1.46824 1.044915 2.110422 23.34176	(exact)
65-74	1.35606 .9626026 1.953505 23.25315	(exact)
75-84	.9047304 .6375194 1.305412 24.31435	(exact)

Crude	1.719823 1.437544 2.0688	(exact)
M-H combined	1.424682 1.194375 1.699399

Test of homogeneity (M-H): chi2(4) = 10.41 Pr>chi2 = 0.0340

Rothman and Greenland (1998, 264) obtain the standardized incidence-rate ratio and 90% confidence intervals, weighting each age category by the population of the exposed group, thus producing the standardized mortality ratio (SMR). This calculation can be reproduced by specifying by(age) to indicate that the table is stratified, and istandard to specify that we want the internally standardized rate:

. ir deaths smokes pyears, by(age) level(90) istandard Stratified incidence-rate analysis

Age category	IRR [90% Conf. Interval] Weight

35-44	5.736638 1.704271 33.61646 52407	(exact)
45-54	2.138812 1.274552 3.813282 43248	(exact)
55-64	1.46824 1.044915 2.110422 28612	(exact)
65-74	1.35606 .9626026 1.953505 12663	(exact)
75-84	.9047304 .6375194 1.305412 5317	(exact)

Crude	1.719823 1.437544 2.0688	(exact)
I. Standardized	1.417609 1.186541 1.693676

If we want the externally standardized ratio (weights proportional to the population of the unexposed group), we can substitute estandard for istandard above.

References

Boice, J. D., and R. R. Monson. 1977.: Breast cancer in women after repeated fluoroscopic examinations of the chest. Journal of the National Cancer Institute 59: 823–832.

Doll, R., and A. B. Hill. 1966.: Mortality of British doctors in relation to smoking: observations on coronary thrombosis. In Epidemiological Approaches to the Study of Cancer and Other Chronic Diseases, ed. W. Haenszel. National Cancer Institute Monograph 19: 205–268.

Rothman, K. J., S. Greenland, and T. L. Lash. 2008.: Modern Epidemiology. 3rd ed (Revised). Philadelphia: Lippincott Williams & Wilkins.