Stata
Products Purchase Support Company
Search
   >> Home >> Products >> Capabilities >> Tools for epidemiologists >> Tables for epidemiologists

Tables for epidemiologists

  • 2 x 2 and 2 x 2 stratified table for longitudinal, cohort study, case–control, and matched case–control data
  • Odds ratio, incidence ratio, risk ratio, risk difference, and attributable fraction
  • Confidence intervals for the above
  • Chi-squared, Fishers’s exact, and Mantel–Haenszel tests
  • Tests for homogeneity
  • Choice of weights for stratified tables: Mantel–Haenszel, standardized, or user specified
  • Exact McNemar test for matched case–control data
  • Tabulated odds and odds ratios
  • Score test for linear trend

Stata has a set of commands for dealing with 2 x 2 tables, including stratified tables, known collectively as the epitab commands. To calculate appropriate statistics and suppress inappropriate statistics, these commands are organized in the same way that epidemiologists conceptualize data.

Stata’s ir command is used with incidence-rate (incidence density or person-time) data; point estimates and confidence intervals for the incidence-rate ratio and difference are calculated, along with the attributable or prevented fractions for the exposed and total populations.

Stata’s cs command is used with cohort study data with equal follow-up time per subject. Risk is then the proportion of subjects who become cases. Point estimates and confidence intervals for the risk difference, risk ratio, and (optionally) the odds ratio are calculated, along with attributable or prevented fractions for the exposed and total population.

Stata’s cc command is used with case–control and cross-sectional data. Point estimates and confidence intervals for the odds ratio are calculated along with attributable or prevented fractions for the exposed and total population.

mcc is used with matched case–control data. It calculates McNemar’s chi-squared, point estimates, and confidence intervals for the difference, ratio, and relative difference of the proportion with the factor, along with the odds ratio.

All these commands come in two flavors: their normal forms and an “immediate” form. In their normal forms, the commands form counts by summing the dataset in use. In their immediate forms, the data are specified on the command line.

For instance, Boice and Monson (1977 and reprinted in Rothman and Greenland 1998, 238) reported on breast cancer cases and person-years of observations for women with tuberculosis who were repeatedly exposed to multiple X-ray fluoroscopies and for those not exposed:

                            X-ray fluoroscopy
                          Exposed     Unexposed
    -------------------------------------------
    Breast cancer cases        41           15
    Person years           28,010       19,017

Using the immediate form of ir, you specify the values in the table following the command:

 . iri 41 15 28010 19017

                  |   Exposed   Unexposed  |      Total
 -----------------+------------------------+------------
            Cases |        41          15  |         56
      Person-time |     28010       19017  |      47027
 -----------------+------------------------+------------
                  |                        |
   Incidence Rate |  .0014638    .0007888  |   .0011908
                  |                        |
                  |      Point estimate    |    [95% Conf. Interval]
                  |------------------------+------------------------
  Inc. rate diff. |          .000675       |    .0000749    .0012751 
  Inc. rate ratio |         1.855759       |    1.005684      3.6093 (exact)
  Attr. frac. ex. |         .4611368       |    .0056519     .722938 (exact)
  Attr. frac. pop |          .337618       |
                  +-------------------------------------------------
                      (midp)   Pr(k>=41) =                    0.0177 (exact)
                      (midp) 2*Pr(k>=41) =                    0.0355 (exact)

The grander ir command itself can work with individual-level or aggregate data and also work with stratified data. Rothman and Greenland (1998, 259) report results from Doll and Hill (1966) on age-specific coronary disease deaths among British male doctors in relation to cigarette smoking:

                   Smokers                 Nonsmokers
    Age      Deaths  Person-years     Deaths  Person-years
    -------------------------------------------------------
    35-44       32       52,407           2       18,790
    45-54      104       43,248          12       10,673
    55-64      206       28,612          28        5,710
    65-74      186       12,663          28        2,585
    75-84      102        5,317          31        1,462

We have entered these data into Stata:

 . list, separator(0)

      +-----------------------------------+
      |    age   smokes   deaths   pyears |
      |-----------------------------------|
   1. |  35-44        1       32   52,407 |
   2. |  35-44        0        2   18,790 |
   3. |  45-54        1      104   43,248 |
   4. |  45-54        0       12   10,673 |
   5. |  55-64        1      206   28,612 |
   6. |  55-64        0       28    5,710 |
   7. |  65-74        1      186   12,663 |
   8. |  65-74        0       28    2,585 |
   9. |  75-84        1      102    5,317 |
  10. |  75-84        0       31    1,462 |
      +-----------------------------------+

We can obtain the Mantel–Haenszel combined estimate of the incidence-rate ratio, along with 90% confidence intervals, by typing

 . ir deaths smokes pyears, by(age) level(90)

              age |      IRR       [90% Conf. Interval]   M-H Weight
 -----------------+-------------------------------------------------
            35-44 |   5.736638      1.704271   33.61646     1.472169 (exact)
            45-54 |   2.138812      1.274552   3.813282     9.624747 (exact)
            55-64 |    1.46824      1.044915   2.110422     23.34176 (exact)
            65-74 |    1.35606      .9626026   1.953505     23.25315 (exact)
            75-84 |   .9047304      .6375194   1.305412     24.31435 (exact)
 -----------------+-------------------------------------------------
            Crude |   1.719823      1.437544     2.0688              (exact)
     M-H combined |   1.424682      1.194375   1.699399
 -------------------------------------------------------------------
  Test of homogeneity (M-H)    chi2(4) =     10.41  Pr>chi2 = 0.0340

Rothman and Greenland (1998, 264) obtain the standardized incidence-rate ratio and 90% confidence intervals, weighting each age category by the population of the exposed group, thus producing the standardized mortality ratio (SMR). This calculation can be reproduced by specifying by(age) to indicate that the table is stratified, and istandard to specify that we want the internally standardized rate:

 . ir deaths smokes pyears, by(age) level(90) istandard

              age |      IRR       [90% Conf. Interval]       Weight
 -----------------+-------------------------------------------------
            35-44 |   5.736638      1.704271   33.61646        52407 (exact)
            45-54 |   2.138812      1.274552   3.813282        43248 (exact)
            55-64 |    1.46824      1.044915   2.110422        28612 (exact)
            65-74 |    1.35606      .9626026   1.953505        12663 (exact)
            75-84 |   .9047304      .6375194   1.305412         5317 (exact)
 -----------------+-------------------------------------------------
            Crude |   1.719823      1.437544     2.0688              (exact)
  I. Standardized |   1.417609      1.186541   1.693676

If we want the externally standardized ratio (weights proportional to the population of the unexposed group), we can substitute estandard for istandard in the command above.

See New in Stata 10 for more about what was added in Stata Release 10.

References

Boice, J. D., and R. R. Monson. 1977.
Breast cancer in women after repeated fluoroscopic examinations of the chest. Journal of the National Cancer Institute 59: 823–832.
Doll, R., and A. B. Hill. 1966.
Mortality of British doctors in relation to smoking: observations on coronary thrombosis. In Epidemiological Approaches to the Study of Cancer and Other Chronic Diseases, ed. W. Haenszel. National Cancer Institute Monograph 19: 205–268.
Rothman, K. J., and S. Greenland. 1998.
Modern Epidemiology. 2nd ed. Philadelphia: Lippincott–Raven.
Stata 10
Overview: Why use Stata?
Stata/MP
64-bit Stata
Capabilities
Overview
Statistics
Basic statistics
Linear models
Multilevel mixed-effects models
Limited dependent variables
Panel data
GLM
Nonparametric
Exact statistics
ANOVA / MANOVA
Multivariate methods
Cluster analysis
Bootstrapping
Model testing
Survey methods
Survival analysis
Epidemiology tools
Tables
Time series
Maximum likelihood
Normality tests
Other methods
Data management
Graphics
Matrix programming—Mata
Programming
Internet capabilities
Y2K
Accessibility
Sample session
New in Stata 10
Supported platforms
Which Stata package?
Technical support
User comments
Products
Stata 10
Order Stata
Upgrade
NetCourses
Bookstore
Stata Journal
Stata Press
Stata News
STB
Stat/Transfer
Gift Shop

Site overview
Products
Resources & support
Company
Site index

© Copyright 1996–2008 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index