*Preface*

*Author*

**1 Do Storks Bring Babies?**

1.1 Karl Pearson and Spurious Correlation

1.2 Jerzy Neyman, Storks, and Babies

1.3 Is Poisson Regression the Solution to the Stork Problem?

1.4 Further Reading

**2 Risks and Rates**

2.1 What Is a Rate?

2.2 Closed and Open Populations

2.3 Measures of Time

2.4 Numerators for Rates: Counts

2.5 Numerators that May Be Mistaken for Counts

2.6 Prevalence Proportions

2.7 Denominators for Rates: Count Denominators for Incidence Proportions (Risks)

2.8 Denominators for Rates: Person-Time for Incidence Rates

2.9 Rate Numerators and Denominators for Recurrent Events

2.10 Rate Denominators Other than Person-Time

2.11 Different Incidence Rates Tell Different Stories

2.12 Potential Advantages of Incidence Rates Compared With Incidence Proportions (Risks)

2.13 Potential Advantages of Incidence Proportions (Risks) Compared with Incidence Rates

2.14 Limitations of Risks and Rates

2.15 Radioactive Decay: An Example of Exponential Decline

2.16 The Relevance of Exponential Decay to Human Populations

2.17 Relationships Between Rates, Risks, and Hazards

2.18 Further Reading

**3 Rate Ratios and Differences**

3.1 Estimated Associations and Causal Effects

3.2 Sources of Bias in Estimates of Causal Effect

3.3 Estimation versus Prediction

3.4 Ratios and Differences for Risks and Rates

3.5 Relationships between Measures of Association in a Closed Population

3.6 The Hypothetical TEXCO Study

3.7 Breaking the Rules: Army Data for Companies A and B

3.8 Relationships between Odds Ratios, Risk Ratios, and Rate Ratios in Case-Control Studies

3.9 Symmetry of Measures of Association

3.10 Convergence Problems for Estimating Associations

3.11 Some History Regarding the Choice between Ratios and Differences

3.12 Other Influences on the Choice between Use of Ratios or Differences

3.13 The Data May Sometimes Be Used to Choose between a Ratio of a Difference

**4 The Poisson Distribution**

4.1 Alpha Particle Radiation

4.2 The Poisson Distribution

4.3 Prussian Soldiers Kicked to Death by Horses

4.4 Variances, Standard Deviations, and Standard Errors for Counts and Rates

4.5 An Example: Mortality from Alzheimer's Disease

4.6 Large Sample P-values for Counts, Rates, and Their Differences using the Wald Statistic

4.7 Comparisons of Rates as Differences versus Ratios

4.8 Large Sampel P-values for Counts, Rates, and Their Differences using the Score Statistic

4.9 Large Sample Confidence Intervals for Counts, Rates, and Their Differences

4.10 Large Sample P-values for Counts, Rates, and Their Ratios

4.11 Large Sample Confidence Intervals for Ratios of Counts and Rates

4.12 A constant Rate Based on More Person-Time Is More Precise

4.13 Exact Methods

4.14 What Is a Poisson Process?

4.15 Simulated Examples

4.16 What If the Data Are Not from a Poisson Process? Part 1, Overdispersion

4.17 What If the Data Are Not from a Poisson Process? Part 2, Underdispersion

4.18 Must Anything Be Rare?

4.19 Bicyclist Deaths in 2010 and 2011

**5 Criticism of Incidence Rates**

5.1 Florence Nightingale, William Farr, and Hospital Mortality Rates. Debate in 1864

5.2 Florence Nightingale, William Farr, and Hospital Mortality Rates. Debate in 1996—1997

5.3 Criticism of Rates in the British Medical Journal in 1995

5.4 Criticism of Incidence Rates in 2009

**6 Stratified Analysis: Standardized Rates**

6.1 Why Standardize?

6.2 External Weights from a Standard Population: Direct Standardization

6.3 Comparing Directly Standardized Rates

6.4 Choice of the Standard Influences the Comparison of Standardized Rates

6.5 Standardized Comparisons versus Adjusted Comparisons from Variance-Minimizing Methods

6.6 Stratified Analyses

6.7 Variations on Directly Standardized Rates

6.8 Internal Weights from a Population: Indirect Standardization

6.9 The Standardized Mortality Ratio (SMR)

6.10 Advantages of SMRs Compared with SRRs (Ratios of Directly Standardized Rates)

6.11 Disadvantages of SMRs Compared with SRRs (Ratios of Directly Standardized Rates)

6.12 The Terminology of Direct and Indirect Standardization

6.13 P-values for Directly Standardized Rates

6.14 Confidence Intervals for Directly Standardized Rates

6.15 P-values and CIs for SRRs (Ratios of Directly Standardized Rates)

6.16 Large Sample P-values and CIs for SMRs

6.17 Small Sample P-values and CIs for SMRs

6.18 Standardized Rates Should Not Be used as Regression Outcomes

6.19 Standardization Is Not Always the Best Choice

**7 Stratified Analysis: Inverse-Variance and Mantel-Haenszel Methods**

7.1 Inverse-variance Methods

7.2 Inverse-Variance Analysis of Rate Ratios

7.3 Inverse-Variance Analysis of Rate Differences

7.4 Choosing between Rate Ratios and Differences

7.5 Mantel-Haenszel Methods

7.6 Mantel-Haenszel Analysis of Rate Ratios

7.7 Mantel-Haenszel Analysis of Rate Differences

7.8 P-values for Stratified Rate Ratios of Differences

7.9 Analysis of Sparse Data

7.10 Maximum-Likelihood Stratified Methods

7.11 Stratified Methods versus Regression

**8 Collapsibility and Confoundings**

8.1 What Is Collapsibility?

8.2 The British X-Trial: Introducing Variation in Risk

8.3 Rate Ratios and Differences Are Noncollapsible because Exposure Influences Person-Time

8.4 Which Estimate of the Rate Ratio Should We Prefer?

8.5 Behavior of Risk Ratios and Differences

8.6 Hazard Ratios and Odds Ratios

8.7 Comparing Risks with Other Outcome Measures

8.8 The Italian X-Trial: 3-Levels of Risk under No Exposure

8.9 The American X-Cohort Study: 3-Levels of Risk in a Cohort Study

8.10 The Swedish X-Cohort Study: A Collapsible Risk Ratio in Confounded Data

8.11 A Summary of Findings

8.12 A Different View of Collapsibility

8.13 Practical Implications: Avoid Common Outcomes

8.14 Practical Implications: Use Risks or Survival Functions

8.15 Practical Implications: Case-Control Studies

8.16 Practical Implications: Uniform Risk

8.17 Practical Implications: Use All Events

**9 Poisson Regression for Rate Ratios**

9.1 The Poisson Regression Model for Rate Ratios

9.2 A Short Comparison with Ordinary Linear Regression

9.3 A Poisson Model without Variables

9.4 A Poisson Regression Model with One Explanatory Variable

9.5 The Iteration Log

9.6 The Header Information above the Table of Estimates

9.7 Using a Generalized Linear Model to Estimate Rate Ratios

9.8 A Regression Example: Studying Rates over Time

9.9 An Alternative Parameterization for Poisson Models: A Regression Trick

9.10 Further Comments about Person-Time

9.11 A Short Summary

**10 Poisson Regression for Rate Differences**

10.1 A Regression Model for Rate Differences

10.2 Florida and Alaska Cancer Mortality: Regression Models that Fail

10.3 Florida and Alaska Cancer Mortality: Regression Models that Succeed

10.4 A Generalized Linear Model with a Power Link

10.5 A Caution

**11 Linear Regression**

11.1 Limitations of Ordinary Least Squares Linear Regression

11.2 Florida and Alaska Cancer Mortality Rates

11.3 Weighted Least Squares Linear Regression

11.4 Importance Weights for Weighted Least Squares Linear Regression

11.5 Comparison of Poisson, Weighted Least Squares, and Ordinary Least Squares Regression

11.6 Exposure to a Carcinogen: Ordinary Linear Regression Ignores the Precision of Each Rate

11.7 Differences in Homicide Rates: Simple Averages versus Population-Weighted Averages

11.8 The Place of Ordinary Least Squares Linear Regression for the Analysis of Incidence Rates

11.9 Variance Weighted Least Squares Regression

11.10 Cautions regarding Inverse-Variance Weights

11.11 Why Use Variance Weighted Least Squares?

11.12 A Short Comparison of Weighted Poisson Regression, Variance Weighted Least Squares, and Weighted Linear Regression

11.13 Problems When Age-Standardized Rates are Used as Outcomes

11.14 Ratios and Spurious Correlation

11.15 Linear Regression with In (Rate) as the Outcome

11.16 Predicting Negative Rates

11.17 Summary

**12 Model Fit**

12.1 Tabular and Graphic Displays

12.2 Goodness of Fit Tests: Deviance and Pearson Statistics

12.3 A Conditional Moment Chi-Squared Test of Fit

12.4 Limitations of Goodness-of-Fit Statistics

12.5 Measures of Dispersion

12.6 Robust Variance Estimator as a Test of Fit

12.7 Comparing Models using the Deviance

12.8 Comparing Models using Akaike and Bayesian Information Criterion

12.9 Example 1: Using Stata's Generalized Linear Model Command to Decide between a Rate Ratio or a Rate Difference Model for the Randomized Controlled Trial of Exercise and Falls

12.10 Example 2: A Rate Ratio of a Rate Difference Model for Hypothetical Data Regarding the Association between Fall Rates and Age

12.11 A Test of the Model Link

12.12 Residuals, Influence Analysis, and Other Measures

12.13 Adding Model Terms to Improve Fit

12.14 A Caution

12.15 Further Reading

**13 Adjusting Standard Errors and Confidence Intervals**

13.1 Estimating the Variance without Regression

13.2 Poisson Regression

13.3 Rescaling the Variance using the Pearson Dispersion Statistic

13.4 Robust Variance

13.5 Generalized Estimating Equations

13.6 Using the Robust Variance to Study Length of Hospital Stay

13.7 Computer Intensive Methods

13.8 The Bootstrap Idea

13.9 The Bootstrap Normal Method

13.10 The Bootstrap Percentile Method

13.11 The Bootstrap Bias-Corrected Percentile Method

13.12 The Bootstrap Bias-Corrected and Accelerated Method

13.13 The Bootstrap-T Method

13.14 Which Bootstrap CI Is Best?

13.15 Permutation and Randomization

13.16 Randomization to Nearly Equal Groups

13.17 Better Randomization Using the Randomized Block Design of the Original Study

13.18 A Summary

**14 Storks and Babies, Revisited**

14.1 Neyman's Approach to His Data

14.2 Using Methods for Incidence Rates

14.3 A Model That uses the Stork/Women Ratio

**15 Flexible Treatment of Continuous Variables**

15.1 The Problem

15.2 Quadratic Splines

15.3 Fractional Polynomials

15.4 Flexible Adjustment for Time

15.5 Which Method Is Best?

**16 Variation in Size of an Association**

16.1 An Example: Shoes and Falls

16.2 Problem 1: Using Subgroup P-values for Interpretation

16.3 Problem 2: Failure to Include Main Effect Terms When Interaction Terms Are Used

16.4 Problem 3: Incorrectly Concluding that There Is No Variation in Association

16.5 Problem 4: Interaction May Be Present on a Ratio Scale but Not on a Difference Scale, and Vice Versa

16.6 Problem 5: Failure to Report All Subgroup Estimates in an Evenhanded Manner

**17 Negative Binomial Regression**

17.1 Negative Binomial Regression Is a Random Effects or Mixed Model

17.2 An Example: Accidents among Workers in a Munitions Factory

17.3 Introducing Equal Person-Time in the Homicide Data

17.4 Letting Person-Time Vary in the Homicide Data

17.5 Estimating a Rate Ratio for the Homicide Data

17.6 Another Example using Hypothetical Data for Five Regions

17.7 Unobserved Heterogeneity

17.8 Observing Heterogeneity in the Shoe Data

17.9 Underdispersion

17.10 A Rate Difference Negative Binomial Regression Model

17.11 Conclusion

**18 Clustered Data**

18.1 Data from 24 Fictitious Nursing Homes

18.2 Results from 10,000 Data Simulations for the Nursing Homes

18.3 A Single Random Set of Data for the Nursing Homes

18.4 Variance Adjustment Methods

18.5 Generalized Estimating Equations (GEE)

18.6 Mixed Model Methods

18.7 What Do Mixed Models Estimate

18.8 Mixed Models Estimates for the Nursing Home Intervention

18.9 Simulation Results for Some Mixed Models

18.10 Mixed Models Weight Observations Differently that Poisson Regression

18.11 Which Should We Prefer for Clustered Data, Variance-Adjusted or Mixed Models?

18.12 Additional Model Commands for Clustered Data

18.13 Further Reading

**19 Longitudinal Data**

19.1 Just Use Rates

19.2 Using Rates to Evaluate Governmental Policies

19.3 Study Designs for Governmental Policies

19.4 A Fictitious Water Treatment and U.S. Mortality 1999—2013

19.5 Poisson Regression

19.6 Population-Averaged Estimates (GEE)

19.7 Conditional Poisson Regression, a Fixed-Effects Approach

19.8 Negative Binomial Regression

19.9 Which Method Is Best?

19.10 Water Treatment in Only 10 States

19.11 Conditional Poisson Regression for the 10-State Water Treatment Data

19.12 A Published Study

**20 Matched Data**

20.1 Matching in Case-Control Studies

20.2 Matching in Randomized Controlled Trials

20.3 Matching in Cohort Studies

20.4 Matching to Control Confounding in Some Randomized Trials and Cohort Studies

20.5 A Benefit of Matching; Only Matched Sets with at Least One Outcome Are Needed

20.6 Studies Designs that Match a Person to Themselves

20.7 A Matched Analysis Can Account for Matching Ratios that Are Not Constant

20.8 Choosing between Risks and Rates for the Crash Data in Tables 20.1 and 20.2

20.9 Stratified Methods for Estimating Risk Ratios for Matched Data

20.10 Odds Ratios, Risk Ratios, Cell A, and Matched Data

20.11 Regression Analysis of Matched Data for the Odds Ratio

20.12 Regression Analysis of Matched Data for the Risk Ratio

20.13 Matched Analysis of Rates with One Outcome Event

20.14 Matched Analysis of Rates for Recurrent Events

20.15 The Randomized Trial of Exercise and Falls; Additional Analyses

20.16 Final Words

**21 Marginal Methods**

21.1 What Are Margins?

21.2 Converting Logistic Regression Results into Risk Ratios or Risk Differences: Marginal Standardization

21.3 Estimating a Rate Difference from a Rate Ratio Model

21.4 Death by Age and Sex: A Short Example

21.5 Skunk Bite Data: A Long Example

21.6 Obtaining the Rate Difference: Crude Rates

21.7 Using the Robust Variance

21.8 Adjusting for Age

21.9 Full Adjustment for Age and Sex

21.10 Marginal Commands for Interactions

21.11 Marginal Methods for a Continuous Variable

21.12 Using a Rate Difference Model to Estimate a Rate Ratio: Use the In Scale

**22 Bayesian Methods**

22.1 Cancer Mortality Rate in Alaska

22.2 The Rate Ratio for Falling in a Trial of Exercise

**23 Exact Poisson Regression**

23.1 A Simple Example

23.2 A Perfectly Predicted Outcome

23.3 Memory Problems

23.4 A Caveat

**24 Instrumental Variables**

24.1 The Problem: What Does a Randomized Controlled Trial Estimate?

24.2 Analysis by Treatment Received May Yield Biased Estimates of Treatment Effect

24.3 Using an Instrumental Variable

24.4 Two-Stage Linear Regression for Instrumental Variables

24.5 Generalized Method of Moments

24.6 Generalized Method of Moments for Rates

24.7 What Does an Instrumental Variable Analysis Estimate?

24.8 There Is No Free Lunch

24.9 Final Comments

**25 Hazards**

25.1 Data for a Hypothetical Treatment with Exponential Survival Times

25.2 Poisson Regression and Exponential Proportional Hazards Regression

25.3 Poisson and Cox Proportional Hazards Regression

25.4 Hypothetical Data for a Rate that Changes over Time

25.5 A Piecewise Poisson Model

25.6 A More Flexible Poisson Model: Quadratic Splines

25.7 Another Flexible Poisson Model: Restricted Cubic Splines

25.8 Flexibility with Fractional Polynomials

25.9 When Should a Poisson Model Be Used? Randomized Trial of a Terrible Treatment

25.10 A Real Randomized Trial, the PLCO Screening Trial

25.11 What If Events Are Common?

25.12 Cox Model or a Flexible Parametric Model?

25.13 Collapsibility and Survival Functions

25.14 Relaxing the Assumption of Proportional Hazards in the Cox Model

25.15 Relaxing the Assumption of Proportional Hazards for the Poisson Model

25.16 Relaxing Proportional Hazards for the Royston-Parmar Model

25.17 The Life Expectancy Difference or Ratio

25.18 Recurrent or Multiple Events

25.19 A Short Summary

*Bibliography*

*Index*