>> Home >> Bookstore >> Biostatistics and epidemiology >> Regression Models as a Tool in Medical Research

Regression Models as a Tool in Medical Research

Werner Vach
Publisher: Chapman & Hall/CRC
Copyright: 2013
ISBN-13: 978-1-4665-17486
Pages: 473; hardcover
Price: $79.50
Supplements:Datasets and solutions to exercises

Comment from the Stata technical group

Regression Models as a Tool in Medical Research, by Werner Vach, is a practical guide to regression analysis for medical researchers. It describes the important aspects of regression models for continuous, binary, survival, and count outcomes—all commonly encountered in medical research. The regression models covered include linear regression, logistic regression, Cox regression, and Poisson regression. The book also discusses methods to handle different types of data structures such as matched case–control data and longitudinal data. The “hands-on” examples reinforce the concepts described in each chapter, and the “in-a-nutshell” summaries after each chapter provide a quick refresher of the topics covered.

The book has five parts. The first part covers the basic concepts of the linear, logistic, and Cox regressions commonly used to analyze medical data. The second part discusses more advanced topics such as modeling of nonlinear effects and analysis of longitudinal and clustered data, as well as sample-size and power considerations when designing a study. The third part concentrates on prediction, and the fourth part briefly covers some alternatives to regression modeling. Finally, the fifth part provides mathematical details behind the main regression concepts.

The numerical examples and graphs are produced with Stata; all datasets used in the examples and solutions to all exercises are available at

Table of contents

About the Author
I The Basics
1 Why Use Regression Models?
1.1 Why Use Simple Regression Models?
1.2 Why Use Multiple Regression Models?
1.3 Some Basic Notation
2 An Introductory Example
2.1 A Single Line Model
2.2 Fitting a Single Line Model
2.3 Taking Uncertainty into Account
2.4 A Two-Line Model
2.5 How to Perform These Steps with Stata
2.6 Exercise 5-HIAA and Serotonin
2.7 Exercise Haemoglobin
2.8 Exercise Scaling of Variables
3 The Classical Multiple Regression Model
4 Adjusted Effects
4.1 Adjusting for Confounding
4.2 Adjusting for Imbalances
4.3 Exercise Physical Activity in Schoolchildren
5 Inference for the Classical Multiple Regression Model
5.1 The Traditional and the Modern Way of Inference
5.2 How to Perform the Modern Way of Inference with Stata
5.3 How Valid and Good are Least Squares Estimates?
5.4 A Note on the Use and Interpretation of p-Values in Regression Analyses
6 Logistic Regression
6.1 The Definition of the Logistic Regression Model
6.2 Analysing a Dose Response Experiment by Logistic Regression
6.3 How to Fit a Dose Response Model with Stata
6.4 Estimating Odds Ratios and Adjusted Odds Ratios Using Logistic Regression
6.5 How to Compute (Adjusted) Odds Ratios Using Logistic Regression in Stata
6.6 Exercise Allergy in Children
6.7 More on Logit Scale and Odds Scale
7 Inference for the Logistic Regression Model
7.1 The Maximum Likelihood Principle
7.2 Properties of the ML Estimates for Logistic Regression
7.3 Inference for a Single Regression Parameter
7.4 How to Perform Wald Tests and Likelihood Ratio Tests in Stata
8 Categorical Covariates
8.1 Incorporating Categorical Covariates in a Regression Model
8.2 Some Technicalities in Using Categorical Covariates
8.3 Testing the Effect of a Categorical Covariate
8.4 The Handling of Categorical Covariates in Stata
8.5 Presenting Results of a Regression Analysis Involving Categorical Covariates in a Table
8.6 Exercise Physical Occupation and Back Pain
8.7 Exercise Odds Ratios and Categorical Covariates
9 Handling Ordered Categories: A First Lesson in Regression Modelling Strategies
10 The Cox Proportional Hazards Model
10.1 Modelling the Risk of Dying
10.2 Modelling the Risk of Dying in Continuous Time
10.3 Using the Cox Proportional Hazards Model to Quantify the Difference in Survival Between Groups
10.4 How to Fit a Cox Proportional Hazards Model with Stata
10.5 Exercise Prognostic Factors in Breast Cancer Patients—Part 1
11 Common Pitfalls in Using Regression Models
11.1 Association versus Causation
11.2 Difference between Subjects versus Difference within Subjects
11.3 Real-World Models versus Statistical Models
11.4 Relevance versus Significance
11.5 Exercise Prognostic Factors in Breast Cancer Patients— Part 2
II Advanced Topics and Techniques
12 Some Useful Technicalities
12.1 Illustrating Models by Using Model-Based Predictions
12.2 How to Work with Predictions in Stata
12.3 Residuals and the Standard Deviation of the Error Term
12.4 Working with Residuals and the RMSE in Stata
12.5 Linear and Nonlinear Functions of Regression Parameters
12.6 Transformations of Regression Parameters
12.7 Centering of Covariate Values
12.8 Exercise Paternal Smoking versus Maternal Smoking
13 Comparing Regression Coefficients
13.1 Comparing Regression Coefficients among Continuous Covariates
13.2 Comparing Regression Coefficients among Binary Covariates
13.3 Measuring the Impact of Changing Covariate Values
13.4 Translating Regression Coefficients
13.5 How to Compare Regression Coefficients in Stata
13.6 Exercise Health in Young People
14 Power and Sample Size
14.1 The Power of a Regression Analysis

14.2 Determinants of Power in Regression Models with a Single Covariate
14.3 Determinants of Power in Regression Models with Several Covariates
14.4 Power and Sample Size Calculations When a Sample from the Covariate Distribution Is Given
14.5 Power and Sample Size Calculations Given a Sample from the Covariate Distribution with Stata
14.6 The Choice of the Values of the Regression Parameters in a Simulation Study
14.7 Simulating a Covariate Distribution
14.8 Simulating a Covariate Distribution with Stata
14.9 Choosing the Parameters to Simulate a Covariate Distribution
14.10 Necessary Sample Sizes to Justify Asymptotic Methods
14.11 Exercise Power Considerations for a Study on Neck Pain
14.12 Exercise Choosing between Two Outcomes
15 Selection of the Sample
15.1 Selection in Dependence on the Covariates
15.2 Selection in Dependence on the Outcome
15.3 Sampling in Dependence on Covariate Values
16 Selection of Covariates
16.1 Fitting Regression Models with Correlated Covariates
16.2 The “Adjustment versus Power” Dilemma
16.3 The “Adjustment Makes Effects Small” Dilemma
16.4 Adjusting for Mediators
16.5 Adjusting for Confounding — A Useful Academic Game
16.6 Adjusting for Correlated Confounders
16.7 Including Predictive Covariates
16.8 Automatic Variable Selection
16.9 How to Choose Relevant Sets of Covariates
16.10 Preparing the Selection of Covariates: Analysing the Association Among Covariates
16.11 Preparing the Selection of Covariates: Univariate Analyses?
16.12 Exercise Vocabulary Size in Young Children—Part 1
16.13 Preprocessing of the Covariate Space
16.14 How to Preprocess the Covariate Space with Stata
16.15 Exercise Vocabulary Size in Young Children— Part 2
16.16 What Is a Confounder?
17 Modelling Nonlinear Effects
17.1 Quadratic Regression
17.2 Polynomial Regression
17.3 Splines
17.4 Fractional Polynomials
17.5 Gain in Power by Modelling Nonlinear Effects?
17.6 Demonstrating the Effect of a Covariate
17.7 Demonstrating a Nonlinear Effect
17.8 Describing the Shape of a Nonlinear Effect
17.9 Detecting Nonlinearity by Analysis of Residuals
17.10 Judging of Nonlinearity May Require Adjustment
17.11 How to Model Nonlinear Effects in Stata
17.12 The Impact of Ignoring Nonlinearity
17.13 Modelling the Nonlinear Effect of Confounders
17.14 Nonlinear Models
17.15 Exercise Serum Makers for AMI
18 Transformation of Covariates
18.1 Transformations to Obtain a Linear Relationship
18.2 Transformation of Skewed Covariates
18.3 To Categorise or Not to Categorise
19 Effect Modification and Interactions
19.1 Modelling Effect Modification
19.2 Adjusted Effect Modifications
19.3 Interactions
19.4 Modelling Effect Modifications in Several Covariates
19.5 The Effect of a Covariate in the Presence of Interactions
19.6 Interactions as Deviations from Additivity
19.7 Scales and Interactions
19.8 Ceiling Effects and Interactions
19.9 Hunting for Interactions
19.10 How to Analyse Effect Modification and Interactions with Stata
19.11 Exercise Treatment Interactions in a Randomised Clinical Trial for the Treatment of Malignant Glioma
20 Applying Regression Models to Clustered Data
20.1 Why Clustered Data Can Invalidate Inference
20.2 Robust Standard Errors
20.3 Improving the Efficiency
20.4 Within- and Between-Cluster Effects
20.5 Some Unusual but Useful Usages of Robust Standard Errors in Clustered Data
20.6 How to Take Clustering into Account in Stata
21 Applying Regression Models to Longitudinal Data
21.1 Analysing Time Trends in the Outcome
21.2 Analysing Time Trends in the Effect of Covariates
21.3 Analysing the Effect of Covariates
21.4 Analysing Individual Variation in Time Trends
21.5 Analysing Summary Measures
21.6 Analysing the Effect of Change
21.7 How to Perform Regression Modelling of Longitudinal Data in Stata
21.8 Exercise Increase of Body Fat in Adolescents
22 The Impact of Measurement Error
22.1 The Impact of Systematic and Random Measurement Error
22.2 The Impact of Misclassification
22.3 The Impact of Measurement Error in Confounders
22.4 The Impact of Differential Misclassification and Measurement Error
22.5 Studying the Measurement Error
22.6 Exercise Measurement Error and Interactions
23 The Impact of Incomplete Covariate Data
23.1 Missing Value Mechanisms
23.2 Properties of a Complete Case Analysis
23.3 Bias Due to Using ad hoc Methods
23.4 Advanced Techniques to Handle Incomplete Covariate Data
23.5 Handling of Partially Defined Covariates
III Risk Scores and Predictors
24 Risk Scores
24.1 What Is a Risk Score?
24.2 Judging the Usefulness of a Risk Score
24.3 The Precision of Risk Score Values
24.4 The Overall Precision of a Risk Score
24.5 Using Stata’s predict Command to Compute Risk Scores
24.6 Categorisation of Risk Scores
24.7 Exercise Computing Risk Scores for Breast Cancer Patients
25 Construction of Predictors
25.1 From Risk Scores to Predictors
25.2 Predictions and Prediction Intervals for a Continuous Outcome
25.3 Predictions for a Binary Outcome
25.4 Construction of Predictions for Time-to-Event Data
25.5 How to Construct Predictions with Stata
25.6 The Overall Precision of a Predictor
26 Evaluating the Predictive Performance
26.1 The Predictive Performance of an Existing Predictor
26.2 How to Assess the Predictive Performance of an Existing Predictor in Stata
26.3 Estimating the Predictive Performance of a New Predictor
26.4 How to Assess the Predictive Performance via Cross-Validation in Stata
26.5 Exercise Assessing the Predictive Performance of a Prognostic Score in Breast Cancer Patients
27 Outlook: Construction of Parsimonious Predictors
IV Miscellaneous
28 Alterations to Regression Modelling
28.1 Stratification
28.2 Measures of Association: Correlation Coefficients
28.3 Measures of Association: The Odds Ratio
28.4 Propensity Scores
28.5 Classification and Regression Trees
29 Specific Regression Models
29.1 Probit Regression for Binary Outcomes
29.2 Generalised Linear Models
29.3 Regression Models for Count Data
29.4 Regression Models for Ordinal Outcome Data
29.5 Quantile Regression and Robust Regression
29.6 ANOVA and Regression
30 Specific Usages of Regression Models
30.1 Logistic Regression for the Analysis of Case-Control Studies
30.2 Logistic Regression for the Analysis of Matched Case-Control Studies
30.3 Adjusting for Baseline Values in Randomised Clinical Trials
30.4 Assessing Predictive Factors
30.5 Incorporating Time-Varying Covariates in a Cox Model
30.6 Time-Dependent Effects in a Cox Model
30.7 Using the Cox Model in the Presence of Competing Risks
30.8 Using the Cox Model to Analyse Multi-State Models
31 What Is a Good Model?
31.1 Does the Model Fit the Data?
31.2 How Good Are Predictions?
31.3 Explained Variation
31.4 Goodness of Fit
31.5 Model Stability
31.6 The Usefulness of a Model
32 Final Remarks on the Role of Prespecified Models and Model Development
V Mathematical Details
A Mathematics Behind the Classical Linear Regression Model
A.1 Computing Regression Parameters in Simple Linear Regression
A.2 Computing Regression Parameters in the Classical Multiple Regression Model
A.3 Estimation of the Standard Error
A.4 Construction of Confidence Intervals and p-Values
B Mathematics Behind the Logistic Regression Model
B.1 The Least Squares Principle as a Maximum Likelihood Principle
B.2 Maximising the Likelihood of a Logistic Regression Model
B.3 Estimating the Standard Error of the ML Estimates
B.4 Testing Composite Hypotheses
C The Modern Way of Inference
C.1 Robust Estimation of Standard Errors
C.2 Robust Estimation of Standard Errors in the Presence of Clustering
D Mathematics for Risk Scores and Predictors
D.1 Computing Individual Survival Probabilities after Fitting a Cox Model
D.2 Standard Errors for Risk Scores
D.3 The Delta Rule
The Stata Blog: Not Elsewhere Classified Find us on Facebook Follow us on Twitter LinkedIn Google+ Watch us on YouTube