Search 
Click to enlarge See the back cover 
Multivariable ModelBuilding: A Pragmatic Approach to Regression Analysis Based on Fractional Polynomials for Modelling Continuous Variables 

$98.75 each Buy 


Review of this book from the Stata Journal


Comment from the Stata technical groupSelecting the appropriate model from among a large class of candidate models is a difficult process: one must balance the (sometimes contradictory) goals of model interpretability, parsimony, good prediction properties, robustness to minor variations in the data, and applicability to other data. This text presents a wellrounded, practical approach to model selection, with its bulk devoted to general variable selection through the use of stepwise procedures (or otherwise) and the selection of functional forms for continuous variables. Regarding the selection of functional forms, the authors pay much attention to fractional polynomials and splines, drawing on their vast research in these areas. In particular, those looking for a tutorial on the use of fractional polynomials will find this text very useful. The methods prescribed can be applied widely, yet the examples used are primarily from the health sciences, with the typically used models being logistic regression, Cox regression, and generalized linear models. 

Table of contentsView table of contents >> Preface
1 Introduction
1.1 RealLife Problems as Motivation for Model Building
1.1.1 Many Candidate Models
1.2 Issues in Modelling Continuous Predictors 1.1.2 Functional Form for Continuous Predictors 1.1.3 Example 1: Continuous Response 1.1.4 Example 2: Multivariable Model for Survival Data
1.2.1 Effects of Assumptions
1.3 Types of Regression Model Considered 1.2.2 Global versus Local Influence Models 1.2.3 Disadvantages of Fractional Polynomial Modelling 1.2.4 Controlling Model Complexity
1.3.1 NormalErrors Regression
1.4 Role of Residuals 1.3.2 Logistic Regression 1.3.3 Cox Regression 1.3.4 Generalized Linear Models 1.3.5 Linear and Additive Predictors
1.4.1 Uses of Residuals
1.5 Role of SubjectMatter Knowledge in Model Development 1.4.2 Graphical Analysis of Residuals 1.6 Scope of Model Building in our Book 1.7 Modelling Preferences
1.7.1 General Issues
1.8 General Notation 1.7.2 Criteria for a Good Model 1.7.3 Personal Preferences 2 Selection of Variables
2.1 Introduction
2.2 Background 2.3 Preliminaries for a Multivariable Analysis 2.4 Aims of Multivariable Models 2.5 Prediction: Summary Statistics and Comparisons 2.6 Procedures for Selecting Variables
2.6.1 Strength of Predictors
2.7 Comparison of Selection Strategies in Examples 2.6.2 Stepwise Procedures 2.6.3 AllSubsets Model Selection Using Information Criteria 2.6.4 Further Considerations
2.7.1 Myeloma Study
2.8 Selection and Shrinkage 2.7.2 Educational BodyFat Data 2.7.3 Glioma Study
2.8.1 Selection Bias
2.9 Discussion 2.8.2 Simulation Study 2.8.3 Shrinkage to Correct for Selection Bias 2.8.4 Postestimation Shrinkage 2.8.5 Reducing Selection Bias 2.8.6 Example
2.9.1 Model Building in Small Datasets
2.9.2 Full, Prespecified or Selected Model? 2.9.3 Comparison of Selection Procedures 2.9.4 Complexity, Stability and Interpretability 2.9.5 Conclusions and Outlook Handling Categorical and Continuous Predictors
3.1 Introduction
3.2 Types of Predictor
3.2.1 Binary
3.3 Handling Ordinal Predictors 3.2.2 Nominal 3.2.3 Ordinal, Counting, Continuous 3.2.4 Derived
3.3.1 Coding Schemes
3.4 Handling Counting and Continuous Predictors: Categorization 3.3.2 Effect of Coding Schemes on Variable Selection
3.4.1 ‘Optimal’ Cutpoints: A Dangerous Analysis
3.5 Example: Issues in Model Building with Categorized Variables 3.4.2 Other Ways of Choosing a Cutpoint
3.5.1 One Ordinal Variable
3.6 Handling Counting and Continuous Predictors: Functional Form 3.5.2 Several Ordinal Variables
3.6.1 Beyond Linearity
3.7 Empirical Curve Fitting 3.6.2 Does Nonlinearity Matter? 3.6.3 Simple versus Complex Functions 3.6.4 Interpretability and Transportability
3.7.1 General Approaches to Smoothing
3.8 Discussion 3.7.2 Critique of Local and Global Influence Models
3.8.1 Sparse Categories
3.8.2 Choice of Coding Scheme 3.8.3 Categorizing Continuous Variables 3.8.4 Handling Continuous Variables 4 Fractional Polynomials for One Variable
4.1 Introduction
4.2 Background
4.2.1 Genesis
4.3 Definition and Notation 4.2.2 Types of Model 4.2.3 Relation to Box–Tidwell and Exponential Functions
4.3.1 Fractional Polynomials
4.4 Characteristics 4.3.2 First Derivative
4.4.1 FP1 and FP2 Functions
4.5 Examples of Curve Shapes with FP1 and FP2 Functions 4.4.2 Maximum or Minimum of a FP2 Function 4.6 Choice of Powers 4.7 Choice of Origin 4.8 Model Fitting and Estimation 4.9 Inference
4.9.1 Hypothesis Testing
4.10 Function Selection Procedure 4.9.2 Interval Estimation
4.10.1 Choice of Default Function
4.11 Scaling and Centering 4.10.2 Closed Test Procedure for Function Selection 4.10.3 Example 4.10.4 Sequential Procedure 4.10.5 Type I Error and Power of the Function Selection Procedure
4.11.1 Computational Aspects
4.12 FP Powers as Approximations to Continuous Powers 4.11.2 Examples
4.12.1 Box–Tidwell and Fractional Polynomial Models
4.13 Presentation of Fractional Polynomial Functions 4.12.2 Example
4.13.1 Graphical
4.14 Worked Example 4.13.2 Tabular
4.14.1 Details of all Fractional Polynomial Models
4.15 Modelling Covariates with a Spike at Zero 4.14.2 Function Selection 4.14.3 Details of the Fitted Model 4.14.4 Standard Error of a Fitted Value 4.14.5 Fitted Odds Ratio and its Confidence Interval 4.16 Power of Fractional Polynomial Analysis
4.16.1 Underlying Function Linear
4.17 Discussion
4.16.2 Underlying Function FP1 or FP2 4.16.3 Comment 5 Some Issues with Univariate Fractional Polynomial Models
5.1 Introduction
5.2 Susceptibility to Influential Covariate Observations 5.3 A Diagnostic Plot for Influential Points in FP Models
5.3.1 Example 1: Educational BodyFat Data
5.4 Dependence on Choice of Origin 5.3.2 Example 2: Primary Biliary Cirrhosis Data 5.5 Improving Robustness by Preliminary Transformation
5.5.1 Example 1: Educational BodyFat Data
5.6 Improving Fit by Preliminary Transformation 5.5.2 Example 2: PBC Data 5.5.3 Practical Use of the Pretransformation g_{δ}(x)
5.6.1 Lack of Fit of Fractional Polynomial Models
5.7 Higher Order Fractional Polynomials 5.6.2 Negative Exponential Pretransformation
5.7.1 Example 1: Nerve Conduction Data
5.8 When Fractional Polynomial Models are Unsuitable 5.7.2 Example 2: Triceps Skinfold Thickness
5.8.1 Not all Curves are Fractional Polynomials
5.9 Discussion 5.8.2 Example: Kidney Cancer 6 MFP: Multivariable Modelbuilding with Fractional Polynomials
6.1 Introduction
6.2 Motivation 6.3 The MFP Algorithm
6.3.1 Remarks
6.4 Presenting the Model 6.3.2 Example
6.4.1 Parameter Estimates
6.5 Model Criticism 6.4.2 Function Plots 6.4.3 Effect Estimates
6.5.1 Function Plots
6.6 Further Topics 6.5.2 Graphical Analysis of Residuals 6.5.3 Assessing Fit by Adding More Complex Functions 6.5.4 Consistency with SubjectMatter Knowledge
6.6.1 Interval Estimation
6.7 Further Examples 6.6.2 Importance of the Nominal Significance Level 6.6.3 The Full MFP Model 6.6.4 A Single Predictor of Interest 6.6.5 Contribution of Individual Variables to the Model Fit 6.6.6 Predictive Value of Additional Variables
6.7.1 Example 1: Oral Cancer
6.8 Simple Versus Complex Fractional Polynomial Models 6.7.2 Example 2: Diabetes 6.7.3 Example 3: Whitehall I
6.8.1 Complexity and Modelling Aims
6.9 Discussion 6.8.2 Example: GBSG Breast Cancer Data
6.9.1 Philosophy of MFP
6.9.2 Function Complexity, Sample Size and SubjectMatter Knowledge 6.9.3 Improving Robustness by Preliminary Covariate Transformation 6.9.4 Conclusion and Future 7 Interactions
7.1 Introduction
7.2 Background 7.3 General Considerations
7.3.1 Effect of Type of Predictor
7.4 The MFPI Procedure 7.3.2 Power 7.3.3 Randomized Trials and Observational Studies 7.3.4 Predefined Hypothesis or Hypothesis Generation 7.3.5 Interactions Caused by Mismodelling Main Effects 7.3.6 The ‘Treatment–Effect’ Plot 7.3.7 Graphical Checks, Sensitivity and Stability Analyses 7.3.8 Cautious Interpretation is Essential
7.4.1 Model Simplifications
7.5 Example 1: Advanced Prostate Cancer 7.4.2 Check of the Results and Sensitivity Analysis
7.5.1 The Fitted Model
7.6 Example 2: GBSG Breast Cancer Study 7.5.2 Check of the Interactions 7.5.3 Final Model 7.5.4 Further Comments and Interpretation 7.5.5 FP Model Simplification
7.6.1 Oestrogen Receptor Positivity as a Predictive Factor
7.7 Categorization 7.6.2 A Predefined Hypothesis: Tamoxifen–Oestrogen Receptor Interaction
7.7.1 Interaction with Categorized Variables
7.8 STEPP 7.7.2 Example: GBSG Study 7.9 Example 3: Comparison of STEPP with MFPI
7.9.1 Interaction in the Kidney Cancer Data
7.10 Comment on Type I Error of MFPI 7.9.2 Stability Investigation 7.11 ContinuousbyContinuous Interactions
7.11.1 Mismodelling May Induce Interaction
7.12 MultiCategory Variables 7.11.2 MFPIgen: An FP Procedure to Investigate Interactions 7.11.3 Examples of MFPIgen 7.11.4 Graphical Presentation of ContinuousbyContinuous Interactions 7.11.5 Summary 7.13 Discussion Model Stability
8.1 Introduction
8.2 Background 8.3 Using the Bootstrap to Explore Model Stability
8.3.1 Selection of Variables Within a Bootstrap Sample
8.4 Example 1: Glioma Data 8.3.2 The Bootstrap Inclusion Frequency and the Importance of a Variable 8.5 Example 2: Educational BodyFat Data
8.5.1 Effect of Influential Observations on Model Selection
8.6 Example 3: Breast Cancer Diagnosis 8.7 Model Stability for Functions
8.7.1 Summarizing Variation between Curves
8.8 Example 4: GBSG Breast Cancer Data
8.7.2 Measures of Curve Instability
8.8.1 Interdependencies among Selected Variables and Functions in Subsets
8.9 Discussion
8.8.2 Plots of Functions 8.8.3 Instability Measures 8.8.4 Stability of Functions Depending on Other Variables Included
8.9.1 Relationship between Inclusion Fractions
8.9.2 Stability of Functions 9 Some Comparisons of MFP with Splines
9.1 Introduction
9.2 Background 9.3 MVRS: A Procedure for Model Building with Regression Splines
9.3.1 Restricted Cubic Spline Functions
9.4 MVSS: A Procedure for Model Building with Cubic Smoothing Splines 9.3.2 Function Selection Procedure for Restricted Cubic Splines 9.3.3 The MVRS Algorithm
9.4.1 Cubic Smoothing Splines
9.5 Example 1: Boston Housing Data 9.4.2 Function Selection Procedure for Cubic Smoothing Splines 9.4.3 The MVSS Algorithm
9.5.1 Effect of Reducing the Sample Size
9.6 Example 2: GBSG Breast Cancer Study 9.5.2 Comparing Predictors 9.7 Example 3: Pima Indians 9.8 Example 4: PBC 9.9 Discussion
9.9.1 Splines in General
9.9.2 Complexity of Functions 9.9.3 Optimal Fit or Transferability? 9.9.4 Reporting of Selected Models 9.9.5 Conclusion 10 How to Work with MFP
10.1 Introduction
10.2 The Dataset 10.3 Univariate Analyses 10.4 MFP Analysis 10.5 Model Criticism
10.5.1 Function Plots
10.6 Stability Analysis 10.5.2 Residuals and Lack of Fit 10.5.3 Robustness Transformation and SubjectMatter Knowledge 10.5.4 Diagnostic Plot for Influential Observations 10.5.5 Refined Model 10.5.6 Interactions 10.7 Final Model 10.8 Issues to be Aware of
10.8.1 Selecting the MainEffects Model
10.9 Discussion 10.8.2 Further Comments on Stability 10.8.3 Searching for Interactions 11 Special Topics Involving Fractional Polynomials
11.1 TimeVarying Hazard Ratios in the Cox Model
11.1.1 The Fractional Polynomial Time Procedure
11.2 Agespecific Reference Intervals 11.1.2 The MFP Time Procedure 11.1.3 Prognostic Model with TimeVarying Effects for Patients with Breast Cancer 11.1.4 Categorization of Survival Time 11.1.5 Discussion
11.2.1 Example: Fetal Growth
11.3 Other Topics 11.2.2 Using FP Functions as Smoothers 11.2.3 More Sophisticated Distributional Assumptions 11.2.4 Discussion
11.3.1 Quantitative Risk Assessment in Developmental Toxicity Studies
11.3.2 Model Uncertainty for Functions 11.3.3 Relative Survival 11.3.4 Approximating Smooth Functions 11.3.5 Miscellaneous Applications 12 Epilogue
12.1 Introduction
12.2 Towards Recommendations for Practice
12.2.1 Variable Selection Procedure
12.3 Omitted Topics and Future Directions 12.2.2 Functional Form for Continuous Covariates 12.2.3 Extreme Values or Influential Points 12.2.4 Sensitivity Analysis 12.2.5 Check for Model Stability 12.2.6 Complexity of a Predictor 12.2.7 Check for Interactions
12.3.1 Measurement Error in Covariates
12.4 Conclusion 12.3.2 Metaanalysis 12.3.3 Multilevel (Hierarchical) Models 12.3.4 Missing Covariate Data 12.3.5 Other Types of Model Appendix A: Data and Software Resources
A.1 Summaries of Datasets
A.2 Datasets used more than once
A.2.1 Research Body Fat
A.3 Software A.2.2 GBSG Breast Cancer A.2.3 Educational Body Fat A.2.4 Glioma A.2.5 Prostate Cancer A.2.6 Whitehall I A.2.7 PBC A.2.8 Oral Cancer A.2.9 Kidney Cancer Appendix B: Glossary of Abbreviations
References
Index

© Copyright StataCorp LP  Terms of use  Privacy  Contact us  Site index  View mobile site 