Generalized Linear Models, Second Edition
Authors: 
P. McCullagh and J. A. Nelder 
Publisher: 
Chapman & Hall/CRC 
Copyright: 
1989 
ISBN13: 
9780412317606 
Pages: 
511; hardcover 
Price: 
$97.75 



Comment from the Stata technical group
This book covers the methodology of generalized linear models, which has
evolved dramatically over the last 20 years as a way to generalize the
methods of classical linear regression to more complex situations, including
analysisofvariance models, logit and probit models, loglinear models,
models with multinomial responses for counts, and models for survival data.
Although the original leastsquares estimation for linear regression is
based on Gaussian errors, the most important properties of leastsquares
estimates depend only on the assumed meantovariance relationship and on
the statistical independence of the observations. This fact is exploited in
developing the more general algorithm of iteratively reweighted
leastsquares to handle the more complex models.
Considered by many to be the most thorough treatment on the topic, this text
is organized to be accessible to the practicing research scientist with only
the most basic knowledge of statistical theory.
Table of contents
Preface to the first edition
Preface
1 Introduction
1.1 Background
1.1.1 The problem of looking at data
1.1.2 Theory as pattern
1.1.3 Model fitting
1.1.4 What is a good model?
1.2 The origins of generalized linear models
1.2.1 Terminology
1.2.2 Classical linear models
1.2.3 R. A. Fisher and the design of experiments
1.2.4 Dilution assay
1.2.5 Probit analysis
1.2.6 Logit models for proportions
1.2.7 Loglinear models for counts
1.2.8 Inverse polynomials
1.2.9 Survival data
1.3 Scope of the rest of the book
1.4 Bibliographic notes
1.5 Further results and exercises 1
2 An outline of generalized linear models
2.1 Processes in model fitting
2.1.1 Model selection
2.1.2 Estimation
2.1.3 Prediction
2.2 The components of a generalized linear model
2.2.1 The generalization
2.2.2 Likelihood functions
2.2.3 Link functions
2.2.4 Sufficient statistics and canonical links
2.3 Measuring the goodness of fit
2.3.1 The discrepancy of a fit
2.3.2 The analysis of deviance
2.4 Residuals
2.4.1 Pearson residual
2.4.2 Anscombe residual
2.4.3 Deviance residual
2.5 An algorithm for fitting generalized linear models
2.5.1 Justification of the fitting procedure
2.6 Bibliographic notes
2.7 Further results and exercises 2
3 Models for continuous data with constant variance
3.1 Introduction
3.2 Error structure
3.3 Systematic component (linear predictor)
3.3.1 Continuous covariates
3.3.2 Qualitative covariates
3.3.3 Dummy variates
3.3.4 Mixed terms
3.4 Model formulae for linear predictors
3.4.1 Individual terms
3.4.2 The dot operator
3.4.3 The + operator
3.4.4 The crossing (*) and nesting (/) operators
3.4.5 Operators for the removal of terms
3.4.6 Exponential operator
3.5 Aliasing
3.5.1 Intrinsic aliasing with factors
3.5.2 Aliasing in a twoway crossclassification
3.5.3 Extrinsic aliasing
3.5.4 Functional relations among covariates
3.6 Estimation
3.6.1 The maximumlikelihood equations
3.6.2 Geometrical interpretation
3.6.3 Information
3.6.4 A model with two covariates
3.6.5 The information surface
3.6.6 Stability
3.7 Tables as data
3.7.1 Empty cells
3.7.2 Fused cells
3.8 Algorithms for least squares
3.8.1 Methods based on the information matrix
3.8.2 Direct decomposition methods
3.8.3 Extension to generalized linear models
3.9 Section of covariates
3.10 Bibliographic notes
3.11 Further results and exercises 3
4 Binary data
4.1 Introduction
4.1.1 Binary responses
4.1.2 Covariate classes
4.1.3 Contingency tables
4.2 Binomial distribution
4.2.1 Genesis
4.2.2 Moments and cumulants
4.2.3 Normal limit
4.2.4 Poisson limit
4.2.5 Transformations
4.3 Models for binary responses
4.3.1 Link functions
4.3.2 Parameter interpretation
4.3.3 Retrospective sampling
4.4 Likelihood functions for binary data
4.4.1 Log likelihood for binomial data
4.4.2 Parameter estimation
4.4.3 Deviance function
4.4.4 Bias and precision of estimates
4.4.5 Sparseness
4.4.6 Extrapolation
4.5 Overdispersion
4.5.1 Genesis
4.5.2 Parameter estimation
4.6 Example
4.6.1 Habitat preferences of lizards
4.7 Bibliographic notes
4.8 Further results and exercises 4
5 Models for polytomous data
5.1 Introduction
5.2 Measurement scales
5.2.1 General points
5.2.2 Models for ordinal scales
5.2.3 Models for interval scales
5.2.4 Models for nominal scales
5.2.5 Nested or hierarchical response scales
5.3 The multinomial distribution
5.3.1 Genesis
5.3.2 Moments and cumulants
5.3.3 Generalized inverse and matrices
5.3.4 Quadratic forms
5.3.5 Marginal and conditional distributions
5.4 Likelihood functions
5.4.1 Log likelihood for multinomial responses
5.4.2 Parameter estimation
5.4.3 Deviance function
5.5 Overdispersion
5.6 Examples
5.6.1 A cheesetasting experiment
5.6.2 Pneumoconiosis among coalminers
5.7 Bibliographic notes
5.8 Further results and exercises 5
6 Loglinear models
6.1 Introduction
6.2 Likelihood functions
6.2.1 Poisson distribution
6.2.2 The Poisson loglikelihood function
6.2.3 Overdispersion
6.2.4 Asymptotic theory
6.3 Examples
6.3.1 A biological assay of tuberculins
6.3.2 A study of wave damage to cargo ships
6.4 Loglinear models and multinomial response models
6.4.1 Comparison of two or more Poisson means
6.4.2 Multinomial response models
6.4.3 Summary
6.5 Multiple responses
6.5.1 Introduction
6.5.2 Independence and conditional independence
6.5.3 Canonical correlation models
6.5.4 Multivariate regression models
6.5.5 Multivariate model formulae
6.5.6 Loglinear regression models
6.5.7 Likelihood equations
6.6 Example
6.6.1 Respiratory ailments of coalminers
6.6.2 Parameter interpretation
6.7 Bibliographic notes
6.8 Further results and exercises 6
7 Conditional likelihoods*
7.1 Introduction
7.2 Marginal and conditional likelihoods
7.2.1 Marginal likelihood
7.2.2 Conditional likelihood
7.2.3 Exponentialfamily models
7.2.4 Profile likelihood
7.3 Hypergeometric distributions
7.3.1 Central hypergeometric distribution
7.3.2 Noncentral hypergeometric distribution
7.3.3 Multivariate hypergeometric distribution
7.3.4 Multivariate noncentral distribution
7.4 Some applications involving binary data
7.4.1 Comparison of two binomial probabilities
7.4.2 Combination of information from 2x2 tables
7.4.3 IlleetVilaine study of oesophageal cancer
7.5 Some applications involving polytomous data
7.5.1 Matched pairs: nominal response
7.5.2 Ordinal responses
7.5.3 Example
7.6 Bibliographic notes
7.7 Further results and exercises 7
8 Models with constant coefficient of variation
8.1 Introduction
8.2 The gamma distribution
8.3 Models with gammadistributed observations
8.3.1 The variance function
8.3.2 The deviance
8.3.3 The canonical link
8.3.4 Multiplicative models: log link
8.3.5 Linear models: identity link
8.3.6 Estimation of the dispersion parameter
8.4 Examples
8.4.1 Car insurance claims
8.4.2 Clotting times of blood
8.4.3 Modelling rainfall data using two generalized linear models
8.4.4 Developmental rate of Drosophila melanogaster
8.5 Bibliographic notes
8.6 Further results and exercises 8
9 Quasilikelihood functions
9.1 Introduction
9.2 Independent observations
9.2.1 Covariance functions
9.2.2 Construction of the quasilikelihood function
9.2.3 Parameter estimation
9.2.4 Example: incidence of leafblotch on barley
9.3 Dependent observations
9.3.1 Quasilikelihood estimating equations
9.3.2 Quasilikelihood function
9.3.3 Example: estimation of probabilities from marginal frequencies
9.4 Optimal estimating functions
9.4.1 Introduction
9.4.2 Combination of estimating functions
9.4.3 Example: estimation for megalithic stone rings
9.5 Optimality criteria
9.6 Extended quasilikelihood
9.7 Bibliographic notes
9.8 Further results and exercises 9
10 Joint modelling of mean and dispersion
10.1 Introduction
10.2 Model specification
10.3 Interaction between mean and dispersion effects
10.4 Extended quasilikelihood as a criterion
10.5 Adjustments of the estimating equations
10.5.1 Adjustment for kurtosis
10.5.2 Adjustment for degrees of freedom
10.5.3 Summary of estimating equations for the dispersion model
10.6 Joint optimum estimating equations
10.7 Example: the production of leafsprings for trucks
10.8 Bibliographic notes
10.9 Further results and exercises 10
11 Models with additional nonlinear parameters
11.1 Introduction
11.2 Parameters in the variance function
11.3 Parameters in the link function
11.3.1 One link parameter
11.3.2 More than one link parameter
11.3.3 Transformation of data vs transformation of fitted values
11.4 Nonlinear parameters in the covariates
11.5 Examples
11.5.1 The effects of fertilizers on coastal Bermuda grass
11.5.2 Assay of an insecticide with a synergist
11.5.3 Mixtures of drugs
11.6 Bibliographic notes
11.7 Further results and exercises 11
12 Model checking
12.1 Introduction
12.2 Techniques in model checking
12.3 Score tests for extra parameters
12.4 Smoothing as an aid to informal checks
12.5 The raw materials of model checking
12.6 Checks for systematic departure from model
12.6.1 Informal checks using residuals
12.6.2 Checking the variance function
12.6.3 Checking the link function
12.6.4 Checking the scales of covariates
12.6.5 Checks for compound discrepancies
12.7 Checks for isolated departures from the model
12.7.1 Measure of leverage
12.7.2 Measure of consistency
12.7.3 Measure of influence
12.7.4 Informal assessment of extreme values
12.7.5 Extreme points and checks for systematic discrepancies
12.8 Examples
12.8.1 Carrot damage in an insecticide experiment
12.8.2 Minitab tree data
12.8.3 Insurance claims (continued)
12.9 A strategy for model checking?
12.10 Bibliographic notes
12.11 Further results and exercises 12
13 Models for survival data
13.1 Introduction
13.1.1 Survival functions and hazard functions
13.2 Proportionalhazards models
13.3 Estimation with a specified survival distribution
13.3.1 The exponential distribution
13.3.2 The Weibull distribution
13.3.3 The extremevalue distribution
13.4 Example: remission times for leukaemia
13.5 Cox's proportionalhazards model
13.5.1 Partial likelihood
13.5.2 The treatment of ties
13.5.3 Numerical methods
13.6 Bibliographic notes
13.7 Further results and exercises 13
14 Components of dispersion
14.1 Introduction
14.2 Linear models
14.3 Nonlinear models
14.4 Parameter estimation
14.5 Example: A salamander mating experiment
14.5.1 Introduction
14.5.2 Experimental procedure
14.5.3 A linear logistic model with random effects
14.5.4 Estimation of the dispersion parameters
14.6 Bibliographic notes
14.7 Further results and exercises 14
15 Further topics
15.1 Introduction
15.2 Bias adjustment
15.2.1 Models with canonical link
15.2.2 Noncanonical models
15.2.3 Example: Lizard data (continued)
15.3 Computation of Bartlett adjustments
15.3.1 General theory
15.3.2 Computation of the adjustment
15.3.3 Example: exponential regression model
15.4 Generalized additive models
15.4.1 Algorithms for fitting
15.4.2 Smoothing methods
15.4.3 Conclusions
15.5 Bibliographic notes
15.6 Further results and exercises 15
Appendices
A Elementary likelihood theory
B Edgeworth series
C Likelihoodratio statistics
References
Index of data sets
Author index
Subject index