Stata Bookstore: Generalized Linear Models, Second Edition

Home / Bookstore / Title index / Categorical, count, and censored outcomes / Generalized Linear Models, Second Edition

Generalized Linear Models, Second Edition

As an Amazon Associate, StataCorp earns a small referral credit from qualifying purchases made from affiliate links on our site.

Amazon Associate affiliate link

What are VitalSource eBooks?
Your access code will be emailed upon purchase.

eBook not available for this title

Authors:	P. McCullagh and J. A. Nelder
Publisher:	Chapman & Hall/CRC
Copyright:	1989
ISBN-13:	978-0-412-31760-6
Pages:	511; hardcover

Authors:	P. McCullagh and J. A. Nelder
Publisher:	Chapman & Hall/CRC
Copyright:	1989
ISBN-13:
Pages:	511; eBook

Authors:	P. McCullagh and J. A. Nelder
Publisher:	Chapman & Hall/CRC
Copyright:	1989
ISBN-13:
Pages:	511; Kindle

Comment from the Stata technical group

This book covers the methodology of generalized linear models, which has evolved dramatically over the last 20 years as a way to generalize the methods of classical linear regression to more complex situations, including analysis-of-variance models, logit and probit models, log-linear models, models with multinomial responses for counts, and models for survival data. Although the original least-squares estimation for linear regression is based on Gaussian errors, the most important properties of least-squares estimates depend only on the assumed mean-to-variance relationship and on the statistical independence of the observations. This fact is exploited in developing the more general algorithm of iteratively reweighted least-squares to handle the more complex models.

Considered by many to be the most thorough treatment on the topic, this text is organized to be accessible to the practicing research scientist with only the most basic knowledge of statistical theory.

View table of contents >>

Preface to the first edition

Preface

1 Introduction

1.1 Background

1.1.1 The problem of looking at data
1.1.2 Theory as pattern
1.1.3 Model fitting
1.1.4 What is a good model?

1.2 The origins of generalized linear models

1.2.1 Terminology
1.2.2 Classical linear models
1.2.3 R. A. Fisher and the design of experiments
1.2.4 Dilution assay
1.2.5 Probit analysis
1.2.6 Logit models for proportions
1.2.7 Log-linear models for counts
1.2.8 Inverse polynomials
1.2.9 Survival data

1.3 Scope of the rest of the book
1.4 Bibliographic notes
1.5 Further results and exercises 1

2 An outline of generalized linear models

2.1 Processes in model fitting

2.1.1 Model selection
2.1.2 Estimation
2.1.3 Prediction

2.2 The components of a generalized linear model

2.2.1 The generalization
2.2.2 Likelihood functions
2.2.3 Link functions
2.2.4 Sufficient statistics and canonical links

2.3 Measuring the goodness of fit

2.3.1 The discrepancy of a fit
2.3.2 The analysis of deviance

2.4 Residuals

2.4.1 Pearson residual
2.4.2 Anscombe residual
2.4.3 Deviance residual

2.5 An algorithm for fitting generalized linear models

2.5.1 Justification of the fitting procedure

2.6 Bibliographic notes
2.7 Further results and exercises 2

3 Models for continuous data with constant variance

3.1 Introduction
3.2 Error structure
3.3 Systematic component (linear predictor)

3.3.1 Continuous covariates
3.3.2 Qualitative covariates
3.3.3 Dummy variates
3.3.4 Mixed terms

3.4 Model formulae for linear predictors

3.4.1 Individual terms
3.4.2 The dot operator
3.4.3 The + operator
3.4.4 The crossing (*) and nesting (/) operators
3.4.5 Operators for the removal of terms
3.4.6 Exponential operator

3.5 Aliasing

3.5.1 Intrinsic aliasing with factors
3.5.2 Aliasing in a two-way cross-classification
3.5.3 Extrinsic aliasing
3.5.4 Functional relations among covariates

3.6 Estimation

3.6.1 The maximum-likelihood equations
3.6.2 Geometrical interpretation
3.6.3 Information
3.6.4 A model with two covariates
3.6.5 The information surface
3.6.6 Stability

3.7 Tables as data

3.7.1 Empty cells
3.7.2 Fused cells

3.8 Algorithms for least squares

3.8.1 Methods based on the information matrix
3.8.2 Direct decomposition methods
3.8.3 Extension to generalized linear models

3.9 Section of covariates
3.10 Bibliographic notes
3.11 Further results and exercises 3

4 Binary data

4.1 Introduction

4.1.1 Binary responses
4.1.2 Covariate classes
4.1.3 Contingency tables

4.2 Binomial distribution

4.2.1 Genesis
4.2.2 Moments and cumulants
4.2.3 Normal limit
4.2.4 Poisson limit
4.2.5 Transformations

4.3 Models for binary responses

4.3.1 Link functions
4.3.2 Parameter interpretation
4.3.3 Retrospective sampling

4.4 Likelihood functions for binary data

4.4.1 Log likelihood for binomial data
4.4.2 Parameter estimation
4.4.3 Deviance function
4.4.4 Bias and precision of estimates
4.4.5 Sparseness
4.4.6 Extrapolation

4.5 Over-dispersion

4.5.1 Genesis
4.5.2 Parameter estimation

4.6 Example

4.6.1 Habitat preferences of lizards

4.7 Bibliographic notes
4.8 Further results and exercises 4

5 Models for polytomous data

5.1 Introduction
5.2 Measurement scales

5.2.1 General points
5.2.2 Models for ordinal scales
5.2.3 Models for interval scales
5.2.4 Models for nominal scales
5.2.5 Nested or hierarchical response scales

5.3 The multinomial distribution

5.3.1 Genesis
5.3.2 Moments and cumulants
5.3.3 Generalized inverse and matrices
5.3.4 Quadratic forms
5.3.5 Marginal and conditional distributions

5.4 Likelihood functions

5.4.1 Log likelihood for multinomial responses
5.4.2 Parameter estimation
5.4.3 Deviance function

5.5 Over-dispersion
5.6 Examples

5.6.1 A cheese-tasting experiment
5.6.2 Pneumoconiosis among coalminers

5.7 Bibliographic notes
5.8 Further results and exercises 5

6 Log-linear models

6.1 Introduction
6.2 Likelihood functions

6.2.1 Poisson distribution
6.2.2 The Poisson log-likelihood function
6.2.3 Over-dispersion
6.2.4 Asymptotic theory

6.3 Examples

6.3.1 A biological assay of tuberculins
6.3.2 A study of wave damage to cargo ships

6.4 Log-linear models and multinomial response models

6.4.1 Comparison of two or more Poisson means
6.4.2 Multinomial response models
6.4.3 Summary

6.5 Multiple responses

6.5.1 Introduction
6.5.2 Independence and conditional independence
6.5.3 Canonical correlation models
6.5.4 Multivariate regression models
6.5.5 Multivariate model formulae
6.5.6 Log-linear regression models
6.5.7 Likelihood equations

6.6 Example

6.6.1 Respiratory ailments of coalminers
6.6.2 Parameter interpretation

6.7 Bibliographic notes
6.8 Further results and exercises 6

7 Conditional likelihoods*

7.1 Introduction
7.2 Marginal and conditional likelihoods

7.2.1 Marginal likelihood
7.2.2 Conditional likelihood
7.2.3 Exponential-family models
7.2.4 Profile likelihood

7.3 Hypergeometric distributions

7.3.1 Central hypergeometric distribution
7.3.2 Non-central hypergeometric distribution
7.3.3 Multivariate hypergeometric distribution
7.3.4 Multivariate non-central distribution

7.4 Some applications involving binary data

7.4.1 Comparison of two binomial probabilities
7.4.2 Combination of information from 2x2 tables
7.4.3 Ille-et-Vilaine study of oesophageal cancer

7.5 Some applications involving polytomous data

7.5.1 Matched pairs: nominal response
7.5.2 Ordinal responses
7.5.3 Example

7.6 Bibliographic notes
7.7 Further results and exercises 7

8 Models with constant coefficient of variation

8.1 Introduction
8.2 The gamma distribution
8.3 Models with gamma-distributed observations

8.3.1 The variance function
8.3.2 The deviance
8.3.3 The canonical link
8.3.4 Multiplicative models: log link
8.3.5 Linear models: identity link
8.3.6 Estimation of the dispersion parameter

8.4 Examples

8.4.1 Car insurance claims
8.4.2 Clotting times of blood
8.4.3 Modelling rainfall data using two generalized linear models
8.4.4 Developmental rate of Drosophila melanogaster

8.5 Bibliographic notes
8.6 Further results and exercises 8

9 Quasi-likelihood functions

9.1 Introduction
9.2 Independent observations

9.2.1 Covariance functions
9.2.2 Construction of the quasi-likelihood function
9.2.3 Parameter estimation
9.2.4 Example: incidence of leaf-blotch on barley

9.3 Dependent observations

9.3.1 Quasi-likelihood estimating equations
9.3.2 Quasi-likelihood function
9.3.3 Example: estimation of probabilities from marginal frequencies

9.4 Optimal estimating functions

9.4.1 Introduction
9.4.2 Combination of estimating functions
9.4.3 Example: estimation for megalithic stone rings

9.5 Optimality criteria
9.6 Extended quasi-likelihood
9.7 Bibliographic notes
9.8 Further results and exercises 9

10 Joint modelling of mean and dispersion

10.1 Introduction
10.2 Model specification
10.3 Interaction between mean and dispersion effects
10.4 Extended quasi-likelihood as a criterion
10.5 Adjustments of the estimating equations

10.5.1 Adjustment for kurtosis
10.5.2 Adjustment for degrees of freedom
10.5.3 Summary of estimating equations for the dispersion model

10.6 Joint optimum estimating equations
10.7 Example: the production of leaf-springs for trucks
10.8 Bibliographic notes
10.9 Further results and exercises 10

11 Models with additional non-linear parameters

11.1 Introduction
11.2 Parameters in the variance function
11.3 Parameters in the link function

11.3.1 One link parameter
11.3.2 More than one link parameter
11.3.3 Transformation of data vs transformation of fitted values

11.4 Non-linear parameters in the covariates
11.5 Examples

11.5.1 The effects of fertilizers on coastal Bermuda grass
11.5.2 Assay of an insecticide with a synergist
11.5.3 Mixtures of drugs

11.6 Bibliographic notes
11.7 Further results and exercises 11

12 Model checking

12.1 Introduction
12.2 Techniques in model checking
12.3 Score tests for extra parameters
12.4 Smoothing as an aid to informal checks
12.5 The raw materials of model checking
12.6 Checks for systematic departure from model

12.6.1 Informal checks using residuals
12.6.2 Checking the variance function
12.6.3 Checking the link function
12.6.4 Checking the scales of covariates
12.6.5 Checks for compound discrepancies

12.7 Checks for isolated departures from the model

12.7.1 Measure of leverage
12.7.2 Measure of consistency
12.7.3 Measure of influence
12.7.4 Informal assessment of extreme values
12.7.5 Extreme points and checks for systematic discrepancies

12.8 Examples

12.8.1 Carrot damage in an insecticide experiment
12.8.2 Minitab tree data
12.8.3 Insurance claims (continued)

12.9 A strategy for model checking?
12.10 Bibliographic notes
12.11 Further results and exercises 12

13 Models for survival data

13.1 Introduction

13.1.1 Survival functions and hazard functions

13.2 Proportional-hazards models
13.3 Estimation with a specified survival distribution

13.3.1 The exponential distribution
13.3.2 The Weibull distribution
13.3.3 The extreme-value distribution

13.4 Example: remission times for leukaemia
13.5 Cox's proportional-hazards model

13.5.1 Partial likelihood
13.5.2 The treatment of ties
13.5.3 Numerical methods

13.6 Bibliographic notes
13.7 Further results and exercises 13

14 Components of dispersion

14.1 Introduction
14.2 Linear models
14.3 Non-linear models
14.4 Parameter estimation
14.5 Example: A salamander mating experiment

14.5.1 Introduction
14.5.2 Experimental procedure
14.5.3 A linear logistic model with random effects
14.5.4 Estimation of the dispersion parameters

14.6 Bibliographic notes
14.7 Further results and exercises 14

15 Further topics

15.1 Introduction
15.2 Bias adjustment

15.2.1 Models with canonical link
15.2.2 Non-canonical models
15.2.3 Example: Lizard data (continued)

15.3 Computation of Bartlett adjustments

15.3.1 General theory
15.3.2 Computation of the adjustment
15.3.3 Example: exponential regression model

15.4 Generalized additive models

15.4.1 Algorithms for fitting
15.4.2 Smoothing methods
15.4.3 Conclusions

15.5 Bibliographic notes
15.6 Further results and exercises 15

Appendices

A Elementary likelihood theory
B Edgeworth series
C Likelihood-ratio statistics

References

Index of data sets

Author index

Subject index

Generalized Linear Models, Second Edition

Comment from the Stata technical group

Table of contents

We use cookies

Privacy policy

Required cookies

Advertising and performance cookies

Stata/MP4 Annual License (download)

Generalized Linear Models, Second Edition

Comment from the Stata technical group

Table of contents

We use cookies

Privacy policy

Required cookies

Advertising and performance cookies