List of tables

List of figures

Preface

1 Introduction

1.1 Origins and motivation

1.2 Notational conventions

1.3 Applied or theoretical?

1.4 Road map

1.5 Installing the support materials

I Foundations of Generalized Linear Models

2 GLMs

2.1 Components

2.2 Assumptions

2.3 Exponential family

2.4 Example: Using an offset in a GLM

2.5 Summary

3 GLM estimation algorithms

3.1 Newton–Raphson (using the observed Hessian)

3.2 Starting values for Newton–Raphson

3.3 IRLS (using the expected Hessian)

3.4 Starting values for IRLS

3.5 Goodness of fit

3.6 Estimated variance matrices

3.6.1 Hessian

3.6.2 Outer product of the gradient

3.6.3 Sandwich

3.6.4 Modified sandwich

3.6.5 Unbiased sandwich

3.6.6 Modified unbiased sandwich

3.6.7 Weighted sandwich: Newey–West

3.6.8 Jackknife

3.6.8.1 Usual jackknife

3.6.8.2 One-step jackknife

3.6.8.3 Weighted jackknife

3.6.8.4 Variable jackknife

3.6.9 Bootstrap

3.6.9.1 Usual bootstrap

3.6.9.2 Grouped bootstrap

3.7 Estimation algorithms

3.8 Summary

4 Analysis of fit

4.1 Deviance

4.2 Diagnostics

4.2.1 Cook’s distance

4.2.2 Overdispersion

4.3 Assessing the link function

4.4 Residual analysis

4.4.1 Response residuals

4.4.2 Working residuals

4.4.3 Pearson residuals

4.4.4 Partial residuals

4.4.5 Anscombe residuals

4.4.6 Deviance residuals

4.4.7 Adjusted deviance residuals

4.4.8 Likelihood residuals

4.4.9 Score residuals

4.5 Checks for systematic departure from the model

4.6 Model statistics

4.6.1 Criterion measures

4.6.1.1 AIC

4.6.1.2 BIC

4.6.2 The interpretation of R

^{2} in linear regression

4.6.2.1 Percentage variance explained

4.6.2.2 The ratio of variances

4.6.2.3 A transformation of the likelihood ratio

4.6.2.4 A transformation of the F test

4.6.2.5 Squared correlation

4.6.3 Generalizations of linear regression R

^{2} interpretations

4.6.3.1 Efron’s pseudo-R^{2}

4.6.3.2 McFadden’s likelihood-ratio index

4.6.3.3 Ben-Akiva and Lerman adjusted likelihood-ratio index

4.6.3.4 McKelvey and Zavoina ratio of variances

4.6.3.5 Transformation of likelihood ratio

4.6.3.6 Cragg and Uhler normed measure

4.6.4 More R

^{2} measures

4.6.4.1 The count R^{2}

4.6.4.2 The adjusted count R^{2}

4.6.4.3 Veall and Zimmermann R^{2}

4.6.4.4 Cameron–Windmeijer R^{2}

4.7 Marginal effects

4.7.1 Marginal effects for GLMs

4.7.2 Discrete change for GLMs

5 Data synthesis

5.1 Generating correlated data

5.2 Generating data from a specified population

5.2.1 Generating data for linear regression

5.2.2 Generating data for logistic regression

5.2.3 Generating data for probit regression

5.2.4 Generating data for cloglog regression

5.2.5 Generating data for Gaussian variance and log link

5.2.6 Generating underdispersed count data

5.3 Simulation

5.3.1 Heteroskedasticity in linear regression

5.3.2 Power analysis

5.3.3 Comparing fit of Poisson and negative binomial

5.3.4 Effect of omitted covariate on R^{2}_{Efron}
in Poisson regression

II Continuous Response Models

6 The Gaussian family

6.1 Derivation of the GLM Gaussian family

6.2 Derivation in terms of the mean

6.3 IRLS GLM algorithm (nonbinomial)

6.4 ML estimation

6.5 GLM log-normal models

6.6 Expected versus observed information matrix

6.7 Other Gaussian links

6.8 Example: Relation to OLS

6.9 Example: Beta-carotene

7 The gamma family

7.1 Derivation of the gamma model

7.2 Example: Reciprocal link

7.3 ML estimation

7.4 Log-gamma models

7.5 Identity-gamma models

7.6 Using the gamma model for survival analysis

8 The inverse Gaussian family

8.1 Derivation of the inverse Gaussian model

8.2 The inverse Gaussian algorithm

8.3 Maximum likelihood algorithm

8.4 Example: The canonical inverse Gaussian

8.5 Noncanonical links

9 The power family and link

9.1 Power links

9.2 Example: Power link

9.3 The power family

III Binomial Response Models

10 The binomial–logit family

10.1 Derivation of the binomial model

10.2 Derivation of the Bernoulli model

10.3 The binomial regression algorithm

10.4 Example: Logistic regression

10.4.1 Model producing logistic coefficients: The heart data

10.4.2 Model producing logistic odds ratios

10.5 GOF statistics

10.6 Proportional data

10.7 Interpretation of parameter estimates

11 The general binomial family

11.1 Noncanonical binomial models

11.2 Noncanonical binomial links (binary form)

11.3 The probit model

11.4 The clog-log and log-log models

11.5 Other links

11.6 Interpretation of coefficients

11.6.1 Identity link

11.6.2 Logit link

11.6.3 Log link

11.6.4 Log complement link

11.6.5 Summary

11.7 Generalized binomial regression

12 The problem of overdispersion

12.1 Overdispersion

12.2 Scaling of standard errors

12.3 Williams’ procedure

12.4 Robust standard errors

IV Count Response Models

13 The Poisson family

13.1 Count response regression models

13.2 Derivation of the Poisson algorithm

13.3 Poisson regression: Examples

13.4 Example: Testing overdispersion in the Poisson model

13.5 Using the Poisson model for survival analysis

13.6 Using offsets to compare models

13.7 Interpretation of coefficients

14 The negative binomial family

14.1 Constant overdispersion

14.2 Variable overdispersion

14.2.1 Derivation in terms of a Poisson–gamma mixture

14.2.2 Derivation in terms of the negative binomial probability function

14.2.3 The canonical link negative binomial parameterization

14.3 The log-negative binomial parameterization

14.4 Negative binomial examples

14.5 The geometric family

14.6 Interpretation of coefficients

15 Other count data models

15.1 Count response regression models

15.2 Zero-truncated models

15.3 Zero-inflated models

15.4 Hurdle models

15.5 Negative binomial(P) models

15.6 Heterogeneous negative binomial models

15.7 Generalized Poisson regression models

15.8 Poisson inverse Gaussian models

15.9 Censored count response models

15.10 Finite mixture models

V Multinomial Response Models

16 The ordered-response family

16.1 Interpretation of coefficients: Single binary predictor

16.2 Ordered outcomes for general link

16.3 Ordered outcomes for specific links

16.3.1 Ordered logit

16.3.2 Ordered probit

16.3.3 Ordered clog-log

16.3.4 Ordered log-log

16.3.5 Ordered cauchit

16.4 Generalized ordered outcome models

16.5 Example: Synthetic data

16.6 Example: Automobile data

16.7 Partial proportional-odds models

16.8 Continuation-ratio models

17 Unordered-response family

17.1 The multinomial logit model

17.1.1 Interpretation of coefficients: Single binary predictor

17.1.2 Example: Relation to logistic regression

17.1.3 Example: Relation to conditional logistic regression

17.1.4 Example: Extensions with conditional logistic regression

17.1.5 The independence of irrelevant alternatives

17.1.6 Example: Assessing the IIA

17.1.7 Interpreting coefficients

17.1.8 Example: Medical admissions—introduction

17.1.9 Example: Medical admissions—summary

17.2 The multinomial probit model

17.2.1 Example: A comparison of the models

17.2.2 Example: Comparing probit and multinomial probit

17.2.3 Example: Concluding remarks

VI Extensions to the GLM

18 Extending the likelihood

18.1 The quasilikelihood

18.2 Example: Wedderburn’s leaf blotch data

18.3 Generalized additive models

19 Clustered data

19.1 Generalization from individual to clustered data

19.2 Pooled estimators

19.3 Fixed effects

19.3.1 Unconditional fixed-effects estimators

19.3.2 Conditional fixed-effects estimators

19.4 Random effects

19.4.1 Maximum likelihood estimation

19.4.2 Gibbs sampling

19.5 GEEs

19.6 Other models

VII Stata Software

20 Programs for Stata

20.1 The glm command

20.1.1 Syntax

20.1.2 Description

20.1.3 Options

20.2 The predict command after glm

20.2.1 Syntax

20.2.2 Options

20.3 User-written programs

20.3.1 Global macros available for user-written programs

20.3.2 User-written variance functions

20.3.3 User-written programs for link functions

20.3.4 User-written programs for Newey–West weights

20.4 Remarks

20.4.1 Equivalent commands

20.4.2 Special comments on family(Gaussian) models

20.4.3 Special comments on family(binomial) models

20.4.4 Special comments on family(nbinomial) models

20.4.5 Special comment on family(gamma) link(log) models

A Tables

References

Author index

Subject index