Regression Models for Categorical Dependent Variables Using Stata, Third Edition: Review

Regression Models for Categorical Dependent Variables Using Stata (3rd ed.).

J. Scott Long and Jeremy Freese. College Station, TX: Stata Press, 2014, xxiii + 589pp., $89.95 (P), ISBN: 978-1-59-718111-2.

Regression Models for Categorical Dependent Variables Using Stata is a friendly and accessible text that is designed to show a reader starting with no knowledge of Stata how to analyze and interpret regression models with categorical dependent variables—as promised by the title. The book extensively uses a companion add-on package, SPost13, of Stata commands that take fitted regression models and return post processed summaries. These greatly enhance the interpretability of nonlinear regressions. This is a third edition, an extensive rewrite of the second, with the Stata package also being completely reworked to taking advantage of functionality released in Stata 11 and after. (The book is targeted for Stata 13.) While the text focuses on cross-sectional data, the tools are fairly easily extended to other data domains.

Over the course of the book the authors cover the big four of dependent categorical variables: binary, ordinal, nominal, and count. The book first gives an overall orientation to Stata and model estimation. (Chapter 2 is a really nice standalone introduction to Stata.) It then gives each of these data types extensive and focused coverage. Binary data, coming first, gets center stage in two full chapters, with the first on estimation and the second on interpretation. These are followed with a chapter each on the other three data types. The binary data chapters lay out a general framework for interpretation that the authors subsequently rely on. Therefore, even if one were only interested in, say, ordinal data, I would recommend first reading these chapters. A commendable aspect of the book is its focus on model interpretation: the Stata package makes this easy, and the book tells people how to do this final and critical step in their data analysis without undue suffering and without cutting corners. Arguably the most difficult aspect of the shift from ordinary linear models to more general regression models is interpreting the fitted models, particularly with regard to the usual nonlinear link functions. One way forward is to focus on aggregating individual-level predictions; this book showcases this approach, and should be applauded for it.

Luke W. Miratrix
Harvard University

Excerpt from "Reviews of Books and Teaching Materials." 2016. The American Statistician, 70:1, 120–126, DOI: 10.1080/00031305.2016.1140432.