>> Home >> Bookstore >> Categorical and limited dependent variables >> Regression Models for Categorical Dependent Variables Using Stata, Second Edition

Regression Models for Categorical Dependent Variables Using Stata, Second Edition

J. Scott Long and Jeremy Freese
Publisher: Stata Press
Copyright: 2006
ISBN-13: 978-1-59718-011-5
Pages: 527; paperback
Price: $58.00

Comment from the Stata technical group

Regression Models for Categorical Dependent Variables Using Stata, Second Edition, by J. Scott Long and Jeremy Freese, shows how to use Stata to fit and interpret regression models for categorical data. Nearly 50% longer than the previous edition, the second edition covers new topics for fitting and interpreting models included in Stata 9, such as multinomial probit models, the stereotype logistic model, and zero-truncated count models. Many of the interpretation techniques have been updated to include interval and point estimates.

Although regression models for categorical dependent variables are common, few texts explain how to interpret such models; Regression Models for Categorical Dependent Variables Using Stata, Second Edition fills this void. To accompany the book, Long and Freese provide a suite of commands for hypothesis testing and model diagnostics.

The second edition begins with an excellent introduction to Stata and follows with general treatments of estimation, testing, fit, and interpretation in this class of models. Long and Freese detail binary, ordinal, nominal, and count outcomes in separate chapters. The final chapter explains how to fit and interpret models with special characteristics, such as interaction, nonlinear terms, and ordinal and nominal independent variables. One appendix explains the syntax of the author-written commands, and a second appendix details the book's datasets.

Long and Freese use many concrete examples in their second edition. All the examples, datasets, and author-written commands are available on the authors’ website, so readers can easily replicate the examples when using Stata. This book is ideal for students or applied researchers who want to learn how to fit and interpret models for categorical data.

Table of contents

Preface (pdf)
Part I   General Information
1 Introduction
1.1 What is this book about?
1.2 Which models are considered?
1.3 Whom is this book for?
1.4 How is the book organized?
1.5 What software do you need?
1.5.1 Updating Stata 9
1.5.2 Installing SPost
Installing SPost using search
Installing SPost using net install
1.5.3 What if commands do not work?
1.5.4 Uninstalling SPost
1.5.5 Using spex to load data and run examples
1.5.6 More files available on the web site
1.6 Where can I learn more about the models?
2 Introduction to Stata
2.1 The Stata interface
Changing the scrollback buffer size
Changing the display of variable names in the Variables window
2.2 Abbreviations
2.3 How to get help
2.3.1 Online help
2.3.2 Manuals
2.3.3 Other resources
2.4 The working directory
2.5 Stata file types
2.6 Saving output to log files
2.6.1 Closing a log file
2.6.2 Viewing a log file
2.6.3 Converting from SMCL to plain text or PostScript
2.7 Using and saving datasets
2.7.1 Data in Stata format
2.7.2 Data in other formats
2.7.3 Entering data by hand
2.8 Size limitations on datasets*
2.9 Do-files
2.9.1 Adding comments
2.9.2 Long lines
2.9.3 Stopping a do-file while it is running
2.9.4 Creating do-files
Using Stata's Do-file Editor
Using other editors to create do-files
2.9.5 Recommended structure for do-files
2.10 Using Stata for serious data analysis
2.11 Syntax of Stata commands
2.11.1 Commands
2.11.2 Variable lists
2.11.3 if and in qualifiers
Examples of if qualifier
2.11.4 Options
2.12 Managing data
2.12.1 Looking at your data
2.12.2 Getting information about variables
2.12.3 Missing values
2.12.4 Selecting observations
2.12.5 Selecting variables
2.13 Creating new variables
2.13.1 generate command
2.13.2 replace command
2.13.3 recode command
2.13.4 Common transformations for RHS variables
Breaking a categorical variable into a set of binary variables
More examples of creating binary variables
Nonlinear transformations
Interaction terms
2.14 Labeling variables and values
2.14.1 Variable labels
2.14.2 Value labels
2.14.3 notes command
2.15 Global and local macros
2.16 Graphics
2.16.1 graph command
2.16.2 Displaying previously drawn graphs
2.16.3 Printing graphs
2.16.4 Combining graphs
2.17 A brief tutorial
A batch version
3 Estimation, testing, fit, and interpretation
3.1 Estimation
3.1.1 Stata’s output for ML estimation
3.1.2 ML and sample size
3.1.3 Problems in obtaining ML estimates
3.1.4 Syntax of estimation commands
Variable lists
Specifying the estimation sample
3.1.5 Reading the output
Estimates and standard errors
Confidence intervals
3.1.6 Storing estimation results
3.1.7 Reformatting output with estimates table
3.1.8 Reformatting output with estout
3.1.9 Alternative output with listcoef
Options for types of coefficients
Options for mlogit, mprobit, and slogit
Other options
Standardized coefficients
Factor and percent change
3.2 Postestimation analysis
3.3 Testing
3.3.1 Wald tests
The accumulate option
3.3.2 LR tests
Avoiding invalid LR tests
3.4 estat command
3.5 Measures of fit
Syntax of fitstat
Models and measures
Example of fitstat
Methods and formulas for fitstat
3.6 Interpretation
3.6.1 Approaches to interpretation
3.6.2 Predictions using predict
3.6.3 Overview of prvalue, prchange, prtab, and prgen
Specifying the levels of variables
Options controlling output
3.6.4 Syntax for prvalue
Options for confidence intervals
Options used for bootstrapped confidence intervals
3.6.5 Syntax for prchange
3.6.6 Syntax for prtab
3.6.7 Syntax for prgen
Options for confidence intervals and marginals
Variables generated
3.6.8 Computing marginal effects using mfx
3.7 Confidence intervals for prediction
3.8 Next steps
Part II   Models for Specific Kinds of Outcomes
4 Models for binary outcomes
4.1 The statistical model
4.1.1 A latent-variable model
4.1.2 A nonlinear probability model
4.2 Estimation using logit and probit
Variable lists
Specifying the estimation sample
4.2.1 Observations predicted perfectly
4.3 Hypothesis testing with test and lrtest
4.3.1 Testing individual coefficients
One- and two-tailed tests
Testing single coefficients using test
Testing single coefficients using lrtest
4.3.2 Testing multiple coefficients
Testing multiple coefficients using test
Testing multiple coefficients using lrtest
4.3.3 Comparing LR and Wald tests
4.4 Residuals and influence using predict
4.4.1 Residuals
4.4.2 Influential cases
4.4.3 Least likely observations
Options for controlling the list of values
4.5 Measuring fit
4.5.1 Scalar measures of fit using fitstat
4.5.2 Hosmer–Lemeshow statistic
4.6 Interpretation using predicted values
4.6.1 Predicted probabilities with predict
4.6.2 Individual predicted probabilities with prvalue
4.6.3 Tables of predicted probabilities with prtab
4.6.4 Graphing predicted probabilities with prgen
4.6.5 Plotting confidence intervals
4.6.6 Changes in predicted probabilities
Marginal change
Discrete change
4.7 Interpretation using odds ratios with listcoef
Multiplicative coefficients
Effect of the base probability
Percent change in the odds
4.8 Other commands for binary outcomes
5 Models for ordinal outcomes
5.1 The statistical model
5.1.1 A latent-variable model
5.1.2 A nonlinear probability model
5.2 Estimation using ologit and oprobit
Variable lists
Specifying the estimation sample
5.2.1 Example of attitudes toward working mothers
5.2.2 Predicting perfectly
5.3 Hypothesis testing with test and lrtest
5.3.1 Testing individual coefficients
5.3.2 Testing multiple coefficients
5.4 Scalar measures of fit using fitstat
5.5 Converting to a different parameterization*
5.6 The parallel regression assumption
5.7 Residuals and outliers using predict
5.8 Interpretation
5.8.1 Marginal change in y*
5.8.2 Predicted probabilities
5.8.3 Predicted probabilities with predict
5.8.4 Individual predicted probabilities with prvalue
5.8.5 Tables of predicted probabilities with prtab
5.8.6 Graphing predicted probabilities with prgen
5.8.7 Changes in predicted probabilities
Marginal change with prchange
Marginal change with mfx
Discrete change with prchange
Confidence intervals for discrete changes
Computing discrete change for a 10-year increase in age
5.8.8 Odds ratios using listcoef
5.9 Less common models for ordinal outcomes
5.9.1 The stereotype model
5.9.2 The generalized ordered logit model
5.9.3 The continuation ratio model
6 Models for nominal outcomes with case-specific data
6.1 The multinomial logit model
6.1.1 Formal statement of the model
6.2 Estimation using mlogit
Variable lists
Specifying the estimation sample
6.2.1 Example of occupational attainment
6.2.2 Using different base categories
6.2.3 Predicting perfectly
6.3 Hypothesis testing of coefficients
6.3.1 mlogtest for tests of the MNLM
6.3.2 Testing the effects of the independent variables
A likelihood-ratio test
A Wald test
Testing multiple independent variables
6.3.3 Tests for combining alternatives
A Wald test for combining alternatives
Using test [category]*
An LR test for combining alternatives
Using constraint with lrtest*
6.4 Independence of irrelevant alternatives
Hausman test of IIA
Small–Hsiao test of IIA
6.5 Measures of fit
6.6 Interpretation
6.6.1 Predicted probabilities
6.6.2 Predicted probabilities with predict
Using predict to compare mlogit and ologit
6.6.3 Predicted probabilities and discrete change with prvalue
6.6.4 Tables of predicted probabilities with prtab
6.6.5 Graphing predicted probabilities with prgen
Plotting probabilities for one outcome and two groups
Graphing probabilities for all outcomes for one group
6.6.6 Changes in predicted probabilities
Computing marginal and discrete change with prchange
Marginal change with mfx
6.6.7 Plotting discrete changes with prchange and mlogview
6.6.8 Odds ratios using listcoef and mlogview
Listing odds ratios with listcoef
Plotting odds ratios
6.6.9 Using mlogplot*
6.6.10 Plotting estimates from matrices with mlogplot*
Options for using matrices with mlogplot
Global macros and matrices used by mlogplot
6.7 Multinomial probit model with IIA
6.8 Stereotype logistic regression
6.8.1 Formal statement of the one-dimensional SLM
6.8.2 Fitting the SLM with slogit
6.8.3 Interpretation using predicted probabilities
6.8.4 Interpretation using odds ratios
6.8.5 Distinguisability and the φ parameters
6.8.6 Ordinality in the one-dimensional SLM
Higher-dimension SLM
7 Models for nominal outcomes with alternative-specific data
7.1 Alternative-specific data organization
7.1.1 Syntax for case2alt
7.2 The conditional logit model
7.2.1 Fitting the conditional logit model
Example of the clogit model
7.2.2 Interpreting odds ratios from clogit
7.2.3 Interpreting probabilities from clogit
Using predict
Using asprvalue
7.2.4 Fitting the multinomial logit model using clogit
Setting up the data with case2alt
Fitting multinomial logit with clogit
7.2.5 Using clogit with case- and alternative-specific variables
Example of a mixed model
Interpretation of odds ratios using listcoef
Interpretation of predicted probabilities using asprvalue
Allow the effects of alternative-specific variables to vary over the alternatives
7.3 Alternative-specific multinomial probit
7.3.1 The model
7.3.2 Informal explanation of estimation by simulation
7.3.3 Alternative-based data with uncorrelated errors
7.3.4 Alternative-based data with correlated errors
7.4 The sturctural covariance matrix
7.4.1 Interpretation using probabilities
Using predict
Using asprvalue
7.4.2 Identification, discrete change, and marginal effects
7.4.3 Testing for IIA
7.4.4 Adding case-specific data
7.5 Rank-ordered logistic regression
7.5.1 Fitting the rank-ordered logit model
Example of the rank-ordered logit model
7.5.2 Interpreting results from rologit
Interpretation using odds ratios
Interpretation using predicted probabilties
7.6 Conclusions
8 Models for count outcomes
8.1 The Poisson distribution
8.1.1 Fitting the Poisson distribution with the poisson command
8.1.2 Computing predicted probabilities with prcounts
Variables generated
8.1.3 Comparing observed and predicted counts with prcounts
8.2 The Poisson regression model
8.2.1 Estimating the PRM with poisson
Variable lists
Specifying the estimation sample
8.2.2 Example of fitting the PRM
8.2.3 Interpretation using the rate, μ
Factor change in E(y|x)
Percent change in E(y|x)
Example of factor and percent change
Marginal change in E(y|x)
Example of marginal change using prchange
Example of marginal change using mfx
Discrete change in E(y|x)
Example of discrete change using prchange
Example of discrete change with confidence intervals
8.2.4 Interpretation using predicted probabilities
Example of predicted probabilities using prvalue
Example of predicted probabilities using prgen
Example of predicted probabilities using prcounts
8.2.5 Exposure time*
8.3 The negative binomial regression model
8.3.1 Fitting the NBRM with nbreg
NB1 and NB2 variance functions
8.3.2 Example of fitting the NBRM
Comparing the PRM and NBRM using estimates table
8.3.3 Testing for overdispersion
8.3.4 Interpretation using the rate μ
8.3.5 Interpretation using predicted probabilities
8.4 Models for truncated counts
8.4.1 Fitting zero-truncated models
8.4.2 Example of fitting zero-truncated models
8.4.3 Interpretation of parameters
8.4.4 Interpretation using predicted probabilities and rates
8.4.5 Computing predicted rates and probabilities in the estimation sample
8.5 The hurdle regression model*
8.5.1 In-sample predictions for the hurdle model
8.5.2 Predictions for user-specified values
8.6 Zero-inflated count models
8.6.1 Fitting zero-inflated models with zinb and zip
Variable lists
8.6.2 Example of fitting the ZIP and ZINB models
8.6.3 Interpretation of coefficients
8.6.4 Interpretation of predicted probabilities
Predicted probabilities with prvalue
Confidence intervals with prvalue
Predicted probabilities with prgen
8.7 Comparisons among count models
8.7.1 Comparing mean probabilities
8.7.2 Tests to compare count models
LR tests of α
Vuong test of nonnested models
8.8 Using countfit to compare count models
9 More topics
9.1 Ordinal and nominal independent variables
9.1.1 Coding a categorical independent variable as a set of dummy variables
9.1.2 Estimation and interpretation with categorical independent variables
9.1.3 Tests with categorical independent variables
Testing the effect of membership in one category versus the reference category
Testing the effect of membership in two nonreference categories
Testing that a categorical independent variable has no effect
Testing whether treating an ordinal variable as interval loses information
9.1.4 Discrete change for categorical independent variables
Computing discrete change with prchange
Computing discrete change with prvalue
9.2 Interactions
9.2.1 Computing sex differences in predictions with interactions
9.2.2 Computing sex differences in discrete change with interactions
9.3 Nonlinear nonlinear models
9.3.1 Adding nonlinearities to linear predictors
9.3.2 Discrete change in nonlinear models
9.4 Using praccum and forvalues to plot predictions
9.4.1 Example using age and age-squared
9.4.2 Using forvalues with praccum
9.4.3 Using praccum for graphing a transformed variable
9.4.4 Using praccum to graph interactions
9.4.5 Using forvalues with prvalue to create tables
9.4.6 A more advanced example*
9.4.7 Using forvalues to create tables with other commands
9.5 Extending SPost to other estimation commands
9.6 Using Stata more efficiently
9.6.2 Changing screen fonts and window preferences
9.6.3 Using ado-files for changing directories
9.6.4 me.hlp file
9.7 Conclusions
A Syntax for SPost Commands
A.1 asprvalue
A.2 brant
Saved results
A.3 case2alt
A.4 countfit
Options for specifying the model
Options to select the models to fit
Options to label and save results
Options to control what is printed
A.5 fitstat
Saved results
A.6 leastlikely
Syntax Description
Options for listing
A.7 listcoef
Options for nominal outcomes
Saved results
A.8 misschk
A.9 mlogplot
A.10 mlogtest
Saved results
A.11 mlogview
Dialog box controls
A.12 Overview of prchange, prgen, prtab, and prvalue
A.13 praccum
Variables generated
A.14 prchange
A.15 prcounts
Variables generated
A.16 prgen
Options for confidence intervals and marginals
Variables generated
A.17 prtab
A.18 prvalue
Options for confidence intervals
Options used for bootstrapped confidence intervals
Saved results
A.19 spex
B Description of datasets
B.1 binlfp2
B.2 couart2
B.3 gsskidvalue2
B.4 nomocc2
B.5 ordwarm2
B.6 science2
B.7 travel2
B.8 wlsrnk
The Stata Blog: Not Elsewhere Classified Find us on Facebook Follow us on Twitter LinkedIn Google+ Watch us on YouTube