List of figures
List of tables
List of boxed tips
Support materials for the book
Glossary of acronyms
Glossary of mathematical and statistical symbols
1 Getting started
1.1 Conventions
1.2 Introduction
1.3 The Stata screen
1.4 Using an existing dataset
1.5 An example of a short Stata session
1.6 Video aids to learning Stata
1.7 Summary
1.8 Exercises
2 Entering data
2.1 Creating a dataset
2.2 An example questionnaire
2.3 Developing a coding system
2.4 Entering data using the Data Editor
2.4.1 Value labels
2.5 The Variables Manager
2.6 The Data Editor (Browse) view
2.7 Saving your dataset
2.8 Checking the data
2.9 Summary
2.10 Exercises
3 Preparing data for analysis
3.1 Introduction
3.2 Planning your work
3.3 Creating value labels
3.4 Reverse-code variables
3.5 Creating and modifying variables
3.6 Creating scales
3.7 Saving some of your data
3.8 Summary
3.9 Exercises
4 Working with commands, do-files, and results
4.1 Introduction
4.2 How Stata commands are constructed
4.3 Creating a do-file
4.4 Copying your results to a word processor
4.5 Logging your command file
4.6 Summary
4.7 Exercises
5 Descriptive statistics and graphs for one variable
5.1 Descriptive statistics and graphs
5.2 Where is the center of a distribution?
5.3 How dispersed is the distribution?
5.4 Statistics and graphs—unordered categories
5.5 Statistics and graphs—ordered categories and variables
5.6 Statistics and graphs—quantitative variables
5.7 Summary
5.8 Exercises
6 Statistics and graphs for two categorical variables
6.1 Relationship between categorical variables
6.2 Cross-tabulation
6.3 Chi-squared test
6.3.1 Degrees of freedom
6.3.2 Probability tables
6.4 Percentages and measures of association
6.5 Odds ratios when dependent variable has two categories
6.6 Ordered categorical variables
6.7 Interactive tables
6.8 Tables—linking categorical and quantitative variables
6.9 Power analysis when using a chi-squared test of significance
6.10 Summary
6.11 Exercises
7 Tests for one or two means
7.1 Introduction to tests for one or two means
7.2 Randomization
7.3 Random sampling
7.4 Hypotheses
7.5 One-sample test of a proportion
7.6 Two-sample test of a proportion
7.7 One-sample test of means
7.8 Two-sample test of group means
7.8.1 Testing for unequal variances
7.9 Repeated-measures t test
7.10 Power analysis
7.11 Nonparametric alternatives
7.11.1 Mann–Whitney two-sample rank-sum test
7.11.2 Nonparametric alternative: Median test
7.12 Video tutorial related to this chapter
7.13 Summary
7.14 Exercises
8 Bivariate correlation and regression
8.1 Introduction to bivariate correlation and regression
8.2 Scattergrams
8.3 Plotting the regression line
8.4 An alternative to producing a scattergram, binscatter
8.5 Correlation
8.6 Regression
8.7 Spearman’s rho: Rank-order correlation for ordinal data
8.8 Power analysis with correlation
8.9 Summary
8.10 Exercises
9 Analysis of variance
9.1 The logic of one-way analysis of variance
9.2 ANOVA example
9.3 ANOVA example with nonexperimental data
9.4 Power analysis for one-way ANOVA
9.5 A nonparametric alternative to ANOVA
9.6 Analysis of covariance
9.7 Two-way ANOVA
9.8 Repeated-measures design
9.9 Intraclass correlation—measuring agreement
9.10 Power analysis with ANOVA
9.10.1 Power analysis for one-way ANOVA
9.10.2 Power analysis for two-way ANOVA
9.10.3 Power analysis for repeated-measures ANOVA
9.10.4 Summary of power analysis for ANOVA
9.11 Summary
9.12 Exercises
10 Multiple regression
10.1 Introduction to multiple regression
10.2 What is multiple regression?
10.3 The basic multiple regression command
10.4 Increment in R-squared: Semipartial correlations
10.5 Is the dependent variable normally distributed?
10.6 Are the residuals normally distributed?
10.7 Regression diagnostic statistics
10.7.1 Outliers and influential cases
10.7.2 Influential observations: DFbeta
10.7.3 Combinations of variables may cause problems
10.8 Weighted data
10.9 Categorical predictors and hierarchical regression
10.10 A shortcut for working with a categorical variable
10.11 Fundamentals of interaction
10.12 Nonlinear relations
10.12.1 Fitting a quadratic model
10.12.2 Centering when using a quadratic term
10.12.3 Do we need to add a quadratic component?
10.13 Power analysis in multiple regression
10.14 Summary
10.15 Exercises
11 Logistic regression
11.1 Introduction to logistic regression
11.2 An example
11.3 What is an odds ratio and a logit?
11.3.1 The odds ratio
11.3.2 The logit transformation
11.4 Data used in the rest of the chapter
11.5 Logistic regression
11.6 Hypothesis testing
11.6.1 Testing individual coefficients
11.6.2 Testing sets of coefficients
11.7 Margins: More on interpreting results from logistic regression
11.8 Nested logistic regressions
11.9 Power analysis when doing logistic regression
11.10 Next steps for using logistic regression and its extensions
11.11 Summary
11.12 Exercises
12 Measurement, reliability, and validity
12.1 Overview of reliability and validity
12.2 Constructing a scale
12.2.1 Generating a mean score for each person
12.3 Reliability
12.3.1 Stability and test–retest reliability
12.3.2 Equivalence
12.3.3 Split-half and alpha reliability—internal consistency
12.3.4 Kuder–Richardson reliability for dichotomous items
12.3.5 Rater agreement—kappa (κ)
12.4 Validity
12.4.1 Expert judgment
12.4.2 Criterion-related validity
12.4.3 Construct validity
12.5 Factor analysis
12.6 PCF analysis
12.6.1 Orthogonal rotation: Varimax
12.6.2 Oblique rotation: Promax
12.7 But we wanted one scale, not four scales
12.7.1 Scoring our variable
12.8 Summary
12.9 Exercises
13 Structural equation and generalized structural equation modeling
13.1 Linear regression using sem
13.1.1 Using the sem command directly
13.1.2 SEM and working with missing values
13.1.3 Exploring missing values and auxiliary variables
13.1.4 Getting auxiliary variables into your SEM command
13.2 A quick way to draw a regression model
13.3 The gsem command for logistic regression
13.3.1 Fitting the model using the logit command
13.3.2 Fitting the model using the gsem command
13.4 Path analysis and mediation
13.5 Conclusions and what is next for the sem command
13.6 Exercises
14 Working with missing values—multiple imputation
14.1 Working with missing values—multiple imputation
14.2 What variables do we include when doing imputations?
14.3 The nature of the problem
14.4 Multiple imputation and its assumptions about the mechanism for missingness
14.5 Multiple imputation
14.6 A detailed example
14.6.1 Preliminary analysis
14.6.2 Setup and multiple-imputation stage
14.6.3 The analysis stage
14.6.4 For those who want an R2 and standardized βs
14.6.5 When impossible values are imputed
14.7 Summary
14.8 Exercises
15 An introduction to multilevel analysis
15.1 Questions and data for groups of individuals
15.2 Questions and data for a longitudinal multilevel application
15.3 Fixed-effects regression models
15.4 Random-effects regression models
15.5 An applied example
15.5.1 Research questions
15.5.2 Reshaping data to do multilevel analysis
15.6 A quick visualization of our data
15.7 Random-intercept model
15.7.1 Random intercept—linear model
15.7.2 Random-intercept model—quadratic term
15.7.3 Treating time as a categorical variable
15.8 Random-coefficients model
15.9 Including a time-invariant covariate
15.10 Summary
15.11 Exercises
16 Item response theory (IRT)
16.1 How are IRT measures of variables different from summated scales?
16.2 Overview of three IRT models for dichotomous items
16.2.1 The one-parameter logistic (1PL) model
16.2.2 The two-parameter logistic (2PL) model
16.2.3 The three-parameter logistic (3PL) model
16.3 Fitting the 1PL model using Stata
16.3.1 The estimation
16.3.2 How important is each of the items?
16.3.3 An overall evaluation of our scale
16.3.4 Estimating the latent score
16.4 Fitting a 2PL IRT model
16.4.1 Fitting the 2PL model
16.5 The graded response model—IRT for Likert-type items
16.5.1 The data
16.5.2 Fitting our graded response model
16.5.3 Estimating a person’s score
16.6 Reliability of the fitted IRT model
16.7 Using the Stata menu system
16.8 Extensions of IRT
16.9 Exercises
A What’s next?
A.1 Introduction to the appendix
A.2 Resources
A.2.1 Web resources
A.2.2 Books about Stata
A.2.3 Short courses
A.2.4 Acquiring data
A.2.5 Learning from the postestimation methods
A.3 Summary
References