*Preface*

New to the Second Edition

Guiding Principles Underlying Our Approach

Overview of Content Coverage and Intended Audience

*Acknowledgments*

1 INTRODUCTION

The Role of Statistical Software in Data Analysis

Statistics: Descriptive and Inferential

Variables and Constants

The Measurement of Variables

Nominal Level

Ordinal Level

Interval Level

Ratio Level

Choosing a Scale of Measurement

Discrete and Continuous Variables

Setting a Context with Real Data

Exercises

2 EXAMINING UNIVARIATE DISTRIBUTIONS

Counting the Occurrence of Data Values

When Variables are Measured at the Nominal Level

Frequency and Percent Distribution Tables

Bar Charts

Pie Charts

When Variables are Measured at the Ordinal, Interval, or Ratio Level

Frequency and Percent Distribution Tables

Stem-and-Leaf Displays

Histograms

Line Graphs

Describing the Shape of a Distribution

Accumulating Data

Cumulative Percent Distributions

Ogive Curves

Percentile Ranks

Percentiles

Five-Number Summaries and Boxplots

Modifying the Appearance of Graphs

Summary of Graphical Selection

Summary of Stata Commands

Exercises

3 MEASURES OF LOCATION, SPREAD, AND SKEWNESS

Characterizing the Location of a Distribution

The Mode

The Median

The Arithmetic Mean

Interpreting the Mean of a Dichotomous Variable

The Weighted Mean

Comparing the Mode, Median, and Mean

Characterizing the Spread of a Distribution

The Range and Interquartile Range

The Variance

The Standard Deviation

Characterizing the Skewness of a Distribution

Selecting Measures of Location and Spread

Applying What We Have Learned

Summary of Stata Commands

Helpful Hints When Using Stata

Online Resources

The Stata Command

Stata Tips

Exercises

4 RE–EXPRESSING VARIABLES

Linear and Nonlinear Transformations

Linear Transformations: Addition, Subtraction, Multiplication, and Division

The Effect on the Shape of a Distribution

The Effect on Summary Statistics of a Distribution

Common Linear Transformations

Standard Scores

*z*-Scores

Using *z*-Scores to Detect Outliers

Using *z*-Scores to Compare Scores in Different Distributions

Relating *z*-Scores to Percentile Ranks

Nonlinear Transformations: Square Roots and Logarithms

Nonlinear Transformations: Ranking Variables

Other Transformations: Recoding and Combining Variables

Recoding Variables

Combining Variables

Data Management Fundamentals: The Do-File

Summary of Stata Commands

Exercises

5 EXPLORING RELATIONSHIPS BETWEEN TWO VARIABLES

When Both Variables are at Least Interval-Leveled

Scatterplots

The Pearson Product–Moment Correlation Coefficient

Interpreting the Pearson Correlation Coefficient

Judging the Strength of the Linear Relationship

The Correlation Scale Itself Is Ordinal

Correlation Does Not Imply Causation

The Effect of Linear Transformations

Restriction of Range

The Shape of the Underlying Distributions

The Reliability of the Data

When at Least One Variable Is Ordinal and the Other Is at Least Ordinal: The
Spearman Rank Correlation Coefficient

When at Least One Variable Is Dichotomous: Other Special Cases of the Pearson
Correlation Coefficient

The Point Biserial Correlation Coefficient: The Case of One at Least
Interval and One Dichotomous Variable

The Phi Coefficient: The Case of Two Dichotomous Variables

Other Visual Displays of Bivariate Relationships

Selection of Appropriate Statistic or Graph to Summarize a Relationship

Summary of Stata Commands

Exercises

6 SIMPLE LINEAR REGRESSION

The “Best-Fitting” Linear Equation

The Accuracy of Prediction Using the Linear Regression Model

The Standardized Regression Equation

*R* As a Measure of the Overall Fit of the Linear Regression Model

Simple Linear Regression When the Independent Variable Is Dichotomous

Using *r* and *R* As Measures of Effect Size

Emphasizing the Importance of the Scatterplot

Summary of Stata Commands

Exercises

7 PROBABILITY FUNDAMENTALS

The Discrete Case

The Complement Rule of Probability

The Additive Rules of Probability

First Additive Rule of Probability

Second Additive Rule of Probability

The Multiplicative Rule of Probability

The Relationship between Independence and Mutual Exclusivity

Conditional Probability

The Law of Total Probability

Bayes' Theorem

The Law of Large Numbers

Exercises

8 THEORETICAL PROBABILITY MODELS

The Binomial Probability Model and Distribution

The Applicability of the Binomial Probability Model

The Normal Probability Model and Distribution

Using the Normal Distribution to Approximate the Binomial Distribution

Summary of Stata Commands

Exercises

9 THE ROLE OF SAMPLING IN INFERENTIAL STATISTICS

Samples and Populations

Random Samples

Obtaining a Simple Random Sample

Sampling with and without Replacement

Sampling Distributions

Describing the Sampling Distribution of Means Empirically

Describing the Sampling Distribution of Means Theoretically

Central Limit Theorem

Estimators and BIAS

Summary of Stata Commands

Exercises

10 INFERENCES INVOLVING THE MEAN OF A SINGLE POPULATION
WHEN σ IS KNOWN

Estimating the Population Mean, μ, When the Population Standard Deviation,
σ, Is Known

Interval Estimation

Relating the Length of a Confidence Interval, the Level of Confidence, and the
Sample Size

Hypothesis Testing

The Relationship between Hypothesis Testing and Interval Estimation

Effect Size

Type II Error and the Concept of Power

Increasing the Level of Significance, α

Increasing the Effect Size, δ

Decreasing the Standard Error of the Mean, σ_{𝓍̅}

Closing Remarks

Summary of Stata Commands

Exercises

11 INFERENCES INVOLVING THE MEAN WHEN σ IS NOT
KNOWN: ONE- AND TWO-SAMPLE DESIGNS

Single Sample Designs When the Parameter of Interest Is the Mean and σ
Is Not Known

The *t*- Distribution

Degrees of Freedom for the One-Sample *t*-Test

Violating the Assumption of a Normally Distributed Parent Population in the
One-Sample *t*-Test

Confidence Intervals for the One-Sample *t*-Test

Hypothesis Tests: The One-Sample *t*-Test

Effect Size for the One-Sample *t*-Test

Two-Sample Designs When the Parameter of Interest Is μ, and σ Is
Not Known

Independent (or Unrelated) and Dependent (or Related) Samples

Independent Samples *t*-Test and Confidence Interval

The Assumptions of the Independent Samples *t*-Test

Effect Size for the Independent Samples *t*-Test

Paired Samples *t*-Test and Confidence Interval

The Assumptions of the Paired Samples *t*-Test

Effect Size for the Paired Samples *t*-Test

The Bootstrap

Conducting Power Analyses for

*t*-Tests on Means

Summary

Summary of Stata Commands

Exercises

12 RESEARCH DESIGN: INTRODUCTION AND OVERVIEW

Questions and their Link to Descriptive, Relational, and Causal Research
Studies

The Need for a Good Measure of Our Construct: Weight

The Descriptive Study

From Descriptive to Relational Studies

From Relational to Causal Studies

The Gold Standard of Causal Studies: The True Experiment and Random Assignment

Comparing Two Kidney Stone Treatments Using a Non-Randomized Controlled Study

Including Blocking in a Research Design

Underscoring the Importance of Having a True Control Group Using Randomization

Analytic Methods for Bolstering Claims of Causality from Observational Data

Quasi-Experimental Designs

Threats to the Internal Validity of a Quasi-Experimental Design

Threats to the External Validity of a Quasi-Experimental Design

Threats to the Validity of a Study: Some Clarifications and Caveats

Threats to the Validity of a Study: Some Examples

Exercises

13 ONE-WAY ANALYSIS OF VARIANCE

The Disadvantage of Multiple

*t*-Tests

The One-Way Analysis of Variance

A Graphical Illustration of the Role of Variance in Tests on Means

ANOVA As an Extension of the Independent Samples

*t*-Test

Developing an Index of Separation for the Analysis of Variance

Carrying Out the ANOVA Computation

The Between Group Variance (MS_{B})

The Within Group Variance (MS_{W})

The Assumptions of the One-Way ANOVA

Testing the Equality of Population Means: The

*F*-Ratio

How to Read the Tables and Use Stata Functions for the

*F*-Distribution

ANOVA Summary Table

Measuring the Effect Size

Post-Hoc Multiple Comparison Tests

The Bonferroni Adjustment: Testing Planned Comparisons

The Bonferroni Tests on Multiple Measures

Conducting Power Analyses for One-Way ANOVA

Summary of Stata Commands

Exercises

14 TWO-WAY ANALYSIS OF VARIANCE

The Two-Factor Design

The Concept of Interaction

The Hypotheses That are Tested by a Two-Way Analysis of Variance

Assumptions of the Two-Way Analysis of Variance

Balanced versus Unbalanced Factorial Designs

Partitioning the Total Sum of Squares

Using the

*F*-Ratio to Test the Effects in Two-Way ANOVA

Carrying Out the Two-Way ANOVA Computation by Hand

Decomposing Score Deviations about the Grand Mean

Modeling Each Score as a Sum of Component Parts

Explaining the Interaction As a Joint (or Multiplicative) Effect

Measuring Effect Size

Fixed versus Random Factors

Post-hoc Multiple Comparison Tests

Simple Effects and Pairwise Comparisons

Summary of Steps to Be Taken in a Two-Way ANOVA Procedure

Conducting Power Analyses for Two-Way ANOVA

Summary of Stata Commands

Exercises

15 CORRELATION AND SIMPLE REGRESSION AS INFERENTIAL
TECHNIQUES

The Bivariate Normal Distribution

Testing whether the Population Pearson Product-Moment Correlation Equals Zero

Using a Confidence Interval to Estimate the Size of the Population Correlation
Coefficient, ρ

Revisiting Simple Linear Regression for Prediction

Estimating the Population Standard Error of Prediction, σ_{Υ|Χ}

Testing the *b*-Weight for Statistical Significance

Explaining Simple Regression Using an Analysis of Variance Framework

Measuring the Fit of the Overall Regression Equation: Using *R* and *R*^{2}

Relating *R*^{2} to σ^{2}_{Υ|Χ }

Testing *R*^{2} for Statistical Significance

Estimating the True Population *R*^{2}: The Adjusted *R*^{2}

Exploring the Goodness of Fit of the Regression Equation: Using Regression
Diagnostics

Residual Plots: Evaluating the Assumptions Underlying Regression

Detecting Influential Observations: Discrepancy and Leverage

Using Stata to Obtain Leverage

Using Stata to Obtain Discrepancy

Using Stata to Obtain Influence

Using Diagnostics to Evaluate the Ice Cream Sales Example

Using the Prediction Model to Predict Ice Cream Sales

Simple Regression When the Predictor is Dichotomous

Conducting Power Analyses for Correlation and Simple Regression

Summary of Stata Commands

Exercises

16 AN INTRODUCTION TO MULTIPLE REGRESSION

The Basic Equation with Two Predictors

Equations for

*b*, β, and

*R*_{Υ.12} When the
Predictors Are Not Correlated

Equations for

*b*, β, and

*R*_{Υ.12} When the
Predictors Are Correlated

Summarizing and Expanding on Some Important Principles of Multiple Regression

Testing the

*b*-Weights for Statistical Significance

Assessing the Relative Importance of the Independent Variables in the Equation

Measuring the Drop in

*R*^{2} Directly: An Alternative to the
Squared Semipartial Correlation

Evaluating the Statistical Significance of the Change in

*R*^{2}
The

*b*-Weight As a Partial Slope in Multiple Regression

Multiple Regression When One of the Two Independent Variables is Dichotomous

Controlling Variables Statistically: A Closer Look

A Hypothetical Example

Conducting Power Analyses for Multiple Regression

Summary of Stata Commands

Exercises

17 TWO-WAY INTERACTIONS IN MULTIPLE REGRESSION

Testing the Statistical Significance of an Interaction Using Stata

Comparing the *Y*-Hat Values from the Additive and Interaction Models

Centering First-Order Effects if the Equation Has an Interaction

Probing the Nature of a Two-Way Interaction

Interaction When One of the Independent Variables Is Dichotomous and the Other Is Continuous

Methods Useful for Model Selection

Conducting a Power Analysis to Detect an Interaction

Summary of Stata Commands

Exercises

18 NONPARAMETRIC METHODS

Parametric versus Nonparametric Methods

Nonparametric Methods When the Dependent Variable Is at the Nominal Level

The Chi-Square Distribution (Χ

^{2})

The Chi-Square Goodness-of-Fit Test

The Chi-Square Test of Independence

Assumptions of the Chi-Square Test of Independence

Fisher’s Exact Test

Calculating the Fisher is Exact Test by Hand Using the Hypergeometric
Distribution

Nonparametric Methods When the Dependent Variable Is Ordinal-Leveled

Wilcoxon Sign Test

The Mann–Whitney

*U* Test or Wilcoxon's Rank-Sum Test

The Kruskal–Wallis Analysis of Variance

Summary of Stata Commands

Exercises

19 COMMUNICATING YOUR STATA RESULTS VIA EXCEL

Setting the Working Directory

Reproducing a Table of Univariate Summary Statistics in Excel

Using estpost and esttab

Using putexcel

Reproducing a Correlation Matrix As a Table in Excel

Using estpost and esttab

Using putexcel

Reproducing Regression Output As a Table in Excel

Using outreg^{2} to obtain a table of model statistics in Excel

Using eststo and esttab to obtain a table of model statistics in Excel

Using putexcel to reproduce a table of regression coefficients in Excel**
**

**
Reproducing a Graph in Excel (Using putexcel)**

Conclusion

Summary of Stata Commands

Exercises

**
***Appendix A Data Set Descriptions*

*Appendix B Stata .Do-files and Data Sets in Stata Format
*

*Appendix C Statistical Tables*

*Appendix D Solutions*

*References*

*Index*