*Preface*

*Acknowledgments*

1 INTRODUCTION

The Role of the Computer in Data Analysis

Statistics: Descriptive and Inferential

Variables and Constants

The Measurement of Variables

Discrete and Continuous Variables

Setting a Context with Real Data

Exercises

2 EXAMINING UNIVARIATE DISTRIBUTIONS

Counting the Occurrence of Data Values

When Variables Are Measured at the Nominal Level

Frequency and Percent Distribution Tables

Bar Charts

Pie Charts

When Variables Are Measured at the Ordinal, Interval, or Ratio Level

Frequency and Percent Distribution Tables

Stem-and-Leaf Displays

Histograms

Line Graphs

Describing the Shape of a Distribution

Accumulating Data

Cumulative Percent Distributions

Ogive Curves

Percentile Ranks

Percentiles

Five-Number Summaries and Boxplots

Modifying the Appearance of Graphs

Summary of Graphical Selection

Summary of Stata Commands in Chapter 2

Exercises

3 MEASURES OF LOCATION, SPREAD, AND SKEWNESS

Characterizing the Location of a Distribution

The Mode

The Median

The Arithmetic Mean

Interpreting the Mean of a Dichotomous Variable

The Weighted Mean

Comparing the Mode, Median, and Mean

Characterizing the Spread of a Distribution

The Range and Interquartile Range

The Variance

The Standard Deviation

Characterizing the Skewness of a Distribution

Selecting Measures of Location and Spread

Applying What We Have Learned

Summary of Stata Commands in Chapter 3

Helpful Hints When Using Stata

Online Resources

The Stata Command

Stata TIPS

Exercises

4 REEXPRESSING VARIABLES

Linear and Nonlinear Transformations

Linear Transformations: Addition, Subtraction, Multiplication, and Division

The Effect on the Shape of a Distribution

The Effect on Summary Statistics of a Distribution

Common Linear Transformations

Standard Scores

*z*-Scores

Using *z*-Scores to Detect Outliers

Using *z*-Scores to Compare Scores in Different Distributions

Relating *z*-Scores to Percentile Ranks

Nonlinear Transformations: Square Roots and Logarithms

Nonlinear Transformations: Ranking Variables

Other Transformations: Recoding and Combining Variables

Recoding Variables

Combining Variables

Data Management Fundamentals – the Do-File

Summary of Stata Commands in Chapter 4

Exercises

5 EXPLORING RELATIONSHIPS BETWEEN TWO VARIABLES

When Both Variables Are at Least Interval-Leveled

Scatterplots

The Pearson Product Moment Correlation Coefficient

Interpreting the Pearson Correlation Coefficient

• Judging the Strength of the Linear Relationship • The
Correlation Scale Itself Is Ordinal • Correlation Does Not Imply
Causation • The Effect of Linear Transformations •
Restriction of Range • The Shape of the Underlying Distributions
• The Reliability of the Data

When at Least One Variable Is Ordinal and the Other Is at Least Ordinal: The
Spearman Rank Correlation Coefficient

When at Least One Variable Is Dichotomous: Other Special Cases of the Pearson
Correlation Coefficient

The Point Biserial Correlation Coefficient: The Case of One at Least
Interval and One Dichotomous Variable

The Phi Coefficient: The Case of Two Dichotomous Variables

Other Visual Displays of Bivariate Relationships

Selection of Appropriate Statistic/Graph to Summarize a Relationship

Summary of Stata Commands in Chapter 5

Exercises

6 SIMPLE LINEAR REGRESSION

The “Best-Fitting” Linear Equation

The Accuracy of Prediction Using the Linear Regression Model

The Standardized Regression Equation

*R* as a Measure of the Overall Fit of the Linear Regression Model

Simple Linear Regression When the Independent Variable Is Dichotomous

Using *r* and *R* as Measures of Effect Size

Emphasizing the Importance of the Scatterplot

Summary of Stata Commands in Chapter 6

Exercises

7 PROBABILITY FUNDAMENTALS

The Discrete Case

The Complement Rule of Probability

The Additive Rules of Probability

First Additive Rule of Probability

Second Additive Rule of Probability

The Multiplicative Rule of Probability

The Relationship between Independence and Mutual Exclusivity

Conditional Probability

The Law of Large Numbers

Exercises

8 THEORETICAL PROBABILITY MODELS

The Binomial Probability Model and Distribution

The Applicability of the Binomial Probability Model

The Normal Probability Model and Distribution

Using the Normal Distribution to Approximate the Binomial Distribution

Summary of Chapter 8 Stata Commands

Exercises

9 THE ROLE OF SAMPLING IN INFERENTIAL STATISTICS

Samples and Populations

Random Samples

Obtaining a Simple Random Sample

Sampling with and without Replacement

Sampling Distributions

Describing the Sampling Distribution of Means Empirically

Describing the Sampling Distribution of Means Theoretically: The Central
Limit Theorem

Central Limit Theorem (CLT)

Estimators and BIAS

Summary of Chapter 9 Stata Commands

Exercises

10 INFERENCES INVOLVING THE MEAN OF A SINGLE POPULATION
WHEN σ IS KNOWN

Estimating the Population Mean, μ, When the Population Standard Deviation,
σ, Is Known

Interval Estimation

Relating the Length of a Confidence Interval, the Level of Confidence, and the
Sample Size

Hypothesis Testing

The Relationship between Hypothesis Testing and Interval Estimation

Effect Size

Type II Error and the Concept of Power

Increasing the Level of Significance, α

Increasing the Effect Size, δ

Decreasing the Standard Error of the Mean, σ_{𝓍̅}

Closing Remarks

Summary of Chapter 10 Stata Commands

Exercises

11 INFERENCES INVOLVING THE MEAN WHEN σ IS NOT
KNOWN: ONE- AND TWO-SAMPLE DESIGNS

Single Sample Designs When the Parameter of Interest Is the Mean and σ
Is Not Known

The *t* Distribution

Degrees of Freedom for the One Sample *t*-Test

Violating the Assumption of a Normally Distributed Parent Population in the
One Sample *t*-Test

Confidence Intervals for the One Sample *t*-Test

Hypothesis Tests: The One Sample *t*-Test

Effect Size for the One Sample *t*-Test

Two Sample Designs When the Parameter of Interest Is μ, and σ Is
Not Known

Independent (or Unrelated) and Dependent (or Related) Samples

Independent Samples *t*-Test and Confidence Interval

The Assumptions of the Independent Samples *t*-Test

Effect Size for the Independent Samples *t*-Test

Paired Samples *t*-Test and Confidence Interval

The Assumptions of the Paired Samples *t*-Test

Effect Size for the Paired Samples *t*-Test

The Bootstrap

Summary

Summary of Chapter 11 Stata Commands

Exercises

12 RESEARCH DESIGN: INTRODUCTION AND OVERVIEW

Questions and Their Link to Descriptive, Relational, and Causal Research
Studies

The Need for a Good Measure of Our Construct, Weight

The Descriptive Study

From Descriptive to Relational Studies

From Relational to Causal Studies

The Gold Standard of Causal Studies: The True Experiment and Random
Assignment

Comparing Two Kidney Stone Treatments Using a Non-randomized Controlled Study

Including Blocking in a Research Design

Underscoring the Importance of Having a True Control Group Using
Randomization

Analytic Methods for Bolstering Claims of Causality from Observational Data
(Optional Reading)

Quasi-Experimental Designs

Threats to the Internal Validity of a Quasi-Experimental Design

Threats to the External Validity of a Quasi-Experimental Design

Threats to the Validity of a Study: Some Clarifications and Caveats

Threats to the Validity of a Study: Some Examples

Exercises

13 ONE-WAY ANALYSIS OF VARIANCE

The Disadvantage of Multiple

*t*-Tests

The One-Way Analysis of Variance

A Graphical Illustration of the Role of Variance in Tests on Means

ANOVA as an Extension of the Independent Samples

*t*-Test

Developing an Index of Separation for the Analysis of Variance

Carrying Out the ANOVA Computation

The Between Group Variance (MS_{B})

The Within Group Variance (MS_{W})

The Assumptions of the One-Way ANOVA

Testing the Equality of Population Means: The

*F*-Ratio

How to Read the Tables and Use Stata Functions for the

*F*-Distribution

ANOVA Summary Table

Measuring the Effect Size

Post-Hoc Multiple Comparison Tests

The Bonferroni Adjustment: Testing Planned Comparisons

The Bonferroni Tests on Multiple Measures

Summary of Stata Commands in Chapter 13

Exercises

14 TWO-WAY ANALYSIS OF VARIANCE

The Two-Factor Design

The Concept of Interaction

The Hypotheses That Are Tested by a Two-Way Analysis of Variance

Assumptions of the Two-Way Analysis of Variance

Balanced versus Unbalanced Factorial Designs

Partitioning the Total Sum of Squares

Using the *F*-Ratio to Test the Effects in Two-Way ANOVA

Carrying Out the Two-Way ANOVA Computation by Hand

Decomposing Score Deviations about the Grand Mean

Modeling Each Score as a Sum of Component Parts

Explaining the Interaction as a Joint (or Multiplicative) Effect

Measuring Effect Size

Fixed versus Random Factors

Post-hoc Multiple Comparison Tests

Summary of Steps to be Taken in a Two-Way ANOVA Procedure

Summary of Stata Commands in Chapter 14

Exercises

15 CORRELATION AND SIMPLE REGRESSION AS INFERENTIAL
TECHNIQUES

The Bivariate Normal Distribution

Testing Whether the Population Pearson Product Moment Correlation Equals Zero

Using a Confidence Interval to Estimate the Size of the Population Correlation
Coefficient, ρ

Revisiting Simple Linear Regression for Prediction

Estimating the Population Standard Error of Prediction, σ_{Υ|Χ}

Testing the *b*-Weight for Statistical Significance

Explaining Simple Regression Using an Analysis of Variance Framework

Measuring the Fit of the Overall Regression Equation: Using *R* and
*R*^{2}

Relating *R*^{2} to σ^{2}_{Υ|Χ
}

Testing *R*^{2} for Statistical Significance

Estimating the True Population *R*^{2}: The Adjusted *R*^{2}

Exploring the Goodness of Fit of the Regression Equation: Using Regression
Diagnostics

Residual Plots: Evaluating the Assumptions Underlying Regression

Detecting Influential Observations: Discrepancy and Leverage

Using Stata to Obtain Leverage

Using Stata to Obtain Discrepancy

Using Stata to Obtain Influence

Using Diagnostics to Evaluate the Ice Cream Sales Example

Using the Prediction Model to Predict Ice Cream Sales

Simple Regression When the Predictor is Dichotomous

Summary of Stata Commands in Chapter 15

Exercises

16 AN INTRODUCTION TO MULTIPLE REGRESSION

The Basic Equation with Two Predictors

Equations for *b*, β, and *R*_{Υ.12} When the
Predictors Are Not Correlated

Equations for *b*, β, and *R*_{Υ.12} When the
Predictors Are Correlated

Summarizing and Expanding on Some Important Principles of Multiple Regression

Testing the *b*-Weights for Statistical Significance

Assessing the Relative Importance of the Independent Variables in the Equation

Measuring the Drop in *R*^{2} Directly: An Alternative to the
Squared Semipartial Correlation

Evaluating the Statistical Significance of the Change in *R*^{2}

The *b*-Weight as a Partial Slope in Multiple Regression

Multiple Regression When One of the Two Independent Variables is Dichotomous

The Concept of Interaction between Two Variables That Are at Least
Interval-Leveled

Testing the Statistical Significance of an Interaction Using Stata

Centering First-Order Effects to Achieve Meaningful Interpretations of
*b*-Weights

Understanding the Nature of a Statistically Significant Two-Way Interaction

Interaction When One of the Independent Variables Is Dichotomous and the
Other Is Continuous

Summary of Stata Commands in Chapter 16

Exercises

17 NONPARAMETRIC METHODS

Parametric versus Nonparametric Methods

Nonparametric Methods When the Dependent Variable Is at the Nominal Level

The Chi-Square Distribution (Χ

^{2})

The Chi-Square Goodness-of-Fit Test

The Chi-Square Test of Independence

Assumptions of the Chi-Square Test of Independence

Fisher’s Exact Test

Calculating the Fisher Exact Test by Hand Using the Hypergeometric
Distribution

Nonparametric Methods When the Dependent Variable Is Ordinal-Leveled

Wilcoxon Sign Test

The Mann-Whitney

*U* Test

The Kruskal-Wallis Analysis of Variance

Summary of Stata Commands in Chapter 17

Exercises

*Appendix A Data Set Descriptions*

*Appendix B Stata .do Files and Data Sets in Stata Format
*

*Appendix C Statistical Tables*

*Appendix D References*

*Appendix E Solutions*

*Index*