Stata Bookstore: Statistics Using Stata: An Integrative Approach, Third Edition

Home / Bookstore / Title index / Books on Stata / Statistics Using Stata: An Integrative Approach, Third Edition

Statistics Using Stata: An Integrative Approach, Third Edition

As an Amazon Associate, StataCorp earns a small referral credit from qualifying purchases made from affiliate links on our site.

Amazon Associate affiliate link

What are VitalSource eBooks?
Your access code will be emailed upon purchase.

eBook not available for this title

Authors:	Sharon Lawner Weinberg, Sarah Knapp, and Daphna Harel
Publisher:	Cambridge University Press
Copyright:	2024
ISBN-13:	978-1-009-39100-9
Pages:	780; paperback

Authors:	Sharon Lawner Weinberg, Sarah Knapp, and Daphna Harel
Publisher:	Cambridge University Press
Copyright:	2024
ISBN-13:
Pages:	780; eBook
Price:	$0.00

Authors:	Sharon Lawner Weinberg, Sarah Knapp, and Daphna Harel
Publisher:	Cambridge University Press
Copyright:	2024
ISBN-13:
Pages:	780; Kindle
Price:	$

Comment from the Stata technical group

Statistics Using Stata: An Integrative Approach, Third Edition, by Sharon Lawner Weinberg, Sarah Knapp Abramowitz, and Daphna Harel, is an excellent introduction to applied statistics and its implementation in Stata. The authors cover essential topics from exploratory data analysis to multiple regression, interweaving statistical concepts and their application in Stata. Their repeated use of real data throughout the book clearly connects the statistical concepts to real-world applications. Designed for teaching graduate and undergraduate students from the behavioral, social, and health sciences, this text is accompanied by additional resources online such as Powerpoint slides and Stata do-files. Each chapter concludes with exercises and a review of Stata code used in the examples, allowing readers to test their knowledge and refer back to Stata commands.

The authors guide the reader from basic statistical concepts to more advanced material, tying concepts together to emphasize the overarching ideas. They begin with descriptive statistics, discussing the different variable types and the corresponding graphs and statistics used to examine their distribution and relationship with other variables. Then, they discuss the law of large numbers, theoretical probability distributions, and sampling, preparing the reader to dive into inferential statistics. The authors then present ANOVA, simple and multiple regression, and nonparametric methods. They carefully explain what the values represent in context of the data and how the methods relate to one another, allowing readers to really grasp the meaning behind the analyses.

Weinberg, Abramowitz, and Harel are just as careful when teaching the reader how to implement statistical methods in Stata. First, they introduce the reader to Stata's interface and the general syntax of Stata’s commands. Then, they explain the importance of do-files for reproducing one’s work and encourage the reader to work alongside the text with the do-files provided at the companion website. Readers can then use these do-files as a starting point when performing analyses on their own data.

The authors have updated the third edition based on Stata 17. An entirely new chapter is devoted to creating, customizing, and exporting tables with the table and collect suite of commands. Additionally, a new chapter is devoted to accessing public-use data. The authors demonstrate how to access, clean, and analyze a publicly available dataset.

View table of contents >>

Preface

New to the Third Edition
Guiding Principles Underlying Our Approach
Overview of Content Coverage and Intended Audience

Acknowledgments

1 INTRODUCTION

The Role of the Computer in Data Analysis
Statistics: Descriptive and Inferential
Variables and Constants
The Measurement of Variables

Nominal Level
Ordinal Level
Interval Level
Ratio Level
Choosing a Scale of Measurement

Discrete and Continuous Variables
Setting a Context with Real Data
Exercises

2 EXAMINING UNIVARIATE DISTRIBUTIONS

Counting the Occurrence of Data Values
When Variables are Measured at the Nominal Level

Frequency and Percent Distribution Tables
Bar Charts
Pie Charts

When Variables are Measured at the Ordinal, Interval, or Ratio Level

Frequency and Percent Distribution Tables
Stem-and-Leaf Displays
Histograms
Line Graphs

Describing the Shape of a Distribution
Accumulating Data

Cumulative Percent Distributions
Ogive Curves
Percentile Ranks
Percentiles
Five-Number Summaries and Boxplots
Modifying the Appearance of Graphs

Summary of Graphical Selection
Summary of Stata Commands
Exercises

3 MEASURES OF LOCATION, SPREAD, AND SKEWNESS

Characterizing the Location of a Distribution

The Mode
The Median
The Arithmetic Mean
Interpreting the Mean of a Dichotomous Variable
The Weighted Mean
Comparing the Mode, Median, and Mean

Characterizing the Spread of a Distribution

The Range and Interquartile Range
The Variance
The Standard Deviation

Characterizing the Skewness of a Distribution
Selecting Measures of Location and Spread
Applying What We Have Learned
Summary of Stata Commands

Helpful Hints When Using Stata

Exercises

4 RE–EXPRESSING VARIABLES

Linear and Nonlinear Transformations
Linear Transformations: Addition, Subtraction, Multiplication, and Division

The Effect on the Shape of a Distribution
The Effect on Summary Statistics of a Distribution
Common Linear Transformations
Standard Scores
z-Scores

Nonlinear Transformations: Square Roots and Logarithms
Nonlinear Transformations: Ranking Variables
Other Transformations: Recoding and Combining Variables

Recoding Variables
Combining Variables

Data Management Fundamentals: The Do-File
Summary of Stata Commands
Exercises

5 EXPLORING RELATIONSHIPS BETWEEN TWO VARIABLES

When Both Variables are at Least Interval-Leveled

Scatterplots
The Pearson Product Moment Correlation Coefficient
Interpreting the Pearson Correlation Coefficient

When at Least One Variable Is Ordinal and the Other Is at Least Ordinal: The Spearman Rank Correlation Coefficient
When at Least One Variable Is Dichotomous: Other Special Cases of the Pearson Correlation Coefficient

The Point Biserial Correlation Coefficient: The Case of One at Least Interval and One Dichotomous Variable
The Phi Coefficient: The Case of Two Dichotomous Variables

Other Visual Displays of Bivariate Relationships
Selection of Appropriate Statistic or Graph to Summarize a Relationship
Summary of Stata Commands
Exercises

6 SIMPLE LINEAR REGRESSION

The “Best-Fitting” Linear Equation
The Accuracy of Prediction Using the Linear Regression Model
The Standardized Regression Equation
R as a Measure of the Overall Fit of the Linear Regression Model
Simple Linear Regression When the Independent Variable Is Dichotomous
Using r and R as Measures of Effect Size
Emphasizing the Importance of the Scatterplot
Summary of Stata Commands
Exercises

7 PROBABILITY FUNDAMENTALS

The Discrete Case
The Complement Rule of Probability
The Additive Rules of Probability

First Additive Rule of Probability
Second Additive Rule of Probability

The Multiplicative Rule of Probability
The Relationship between Independence and Mutual Exclusivity
Conditional Probability
The Law of Total Probability
Bayes' Theorem
The Law of Large Numbers
Exercises

8 THEORETICAL PROBABILITY MODELS

The Binomial Probability Model and Distribution

The Applicability of the Binomial Probability Model

The Normal Probability Model and Distribution
Using the Normal Distribution to Approximate the Binomial Distribution
Summary of Stata Commands
Exercises

9 THE ROLE OF SAMPLING IN INFERENTIAL STATISTICS

Samples and Populations
Random Samples

Obtaining a Simple Random Sample

Sampling with and without Replacement
Sampling Distributions
Describing the Sampling Distribution of Means Empirically
Describing the Sampling Distribution of Means Theoretically
The Central Limit Theorem
Estimators and Bias
Summary of Stata Commands
Exercises

10 INFERENCES INVOLVING THE MEAN OF A SINGLE POPULATION WHEN σ IS KNOWN

Estimating the Population Mean, μ, When the Population Standard Deviation, σ, Is Known
Interval Estimation
Relating the Length of a Confidence Interval, the Level of Confidence, and the Sample Size
Hypothesis Testing
The Relationship between Hypothesis Testing and Interval Estimation
Effect Size
Type II Error and the Concept of Power

Increasing the Level of Significance, α
Increasing the Effect Size, δ
Decreasing the Standard Error of the Mean, σ_𝓍̅

Closing Remarks
Summary of Stata Commands
Exercises

11 INFERENCES INVOLVING THE MEAN WHEN σ IS NOT KNOWN: ONE- AND TWO-SAMPLE DESIGNS

One-Sample Designs When the Parameter of Interest Is the Mean and σ Is Not Known

The t- Distribution
Degrees of Freedom for the One-Sample t-Test
Violating the Assumption of a Normally Distributed Parent Population in the One-Sample t-Test
Confidence Intervals for the One-Sample t-Test
Hypothesis Tests: The One-Sample t-Test
Effect Size for the One-Sample t-Test

Two-Sample Designs When the Parameter of Interest Is μ, and σ Is Not Known

Independent (or Unrelated) and Dependent (or Related) Samples
Independent Samples t-Test and Confidence Interval
The Assumptions of the Independent Samples t-Test
Effect Size for the Independent Samples t-Test
Paired Samples t-Test and Confidence Interval
The Assumptions of the Paired Samples t-Test
Effect Size for the Paired Samples t-Test

The Bootstrap
Conducting Power Analyses for t-Tests on Means
Summary
Summary of Stata Commands
Exercises

12 RESEARCH DESIGN: INTRODUCTION AND OVERVIEW

Questions and their Link to Descriptive, Relational, and Causal Research Studies

The Need for a Good Measure of Our Construct: Weight
The Descriptive Study
From Descriptive to Relational Studies
From Relational to Causal Studies

The Gold Standard of Causal Studies: The True Experiment and Random Assignment
Comparing Two Kidney Stone Treatments Using a Non-Randomized Controlled Study
Including Blocking in a Research Design
Underscoring the Importance of Having a True Control Group Using Randomization
Analytic Methods for Bolstering Claims of Causality from Observational Data (Optional Reading)
Quasi-Experimental Designs

Threats to the Internal Validity of a Quasi-experimental Design
Threats to the External Validity of a Quasi-experimental Design

Threats to the Validity of a Study: Some Clarifications and Caveats
Threats to the Validity of a Study: Some Examples
Exercises

13 ONE-WAY ANALYSIS OF VARIANCE

The Disadvantage of Multiple t-Tests
The One-Way Analysis of Variance

A Graphical Illustration of the Role of Variance in Tests on Means
ANOVA as an Extension of the Independent Samples t-Test
Developing an Index of Separation for the Analysis of Variance
Carrying out the ANOVA Computation
The Assumptions the one-way ANOVA

Testing the Equality of Population Means: The F-Ratio

How to Read the Tables and Use Stata Functions for the F-Distribution

ANOVA Summary Table
Measuring the Effect Size
Post Hoc Multiple Comparison Tests
The Bonferroni Adjustment: Testing Planned Comparisons

The Bonferroni Tests on Multiple Measures

Conducting Power Analyses for One-Way ANOVA
Summary of Stata Commands
Exercises

14 TWO-WAY ANALYSIS OF VARIANCE

The Two-Factor Design
The Concept of Interaction
The Hypotheses That are Tested by a Two-Way Analysis of Variance

Assumptions of the Two-Way Analysis of Variance
Balanced versus Unbalanced Factorial Designs
Partitioning the Total Sum of Squares

Using the F-Ratio to Test the Effects in Two-Way ANOVA
Carrying Out the Two-Way ANOVA Computation by Hand

Decomposing Score Deviations about the Grand Mean
Modeling Each Score as a Sum of Component Parts
Explaining the Interaction As a Joint (or Multiplicative) Effect
Measuring Effect Size

Fixed versus Random Factors
Post Hoc Multiple Comparison Tests

Simple Effects and Pairwise Comparisons

Summary of Steps to Be Taken in a Two-Way ANOVA Procedure
Conducting Power Analyses for Two-Way ANOVA
Summary of Stata Commands
Exercises

15 CORRELATION AND SIMPLE REGRESSION AS INFERENTIAL TECHNIQUES

The Bivariate Normal Distribution
Testing Whether the Population Pearson Product Moment Correlation Equals Zero
Using a Confidence Interval to Estimate the Size of the Population Correlation Coefficient, ρ
Revisiting Simple Linear Regression for Prediction

Estimating the Population Standard Error of Prediction, σ_Υ|Χ
Testing the b Weight for Statistical Significance
Explaining Simple Regression Using an Analysis of Variance Framework
Measuring the Fit of the Overall Regression Equation: Using R and R²
Relating R² to σ²_Υ|Χ
Testing R² for Statistical Significance
Estimating the True Population R²: The Adjusted R²

Exploring the Goodness of Fit of the Regression Equation: Using Regression Diagnostics

Residual Plots: Evaluating the Assumptions Underlying Regression
Detecting Influential Observations: Discrepancy and Leverage
Using Stata to Obtain Leverage
Using Stata to Obtain Discrepancy
Using Stata to Obtain Influence
Using Diagnostics to Evaluate the Ice Cream Sales Example
Using the Prediction Model to Predict Ice Cream Sales

Simple Regression When the Predictor Is Dichotomous
Conducting Power Analyses for Correlation and Simple Regression
Summary of Stata Commands
Exercises

16 AN INTRODUCTION TO MULTIPLE REGRESSION

The Basic Equation with Two Predictors

Equations for b, β, and R_Υ.12 When the Predictors Are Not Correlated
Equations for b, β, and R_Υ.12 When the Predictors Are Correlated

Summarizing and Expanding on Some Important Principles of Multiple Regression

Testing the b Weights for Statistical Significance
Assessing the Relative Importance of the Independent Variables in the Equation
Measuring the Drop in R² Directly: An Alternative to the Squared Semipartial Correlation
Evaluating the Statistical Significance of the Change in R²
The b Weight As a Partial Slope in Multiple Regression

Multiple Regression When One of the Two Independent Variables Is Dichotomous
Controlling Variables Statistically: A Closer Look

A Hypothetical Example

Conducting Power Analyses for Multiple Regression
Summary of Stata Commands
Exercises

17 TWO-WAY INTERACTIONS IN MULTIPLE REGRESSION

Testing the Statistical Significance of an Interaction Using Stata
Comparing the Y-Hat Values from the Additive and Interaction Models
Centering First-Order Effects if the Equation Has an Interaction
Probing the Nature of a Two-Way Interaction
Interaction When One of the Independent Variables Is Dichotomous and the Other Is Continuous
Methods Useful for Model Selection
Conducting a Power Analysis to Detect an Interaction
Summary of Stata Commands
Exercises

18 NON-PARAMETRIC METHODS

Parametric versus Non-parametric Methods
Non-parametric Methods When the Dependent Variable Is at the Nominal Level
The Chi-Square Distribution (Χ²)

The Chi-Square Goodness-of-Fit Test
The Chi-Square Test of Independence
Assumptions of the Chi-Square Test of Independence

Fisher’s Exact Test

Calculating the Fisher Exact Test by Hand Using the Hypergeometric Distribution

Non-parametric Methods When the Dependent Variable Is Ordinal-Leveled

Wilcoxon Sign Test
The Mann–Whitney U-Test or Wilcoxon's Rank Sum Test
The Kruskal–Wallis Analysis of Variance

Summary of Stata Commands
Exercises

19 CUSTOMIZING AND EXPORTING TABLES TO MICROSOFT WORD AND EXCEL USING THE NEW TABLE COMMAND

Introduction
Setting the Working Directory as a First Step
Customizing a One-Way Table and Exporting it to Microsoft Word and Excel
Customizing a Two-Way Table and Exporting it to Microsoft Word and Excel
Customizing a Table of Univariate Summary Statistics and Exporting It to Microsoft Word and Excel
Customizing a Correlation Table and Exporting it to Microsoft Word and Excel

Correlation Table without Significance Levels and p-Values
Correlation Table with Significance Levels and p-Values

Customizing and Exporting Tables of Regression Results

A Single Regression Equation Table
A Comparative Regression Equation Table

Conclusion
Summary of Stata Commands
Exercises

20 ACCESSING DATA FROM PUBLIC-USE SOURCES

Data, Data Everywhere
What Makes for a Good Research Question
Desirable Features of Public-Use Data Sets
Accessing Publicly Available Data Sets
Accessing, Understanding, and Analyzing Data: An Illustrative Example Using a National Household Education Services (NHES) Program Data Set
Positive Features of the NHES Data Set
Analyzing Our NHES Data Using Stata

Preparing Our Data for Analysis
Using Descriptive Statistics to Describe and Explore Our Data
Using Regression to Answer Our Two Research Questions
A Nuanced Interpretation of Results Based on the Significant Interaction Effect

Exercises

Appendix A Data Set Descriptions

Appendix B Stata .Do-files and Data Sets in Stata Format

Appendix C Statistical Tables

Appendix D Solutions

References

Index

Statistics Using Stata: An Integrative Approach, Third Edition

Comment from the Stata technical group

Table of contents

We use cookies

Privacy policy

Required cookies

Advertising and performance cookies

Stata/MP4 Annual License (download)

Statistics Using Stata: An Integrative Approach, Third Edition

Comment from the Stata technical group

Table of contents

We use cookies

Privacy policy

Required cookies

Advertising and performance cookies