Statistics Using Stata: An Integrative Approach, Third Edition 

Comment from the Stata technical groupStatistics Using Stata: An Integrative Approach, Third Edition, by Sharon Lawner Weinberg, Sarah Knapp Abramowitz, and Daphna Harel, is an excellent introduction to applied statistics and its implementation in Stata. The authors cover essential topics from exploratory data analysis to multiple regression, interweaving statistical concepts and their application in Stata. Their repeated use of real data throughout the book clearly connects the statistical concepts to realworld applications. Designed for teaching graduate and undergraduate students from the behavioral, social, and health sciences, this text is accompanied by additional resources online such as Powerpoint slides and Stata dofiles. Each chapter concludes with exercises and a review of Stata code used in the examples, allowing readers to test their knowledge and refer back to Stata commands. The authors guide the reader from basic statistical concepts to more advanced material, tying concepts together to emphasize the overarching ideas. They begin with descriptive statistics, discussing the different variable types and the corresponding graphs and statistics used to examine their distribution and relationship with other variables. Then, they discuss the law of large numbers, theoretical probability distributions, and sampling, preparing the reader to dive into inferential statistics. The authors then present ANOVA, simple and multiple regression, and nonparametric methods. They carefully explain what the values represent in context of the data and how the methods relate to one another, allowing readers to really grasp the meaning behind the analyses. Weinberg, Abramowitz, and Harel are just as careful when teaching the reader how to implement statistical methods in Stata. First, they introduce the reader to Stata's interface and the general syntax of Stata’s commands. Then, they explain the importance of dofiles for reproducing one’s work and encourage the reader to work alongside the text with the dofiles provided at the companion website. Readers can then use these dofiles as a starting point when performing analyses on their own data. The authors have updated the third edition based on Stata 17. An entirely new chapter is devoted to creating, customizing, and exporting tables with the table and collect suite of commands. Additionally, a new chapter is devoted to accessing publicuse data. The authors demonstrate how to access, clean, and analyze a publicly available dataset.  
Table of contentsView table of contents >> Preface
New to the Third Edition
Guiding Principles Underlying Our Approach Overview of Content Coverage and Intended Audience Acknowledgments
1 INTRODUCTION
The Role of the Computer in Data Analysis
Statistics: Descriptive and Inferential Variables and Constants The Measurement of Variables
Nominal Level
Discrete and Continuous VariablesOrdinal Level Interval Level Ratio Level Choosing a Scale of Measurement Setting a Context with Real Data Exercises 2 EXAMINING UNIVARIATE DISTRIBUTIONS
Counting the Occurrence of Data Values
When Variables are Measured at the Nominal Level
Frequency and Percent Distribution Tables
When Variables are Measured at the Ordinal, Interval, or Ratio LevelBar Charts Pie Charts
Frequency and Percent Distribution Tables
Describing the Shape of a DistributionStemandLeaf Displays Histograms Line Graphs Accumulating Data
Cumulative Percent Distributions
Summary of Graphical SelectionOgive Curves Percentile Ranks Percentiles FiveNumber Summaries and Boxplots Modifying the Appearance of Graphs Summary of Stata Commands Exercises 3 MEASURES OF LOCATION, SPREAD, AND SKEWNESS
Characterizing the Location of a Distribution
The Mode
Characterizing the Spread of a DistributionThe Median The Arithmetic Mean Interpreting the Mean of a Dichotomous Variable The Weighted Mean Comparing the Mode, Median, and Mean
The Range and Interquartile Range
Characterizing the Skewness of a DistributionThe Variance The Standard Deviation Selecting Measures of Location and Spread Applying What We Have Learned Summary of Stata Commands
Helpful Hints When Using Stata
Exercises4 RE–EXPRESSING VARIABLES
Linear and Nonlinear Transformations
Linear Transformations: Addition, Subtraction, Multiplication, and Division
The Effect on the Shape of a Distribution
Nonlinear Transformations: Square Roots and LogarithmsThe Effect on Summary Statistics of a Distribution Common Linear Transformations Standard Scores zScores Nonlinear Transformations: Ranking Variables Other Transformations: Recoding and Combining Variables
Recoding Variables
Data Management Fundamentals: The DoFileCombining Variables Summary of Stata Commands Exercises 5 EXPLORING RELATIONSHIPS BETWEEN TWO VARIABLES
When Both Variables are at Least IntervalLeveled
Scatterplots
The Pearson Product Moment Correlation Coefficient Interpreting the Pearson Correlation Coefficient
When at Least One Variable Is Ordinal and the Other Is at Least Ordinal: The
Spearman Rank Correlation Coefficient
When at Least One Variable Is Dichotomous: Other Special Cases of the Pearson Correlation Coefficient
The Point Biserial Correlation Coefficient: The Case of One at Least
Interval and One Dichotomous Variable
Other Visual Displays of Bivariate RelationshipsThe Phi Coefficient: The Case of Two Dichotomous Variables Selection of Appropriate Statistic or Graph to Summarize a Relationship Summary of Stata Commands Exercises 6 SIMPLE LINEAR REGRESSION
The “BestFitting” Linear Equation
The Accuracy of Prediction Using the Linear Regression Model The Standardized Regression Equation R as a Measure of the Overall Fit of the Linear Regression Model Simple Linear Regression When the Independent Variable Is Dichotomous Using r and R as Measures of Effect Size Emphasizing the Importance of the Scatterplot Summary of Stata Commands Exercises 7 PROBABILITY FUNDAMENTALS
The Discrete Case
The Complement Rule of Probability The Additive Rules of Probability
First Additive Rule of Probability
The Multiplicative Rule of ProbabilitySecond Additive Rule of Probability The Relationship between Independence and Mutual Exclusivity Conditional Probability The Law of Total Probability Bayes' Theorem The Law of Large Numbers Exercises 8 THEORETICAL PROBABILITY MODELS
The Binomial Probability Model and Distribution
The Applicability of the Binomial Probability Model
The Normal Probability Model and DistributionUsing the Normal Distribution to Approximate the Binomial Distribution Summary of Stata Commands Exercises 9 THE ROLE OF SAMPLING IN INFERENTIAL STATISTICS
Samples and Populations
Random Samples
Obtaining a Simple Random Sample
Sampling with and without ReplacementSampling Distributions Describing the Sampling Distribution of Means Empirically Describing the Sampling Distribution of Means Theoretically The Central Limit Theorem Estimators and Bias Summary of Stata Commands Exercises 10 INFERENCES INVOLVING THE MEAN OF A SINGLE POPULATION
WHEN σ IS KNOWN
Estimating the Population Mean, μ, When the Population Standard Deviation,
σ, Is Known
Interval Estimation Relating the Length of a Confidence Interval, the Level of Confidence, and the Sample Size Hypothesis Testing The Relationship between Hypothesis Testing and Interval Estimation Effect Size Type II Error and the Concept of Power
Increasing the Level of Significance, α
Closing RemarksIncreasing the Effect Size, δ Decreasing the Standard Error of the Mean, σ_{𝓍̅} Summary of Stata Commands Exercises 11 INFERENCES INVOLVING THE MEAN WHEN σ IS NOT
KNOWN: ONE AND TWOSAMPLE DESIGNS
OneSample Designs When the Parameter of Interest Is the Mean and σ
Is Not Known
The t Distribution
TwoSample Designs When the Parameter of Interest Is μ, and σ Is
Not KnownDegrees of Freedom for the OneSample tTest Violating the Assumption of a Normally Distributed Parent Population in the OneSample tTest Confidence Intervals for the OneSample tTest Hypothesis Tests: The OneSample tTest Effect Size for the OneSample tTest
Independent (or Unrelated) and Dependent (or Related) Samples
The BootstrapIndependent Samples tTest and Confidence Interval The Assumptions of the Independent Samples tTest Effect Size for the Independent Samples tTest Paired Samples tTest and Confidence Interval The Assumptions of the Paired Samples tTest Effect Size for the Paired Samples tTest Conducting Power Analyses for tTests on Means Summary Summary of Stata Commands Exercises 12 RESEARCH DESIGN: INTRODUCTION AND OVERVIEW
Questions and their Link to Descriptive, Relational, and Causal Research
Studies
The Need for a Good Measure of Our Construct: Weight
The Gold Standard of Causal Studies: The True Experiment and Random AssignmentThe Descriptive Study From Descriptive to Relational Studies From Relational to Causal Studies Comparing Two Kidney Stone Treatments Using a NonRandomized Controlled Study Including Blocking in a Research Design Underscoring the Importance of Having a True Control Group Using Randomization Analytic Methods for Bolstering Claims of Causality from Observational Data (Optional Reading) QuasiExperimental Designs
Threats to the Internal Validity of a Quasiexperimental Design
Threats to the Validity of a Study: Some Clarifications and CaveatsThreats to the External Validity of a Quasiexperimental Design Threats to the Validity of a Study: Some Examples Exercises 13 ONEWAY ANALYSIS OF VARIANCE
The Disadvantage of Multiple tTests
The OneWay Analysis of Variance
A Graphical Illustration of the Role of Variance in Tests on Means
Testing the Equality of Population Means: The FRatioANOVA as an Extension of the Independent Samples tTest Developing an Index of Separation for the Analysis of Variance Carrying out the ANOVA Computation The Assumptions the oneway ANOVA
How to Read the Tables and Use Stata Functions for the
FDistribution
ANOVA Summary TableMeasuring the Effect Size Post Hoc Multiple Comparison Tests The Bonferroni Adjustment: Testing Planned Comparisons
The Bonferroni Tests on Multiple Measures
Conducting Power Analyses for OneWay ANOVASummary of Stata Commands Exercises 14 TWOWAY ANALYSIS OF VARIANCE
The TwoFactor Design
The Concept of Interaction The Hypotheses That are Tested by a TwoWay Analysis of Variance
Assumptions of the TwoWay Analysis of Variance
Using the FRatio to Test the Effects in TwoWay ANOVABalanced versus Unbalanced Factorial Designs Partitioning the Total Sum of Squares Carrying Out the TwoWay ANOVA Computation by Hand
Decomposing Score Deviations about the Grand Mean
Fixed versus Random FactorsModeling Each Score as a Sum of Component Parts Explaining the Interaction As a Joint (or Multiplicative) Effect Measuring Effect Size Post Hoc Multiple Comparison Tests
Simple Effects and Pairwise Comparisons
Summary of Steps to Be Taken in a TwoWay ANOVA ProcedureConducting Power Analyses for TwoWay ANOVA Summary of Stata Commands Exercises 15 CORRELATION AND SIMPLE REGRESSION AS INFERENTIAL
TECHNIQUES
The Bivariate Normal Distribution
Testing Whether the Population Pearson Product Moment Correlation Equals Zero Using a Confidence Interval to Estimate the Size of the Population Correlation Coefficient, ρ Revisiting Simple Linear Regression for Prediction
Estimating the Population Standard Error of Prediction, σ_{ΥΧ}
Exploring the Goodness of Fit of the Regression Equation: Using Regression
DiagnosticsTesting the b Weight for Statistical Significance Explaining Simple Regression Using an Analysis of Variance Framework Measuring the Fit of the Overall Regression Equation: Using R and R^{2} Relating R^{2} to σ^{2}_{ΥΧ } Testing R^{2} for Statistical Significance Estimating the True Population R^{2}: The Adjusted R^{2}
Residual Plots: Evaluating the Assumptions Underlying Regression
Simple Regression When the Predictor Is DichotomousDetecting Influential Observations: Discrepancy and Leverage Using Stata to Obtain Leverage Using Stata to Obtain Discrepancy Using Stata to Obtain Influence Using Diagnostics to Evaluate the Ice Cream Sales Example Using the Prediction Model to Predict Ice Cream Sales Conducting Power Analyses for Correlation and Simple Regression Summary of Stata Commands Exercises 16 AN INTRODUCTION TO MULTIPLE REGRESSION
The Basic Equation with Two Predictors
Equations for b, β, and R_{Υ.12} When the
Predictors Are Not Correlated
Summarizing and Expanding on Some Important Principles of Multiple Regression
Equations for b, β, and R_{Υ.12} When the Predictors Are Correlated
Testing the b Weights for Statistical Significance
Multiple Regression When One of the Two Independent Variables Is Dichotomous
Assessing the Relative Importance of the Independent Variables in the Equation Measuring the Drop in R^{2} Directly: An Alternative to the Squared Semipartial Correlation Evaluating the Statistical Significance of the Change in R^{2} The b Weight As a Partial Slope in Multiple Regression Controlling Variables Statistically: A Closer Look
A Hypothetical Example
Conducting Power Analyses for Multiple RegressionSummary of Stata Commands Exercises 17 TWOWAY INTERACTIONS IN MULTIPLE REGRESSION
Testing the Statistical Significance of an Interaction Using Stata
Comparing the YHat Values from the Additive and Interaction Models Centering FirstOrder Effects if the Equation Has an Interaction Probing the Nature of a TwoWay Interaction Interaction When One of the Independent Variables Is Dichotomous and the Other Is Continuous Methods Useful for Model Selection Conducting a Power Analysis to Detect an Interaction Summary of Stata Commands Exercises 18 NONPARAMETRIC METHODS
Parametric versus Nonparametric Methods
Nonparametric Methods When the Dependent Variable Is at the Nominal Level The ChiSquare Distribution (Χ^{2})
The ChiSquare GoodnessofFit Test
Fisher’s Exact TestThe ChiSquare Test of Independence Assumptions of the ChiSquare Test of Independence
Calculating the Fisher Exact Test by Hand Using the Hypergeometric
Distribution
Nonparametric Methods When the Dependent Variable Is OrdinalLeveled
Wilcoxon Sign Test
Summary of Stata CommandsThe Mann–Whitney UTest or Wilcoxon's Rank Sum Test The Kruskal–Wallis Analysis of Variance Exercises 19 CUSTOMIZING AND EXPORTING TABLES TO MICROSOFT WORD AND EXCEL USING THE NEW TABLE COMMAND
Introduction
Setting the Working Directory as a First Step Customizing a OneWay Table and Exporting it to Microsoft Word and Excel Customizing a TwoWay Table and Exporting it to Microsoft Word and Excel Customizing a Table of Univariate Summary Statistics and Exporting It to Microsoft Word and Excel Customizing a Correlation Table and Exporting it to Microsoft Word and Excel
Correlation Table without Significance Levels and pValues
Customizing and Exporting Tables of Regression ResultsCorrelation Table with Significance Levels and pValues
A Single Regression Equation Table
ConclusionA Comparative Regression Equation Table Summary of Stata Commands Exercises 20 ACCESSING DATA FROM PUBLICUSE SOURCES
Data, Data Everywhere
What Makes for a Good Research Question Desirable Features of PublicUse Data Sets Accessing Publicly Available Data Sets Accessing, Understanding, and Analyzing Data: An Illustrative Example Using a National Household Education Services (NHES) Program Data Set Positive Features of the NHES Data Set Analyzing Our NHES Data Using Stata
Preparing Our Data for Analysis
Exercises Using Descriptive Statistics to Describe and Explore Our Data Using Regression to Answer Our Two Research Questions A Nuanced Interpretation of Results Based on the Significant Interaction Effect Appendix A Data Set Descriptions
Appendix B Stata .Dofiles and Data Sets in Stata Format
Appendix C Statistical Tables
Appendix D Solutions
References
Index

