What’s Different about This Book

Working with Data in the “Active Learning Exercises”

Acknowledgments

Notation

Part I INTRODUCTION AND STATISTICS REVIEW

Chapter 1 INTRODUCTION

1.1 Preliminaries

1.2 Example: Is Growth Good for the Poor?

1.3 What’s to Come

ALE 1a: An Econometrics “Time Capsule”

ALE 1b: Investigating the Slope Graphically Using a Scatterplot

ALE 1c: Examining Some Disturbing Variations on Dollar & Kraay’s
Model

ALE 1d: The Pitfalls of Making Scatterplots with Trended Time-Series
Data

Chapter 2 A REVIEW OF PROBABILITY THEORY

2.1 Introduction

2.2 Random Variables

2.3 Discrete Random Variables

2.4 Continuous Random Variables

2.5 Some Initial Results on Expectations

2.6 Some Results on Variances

2.7 A Pair of Random Variables

2.8 The Linearity Property of Expectations

2.9 Statistical Independence

2.10 Normally Distributed Random Variables

2.11 Three Special Properties of Normally Distributed Variables

2.12 Distribution of a Linear Combination of Normally Distributed
Random Variables

2.13 Conclusion

Exercises

ALE 2a: The Normal Distribution

ALE 2b: Central Limit Theorem Simulators on the Web

Appendix 2.1: The Conditional Mean of a Random Variable

Appendix 2.2: Proof of the Linearity Property for the Expectation
of a Weighted Sum of Two Discretely Distributed Random Variables

Chapter 3 ESTIMATING THE MEAN OF A NORMALLY DISTRIBUTED RANDOM VARIABLE

3.1 Introduction

3.2 Estimating μ by Curve Fitting

3.3 The Sampling Distribution of Y-bar

3.4 Consistency — A First Pass

3.5 Unbiasedness and the Optimal Estimator

3.6 The Squared Error Loss Function and the Optimal Estimator

3.7 The Feasible Optimality Properties: Efficiency and BLUness

3.8 Summary

3.9 Conclusions and Lead-in to Next Chapter

Exercises

ALE 3a: Investigating the Consistency of the Sample Mean and Sample
Variance Using Computer-Generated Data

ALE 3b: Estimating Means and Variances Regarding the Standard &
Poor's SP500 Stock Index

Chapter 4 STATISTICAL INFERENCE ON THE MEAN OF A NORMALLY DISTRIBUTED RANDOM VARIABLE

4.1 Introduction

4.2 Standardizing the distribution of Y-bar

4.3 Confidence Intervals for μ When σ^{2} Is Known

4.4 Hypothesis Testing when σ^{2} Is Known

4.5 Using S^{2} to Estimate σ^{2} (and Introducing the
Chi-Squared Distribution)

4.6 Inference Results on μ When σ^{2} is Unknown (and Introducing the
Student's t Distribution)

4.7 Application: State-Level U.S. Unemployment Rates

4.8 Introduction to Diagnostic Checking: Testing the Constancy of μ
across the Sample

4.9 Introduction to Diagnostic Checking: Testing the Constancy of σ^{2}
across the Sample

4.10 Some General Comments on Diagnostic Checking

4.11 Closing Comments

Exercises

ALE 4a: Investigating the Sensitivity of Hypothesis Test
*p*-Values to Departures from the NIID (μ, σ^{2}) Assumption
Using Computer-Generated Data

ALE 4b: Individual Income Data from the Panel Study on Income
Dynamics (PSID) — Does Birth-Month Matter?

Part II REGRESSION ANALYSIS

Chapter 5 THE BIVARIATE REGRESSION MODEL: INTRODUCTION, ASSUMPTIONS, AND
PARAMETER ESTIMATES

5.1 Introduction

5.2 The Transition from Mean Estimation to Regression: Analyzing the
Variation of Per Capita Real Output across Countries

5.3 The Bivariate Regression Model — Its Form and the “Fixed in
Repeated Samples” Causality Assumption

5.4 The Assumptions on the Model Error Term, *U*_{i}

5.5 Least Squares Estimation of α and β

5.6 Interpreting the Least Squares Estimates of α and β

5.7 Bivariate Regression with a Dummy Variable: Quantifying the
Impact of College Graduation on Weekly Earnings

Exercises

ALE 5a: Exploring the Penn World Table Data

ALE 5b: Verifying α-hat^{*}_{ols} and β-hat^{*}_{ols} over a Very Small Data Set

ALE 5c: Extracting and Downloading CPS Data from the Census Bureau
Web Site

ALE 5d: Verifying that β-hat^{*}_{ols} on a Dummy Variable Equals the Difference
in the Sample Means

Appendix 5.1: β-hat^{*}_{ols} When *x*_{i} Is a Dummy Variable

Chapter 6 THE BIVARIATE LINEAR REGRESSION MODEL: SAMPLING DISTRIBUTIONS
AND ESTIMATOR PROPERTIES

6.1 Introduction

6.2 Estimates and Estimators

6.3 β-hat as a Linear Estimator and the Least Squares Weights

6.4 The Sampling Distribution of β-hat

6.5 Properties of β-hat: Consistency

6.6 Properties of β-hat: Best Linear Unbiasedness

6.7 Summary

Exercises

ALE 6a: Outliers and Other Perhaps Overly Influential Observations:
Investigating the Sensitivity of β-hat to an Outlier Using
Computer-Generated Data

ALE 6b: Investigating the Consistency of β-hat Using Computer-Generated
Data

Chapter 7 THE BIVARIATE LINEAR REGRESSION MODEL: INFERENCE ON β

7.1 Introduction

7.2 A Statistic for β with a Known Distribution

7.3 A 95% Confidence Interval for β with σ^{2} Given

7.4 Estimates versus Estimators and the Role of the Model Assumptions

7.5 Testing a Hypothesis about β with σ^{2} Given

7.6 Estimating σ^{2}

7.7 Properties of S^{2}

7.8 A Statistic for β Not Involving σ^{2}

7.9 A 95% Confidence Interval for β with σ^{2} Unknown

7.10 Testing a Hypothesis about β with σ^{2} Unknown

7.11 Application: The Impact of College Graduation on Weekly Earnings (Inference Results)

7.12 Application: Is Growth Good for the Poor?

7.13 Summary

Exercises

ALE 7a: Investigating the Sensitivity of Slope Coefficient Inference
to Departures from the *U*_{i}~NIID(O, σ^{2}) Assumption
Using Computer-Generated Data

ALE 7b: Distorted Inference in Time-Series Regressions with Serially
Correlated Model Errors: An Investigation Using Computer-Generated
Data

Appendix 7.1: Proof That S^{2} Is Independent of β-hat

Chapter 8 THE BIVARIATE REGRESSION MODEL: R^{2} AND PREDICTION

8.1 Introduction

8.2 Quantifying How Well the Model Fits the Data

8.3 Prediction as a Tool for Model Variation

8.4 Predicting *Y*_{N+1} given *x*_{N+1}

Exercises

ALE 8a: On the Folly of Trying Too Hard: A Simple Example of
"Data Mining"

Chapter 9 THE MULTIPLE REGRESSION MODEL

9.1 Introduction

9.2 The Multiple Regression Model

9.3 Why the Multiple Regression Model is Necessary and Important

9.4 Multiple Regression Parameter Estimates via Least Squares Fitting

9.5 Properties and Sampling Distribution of β-hat_{ols,1}...β-hat_{ols,k}

9.6 Overelaborate Multiple Regression Models

9.7 Underelaborate Multiple Regression Models

9.8 Application: The Curious Relationship between Marriage and Death

9.9 Multicollinearity

9.10 Application: The Impact of College Graduation and Gender on
Weekly Earnings

9.11 Application: Vote Fraud in Philadelphia Senatorial Elections

Exercises

ALE 9a: A Statistical Examination of the Florida Voting in the
November 2000 Presidential Election — Did Mistaken Votes for Pat Buchanan Swing the Election from Gore to Bush?

ALE 9b: Observing and Interpreting the Symptoms of Multicollinearity

ALE 9c: The Market Value of a Bathroom in Georgia

Appendix 9.1: Prediction Using the Multiple Regression Model

Chapter 10 DIAGNOSTICALLY CHECKING AND RESPECIFYING THE MULTIPLE REGRESSION
MODEL: DEALING WITH POTENTIAL OUTLIERS AND HETEROSCEDASTICITY IN THE
CROSS-SECTIONAL DATA CASE

10.1 Introduction

10.2 The Fitting Errors as Large-Sample Estimates of the Model
Errors, *U*_{1}...*U*_{N}

10.3 Reasons for Checking the Normality of the Model Errors,
*U*_{1}...*U*_{N}

10.4 Heteroscedasticity and Its Consequences

10.5 Testing for Heteroscedasticity

10.6 Correcting for Heteroscedasticity of Known Form

10.7 Correcting for Heteroscedasticity of Unknown Form

10.8 Application: Is Growth Good for the Poor? Diagnostically
Checking the Dollar/Kraay (2002) Model.

Exercises

ALE 10a: The Fitting Errors as Approximates for the Model Errors

ALE 10b: Does Output Per Person Depend on Human Capital? (A Test of
the Augmented Solow Model of Growth)

ALE 10c: Is Trade Good or Bad for the Environment? (First Pass)

Chapter 11 STOCHASTIC REGRESSORS AND ENDOGENEITY

11.1 Introduction

11.2 Unbiasedness of the OLS Slope Estimator with a Stochastic
Regressor Independent of the Model Error

11.3 A Brief Introduction to Asymptotic Theory

11.4 Asymptotic Results for the OLS Slope Estimator with a Stochastic
Regressor

11.5 Endogenous Regressors: Omitted Variables

11.6 Endogenous Regressors: Measurement Error

11.7 Endogenous Regressors: Joint Determination — Introduction to
Simultaneous Equation Macroeconomic and Microeconomic Models

11.8 How Large a Sample Is “Large Enough”? The Simulation Alternative

11.9 An Example: Bootstrapping the Angrist–Krueger (1991) Model

Exercises

ALE 11a: Central Limit Theorem Convergence for β-hat^{OLS} in the Bivariate
Regression Model

ALE 11b: Bootstrap Analysis of the Convergence of the Asymptotic
Sampling Distributions for Multiple Regression Model Parameter
Estimators

Appendix 11.1: The Algebra of Probability Limits

Appendix 11.2: Derivation of the Asymptotic Sampling Distribution
of the OLS Slope Estimator

Chapter 12 INSTRUMENTAL VARIABLES ESTIMATION

12.1 Introduction — Why It Is Challenging to Test for Endogeneity

12.2 Correlation versus Causation — Two Ways to Untie the Knot

12.3 The Instrumental Variables Slope Estimator (and Proof of Its
Consistency) in the Bivariate Regression Model

12.4 Inference Using the Instrumental Variables Slope Estimator

12.5 The Two-Stage Least Squares Estimator for the Overidentified
Case

12.6 Application: The Relationship between Education and Wages
(Angrist and Krueger, 1991)

Exercises

ALE 12a: The Role of Institutions "Rule of Law" in Economic Growth

ALE 12b: Is Trade Good or Bad for the Environment? (Completion)

ALE 12c: The Impact of Military Service on the Smoking Behavior
of Veterans

ALE 12d: The Effect of Measurement-Error Contamination on OLS
Regression Estimates and the Durbin/Bartlett IV Estimators

Appendix 12.1: Derivation of the Asymptotic Sampling Distribution of
the Instrumental Variables Slope Estimator

Appendix 12.2: Proof That the 2SLS Composite Instrument is
Asymptotically Uncorrelated with the Model Error Term

Chapter 13 DIAGNOSTICALLY CHECKING AND RESPECIFYING THE MULTIPLE REGRESSION
MODEL: THE TIME-SERIES DATA CASE (PART A)

13.1 An Introduction to Time-Series Data, with a "Road Map" for this
Chapter

13.2 The Bivariate Time-Series Regression Model with Fixed
Regressors but Serially Correlated Model Errors,
U_{1} ... U_{T}

13.3 Disastrous Parameter Inference with Correlated Model Errors:
Two Cautionary Examples Based on U.S. Consumption Expenditures
Data

13.4 The AR(1) Model for Serial Dependence in a Time-Series

13.5 The Consistency of φ-hat_{1}^{OLS} as an Estimator of φ_{1} in the AR(1) Model
and Its Asymptotic Distribution

13.6 Application of the AR(1) Model to the Errors of the
(Detrended) U.S. Consumption Function — and a Straightforward
Test for Serially Correlated Regression Errors

13.7 Dynamic Model Respecification: An Effective Response to
Serially Correlated Regression Model Errors, with an Application
to the (Detrended) U.S. Consumption Function

Exercises

Appendix 13.1: Derivation of the Asymptotic Sampling Distribution
of φ-hat_{1}^{OLS} in the AR(1) Model

Chapter 14 DIAGNOSTICALLY CHECKING AND RESPECIFYING THE MULTIPLE REGRESSION
MODEL: THE TIME-SERIES DATA CASE (PART B)

14.1 Introduction: Generalizing the Results to Multiple Time-Series

14.2 The Dynamic Multiple Regression Model

14.3 I(1) or “Random Walk” Time-Series

14.4 Capstone Example Part 1: Modeling Monthly U.S. Consumption
Expenditures in Growth Rates

14.5 Capstone Example Part 2: Modeling Monthly U.S. Consumption
Expenditures in Growth Rates and Levels (Cointegrated Model)

14.6 Capstone Example Part 3: Modeling the Level of Monthly U.S.
Consumption Expenditures

14.7 Which is Better: To Model in Levels or to Model in Changes?

Exercises

ALE 14a: Analyzing the Food Price Sub-Index of the Monthly U.S.
Consumer Price Index

ALE 14b: Estimating Taylor Rules for How the U.S. Fed Sets Interest
Rates

PART III ADDITIONAL TOPICS IN REGRESSION ANALYSIS

Chapter 15 REGRESSION MODELING WITH PANEL DATA (PART A)

15.1 Introduction: A Source of Large (but Likely Heterogeneous)
Data Sets

15.2 Revisiting the Chapter 5 Illustrative Example Using Data from
the Penn World Table

15.3 A Multivariate Empirical Example

15.4 The Fixed Effects and the Between Effects Models

15.5 The Random Effects Model

15.6 Diagnostic Checking of an Estimated Panel Data Model

Exercises

Appendix 15.1: Stata Code for the Generalized Hausman Test

Chapter 16 REGRESSION MODELING WITH PANEL DATA (PART B)

16.1 Relaxing Strict Exogeneity: Dynamics and Lagged Dependent
Variables

16.2 Relaxing Strict Exogeneity: The First-Differences Model

16.3 Summary

Exercises

ALE 16a: Assessing the Impact of 4-H Participation on the
Standardized Test Scores of Florida Schoolchildren

ALE 16b: Using Panel Data Methods to Reanalyze Data from a Public
Goods Experiment

Chapter 17 A CONCISE INTRODUCTION TO TIME-SERIES ANALYSIS AND FORECASTING
(PART A)

17.1 Introduction: The Difference between Time-Series Analysis
and Time-Series Econometrics

17.2 Optimal Forecasts: The Primacy of the Conditional-Mean Forecast
and When It Is Better to Use a Biased Forecast

17.3 The Crucial Assumption (Stationarity) and the Fundamental
Tools: The Time-Plot and the Sample Correlogram

17.4 A Polynomial in the Lag Operator and Its Inverse: The Key to
Understanding and Manipulating Linear Time-Series Models

17.5 Identification/Estimation/Checking/Forecasting of an
Invertible MA(*q*) Model

17.6 Identification/Estimation/Checking/Forecasting of a Stationary
AR(*p*) Model

17.7 ARMA(*p,q*) Models and a Summary of the Box–Jenkins
Modeling Algorithm

Exercises

ALE 17a: Conditional Forecasting Using a Large-Scale
Macroeconometric Model

ALE 17b: Modeling U.S. GNP

Chapter 18 A CONCISE INTRODUCTION TO TIME-SERIES ANALYSIS AND FORECASTING
(PART B)

18.1 Integrated — ARIMA(*p,d,q*) — Models and “Trend like”
Behavior

18.2 A Univariate Application: Modeling the Monthly U.S. Treasury
Bill Rate

18.3 Seasonal Time-Series Data and ARMA Deseasonalization of the
U.S. Total Nonfarm Payroll Time-Series

18.4 Multivariate Time-Series Models

18.5 Post-Sample Model Forecast Evaluation and Testing for
Granger-Causation

18.6 Modeling Nonlinear Serial Dependence in a Time-Series

18.7 Additional Topics in Forecasting

Exercises

ALE 18a: Modeling the South Korean Won–U.S. Dollar Exchange Rate

ALE 18b: Modeling the Daily Returns to Ford Motor Company Stock

Chapter 19 PARAMETER ESTIMATION BEYOND CURVE-FITTING: MLE (WITH AN APPLICATION
TO BINARY-CHOICE MODELS) AND GMM (WITH AN APPLICATION TO IV
REGRESSION)

19.1 Introduction

19.2 Maximum Likelihood Estimation of a Simple Bivariate Regression
Model

19.3 Maximum Likelihood Estimation of Binary-Choice Regression
Models

19.4 Generalized Method of Moments (GMM) Estimation

Exercises

ALE 19a: Probit Modeling of the Determinants of Labor Force
Participation

Appendix 19.1: GMM Estimation of β in the Bivariate Regression
Model (Optimal Penalty-Weights and Sampling Distribution)

Chapter 20 CONCLUDING COMMENTS

20.1 The Goals of This Book

20.2 Diagnostic Checking and Model Respecification

20.3 The Four “Big Mistakes”

Mathematics Review

Index