2.1 An Economic Model

2.2 An Econometric Model

2.2.1 Data Generating Process

2.2.2 The Random Error and Strict Exogeneity

2.2.3 The Regression Function

2.2.4 Random Error Variation

2.2.5 Variation in *x*

2.2.6 Error Normality

2.2.7 Generalizing the Exogeneity Assumption

2.2.8 Error Correlation

2.2.9 Summarizing the Assumptions

2.3 Estimating the Regression Parameters

2.3.1 The Least Squares Principle

2.3.2 Other Economic Models

2.4 Assessing the Least Squares Estimators

2.4.1 The Estimator *b*_{2}

2.4.2 The Expected Values of *b*_{1} and *b*_{2}

2.4.3 Sampling Variation

2.4.4 The Variances and Covariance of *b*_{1} and *b*_{2}

2.5 The Gauss–Markov Theorem

2.6 The Probability Distributions of the Least Squares Estimators

2.7 Estimating the Variance of the Error Term

2.7.1 Estimating the Variances and Covariance of the Least Squares Estimators

2.7.2 Interpreting the Standard Errors

2.8 Estimating Nonlinear Relationships

2.8.1 Quadratic Functions

2.8.2 Using a Quadratic Model

2.8.3 A Log-Linear Function

2.8.4 Using a Log-Linear Model

2.8.5 Choosing a Functional Form

2.9 Regression with Indicator Variables

2.10 The Independent Variable

2.10.1 Random and Independent *x*

2.10.2 Random and Strictly Exogenous *x*

2.10.3 Random Sampling

2.11 Exercises

2.11.1 Problems

2.11.2 Computer Exercises

Appendix 2A Derivation of the Least Squares Estimates

Appendix 2B Deviation from the Mean Form of

*b*_{2}
Appendix 2C

*b*_{2} Is a Linear Estimator

Appendix 2D Derivation of Theoretical Expression for

*b*_{2}
Appendix 2E Deriving the Conditional Variance of

*b*_{2}
Appendix 2F Proof of the Gauss–Markov Theorem

Appendix 2G Proofs of Results Introduced in Section 2.10

2G.1 The Implications of Strict Exogeneity

2G.2 The Random and Independent *x* Case

2G.3 The Random and Strictly Exogenous *x* Case

2G.4 Random Sampling

Appendix 2H Monte Carlo Simulation

2H.1 The Regression Function

2H.2 The Random Error

2H.3 Theoretically True Values

2H.4 Creating a Sample of Data

2H.5 Monte Carlo Objectives

2H.6 Monte Carlo Results

2H.7 Random-*x* Monte Carlo Results

5.1 Introduction

5.1.1 The Economic Model

5.1.2 The Econometric Model

5.1.3 The General Model

5.1.4 Assumptions of the Multiple Regression Model

5.2 Estimating the Parameters of the Multiple Regression Model

5.2.1 Least Squares Estimation Procedure

5.2.2 Estimating the Error Variance σ^{2}

5.2.3 Measuring Goodness-of-Fit

5.2.4 Frisch–Waugh–Lovell (FWL) Theorem

5.3 Finite Sample Properties of the Least Squares Estimator

5.3.1 The Variances and Covariances of the Least Squares Estimators

5.3.2 The Distribution of the Least Squares Estimators

5.4 Interval Estimation

5.4.1 Interval Estimation for a Single Coefficient

5.4.2 Interval Estimation for a Linear Combination of Coefficients

5.5 Hypothesis Testing

5.5.1 Testing the Significance of a Single Coefficient

5.5.2 One-Tail Hypothesis Testing for a Single Coefficient

5.5.3 Hypothesis Testing for a Linear Combination of Coefficients

5.6 Nonlinear Relationships

5.7 Large Sample Properties of the Least Squares Estimator

5.7.1 Consistency

5.7.2 Asymptotic Normality

5.7.3 Relaxing Assumptions

5.7.4 Inference for a Nonlinear Function of Coefficients

5.8 Exercises

5.8.1 Problems

5.8.2 Computer Exercises

Appendix 5A Derivation of Least Squares Estimators

Appendix 5B The Delta Method

5B.1 Nonlinear Function of a Single Parameter

5B.2 Nonlinear Function of Two Parameters

Appendix 5C Monte Carlo Simulation

5C.1 Least Squares Estimation with Chi-Square Errors

5C.2 Monte Carlo Simulation of the Delta Method

Appendix 5D Bootstrapping

5D.1 Resampling

5D.2 Bootstrap Bias Estimate

5D.3 Bootstrap Standard Error

5D.4 Bootstrap Percentile Interval Estimate

5D.5 Asymptotic Refinement

7.1 Indicator Variables

7.1.1 Intercept Indicator Variables

7.1.2 Slope-Indicator Variables

7.2 Applying Indicator Variables

7.2.1 Interactions Between Qualitative Factors

7.2.2 Qualitative Factors with Several Categories

7.2.3 Testing the Equivalence of Two Regressions

7.2.4 Controlling for Time

7.3 Log-Linear Models

7.3.1 A Rough Calculation

7.3.2 An Exact Calculation

7.4 The Linear Probability Model

7.5 Treatment Effects

7.5.1 The Difference Estimator

7.5.2 Analysis of the Difference Estimator

7.5.3 The Differences-in-Differences Estimator

7.6 Treatment Effects and Causal Modeling

7.6.1 The Nature of Causal Effects

7.6.2 Treatment Effect Models

7.6.3 Decomposing the Treatment Effect

7.6.4 Introducing Control Variables

7.6.5 The Overlap Assumption

7.6.6 Regression Discontinuity Designs

7.7 Exercises

7.7.1 Problems

7.7.2 Computer Exercises

Appendix 7A Details of Log-Linear Model Interpretation

Appendix 7B Derivation of the Differences-in-Differences Estimator

Appendix 7C The Overlap Assumption: Details

8 Heteroskedasticity

8.1 The Nature of Heteroskedasticity

8.2 Heteroskedasticity in the Multiple Regression Model

8.2.1 The Heteroskedastic Regression Model

8.2.2 Heteroskedasticity Consequences for the OLS Estimator

8.3 Heteroskedasticity Robust Variance Estimator

8.4 Generalized Least Squares: Known Form of Variance

8.4.1 Transforming the Model: Proportional Heteroskedasticity

8.4.2 Weighted Least Squares: Proportional Heteroskedasticity

8.5 Generalized Least Squares: Unknown Form of Variance

8.5.1 Estimating the Multiplicative Model

8.6 Detecting Heteroskedasticity

8.6.1 Residual Plots

8.6.2 The Goldfeld–Quandt Test

8.6.3 A General Test for Conditional Heteroskedasticity

8.6.4 The White Test

8.6.5 Model Specification and Heteroskedasticity

8.7 Heteroskedasticity in the Linear Probability Model

8.8 Exercises

8.8.1 Problems

8.8.2 Computer Exercises

Appendix 8A Properties of the Least Squares Estimator

Appendix 8B Lagrange Multiplier Tests for Heteroskedasticity

Appendix 8C Properties of the Least Squares Residuals

8C.1 Details of Multiplicative Heteroskedasticity Model

Appendix 8D Alternative Robust Sandwich Estimators

Appendix 8E Monte Carlo Evidence: OLS, GLS, and FGLS

9 Regression with Time-Series Data: Stationary Variables

9.1 Introduction

9.1.1 Modeling Dynamic Relationships

9.1.2 Autocorrelations

9.2 Stationarity and Weak Dependence

9.3 Forecasting

9.3.1 Forecast Intervals and Standard Errors

9.3.2 Assumptions for Forecasting

9.3.3 Selecting Lag Lengths

9.3.4 Testing for Granger Causality

9.4 Testing for Serially Correlated Errors

9.4.1 Checking the Correlogram of the Least Squares Residuals

9.4.2 Lagrange Multiplier Test

9.4.3 Durbin–Watson Test

9.5 Time-Series Regressions for Policy Analysis

9.5.1 Finite Distributed Lags

9.5.2 HAC Standard Errors

9.5.3 Estimation with AR(1) Errors

9.5.4 Infinite Distributed Lags

9.6 Exercises

9.6.1 Problems

9.6.2 Computer Exercises

Appendix 9A The Durbin–Watson Test

9A.1 The Durbin–Watson Bounds Test

Appendix 9B Properties of an AR(1) Error

10 Endogenous Regressors and Moment-Based Estimation

10.1 Least Squares Estimation with Endogenous Regressors

10.1.1 Large Sample Properties of the OLS Estimator

10.1.2 Why Least Squares Estimation Fails

10.1.3 Proving the Inconsistency of OLS

10.2 Cases in Which

*x* and

*e* are Contemporaneously Correlated

10.2.1 Measurement Error

10.2.2 Simultaneous Equations Bias

10.2.3 Lagged-Dependent Variable Models with Serial Correlation

10.2.4 Omitted Variables

10.3 Estimators Based on the Method of Moments

10.3.1 Method of Moments Estimation of a Population Mean and Variance

10.3.2 Method of Moments Estimation in the Simple Regression Model

10.3.3 Instrumental Variables Estimation in the Simple Regression Model

10.3.4 The Importance of Using Strong Instruments

10.3.5 Proving the Consistency of the IV Estimator

10.3.6 IV Estimation Using Two-Stage Least Squares (2SLS)

10.3.7 Using Surplus Moment Conditions

10.3.8 Instrumental Variables Estimation in the Multiple Regression Model

10.3.9 Assessing Instrument Strength Using the First-Stage Model

10.3.10 Instrumental Variables Estimation in a General Model

10.3.11 Additional Issues When Using IV Estimation

10.4 Specification Tests

10.4.1 The Hausman Test for Endogeneity

10.4.2 The Logic of the Hausman Test

10.4.3 Testing Instrument Validity

10.5 Exercises

10.5.1 Problems

10.5.2 Computer Exercises

Appendix 10A Testing for Weak Instruments

10A.1 A Test for Weak Identification

10A.2 Testing for Weak Identification: Conclusions

Appendix 10B Monte Carlo Simulation

10B.1 Illustrations Using Simulated Data

10B.2 The Sampling Properties of IV/2SLS

11 Simultaneous Equations Models

11.1 A Supply and Demand Model

11.2 The Reduced-Form Equations

11.3 The Failure of Least Squares Estimation

11.3.1 Proving the Failure of OLS

11.4 The Identification Problem

11.5 Two-Stage Least Squares Estimation

11.5.1 The General Two-Stage Least Squares Estimation Procedure

11.5.2 The Properties of the Two-Stage Least Squares Estimator

11.6 Exercises

11.6.1 Problems

11.6.2 Computer Exercises

Appendix 11A 2SLS Alternatives

11A.1 The *k*-Class of Estimators

11A.2 The LIML Estimator

11A.3 Monte Carlo Simulation Results

12 Regression with Time-Series Data: Nonstationary Variables

12.1 Stationary and Nonstationary Variables

12.1.1 Trend Stationary Variables

12.1.2 The First-Order Autoregressive Model

12.1.3 Random Walk Models

12.2 Consequences of Stochastic Trends

12.3 Unit Root Tests for Stationarity

12.3.1 Unit Roots

12.3.2 Dickey–Fuller Tests

12.3.3 Dickey–Fuller Test with Intercept and No Trend

12.3.4 Dickey–Fuller Test with Intercept and Trend

12.3.5 Dickey–Fuller Test with No Intercept and No Trend

12.3.6 Order of Integration

12.3.7 Other Unit Root Tests

12.4 Cointegration

12.4.1 The Error Correction Model

12.5 Regression When There Is No Cointegration

12.6 Summary

12.7 Exercises

12.7.1 Problems

12.7.2 Computer Exercises

13 Vector Error Correction and Vector Autoregressive Models

13.1 VEC and VAR Models

13.2 Estimating a Vector Error Correction Model

13.3 Estimating a VAR Model

13.4 Impulse Responses and Variance Decompositions

13.4.1 Impulse Response Functions

13.4.2 Forecast Error Variance Decompositions

13.5 Exercises

13.5.1 Problems

13.5.2 Computer Exercises

Appendix 13A The Identification Problem

14 Time-Varying Volatility and ARCH Models

14.1 The ARCH Model

14.2 Time-Varying Volatility

14.3 Testing, Estimating, and Forecasting

14.4 Extensions

14.4.1 The GARCH Model—Generalized ARCH

14.4.2 Allowing for an Asymmetric Effect

14.4.3 GARCH-in-Mean and Time-Varying Risk Premium

14.4.4 Other Developments

14.5 Exercises

14.5.1 Problems

14.5.2 Computer Exercises

15 Panel Data Models

15.1 The Panel Data Regression Function

15.1.1 Further Discussion of Unobserved Heterogeneity

15.1.2 The Panel Data Regression Exogeneity Assumption

15.1.3 Using OLS to Estimate the Panel Data Regression

15.2 The Fixed Effects Estimator

15.2.1 The Difference Estimator: *T* = 2

15.2.2 The Within Estimator: *T* = 2

15.2.3 The Within Estimator: *T* > 2

15.2.4 The Least Squares Dummy Variable Model

15.3 Panel Data Regression Error Assumptions

15.3.1 OLS Estimation with Cluster-Robust Standard Errors

15.3.2 Fixed Effects Estimation with Cluster-Robust Standard Errors

15.4 The Random Effects Estimator

15.4.1 Testing for Random Effects

15.4.2 A Hausman Test for Endogeneity in the Random Effects Model

15.4.3 A Regression-Based Hausman Test

15.4.4 The Hausman–Taylor Estimator

15.4.5 Summarizing Panel Data Assumptions

15.4.6 Summarizing and Extending Panel Data Model Estimation

15.5 Exercises

15.5.1 Problems

15.5.2 Computer Exercises

Appendix 15A Cluster-Robust Standard Errors: Some Details

Appendix 15B Estimation of Error Components

16 Qualitative and Limited Dependent Variable Models

16.1 Introducing Models with Binary Dependent Variables

16.1.1 The Linear Probability Model

16.2 Modeling Binary Choices

16.2.1 The Probit Model for Binary Choice

16.2.2 Interpreting the Probit Model

16.2.3 Maximum Likelihood Estimation of the Probit Model

16.2.4 The Logit Model for Binary Choices

16.2.5 Wald Hypothesis Tests

16.2.6 Likelihood Ratio Hypothesis Tests

16.2.7 Robust Inference in Probit and Logit Models

16.2.8 Binary Choice Models with a Continuous Endogenous Variable

16.2.9 Binary Choice Models with a Binary Endogenous Variable

16.2.10 Binary Endogenous Explanatory Variables

16.2.11 Binary Choice Models and Panel Data

16.3 Multinomial Logit

16.3.1 Multinomial Logit Choice Probabilities

16.3.2 Maximum Likelihood Estimation

16.3.3 Multinomial Logit Postestimation Analysis

16.4 Conditional Logit

16.4.1 Conditional Logit Choice Probabilities

16.4.2 Conditional Logit Postestimation Analysis

16.5 Ordered Choice Models

16.5.1 Ordinal Probit Choice Probabilities

16.5.2 Ordered Probit Estimation and Interpretation

16.6 Models for Count Data

16.6.1 Maximum Likelihood Estimation of the Poisson Regression Model

16.6.2 Interpreting the Poisson Regression Model

16.7 Limited Dependent Variables

16.7.1 Maximum Likelihood Estimation of the Simple Linear Regression Model

16.7.2 Truncated Regression

16.7.3 Censored Samples and Regression

16.7.4 Tobit Model Interpretation

16.7.5 Sample Selection

16.8 Exercises

16.8.1 Problems

16.8.2 Computer Exercises

Appendix 16A Probit Marginal Effects: Details

16A.1 Standard Error of Marginal Effect at a Given Point

16A.2 Standard Error of Average Marginal Effect

Appendix 16B Random Utility Models

16B.1 Binary Choice Model

16B.2 Probit or Logit?

Appendix 16C Using Latent Variables

16C.1 Tobit (Tobit Type I)

16C.2 Heckit (Tobit Type II)

Appendix 16D A Tobit Monte Carlo Experiment

Appendix A Mathematical Tools

A.1 Some Basics

A.1.1 Numbers

A.1.2 Exponents

A.1.3 Scientific Notation

A.1.4 Logarithms and the Number *e*

A.1.5 Decimals and Percentages

A.1.6 Logarithms and Percentages

A.2 Linear Relationships

A.2.1 Slopes and Derivatives

A.2.2 Elasticity

A.3 Nonlinear Relationships

A.3.1 Rules for Derivatives

A.3.2 Elasticity of a Nonlinear Relationship

A.3.3 Second Derivatives

A.3.4 Maxima and Minima

A.3.5 Partial Derivatives

A.3.6 Maxima and Minima of Bivariate Functions

A.4 Integrals

A.4.1 Computing the Area Under a Curve

A.5 Exercises

Appendix B Probability Concepts

B.1 Discrete Random Variables

B.1.1 Expected Value of a Discrete Random Variable

B.1.2 Variance of a Discrete Random Variable

B.1.3 Joint, Marginal, and Conditional Distributions

B.1.4 Expectations Involving Several Random Variables

B.1.5 Covariance and Correlation

B.1.6 Conditional Expectations

B.1.7 Iterated Expectations

B.1.8 Variance Decomposition

B.1.9 Covariance Decomposition

B.2 Working with Continuous Random Variables

B.2.1 Probability Calculations

B.2.2 Properties of Continuous Random Variables

B.2.3 Joint, Marginal, and Conditional Probability Distributions

B.2.4 Using Iterated Expectations with Continuous Random Variables

B.2.5 Distributions of Functions of Random Variables

B.2.6 Truncated Random Variables

B.3 Some Important Probability Distributions

B.3.1 The Bernoulli Distribution

B.3.2 The Binomial Distribution

B.3.3 The Poisson Distribution

B.3.4 The Uniform Distribution

B.3.5 The Normal Distribution

B.3.6 The Chi-Square Distribution

B.3.7 The *t*-distribution

B.3.8 The *F*-distribution

B.3.9 The Log-Normal Distribution

B.4 Random Numbers

B.4.1 Uniform Random Numbers

B.5 Exercises

Appendix C Review of Statistical Inference

C.1 A Sample of Data

C.2 An Econometric Model

C.3 Estimating the Mean of a Population

C.3.1 The Expected Value of *Ȳ*

C.3.2 The Variance of *Ȳ*

C.3.3 The Sampling Distribution of *Ȳ*

C.3.4 The Central Limit Theorem

C.3.5 Best Linear Unbiased Estimation

C.4 Estimating the Population Variance and Other Moments

C.4.1 Estimating the Population Variance

C.4.2 Estimating Higher Moments

C.5 Interval Estimation

C.5.1 Interval Estimation: σ^{2} Known

C.5.2 Interval Estimation: σ^{2} Unknown

C.6 Hypothesis Tests About a Population Mean

C.6.1 Components of Hypothesis Tests

C.6.2 One-Tail Tests with Alternative “Greater Than” (>)

C.6.3 One-Tail Tests with Alternative “Less Than” (<)

C.6.4 Two-Tail Tests with Alternative “Not Equal To” (≠)

C.6.5 The *p*-Value

C.6.6 A Comment on Stating Null and Alternative Hypotheses

C.6.7 Type I and Type II Errors

C.6.8 A Relationship Between Hypothesis Testing and Confidence Intervals

C.7 Some Other Useful Tests

C.7.1 Testing the Population Variance

C.7.2 Testing the Equality of Two Population Means

C.7.3 Testing the Ratio of Two Population Variances

C.7.4 Testing the Normality of a Population

C.8 Introduction to Maximum Likelihood Estimation

C.8.1 Inference with Maximum Likelihood Estimators

C.8.2 The Variance of the Maximum Likelihood Estimator

C.8.3 The Distribution of the Sample Proportion

C.8.4 Asymptotic Test Procedures

C.9 Algebraic Supplements

C.9.1 Derivation of Least Squares Estimator

C.9.2 Best Linear Unbiased Estimation

C.10 Kernel Density Estimator

C.11 Exercises

C.11.1 Problems

C.11.2 Computer Exercises

Appendix D Statistical Tables

Table D.1 Cumulative Probabilities for the Standard Normal Distribution Φ(*z*) = *P(Z ≤ z)*

Table D.2 Percentiles of the *t*-distribution

Table D.3 Percentiles of the Chi-square Distribution

Table D.4 95th Percentile for the *F*-distribution

Table D.5 99th Percentile for the *F*-distribution

Table D.6 Standard Normal pdf Values Φ(*z*)

Index