List of Tables

List of Figures

1 “The first time”

1.1 Starting Stata

1.2 Setting up your screen

1.3 Your first analysis

1.3.1 Inputting commands

1.3.2 Files and the working memory

1.3.3 Loading data

1.3.4 Variables and observations

1.3.5 Looking at data

1.3.6 Interrupting a command and repeating a command

1.3.7 The variable list

1.3.8 The in qualifier

1.3.9 Summary statistics

1.3.10 The if qualifier

1.3.11 Define missing values

1.3.12 The by prefix

1.3.13 Command options

1.3.14 Frequency tables

1.3.15 Variable labels and value labels

1.3.16 Graphs

1.3.17 Getting help

1.3.18 Recoding of variables

1.3.19 Linear regression

1.4 Do-files

1.5 Exiting Stata

1.6 Exercises

2 Working with do-files

2.1 From interactive work to working with a do-file

2.1.1 Alternative 1

2.1.2 Alternative 2

2.2 Designing do-files

2.2.1 Comments

2.2.2 Line breaks

2.2.3 Some crucial commands

2.3 Organizing your work

2.4 Exercises

3 The grammar of Stata

3.1 The elements of Stata commands

3.1.1 Stata commands

3.1.2 The variable list

List of variables: Required or optional

Abbreviation rules

Special listings

3.1.3 Options

3.1.4 The in qualifier

3.1.5 The if qualifier

3.1.6 Expressions

Operators

Functions

3.1.7 Lists of numbers

3.1.8 Using filenames

3.2 Repeating similar commands

3.2.1 The by prefix

3.2.2 The foreach loop

The types of foreach lists

Several commands within a foreach loop

3.2.3 The forvalues loop

3.3 Weights

Frequency weights

Analytic weights

Probability weights

3.4 Exercises

4 General comments on the statistical commands

4.1 Exercises

5 Creating and changing variables

5.1 The commands generate and replace

5.1.1 Variable names

5.1.2 Some examples

5.1.3 Changing codes with by, _n, and _N

5.1.4 Subscripts

5.2 Specialized recoding commands

5.2.1 The recode command

5.2.2 The egen command

5.3 More tools for recording data

5.3.1 String functions

5.3.2 Date and time functions

Dates

Time

5.4 Commands for dealing with missing values

5.5 Labels

5.6 Storage types, or the ghost in the machine

5.7 Exercises

6 Creating and changing graphs

6.1 A primer on graph syntax

6.2 Graph types

6.2.1 Examples

6.2.2 Specialized graphs

6.3 Graph elements

6.3.1 Appearance of data

Choice of marker

Marker colors

Marker size

Lines

6.3.2 Graph and plot regions

Graph size

Plot region

Scaling the axes

6.3.3 Information inside the plot region

Reference lines

Labeling inside the plot region

6.3.4 Information outside the plot region

Labeling the axes

Tick lines

Axis titles

The legend

Graph titles

6.4 Multiple graphs

6.4.1 Overlaying many twoway graphs

6.4.2 Option by()

6.4.3 Combining graphs

6.5 Saving and printing graphs

6.6 Exercises

7 Describing and comparing distributions

7.1 Categories: Few or many?

7.2 Variables with few categories

7.2.1 Tables

Frequency tables

More than one frequency table

Comparing distributions

Summary statistics

More than one contingency table

7.2.2 Graphs

Histograms

Bar charts

Pie charts

Dot charts

7.3 Variables with many categories

7.3.1 Frequencies of grouped data

Some remarks on grouping data

Special techniques for grouping data

7.3.2 Describing data using statistics

Important summary statistics

The summarize command

The tabstat command

Comparing distributions using statistics

7.3.3 Graphs

Box plots

Histograms

Kernel density estimation

Quantile plot

Comparing distributions with Q–Q plots

7.4 Exercises

8 Introduction to linear regression

8.1 Simple linear regression

8.1.1 The basic principle

8.1.2 Linear regression using Stata

The table of coefficients

Standard errors

The table of ANOVA results

The model fit table

8.2 Multiple regression

8.2.1 Multiple regression using Stata

8.2.2 More computations

Adjusted R^{2}

Standardized regression coefficients

8.2.3 What does "under control" mean?

8.3 Regression diagnostics

8.3.1 Violation of E(ε

_{i}) = 0

Linearity

Influential cases

Omitted variables

Multicollinearity

8.3.2 Violation of Var(ε

_{i}) = σ

^{2}
8.3.3 Violation of Cov(ε

_{i}, ε

_{j}) = 0, i ≠ j

8.4 Model extensions

8.4.1 Categorical independent variables

8.4.2 Interaction terms

8.4.3 Regression models using transformed variables

Nonlinear relations

Eliminating heteroskedasticity

8.5 More on standard errors

8.5.1 Bootstrap techniques

8.5.2 Confidence intervals in cluster samples

8.6 Advanced techniques

8.6.1 Median regression

8.6.2 Regression models for panel data

From wide to long format

Fixed-effects models

8.6.3 Error-components models

8.7 Exercises

9 Regression models for categorical dependent variables

9.1 The linear probability model

9.2 Basic concepts

9.2.1 Odds, log odds, and odds ratios

9.2.2 Excursion: The maximum likelihood principle

9.3 Logistic regression with Stata

9.3.1 The coefficient table

Sign interpretation

Interpretation with odds ratios

Probability interpretation

9.3.2 The iteration block

9.3.3 The model fit block

Classification tables

Pearson chi-squared

9.4 Logistic regression diagnostics

9.4.1 Linearity

9.4.2 Influential cases

9.5 Likelihood-ratio test

9.6 Refined models

9.6.1 Nonlinear relationships

9.6.2 Categorical independent variables

9.6.3 Interaction effects

9.7 Advanced techniques

9.7.1 Probit models

9.7.2 Multinomial logistic regression

9.7.3 Models for ordinal data

9.8 Exercises

10 Reading and writing data

10.1 The goal: The data matrix

10.2 Importing machine-readable data

10.2.1 Reading system files from other packages

10.2.2 Reading ASCII text files

Reading data in spreadsheet format

Reading data in free format

Reading data in fixed format

10.3 Inputting data

10.3.1 Input data using the Data Editor

10.3.2 The input command

10.4 Combining data

10.4.1 The GSOEP database

10.4.2 The merge command

The merge procedure

Keeping track of observations

Merging more than two files

Merging data on different levels

10.4.3 The append command

10.5 Saving and exporting data

10.6 Handling large datasets

10.6.1 Rules for handling the working memory

10.6.2 Using oversized datasets

10.7 Exercises

11 Do-files for advanced users and user-written programs

11.1 Two examples of usage

11.2 Four programming tools

11.2.1 Local macros

Calculating with local macros

Combining local macros

Changing local macros

11.2.2 Do-files

11.2.3 Programs

The problem of redefinition

The problem of naming

The problem of error checking

11.2.4 Programs in do-files and ado-files

11.3 User-written Stata commands

11.3.1 Parsing variable lists

11.3.2 Parsing options

11.3.3 Parsing if and in qualifiers

11.3.4 Generating an unknown number of variables

11.3.5 Default values

11.3.6 Extended macro functions

11.3.7 Avoiding changes in the dataset

11.3.8 Help files

11.4 Exercises

12 Around Stata

12.1 Resources and information

12.2 Taking care of Stata

12.3 Additional procedures

12.3.1 SJ and STB ado-files

12.3.2 SSC ado-files

12.3.3 Other ado-files

12.4 Exercises

References