Stata Bookstore: Data Analysis Using Stata

Home / Bookstore / Books on Stata / Data Analysis Using Stata

Data Analysis Using Stata

Authors:	Ulrich Kohler and Frauke Kreuter
Publisher:	Stata Press
Copyright:	2005
ISBN-13:	978-1-59718-007-8
Pages:	378; paperback
See a large photo of the front cover See the back cover Table of contents Preface (pdf) Chapter 0—About the book (pdf) Author index (pdf) Subject index (pdf) Errata (www.stata-press.com) Download the datasets used in this book (from www.stata-press.com) Download the datasets used in the German edition of this book (from www.stata-press.com) Review of book from the Stata Journal

Third edition now available

Comment from the Stata technical group

Data Analysis Using Stata provides a comprehensive introduction to Stata that will be useful to those who are just learning statistics and Stata, as well as to users of other statistical packages making the switch to Stata. Throughout the book, the authors make extensive use of examples using data from the German Socioeconomic Panel, a large survey of households containing demographic, income, employment, and other key information.

The book begins with an introduction to the Stata interface and then proceeds with a discussion of Stata syntax and simple programming tools like foreach loops. The core of the book includes chapters on producing tables and graphs, performing linear regression, and using logistic regression. All key concepts are illustrated with multiple examples.

The remainder of the book includes chapters on reading text files, writing programs and ado-files, and using Internet resources, such as the search command and the SSC archive.

Overall, Kohler and Kreuter's book will serve as a valuable introduction to Stata, both for those who are new to statistics and statistical computing, as well as for those new to Stata but familiar with other programs. The book also makes a handy reference guide for current Stata users.

Preface (pdf)

0 About the book

0.1 Structure
0.2 Using this book: Materials and hints
0.3 Teaching with this manual

1 "The first time"

1.1 Starting Stata
1.2 Setting up your screen
1.3 Your first analysis
1.4 Do-files
1.5 Exiting Stata

2 Working with do-files

2.1 From interactive work to working with a do-file

2.1.1 Alternative 1
2.1.2 Alternative 2

2.2 Designing do-files

2.2.1 Comments
2.2.2 Line breaks
2.2.3 Some crucial commands

2.3 Organizing your work
2.4 Summary

3 The grammar of Stata

3.1 The elements of Stata commands

3.1.1 Stata commands
3.1.2 The variable list

List of variables: required or optional
Abbreviation rules
Special listings

3.1.3 Options
3.1.4 The in qualifier
3.1.5 The if qualifier
3.1.6 Expressions

Operators
Functions

3.1.7 Lists of numbers
3.1.8 Using filenames

3.2 Repeating similar commands

3.2.1 The by prefix
3.2.2 The foreach loop
3.2.3 The forvalues loop

3.3 Weights

4 Some general comments on the statistical commands

5 Creating and changing variables

5.1 The commands generate and replace

5.1.1 Variable names
5.1.2 Some examples
5.1.3 Changing codes with by, _n, and _N
5.1.4 Subscripts

5.2 Specialized recoding commands

5.2.1 The recode command
5.2.2 The egen command

5.3 Additional tools for recording data

5.3.1 String functions
5.3.2 Date functions

5.4 Commands for dealing with missing values
5.5 Labels
5.6 Storage types, or, the ghost in the machine

6 Creating and changing graphs

6.1 A primer on graph syntax
6.2 Graph types

6.2.1 Examples
6.2.2 Specialized graphs

6.3 Graph elements

6.3.1 Appearance of data

Choice of marker
Marker colors
Marker size
Lines

6.3.2 Graph and plot regions

Graph size
Plot region
Scaling the axes

6.3.3 Information inside the plot region

Reference lines
Labeling inside the plot region

6.3.4 Information outside the plot region

Labeling the axes
Tick lines
Axis titles
The legend
Graph titles

6.4 Multiple graphs

6.4.1 Overlaying numerous twoway graphs
6.4.2 Option by()
6.4.3 Combining graphs

6.5 Saving and printing graphs

7 Describing and comparing distributions

7.1 Categories: Few or many?
7.2 Variables with few categories

7.2.1 Tables

Frequency tables
More than one frequency table
Comparing distributions
Summary statistics

7.2.2 Graphs

Histograms
Bar charts
Dot chart

7.3 Variables with many categories

7.3.1 Frequencies of grouped data

Some remarks on grouping data
Special techniques for grouping data

7.3.2 Describing data using statistics

Important summary statistics
The summarize command
The tabstat command
Comparing distributions using statistics

7.3.3 Graphs

Box plots
Histograms
Kernel density estimation
Quantile plot

7.3.4 Summary

7.4 Summary

8 Introduction to linear regression

8.1 Simple linear regression

8.1.1 The basic principle
8.1.2 Linear regression using Stata

The table of coefficients
Standard errors
The table of ANOVA results
The model fit table

8.2 Multiple regression

8.2.1 Multiple regression using Stata
8.2.2 Additional computations
8.2.3 What does "under control" mean?

8.3 Regression diagnostics

8.3.1 Violation of E(ε_i) = 0

Linearity
Influential cases
Omitted variables

8.3.2 Violation of Var(ε_i) = σ²
8.3.3 Violation of Cov(ε_i, ε_j) = 0, i ≠ j

8.4 Model extensions

8.4.1 Categorical independent variables
8.4.2 Interaction terms
8.4.3 Regression models using transformed variables

Nonlinear relations
Eliminating heteroskedasticity

8.5 More on standard errors

8.5.1 Bootstrap techniques
8.5.2 Confidence intervals in cluster samples

8.6 Advanced techniques

8.6.1 Median regression
8.6.2 Regression models for panel data

From wide to long format
Fixed-effects models

8.6.3 Error-component models

8.7 Summary

9 Regression models for categorical dependent variables

9.1 The linear probability model
9.2 Basic concepts

9.2.1 Odds, log odds, and odds ratios
9.2.2 Excursion: The maximum likelihood principle

9.3 Logistic regression with Stata

9.3.1 The coefficients block

Sign interpretation
Interpretation with odds ratios
Probability interpretation

9.3.2 The iteration block
9.3.3 The model fit block

Classification tables
Pearson chi-squared

9.4 Logistic regression diagnostics

9.4.1 Linearity
9.4.2 Influential cases

9.5 Likelihood-ratio test
9.6 Refined models
9.7 Advanced techniques

9.7.1 Probit models
9.7.2 Multinomial logistic regression
9.7.3 Models for ordinal data

9.8 Summary

10 Reading and writing data

10.1 The goal: The data matrix
10.2 Importing machine-readable data

10.2.1 Reading system files from other packages
10.2.2 Reading ASCII text files

Reading data in spreadsheet format
Reading data in free format
Reading data in fixed format

10.3 Inputting data

10.3.1 Input data using the editor
10.3.2 The input command

10.4 Combining data

10.4.1 The GSOEP database
10.4.2 The merge command

The merge procedure
Keeping track of observations
Merging more than two files
Merging data on different levels

10.4.3 The append command

10.5 Saving and exporting data
10.6 Handling big datasets

10.6.1 Rules for handling the working memory
10.6.2 Using oversized datasets

10.7 Summary

11 Do-files for advanced users and user-written programs

11.1 Two examples of usage
11.2 Four programming tools

11.2.1 Local macros
11.2.2 Do-files
11.2.3 Programs
11.2.4 Programs in do-files and ado-files

11.3 User-written Stata commands

11.3.1 Parsing variable lists
11.3.2 Parsing options
11.3.3 Parsing if and in qualifiers
11.3.4 Generating an unknown number of variables
11.3.5 Default values
11.3.6 Extended macro functions
11.3.7 Avoiding changes in the dataset
11.3.8 Help files

11.4 Summary

12 Around Stata

12.1 Resources and information
12.2 Taking care of Stata
12.3 Additional procedures

12.3.1 SJ and STB ado-files
12.3.2 SSC ado-files
12.3.3 Other ado-files

12.4 Summary

References

Author index (pdf)

Subject index (pdf)

Data Analysis Using Stata

Comment from the Stata technical group

Table of contents

We use cookies

Privacy policy

Required cookies

Advertising and performance cookies

Stata/MP4 Annual License (download)

Data Analysis Using Stata

Comment from the Stata technical group

Table of contents

We use cookies

Privacy policy

Required cookies

Advertising and performance cookies