Stata's commands are intuitive and easy to learn. Even better, everything you learn about performing a task can be applied to other tasks.

Need to limit your analysis to females? Add **if female==1** to any
command.

Need standard errors that are robust to many common assumptions? Add
**vce(robust)** to almost any estimation command.

Need to account for sampling weights, clusters, and stratification? Add
**svy:** to the beginning of the command.

The consistency goes even deeper. What you learn about data management commands often applies to estimation commands, and vice versa.

There is a full suite of postestimation commands to perform hypothesis tests, form linear and nonlinear combinations, make predictions, form contrasts, and even perform marginal analysis with interaction plots. These commands work the same way after virtually every estimator.

Let's start with linear regression. We fit a variety of models and explore results using the postestimation commands for testing, prediction, and marginal analysis.

// Regression of body mass index (BMI) on age and region indicatorsregress bmi age i.region// Fit the model for females onlyregress bmi age i.region if female==1// Obtain robust standard errorsregress bmi age i.region, vce(robust)// Include a female indicator and its interaction with ageregress bmi age i.region i.female c.age#i.female// Perform a joint test of significance for the region indicatorstestparm i.region// Compute the predicted BMI for each personpredict bmi_hat// Obtain the average prediction (potential outcome), treating // all individuals as if they live in region 1margins 1.region// Obtain average predictions for all regionsmargins region// Obtain average predictions by sex across a range of agesmargins female, at(age=(20 40 60 80))// Plot this interactionmarginsplot(See the graph)

What if we instead have a binary outcome variable, an indicator of whether
an individual has high blood pressure? We could fit a logistic
regression model. We replace **regress**
in the commands above with **logistic**,
and we use **highbp** instead of **bmi** as the dependent variable.
Otherwise, the model specification, options, and postestimation commands are
almost identical.

// Logistic regression of high blood pressure on age and region indicatorslogistic highbp age i.region// Fit the model for females onlylogistic highbp age i.region if female==1// Obtain robust standard errorslogistic highbp age i.region, vce(robust)// Include a female indicator and its interaction with agelogistic highbp age i.region i.female c.age#i.female// Perform a joint test of significance for the region indicatorstestparm i.region// Compute the predicted probability of high blood pressure // for each personpredict prob_hbp// Obtain the average predicted probability (potential outcome), // treating all individuals as if they live in region 1margins 1.region// Obtain average predicted probability for all regionsmargins region// Obtain average predicted probabilities by sex across a range of agesmargins female, at(age=(20 40 60 80))// Plot this interactionmarginsplot(See the graph)

If we have a count outcome such as the number of individuals in the household, we
might want to fit a Poisson model. We use the
**poisson** command and
**housesize** as the dependent variable, but again,
the rest of the command syntax is the same.

// Poisson regression of household size on age and region indicatorspoisson housesize age i.region// Fit the model for females onlypoisson housesize age i.region if female==1// Obtain robust standard errorspoisson housesize age i.region, vce(robust)// Include a rural location indicator and its interaction with agepoisson housesize age i.region i.rural c.age#i.rural// Perform a joint test of significance for the region indicatorstestparm i.region// Compute the predicted number of individuals in each householdpredict size// Obtain the average predicted household size (potential outcome), // treating all individuals as if they live in region 1margins 1.region// Obtain average predicted household size for all regionsmargins region// Obtain average predicted household size by rural across // a range of agesmargins rural, at(age=(20 40 60 80))// Plot this interactionmarginsplot(See the graph)

We could fit many other models. Models for ordered and unordered categorical outcomes. Multilevel models. Models for time-series, panel, or survival data. Models accounting for endogeneity and sample selection. Regardless of the model, we can use the same command structure, same options, and same postestimation commands that we used above.

See the commands for fitting and interpreting linear regression models. Or watch the webinar.

See the commands for fitting and interpreting binary, count, and other outcomes. Or watch the webinar.