»  Home »  Products »  Why use Stata? »  Stata is easy to grow with

## Stata is easy to grow with

### Consistent command syntax

Stata's commands are intuitive and easy to learn. Even better, everything you learn about performing a task can be applied to other tasks.

Need to limit your analysis to females? Add if female==1 to any command.

Need standard errors that are robust to many common assumptions? Add vce(robust) to almost any estimation command.

Need to account for sampling weights, clusters, and stratification? Add svy: to the beginning of the command.

The consistency goes even deeper. What you learn about data management commands often applies to estimation commands, and vice versa.

There is a full suite of postestimation commands to perform hypothesis tests, form linear and nonlinear combinations, make predictions, form contrasts, and even perform marginal analysis with interaction plots. These commands work the same way after virtually every estimator.

### See how it works

Let's start with linear regression. We fit a variety of models and explore results using the postestimation commands for testing, prediction, and marginal analysis.

// Regression of body mass index (BMI) on age and region indicators
regress bmi age i.region

// Fit the model for females only
regress bmi age i.region if female==1

// Obtain robust standard errors
regress bmi age i.region, vce(robust)

// Include a female indicator and its interaction with age
regress bmi age i.region i.female c.age#i.female

// Perform a joint test of significance for the region indicators
testparm i.region

// Compute the predicted BMI for each person
predict bmi_hat

// Obtain the average prediction (potential outcome), treating
// all individuals as if they live in region 1
margins 1.region

// Obtain average predictions for all regions
margins region

// Obtain average predictions by sex across a range of ages
margins female, at(age=(20 40 60 80))

// Plot this interaction
marginsplot

(See the graph)

×



What if we instead have a binary outcome variable, an indicator of whether an individual has high blood pressure? We could fit a logistic regression model. We replace regress in the commands above with logistic, and we use highbp instead of bmi as the dependent variable. Otherwise, the model specification, options, and postestimation commands are almost identical.

// Logistic regression of high blood pressure on age and region indicators
logistic highbp age i.region

// Fit the model for females only
logistic highbp age i.region if female==1

// Obtain robust standard errors
logistic highbp age i.region, vce(robust)

// Include a female indicator and its interaction with age
logistic highbp age i.region i.female c.age#i.female

// Perform a joint test of significance for the region indicators
testparm i.region

// Compute the predicted probability of high blood pressure
// for each person
predict prob_hbp

// Obtain the average predicted probability (potential outcome),
// treating all individuals as if they live in region 1
margins 1.region

// Obtain average predicted probability for all regions
margins region

// Obtain average predicted probabilities by sex across a range of ages
margins female, at(age=(20 40 60 80))

// Plot this interaction
marginsplot

(See the graph)

×



If we have a count outcome such as the number of individuals in the household, we might want to fit a Poisson model. We use the poisson command and housesize as the dependent variable, but again, the rest of the command syntax is the same.

// Poisson regression of household size on age and region indicators
poisson housesize age i.region

// Fit the model for females only
poisson housesize age i.region if female==1

// Obtain robust standard errors
poisson housesize age i.region, vce(robust)

// Include a rural location indicator and its interaction with age
poisson housesize age i.region i.rural c.age#i.rural

// Perform a joint test of significance for the region indicators
testparm i.region

// Compute the predicted number of individuals in each household
predict size

// Obtain the average predicted household size (potential outcome),
// treating all individuals as if they live in region 1
margins 1.region

// Obtain average predicted household size for all regions
margins region

// Obtain average predicted household size by rural across
// a range of ages
margins rural, at(age=(20 40 60 80))

// Plot this interaction
marginsplot

(See the graph)

×



We could fit many other models. Models for ordered and unordered categorical outcomes. Multilevel models. Models for time-series, panel, or survival data. Models accounting for endogeneity and sample selection. Regardless of the model, we can use the same command structure, same options, and same postestimation commands that we used above.

### Want to see more?

See the commands for fitting and interpreting linear regression models. Or watch the webinar.

See the commands for fitting and interpreting binary, count, and other outcomes. Or watch the webinar.