Stata's margins and marginsplot commands are powerful tools for visualizing the results of regression models. We will use linear regression below, but the same principles and syntax work with nearly all of Stata's regression commands, including probit, logistic, poisson, and others. You will want to review Stata's factor-variable notation if you have not used it before.
Let's begin by opening the nhanes2l dataset. Then let's describe and summarize the variables bpsystol and diabetes.
. webuse nhanes2l (Second National Health and Nutrition Examination Survey) . describe bpsystol diabetes
Variable Storage Display Value name type format label Variable label |
bpsystol int %9.0g Systolic blood pressure diabetes byte %12.0g diabetes Diabetes status |
Variable | Obs Mean Std. dev. Min Max | |
bpsystol | 10,351 130.8817 23.33265 65 300 | |
diabetes | 10,349 .0482172 .2142353 0 1 |
We are going to fit a series of linear regression models for the outcome variable bpsystol, which measures systolic blood pressure (SBP) with a range of 65 to 300 mmHg, and diabetes measures diabetes status with a range of 0 to 1.
Let's fit a linear regression model using the continuous outcome variable bpsystol and the binary predictor variable diabetes. Note that I have used factor-variable notation to tell Stata that diabetes is a categorical predictor.
. regress bpsystol i.diabetes
Source | SS df MS | Number of obs = 10,349 | F(1, 10347) = 244.99 |
Model | 130296.034 1 130296.034 | Prob > F = 0.0000 | |
Residual | 5502984.01 10,347 531.843434 | R-squared = 0.0231 | Adj R-squared = 0.0230 |
Total | 5633280.05 10,348 544.38346 | Root MSE = 23.062 |
bpsystol | Coefficient Std. err. t P>|t| [95% conf. interval] | |
diabetes | ||
Diabetic | 16.56328 1.058212 15.65 0.000 14.48898 18.63758 | |
_cons | 130.088 .2323666 559.84 0.000 129.6325 130.5435 | |
The regression output tells us that the expected SBP for a person without diabetes is 130.088 and the expected SBP for a person diabetes is 16.56328 mmHg higher than a person without diabetes.
We can calculate the expected SBP in each group using the estimated coefficients in the regression output. We substitute a value of 0 in the equation for the nondiabetic group and a value of 1 in the equation for the diabetic group.
. display "E(SBP | no diabetes) = " 130.088 + 16.56328 * 0 E(SBP | no diabetes) = 130.088 . display "E(SBP | diabetes) = " 130.088 + 16.56328 * 1 E(SBP | diabetes) = 146.65128
Stata's margins command will estimate the expected SBP in both groups. Note that the “i.” prefix is required in the regress command but not in the margins command.
. margins diabetes
Delta-method | ||
Margin std. err. t P>|t| [95% conf. interval] | ||
diabetes | ||
Not diabetic | 130.088 .2323666 559.84 0.000 129.6325 130.5435 | |
Diabetic | 146.6513 1.032385 142.05 0.000 144.6276 148.675 | |
The output also reports a standard error, t statistic, p-value, and 95% confidence interval for each estimate. The t statistic tests the null hypothesis that the expected SBP is zero.
We can plot the marginal predictions and their 95% confidence intervals by typing marginsplot.
. marginsplot Variables that uniquely identify margins: diabetes
By default, marginsplot creates a profile plot using lines. We can use the recast(bar) option if we prefer a bar chart, or “dynamite plunger plot”.
. marginsplot, recast(bar) Variables that uniquely identify margins: diabetes
We can add the horizontal option to create a horizontal bar chart.
. marginsplot, recast(bar) horizontal Variables that uniquely identify margins: diabetes
Let's add some additional options to make our graph look nicer. We can use the plotopts(barwidth()) option to add some space between the bars. And we can use the title(), subtitle(), xtitle(), and ytitle() options to add various titles to our graph.
. marginsplot, recast(bar) horizontal plotopts(barwidth(0.8)) title("Expected systolic blood pressure (mmHg)") subtitle("By diabetes status") xtitle("Expected systolic blood pressure (mmHg)") ytitle("") Variables that uniquely identify margins: diabetes
You can read more about factor-variable notation, margins, and marginsplot in the Stata documentation. You can also watch a demonstration of these commands by clicking on the links to the YouTube videos below.
Read more in the Stata Base Reference Manual; see [R] margins, [R] marginsplot, and [R] regress. And in the Stata User’s Guide, see [U-11] factor variables.