Stata's margins and marginsplot commands are powerful tools for visualizing the results of regression models. We will use linear regression below, but the same principles and syntax work with nearly all of Stata's regression commands, including probit, logistic, poisson, and others. You will want to review Stata's factor-variable notation if you have not used it before.
Let's begin by opening the nhanes2l dataset. Then let's describe and summarize the variables bpsystol and age.
. webuse nhanes2l (Second National Health and Nutrition Examination Survey) . describe bpsystol age
Variable Storage Display Value name type format label Variable label |
bpsystol int %9.0g Systolic blood pressure age byte %9.0g Age (years) |
Variable | Obs Mean Std. dev. Min Max | |
bpsystol | 10,351 130.8817 23.33265 65 300 | |
age | 10,351 47.57965 17.21483 20 74 |
We are going to fit a series of linear regression models for the outcome variable bpsystol, which measures systolic blood pressure (SBP) with a range of 65 to 300 mmHg, and age measures age with a range of 20 to 74 years.
Let's fit a linear regression model using the continuous outcome variable bpsystol and the continuous predictor variable age.
. regress bpsystol age
Source | SS df MS | Number of obs = 10,351 | F(1, 10349) = 3116.79 |
Model | 1304200.02 1 1304200.02 | Prob > F = 0.0000 | |
Residual | 4330470.01 10,349 418.443328 | R-squared = 0.2315 | Adj R-squared = 0.2314 |
Total | 5634670.03 10,350 544.412563 | Root MSE = 20.456 |
bpsystol | Coefficient Std. err. t P>|t| [95% conf. interval] | |
age | .6520775 .0116801 55.83 0.000 .6291823 .6749727 | |
_cons | 99.85603 .5909867 168.96 0.000 98.69758 101.0145 | |
The regression output tells us that the intercept, labeled “_cons”, is 99.85603 and the slope coefficient for age is 0.6520775. We can use these estimated coefficients to calculate the expected SBP for a 20-year-old.
. display "E(SBP | age = 20) = " 99.85603 + 0.6520775 * 20 E(SBP | age = 20) = 112.89758
We can also use the estimated coefficients to calculate the expected SBP for a 40-year-old and a 60-year-old.
. display "E(SBP | age = 40) = " 99.85603 + 0.6520775 * 40 E(SBP | age = 40) = 125.93913 . display "E(SBP | age = 60) = " 99.85603 + 0.6520775 * 60 E(SBP | age = 60) = 138.98068
Stata's margins command will estimate the expected SBP for all three values of age along with the standard error, t statistic, p-value, and 95% confidence interval. Note that the “i.” prefix is required in the regress command but not in the margins command.
. margins, at(age=(20 40 60)) Adjusted predictions Number of obs = 10,351 Model VCE: OLS Expression: Linear prediction, predict() 1._at: age = 20 2._at: age = 40 3._at: age = 60
Delta-method | ||
Margin std. err. t P>|t| [95% conf. interval] | ||
_at | ||
1 | 112.8976 .3797296 297.31 0.000 112.1532 113.6419 | |
2 | 125.9391 .2196887 573.26 0.000 125.5085 126.3698 | |
3 | 138.9807 .2479332 560.56 0.000 138.4947 139.4667 | |
We can estimate predictions from age 20 to 60 in 5-year increments using the following syntax:
. margins, at(age=(20(5)60)) Adjusted predictions Number of obs = 10,351 Model VCE: OLS Expression: Linear prediction, predict() 1._at: age = 20 2._at: age = 25 3._at: age = 30 4._at: age = 35 5._at: age = 40 6._at: age = 45 7._at: age = 50 8._at: age = 55 9._at: age = 60
Delta-method | ||
Margin std. err. t P>|t| [95% conf. interval] | ||
_at | ||
1 | 112.8976 .3797296 297.31 0.000 112.1532 113.6419 | |
2 | 116.158 .3316322 350.26 0.000 115.5079 116.808 | |
3 | 119.4184 .2873786 415.54 0.000 118.855 119.9817 | |
4 | 122.6787 .2490265 492.63 0.000 122.1906 123.1669 | |
5 | 125.9391 .2196887 573.26 0.000 125.5085 126.3698 | |
6 | 129.1995 .2033058 635.49 0.000 128.801 129.598 | |
7 | 132.4599 .2030384 652.39 0.000 132.0619 132.8579 | |
8 | 135.7203 .2189455 619.88 0.000 135.2911 136.1495 | |
9 | 138.9807 .2479332 560.56 0.000 138.4947 139.4667 | |
The output also reports a standard error, t statistic, p-value, and 95% confidence interval for each estimate. The t statistic tests the null hypothesis that the expected SBP is zero.
Then we can plot the marginal predictions by typing marginsplot.
. marginsplot Variables that uniquely identify margins: age
Let's use the ytitle() and title() options to add titles to our graph.
. marginsplot, ytitle("Expected systolic blood pressure (mmHg)") title("Expected systolic blood pressure by age") Variables that uniquely identify margins: age
You can read more about factor-variable notation, margins, and marginsplot in the Stata documentation. You can also watch a demonstration of these commands by clicking on the links to the YouTube videos below.
Read more in the Stata Base Reference Manual; see [R] margins, [R] marginsplot, and [R] regress. And in the Stata User’s Guide, see [U-11] factor variables.