Home  /  Stata News  /  Vol 38 No 1  /  In the spotlight: Tables of estimation results in Stata 17

## In the spotlight: Tables of estimation results in Stata 17

Making tables of estimation results is now easier than ever before with the new etable command in Stata 17. You can create tables for single or multiple models, include marginal predictions, customize your tables using the collect commands, and export your tables to most document formats. There are far too many features to demonstrate in one Stata News article. But I want to show you some basics to get you started, and you can learn more by reading the Stata manual.

## Tables for one regression model

Let's begin by typing webuse nhanes2l to open a dataset based on the second National Health and Nutrition Examination Survey data (NHANES II).

. webuse nhanes2l
(Second National Health and Nutrition Examination Survey)


We will use the variables diabetes, agegrp, bmi, and highbp.

. describe diabetes agegrp bmi highbp

Variable      Storage   Display    Value
name         type    format    label      Variable label

diabetes        byte    %12.0g     diabetes   Diabetes status
agegrp          byte    %8.0g      agegrp     Age group
bmi             float   %9.0g                 Body mass index (BMI)
highbp          byte    %8.0g               * High blood pressure



We start by fitting a logistic regression model for the binary outcome diabetes, including the predictors agegrp, bmi, and highbp. The i. prefix is factor-variable notation that tells Stata agegrp and highbp are categorical variables.

. logistic diabetes i.agegrp bmi i.highbp

Logistic regression                                 Number of obs = 10,349
LR chi2(7)    = 416.01
Prob > chi2   = 0.0000
Log likelihood = -1791.7561                         Pseudo R2     = 0.1040

diabetes   Odds ratio   Std. err.      z    P>|z|     [95% conf. interval]

agegrp
30–39      1.730448   .5895795     1.61   0.108     .8874554    3.374199
40–49      4.259599   1.297735     4.76   0.000     2.344448    7.739213
50–59      6.888277   1.993273     6.67   0.000     3.906582    12.14575
60–69      10.88779   2.952805     8.80   0.000     6.398693    18.52629
70+      15.25109   4.308098     9.65   0.000     8.767088    26.53057

bmi      1.07177   .0091357     8.13   0.000     1.054014    1.089826
1.highbp     1.251171   .1290527     2.17   0.030     1.022161    1.531491
_cons     .0011192   .0003767   -20.19   0.000     .0005787    .0021646

Note: _cons estimates baseline odds.

Now we can type etable to create a basic table of the estimation results. By default, etable displays the estimated odds ratios, their standard errors in parentheses below the odds ratios, and the number of observations.

. etable

diabetes

Age group
30–39                   1.730
(0.590)
40–49                   4.260
(1.298)
50–59                   6.888
(1.993)
60–69                  10.888
(2.953)
70+                    15.251
(4.308)
Body mass index (BMI)     1.072
(0.009)
High blood pressure
1                       1.251
(0.129)
Intercept                 0.001
(0.000)
Number of observations    10349



We can use the keep() option to specify which variables to include in the table. The example below retains only the predictor variables and omits the intercept. The replay option replays the previous etable results, including any previously specified options.

. etable, replay keep(agegrp bmi highbp)

diabetes

Age group
30–39                   1.730
(0.590)
40–49                   4.260
(1.298)
50–59                   6.888
(1.993)
60–69                  10.888
(2.953)
70+                    15.251
(4.308)
Body mass index (BMI)     1.072
(0.009)
High blood pressure
1                       1.251
(0.129)
Number of observations    10349



We can use the cstat() option to specify which statistics we wish to display in our table. The example below uses cstat(_r_b) to display the reported coefficients, which are the odds ratios for logistic regression models. The command also includes cstat(_r_ci), which displays the 95% confidence interval for the estimated odds ratios.

. etable, replay cstat(_r_b) cstat(_r_ci)

diabetes

Age group
30–39                            1.730
[0.887     3.374]
40–49                            4.260
[2.344     7.739]
50–59                            6.888
[3.907    12.146]
60–69                           10.888
[6.399    18.526]
70+                             15.251
[8.767    26.531]
Body mass index (BMI)              1.072
[1.054     1.090]
High blood pressure
1                                1.251
[1.022     1.531]
Number of observations             10349



The table below displays the option names and descriptions of the statistics that may be used with cstat().

We can use the nformat() option within cstat() to specify the display format of the statistic. The example below displays the odds ratios and confidence intervals with two digits to the right of the decimal. The cidelimiter(,) option places a comma between the lower and upper bounds of the confidence interval.

. etable, replay cstat(_r_b, nformat(%4.2f))
cstat(_r_ci, cidelimiter(,) nformat(%6.2f))

diabetes

Age group
30–39                         1.73
[0.89,  3.37]
40–49                         4.26
[2.34,  7.74]
50–59                         6.89
[3.91, 12.15]
60–69                        10.89
[6.40, 18.53]
70+                          15.25
[8.77, 26.53]
Body mass index (BMI)           1.07
[1.05,  1.09]
High blood pressure
1                             1.25
[1.02,  1.53]
Number of observations         10349



We can use the showstars option to place stars next to the odds ratios with p-values below 0.05 and 0.01. We can use the showstarsnote option to place a note at the bottom of the table to remind us that one star indicates a p-value less than 0.05 and two stars indicate a p-value less than 0.01.

. etable, replay showstars showstarsnote

diabetes

Age group
30–39                         1.73
[0.89,  3.37]
40–49                         4.26 **
[2.34,  7.74]
50–59                         6.89 **
[3.91, 12.15]
60–69                        10.89 **
[6.40, 18.53]
70+                          15.25 **
[8.77, 26.53]
Body mass index (BMI)           1.07 **
[1.05,  1.09]
High blood pressure
1                             1.25 *
[1.02,  1.53]
Number of observations         10349

** p<.01, * p<.05

We can use the stars() option to create custom star definitions and use attach() to place the stars next to a particular statistic.

. etable, replay
>         showstars showstarsnote
>         stars(.05 "*" .01 "**" .001 "***", attach(_r_b))

diabetes

Age group
30–39                         1.73
[0.89,  3.37]
40–49                         4.26 ***
[2.34,  7.74]
50–59                         6.89 ***
[3.91, 12.15]
60–69                        10.89 ***
[6.40, 18.53]
70+                          15.25 ***
[8.77, 26.53]
Body mass index (BMI)           1.07 ***
[1.05,  1.09]
High blood pressure
1                             1.25 *
[1.02,  1.53]
Number of observations         10349

*** p<.001, ** p<.01, * p<.05

We can use the mstat() option to add model-level statistics to our table. The example below uses mstat(N) to include the sample size, mstat(aic) to add Akaike's information criterion (AIC), and mstat(bic) to add Schwarz's Bayesian information criterion (BIC) to the table.

. etable, replay mstat(N) mstat(aic) mstat(bic)

diabetes

Age group
30–39                         1.73
[0.89,  3.37]
40–49                         4.26 ***
[2.34,  7.74]
50–59                         6.89 ***
[3.91, 12.15]
60–69                        10.89 ***
[6.40, 18.53]
70+                          15.25 ***
[8.77, 26.53]
Body mass index (BMI)           1.07 ***
[1.05,  1.09]
High blood pressure
1                             1.25 *
[1.02,  1.53]
Number of observations         10349
AIC                          3599.51
BIC                          3657.47

*** p<.001, ** p<.01, * p<.05

The table below displays the identifiers and descriptions for the statistics (results) that may be used with mstat().

Some model-level statistics are appropriate only after a particular model. Those statistics are stored as scalars in the estimation results. You can view a list of these results by typing ereturn list immediately after you run an estimation command. And you can include those statistics by using mstat().

The scalar e(r2_p) was left in memory by the logistic command, and we use mstat(r2_p) to include it in the example below. We also use nformat() to customize the display format and label() to create a custom label for the pseudo r-squared.

. etable, replay mstat(r2_p, nformat(%5.4f) label("Pseudo R2"))

diabetes

Age group
30–39                        1.73
[0.89,  3.37]
40–49                        4.26 ***
[2.34,  7.74]
50–59                        6.89 ***
[3.91, 12.15]
60–69                       10.89 ***
[6.40, 18.53]
70+                         15.25 ***
[8.77, 26.53]
Body mass index (BMI)          1.07 ***
[1.05,  1.09]
High blood pressure
1                            1.25 *
[1.02,  1.53]
Pseudo R2                    0.1040

*** p<.001, ** p<.01, * p<.05

We can add titles to our tables with the title() option, and we can customize the font for the title with the titlestyles() option. Style changes affect only the title when the table is exported and will not affect the Stata output.

. etable, replay
>        title("Table 5: Logistic Regression Model For Diabetes")
>        titlestyles(font(Calibri, size(14) bold))

Table 5: Logistic Regression Model For Diabetes

diabetes

Age group
30–39                         1.73
[0.89,  3.37]
40–49                         4.26 ***
[2.34,  7.74]
50–59                         6.89 ***
[3.91, 12.15]
60–69                        10.89 ***
[6.40, 18.53]
70+                          15.25 ***
[8.77, 26.53]
Body mass index (BMI)           1.07 ***
[1.05,  1.09]
High blood pressure
1                             1.25 *
[1.02,  1.53]
Number of observations         10349
AIC                          3599.51
BIC                          3657.47
Pseudo R2                     0.1040

*** p<.001, ** p<.01, * p<.05

We can also use the notes option to add notes to our table, and we can customize the font with the notestyles() option. Style changes affect only the notes when the table is exported and will not affect the Stata output.

. etable, replay
>        note("Data Source: NHANES, 1981")
>        notestyles(font(Calibri, size(10) italic))

Table 5: Logistic Regression Model For Diabetes

diabetes

Age group
30–39                         1.73
[0.89,  3.37]
40–49                         4.26 ***
[2.34,  7.74]
50–59                         6.89 ***
[3.91, 12.15]
60–69                        10.89 ***
[6.40, 18.53]
70+                          15.25 ***
[8.77, 26.53]
Body mass index (BMI)           1.07 ***
[1.05,  1.09]
High blood pressure
1                             1.25 *
[1.02,  1.53]
Number of observations         10349
AIC                          3599.51
BIC                          3657.47
Pseudo R2                     0.1040

*** p<.001, ** p<.01, * p<.05
Data Source: NHANES, 1981

We can use the column() option to customize the column labels. The default is the name of the dependent variable, which can be specified as column(depvar). We can also use column(dvlabel) to label the column with the dependent variable label, column(command) to use the command, column(title) to use the command title stored in e(title), column(estimates) to use the names specified with estimates store, and column(index) to use sequential integers.

. etable, replay column(dvlabel)

Table 5: Logistic Regression Model For Diabetes

Diabetes status

Age group
30–39                         1.73
[0.89,  3.37]
40–49                         4.26 ***
[2.34,  7.74]
50–59                         6.89 ***
[3.91, 12.15]
60–69                        10.89 ***
[6.40, 18.53]
70+                          15.25 ***
[8.77, 26.53]
Body mass index (BMI)           1.07 ***
[1.05,  1.09]
High blood pressure
1                             1.25 *
[1.02,  1.53]
Number of observations         10349
AIC                          3599.51
BIC                          3657.47
Pseudo R2                     0.1040

*** p<.001, ** p<.01, * p<.05
Data Source: NHANES, 1981

When we are finished, we can export our table to one of many file formats by using the export() option. The example below exports our table to a Microsoft Word document named Diabetes.docx.

. etable, replay
>        export(Diabetes.docx, replace)

Table 5: Logistic Regression Model For Diabetes

Diabetes status

Age group
30–39                         1.73
[0.89,  3.37]
40–49                         4.26 ***
[2.34,  7.74]
50–59                         6.89 ***
[3.91, 12.15]
60–69                        10.89 ***
[6.40, 18.53]
70+                          15.25 ***
[8.77, 26.53]
Body mass index (BMI)           1.07 ***
[1.05,  1.09]
High blood pressure
1                             1.25 *
[1.02,  1.53]
Number of observations         10349
AIC                          3599.51
BIC                          3657.47
Pseudo R2                     0.1040

*** p<.001, ** p<.01, * p<.05
Data Source: NHANES, 1981
(collection ETable exported to file Diabetes.docx)

You can save your tables in any of the formats shown in the table below using the file suffixes listed.

## Tables for more than one regression model

Now that we understand how to use etable for one regression model, let's see how to use etable to create tables for more than one regression model. There are several ways you can do this, and I am going to show you one way using estimates store.

We can store the results of a regression model in memory by typing estimates store name immediately after running the regression command. We can then refer to the stored results in other commands by using name. In the example below, we fit four logistic regression models and store the results of each model using estimates store. We precede each logistic command with quietly to suppress the output.

. quietly logistic diabetes i.agegrp bmi i.highbp

. estimates store full

.
. quietly logistic diabetes i.agegrp

. estimates store age

.
. quietly logistic diabetes bmi

. estimates store bmi

.
. quietly logistic diabetes i.highbp

. estimates store highbp

We can then include the names of our results in the estimates() option to choose the models to include in our table.

. etable, estimates(full age bmi highbp)
>         keep(agegrp bmi highbp)

diabetes diabetes diabetes diabetes

Age group
30–39                   1.730    2.017
(0.590)  (0.685)
40–49                   4.260    5.251
(1.298)  (1.590)
50–59                   6.888    9.076
(1.993)  (2.596)
60–69                  10.888   13.948
(2.953)  (3.735)
70+                    15.251   19.494
(4.308)  (5.418)
Body mass index (BMI)     1.072             1.089
(0.009)           (0.008)
High blood pressure
1                       1.251                      2.577
(0.129)                    (0.247)
Number of observations    10349    10349    10349    10349



In the example below, we use the column(index) option to number each of the columns and note() to add a note to list the covariates in each model.

. etable, replay column(index)
>         note(Model 1: agegrp bmi highbp)
>         note(Model 2: agegrp)
>         note(Model 3: bmi)
>         note(Model 4: highbp)
>         notestyles(font(Calibri, size(10) italic))

1       2       3       4

Age group
30–39                  1.730   2.017
(0.590) (0.685)
40–49                  4.260   5.251
(1.298) (1.590)
50–59                  6.888   9.076
(1.993) (2.596)
60–69                 10.888  13.948
(2.953) (3.735)
70+                   15.251  19.494
(4.308) (5.418)
Body mass index (BMI)    1.072           1.089
(0.009)         (0.008)
High blood pressure
1                      1.251                   2.577
(0.129)                 (0.247)
Number of observations   10349   10349   10349   10349

Model 1: agegrp bmi highbp
Model 2: agegrp
Model 3: bmi
Model 4: highbp

Let's use all of these options together to create a final table and export the results to a Microsoft Word document named Diabetes.docx.

. etable, estimates(full age bmi highbp)
>         keep(agegrp bmi highbp)
>         column(index)
>         stat(_r_b, nformat(%4.2f))
>         cstat(_r_ci, cidelimiter(,) nformat(%6.2f))
>         showstars showstarsnote
>         stars(.05 "*" .01 "**" .001 "***", attach(_r_b))
>         mstat(N,   nformat(%8.0fc) label("Observations"))
>         mstat(aic, nformat(%5.0f))
>         mstat(bic, nformat(%5.0f))
>         mstat(r2_p, nformat(%5.4f) label("Pseudo R2"))
>         title(Table 5: Logistic Regression Model For Diabetes)
>         titlestyles(font(Calibri, size(14) bold))
>         note(Data Source: NHANES, 1981)
>         note(Model 1: agegrp bmi highbp)
>         note(Model 2: agegrp)
>         note(Model 3: bmi)
>         note(Model 4: highbp)
>         notestyles(font(Calibri, size(10) italic))
>         export(Diabetes.docx, replace)

Table 5: Logistic Regression Model For Diabetes

1                  2                 3                 4

Age group
30–39                        1.73               2.02 *
[0.89,  3.37]      [1.04,  3.92]
40–49                        4.26 ***           5.25 ***
[2.34,  7.74]      [2.90,  9.51]
50–59                        6.89 ***           9.08 ***
[3.91, 12.15]      [5.18, 15.90]
60–69                       10.89 ***          13.95 ***
[6.40, 18.53]      [8.25, 23.57]
70+                         15.25 ***          19.49 ***
[8.77, 26.53]     [11.31, 33.61]
Body mass index (BMI)          1.07 ***                             1.09 ***
[1.05,  1.09]                        [1.07,  1.10]
High blood pressure
1                            1.25 *                                                 2.58 ***
[1.02,  1.53]                                          [2.14,  3.11]
Observations                 10,349             10,349            10,349            10,349
AIC                            3600               3675              3892              3900
BIC                            3657               3718              3906              3915
Pseudo R2                    0.1040             0.0841            0.0279            0.0258

*** p<.001, ** p<.01, * p<.05
Data Source: NHANES, 1981
Model 1: agegrp bmi highbp
Model 2: agegrp
Model 3: bmi
Model 4: highbp
(collection ETable exported to file Diabetes.docx)

## Customizing etables with collect

I'm 99% happy with our final table, but I don't like the "1" label for "High blood pressure". We could use label define and label values to add a custom label to "1". But we can also customize the label without changing the original data. etable uses collect to construct tables, and we can view the underlying table structure by typing collect layout.

. collect layout

Collection: ETable
Rows: coleq#colname[agegrp bmi highbp]#result[_r_b _r_ci] result[N aic bic pseudo_r2]
Columns: cmdset#stars
Table 1: 20 x 8

(Table output omitted)

This means that we can also use collect to customize tables built with etable. In the example below, we use collect label levels to change the column name from "High blood pressure" to "Hypertension". We also label the levels of the dimension highbp.

. collect label levels colname highbp "Hypertension", modify

. collect label levels highbp 0 "No" 1 "Yes"

. collect preview

Table 5: Logistic Regression Model For Diabetes

1                  2                 3                 4

Age group
30–39                        1.73               2.02 *
[0.89,  3.37]      [1.04,  3.92]
40–49                        4.26 ***           5.25 ***
[2.34,  7.74]      [2.90,  9.51]
50–59                        6.89 ***           9.08 ***
[3.91, 12.15]      [5.18, 15.90]
60–69                       10.89 ***          13.95 ***
[6.40, 18.53]      [8.25, 23.57]
70+                         15.25 ***          19.49 ***
[8.77, 26.53]     [11.31, 33.61]
Body mass index (BMI)          1.07 ***                             1.09 ***
[1.05,  1.09]                        [1.07,  1.10]
Hypertension
Yes                          1.25 *                                                 2.58 ***
[1.02,  1.53]                                          [2.14,  3.11]
Observations                 10,349             10,349            10,349            10,349
AIC                            3600               3675              3892              3900
BIC                            3657               3718              3906              3915
Pseudo R2                    0.1040             0.0841            0.0279            0.0258

*** p<.001, ** p<.01, * p<.05
Data Source: NHANES, 1981
Model 1: agegrp bmi highbp
Model 2: agegrp
Model 3: bmi
Model 4: highbp

Now we can export the final table to a Microsoft Word document named Diabetes.docx by using collect style putdocx and collect export.

. collect style putdocx, layout(autofitcontents)

. collect export Diabetes.docx, as(docx) replace
(collection ETable exported to file Diabetes.docx)

## Tables of marginal predicted probabilities using etable

You can also use etable to create tables of marginal predictions, in this case, expected probabilities of having diabetes for each group. Simply fit your model, use margins to estimate the marginal predictions, and add the margins option to etable. Here's a quick example to get you started.

. quietly logistic diabetes i.agegrp bmi i.highbp

. quietly margins agegrp highbp

. etable, margins

diabetes

Age group
20–29                   0.008
(0.002)
30–39                   0.013
(0.003)
40–49                   0.032
(0.005)
50–59                   0.051
(0.006)
60–69                   0.078
(0.005)
70+                     0.105
(0.010)
High blood pressure
0                       0.043
(0.003)
1                       0.052
(0.003)
Number of observations    10349



You can use the same options to customize your tables as before. You can even combine the results from your model with the marginal predictions. The example below shows how to include the odds ratios and the marginal predicted probabilities in the same table.

. quietly logistic diabetes i.agegrp bmi i.highbp

. quietly etable

. quietly margins agegrp highbp

. etable, append margins keep(agegrp highbp)
<          column(index)
<          cstat(_r_b,  nformat(%4.2f))
<          cstat(_r_ci, nformat(%5.2f) cidelimiter(,))
<          notes(Column 1: Odds ratios [95% CI])
<          notes(Column 2: Marginal predicted probabilities [95% CI])

1            2

Age group
20–29                                     0.01
[0.00, 0.01]
30–39                        1.73         0.01
[0.89, 3.37] [0.01, 0.02]
40–49                        4.26         0.03
[2.34, 7.74] [0.02, 0.04]
50–59                        6.89         0.05
[3.91,12.15] [0.04, 0.06]
60–69                       10.89         0.08
[6.40,18.53] [0.07, 0.09]
70+                         15.25         0.10
[8.77,26.53] [0.09, 0.12]
High blood pressure
0                                         0.04
[0.04, 0.05]
1                            1.25         0.05
[1.02, 1.53] [0.05, 0.06]
Number of observations        10349        10349

Column 1: Odds ratios [95% CI]
Column 2: Marginal predicted probabilities [95% CI]

## Conclusion

These are just some of the fun things you can do with the new etable command in Stata 17. We've barely scratched the surface, but I hope that I've inspired you to try it out. You can learn more about etable at stata.com and in the Stata manual.

— by Chuck Huber
Director of Statistical Outreach