## In the spotlight: Tables of estimation results in Stata 17

Making tables of estimation results is now easier than ever before with the new etable command in Stata 17. You can create tables for single or multiple models, include marginal predictions, customize your tables using the collect commands, and export your tables to most document formats. There are far too many features to demonstrate in one Stata News article. But I want to show you some basics to get you started, and you can learn more by reading the Stata manual.

## Tables for one regression model

Let's begin by typing **webuse nhanes2l** to open a dataset based on the second National Health and Nutrition Examination Survey data (NHANES II).

.webuse nhanes2l(Second National Health and Nutrition Examination Survey)

We will use the variables **diabetes**, **agegrp**, **bmi**, and **highbp**.

.describe diabetes agegrp bmi highbp

Variable Storage Display Value | ||

name type format label Variable label | ||

diabetes byte %12.0g diabetes Diabetes status | ||

agegrp byte %8.0g agegrp Age group | ||

bmi float %9.0g Body mass index (BMI) | ||

highbp byte %8.0g * High blood pressure |

We start by fitting a logistic regression model for the binary outcome **diabetes**, including the predictors **agegrp**, **bmi**, and **highbp**. The **i.** prefix is factor-variable notation that tells Stata **agegrp** and **highbp** are categorical variables.

.logistic diabetes i.agegrp bmi i.highbpLogistic regression Number of obs = 10,349 LR chi2(7) = 416.01 Prob > chi2 = 0.0000 Log likelihood = -1791.7561 Pseudo R2 = 0.1040

diabetes | Odds ratio Std. err. z P>|z| [95% conf. interval] | |

agegrp | ||

30–39 | 1.730448 .5895795 1.61 0.108 .8874554 3.374199 | |

40–49 | 4.259599 1.297735 4.76 0.000 2.344448 7.739213 | |

50–59 | 6.888277 1.993273 6.67 0.000 3.906582 12.14575 | |

60–69 | 10.88779 2.952805 8.80 0.000 6.398693 18.52629 | |

70+ | 15.25109 4.308098 9.65 0.000 8.767088 26.53057 | |

bmi | 1.07177 .0091357 8.13 0.000 1.054014 1.089826 | |

1.highbp | 1.251171 .1290527 2.17 0.030 1.022161 1.531491 | |

_cons | .0011192 .0003767 -20.19 0.000 .0005787 .0021646 | |

Now we can type etable to create a basic table of the estimation results. By default, **etable** displays the estimated odds ratios, their standard errors in parentheses below the odds ratios, and the number of observations.

.etable

diabetes |

Age group 30–39 1.730 (0.590) 40–49 4.260 (1.298) 50–59 6.888 (1.993) 60–69 10.888 (2.953) 70+ 15.251 (4.308) Body mass index (BMI) 1.072 (0.009) High blood pressure 1 1.251 (0.129) Intercept 0.001 (0.000) Number of observations 10349 |

We can use the **keep()** option to specify which variables to include in the table. The example below retains only the predictor variables and omits the intercept. The **replay** option replays the previous **etable** results, including any previously specified options.

.etable, replay keep(agegrp bmi highbp)

diabetes |

Age group 30–39 1.730 (0.590) 40–49 4.260 (1.298) 50–59 6.888 (1.993) 60–69 10.888 (2.953) 70+ 15.251 (4.308) Body mass index (BMI) 1.072 (0.009) High blood pressure 1 1.251 (0.129) Number of observations 10349 |

We can use the **cstat()** option to specify which statistics we wish to display in our table. The example below uses **cstat(_r_b)** to display the reported coefficients, which are the odds ratios for logistic regression models. The command also includes **cstat(_r_ci)**, which displays the 95% confidence interval for the estimated odds ratios.

.etable, replay cstat(_r_b) cstat(_r_ci)

diabetes |

Age group 30–39 1.730 [0.887 3.374] 40–49 4.260 [2.344 7.739] 50–59 6.888 [3.907 12.146] 60–69 10.888 [6.399 18.526] 70+ 15.251 [8.767 26.531] Body mass index (BMI) 1.072 [1.054 1.090] High blood pressure 1 1.251 [1.022 1.531] Number of observations 10349 |

The table below displays the option names and descriptions of the statistics that may be used with **cstat()**.

We can use the **nformat()** option within **cstat()** to specify the display format of the statistic. The example below displays the odds ratios and confidence intervals with two digits to the right of the decimal. The **cidelimiter(,)** option places a comma between the lower and upper bounds of the confidence interval.

.etable, replay cstat(_r_b, nformat(%4.2f)) cstat(_r_ci, cidelimiter(,) nformat(%6.2f))

diabetes |

Age group 30–39 1.73 [0.89, 3.37] 40–49 4.26 [2.34, 7.74] 50–59 6.89 [3.91, 12.15] 60–69 10.89 [6.40, 18.53] 70+ 15.25 [8.77, 26.53] Body mass index (BMI) 1.07 [1.05, 1.09] High blood pressure 1 1.25 [1.02, 1.53] Number of observations 10349 |

We can use the **showstars** option to place stars next to the odds ratios with *p*-values below 0.05 and 0.01. We can use the **showstarsnote** option to place a note at the bottom of the table to remind us that one star indicates a *p*-value less than 0.05 and two stars indicate a *p*-value less than 0.01.

.etable, replay showstars showstarsnote

diabetes |

Age group 30–39 1.73 [0.89, 3.37] 40–49 4.26 ** [2.34, 7.74] 50–59 6.89 ** [3.91, 12.15] 60–69 10.89 ** [6.40, 18.53] 70+ 15.25 ** [8.77, 26.53] Body mass index (BMI) 1.07 ** [1.05, 1.09] High blood pressure 1 1.25 * [1.02, 1.53] Number of observations 10349 |

We can use the **stars()** option to create custom star definitions and use **attach()** to place the stars next to a particular statistic.

.etable, replay > showstars showstarsnote > stars(.05 "*" .01 "**" .001 "***", attach(_r_b))

diabetes |

Age group 30–39 1.73 [0.89, 3.37] 40–49 4.26 *** [2.34, 7.74] 50–59 6.89 *** [3.91, 12.15] 60–69 10.89 *** [6.40, 18.53] 70+ 15.25 *** [8.77, 26.53] Body mass index (BMI) 1.07 *** [1.05, 1.09] High blood pressure 1 1.25 * [1.02, 1.53] Number of observations 10349 |

We can use the **mstat()** option to add model-level statistics to our table. The example below uses **mstat(N)** to include the sample size, **mstat(aic)** to add Akaike's information criterion (AIC), and **mstat(bic)** to add Schwarz's Bayesian information criterion (BIC) to the table.

.etable, replay mstat(N) mstat(aic) mstat(bic)

diabetes |

Age group 30–39 1.73 [0.89, 3.37] 40–49 4.26 *** [2.34, 7.74] 50–59 6.89 *** [3.91, 12.15] 60–69 10.89 *** [6.40, 18.53] 70+ 15.25 *** [8.77, 26.53] Body mass index (BMI) 1.07 *** [1.05, 1.09] High blood pressure 1 1.25 * [1.02, 1.53] Number of observations 10349 AIC 3599.51 BIC 3657.47 |

The table below displays the identifiers and descriptions for the statistics (results) that may be used with **mstat()**.

Some model-level statistics are appropriate only after a particular model. Those statistics are stored as scalars in the estimation results. You can view a list of these results by typing ereturn list immediately after you run an estimation command. And you can include those statistics by using **mstat()**.

The scalar **e(r2_p)** was left in memory by the **logistic** command, and we use **mstat(r2_p)** to include it in the example below. We also use **nformat()** to customize the display format and **label()** to create a custom label for the pseudo *r*-squared.

.etable, replay mstat(r2_p, nformat(%5.4f) label("Pseudo R2"))

diabetes |

Age group 30–39 1.73 [0.89, 3.37] 40–49 4.26 *** [2.34, 7.74] 50–59 6.89 *** [3.91, 12.15] 60–69 10.89 *** [6.40, 18.53] 70+ 15.25 *** [8.77, 26.53] Body mass index (BMI) 1.07 *** [1.05, 1.09] High blood pressure 1 1.25 * [1.02, 1.53] Pseudo R2 0.1040 |

We can add titles to our tables with the **title()** option, and we can customize the font for the title with the **titlestyles()** option. Style changes affect only the title when the table is exported and will not affect the Stata output.

.etable, replay>title("Table 5: Logistic Regression Model For Diabetes")>titlestyles(font(Calibri, size(14) bold))Table 5: Logistic Regression Model For Diabetes

diabetes |

Age group 30–39 1.73 [0.89, 3.37] 40–49 4.26 *** [2.34, 7.74] 50–59 6.89 *** [3.91, 12.15] 60–69 10.89 *** [6.40, 18.53] 70+ 15.25 *** [8.77, 26.53] Body mass index (BMI) 1.07 *** [1.05, 1.09] High blood pressure 1 1.25 * [1.02, 1.53] Number of observations 10349 AIC 3599.51 BIC 3657.47 Pseudo R2 0.1040 |

We can also use the **notes** option to add notes to our table, and we can customize the font with the **notestyles()** option. Style changes affect only the notes when the table is exported and will not affect the Stata output.

.etable, replay>note("Data Source: NHANES, 1981")>notestyles(font(Calibri, size(10) italic))Table 5: Logistic Regression Model For Diabetes

diabetes |

Age group 30–39 1.73 [0.89, 3.37] 40–49 4.26 *** [2.34, 7.74] 50–59 6.89 *** [3.91, 12.15] 60–69 10.89 *** [6.40, 18.53] 70+ 15.25 *** [8.77, 26.53] Body mass index (BMI) 1.07 *** [1.05, 1.09] High blood pressure 1 1.25 * [1.02, 1.53] Number of observations 10349 AIC 3599.51 BIC 3657.47 Pseudo R2 0.1040 |

We can use the **column()** option to customize the column labels. The default is the name of the dependent variable, which can be specified as **column(depvar)**. We can also use **column(dvlabel)** to label the column with the dependent variable label, **column(command)** to use the command, **column(title)** to use the command title stored in **e(title)**, **column(estimates)** to use the names specified with **estimates store**, and **column(index)** to use sequential integers.

.etable, replay column(dvlabel)Table 5: Logistic Regression Model For Diabetes

Diabetes status |

Age group 30–39 1.73 [0.89, 3.37] 40–49 4.26 *** [2.34, 7.74] 50–59 6.89 *** [3.91, 12.15] 60–69 10.89 *** [6.40, 18.53] 70+ 15.25 *** [8.77, 26.53] Body mass index (BMI) 1.07 *** [1.05, 1.09] High blood pressure 1 1.25 * [1.02, 1.53] Number of observations 10349 AIC 3599.51 BIC 3657.47 Pseudo R2 0.1040 |

When we are finished, we can export our table to one of many file formats by using the **export()** option. The example below exports our table to a Microsoft Word document named **Diabetes.docx**.

.etable, replay>export(Diabetes.docx, replace)Table 5: Logistic Regression Model For Diabetes

Diabetes status |

You can save your tables in any of the formats shown in the table below using the file suffixes listed.

## Tables for more than one regression model

Now that we understand how to use **etable** for one regression model, let's see how to use **etable** to create tables for more than one regression model. There are several ways you can do this, and I am going to show you one way using estimates store.

We can store the results of a regression model in memory by typing estimates store **name** immediately after running the regression command. We can then refer to the stored results in other commands by using *name*. In the example below, we fit four logistic regression models and store the results of each model using **estimates store**. We precede each **logistic** command with quietly to suppress the output.

.quietly logistic diabetes i.agegrp bmi i.highbp.estimates store full. .quietly logistic diabetes i.agegrp.estimates store age. .quietly logistic diabetes bmi.estimates store bmi. .quietly logistic diabetes i.highbp.estimates store highbp

We can then include the names of our results in the **estimates()** option to choose the models to include in our table.

.etable, estimates(full age bmi highbp)>keep(agegrp bmi highbp)

diabetes diabetes diabetes diabetes |

Age group 30–39 1.730 2.017 (0.590) (0.685) 40–49 4.260 5.251 (1.298) (1.590) 50–59 6.888 9.076 (1.993) (2.596) 60–69 10.888 13.948 (2.953) (3.735) 70+ 15.251 19.494 (4.308) (5.418) Body mass index (BMI) 1.072 1.089 (0.009) (0.008) High blood pressure 1 1.251 2.577 (0.129) (0.247) Number of observations 10349 10349 10349 10349 |

In the example below, we use the **column(index)** option to number each of the columns and **note()** to add a note to list the covariates in each model.

.etable, replay column(index)>note(Model 1: agegrp bmi highbp)>note(Model 2: agegrp)>note(Model 3: bmi)>note(Model 4: highbp)>notestyles(font(Calibri, size(10) italic))

1 2 3 4 |

Age group 30–39 1.730 2.017 (0.590) (0.685) 40–49 4.260 5.251 (1.298) (1.590) 50–59 6.888 9.076 (1.993) (2.596) 60–69 10.888 13.948 (2.953) (3.735) 70+ 15.251 19.494 (4.308) (5.418) Body mass index (BMI) 1.072 1.089 (0.009) (0.008) High blood pressure 1 1.251 2.577 (0.129) (0.247) Number of observations 10349 10349 10349 10349 |

Let's use all of these options together to create a final table and export the results to a Microsoft Word document named **Diabetes.docx**.

.etable, estimates(full age bmi highbp)>keep(agegrp bmi highbp)>column(index)>stat(_r_b, nformat(%4.2f))>cstat(_r_ci, cidelimiter(,) nformat(%6.2f))>showstars showstarsnote>stars(.05 "*" .01 "**" .001 "***", attach(_r_b))>mstat(N, nformat(%8.0fc) label("Observations"))>mstat(aic, nformat(%5.0f))>mstat(bic, nformat(%5.0f))>mstat(r2_p, nformat(%5.4f) label("Pseudo R2"))>title(Table 5: Logistic Regression Model For Diabetes)>titlestyles(font(Calibri, size(14) bold))>note(Data Source: NHANES, 1981)>note(Model 1: agegrp bmi highbp)>note(Model 2: agegrp)>note(Model 3: bmi)>note(Model 4: highbp)>notestyles(font(Calibri, size(10) italic))>export(Diabetes.docx, replace)Table 5: Logistic Regression Model For Diabetes

1 2 3 4 |

Age group 30–39 1.73 2.02 * [0.89, 3.37] [1.04, 3.92] 40–49 4.26 *** 5.25 *** [2.34, 7.74] [2.90, 9.51] 50–59 6.89 *** 9.08 *** [3.91, 12.15] [5.18, 15.90] 60–69 10.89 *** 13.95 *** [6.40, 18.53] [8.25, 23.57] 70+ 15.25 *** 19.49 *** [8.77, 26.53] [11.31, 33.61] Body mass index (BMI) 1.07 *** 1.09 *** [1.05, 1.09] [1.07, 1.10] High blood pressure 1 1.25 * 2.58 *** [1.02, 1.53] [2.14, 3.11] Observations 10,349 10,349 10,349 10,349 AIC 3600 3675 3892 3900 BIC 3657 3718 3906 3915 Pseudo R2 0.1040 0.0841 0.0279 0.0258 |

## Customizing etables with collect

I'm 99% happy with our final table, but I don't like the "1" label for "High blood pressure". We could use label define and label values to add a custom label to "1". But we can also customize the label without changing the original data. etable uses collect to construct tables, and we can view the underlying table structure by typing collect layout.

.collect layoutCollection: ETable Rows: coleq#colname[agegrp bmi highbp]#result[_r_b _r_ci] result[N aic bic pseudo_r2] Columns: cmdset#stars Table 1: 20 x 8 (Table output omitted)

This means that we can also use **collect** to customize tables built with **etable**. In the example below, we use collect label levels to change the column name from "High blood pressure" to "Hypertension". We also label the levels of the dimension **highbp**.

.collect label levels colname highbp "Hypertension", modify.collect label levels highbp 0 "No" 1 "Yes".collect previewTable 5: Logistic Regression Model For Diabetes

1 2 3 4 |

Age group 30–39 1.73 2.02 * [0.89, 3.37] [1.04, 3.92] 40–49 4.26 *** 5.25 *** [2.34, 7.74] [2.90, 9.51] 50–59 6.89 *** 9.08 *** [3.91, 12.15] [5.18, 15.90] 60–69 10.89 *** 13.95 *** [6.40, 18.53] [8.25, 23.57] 70+ 15.25 *** 19.49 *** [8.77, 26.53] [11.31, 33.61] Body mass index (BMI) 1.07 *** 1.09 *** [1.05, 1.09] [1.07, 1.10] Hypertension Yes 1.25 * 2.58 *** [1.02, 1.53] [2.14, 3.11] Observations 10,349 10,349 10,349 10,349 AIC 3600 3675 3892 3900 BIC 3657 3718 3906 3915 Pseudo R2 0.1040 0.0841 0.0279 0.0258 |

Now we can export the final table to a Microsoft Word document named **Diabetes.docx** by using collect style putdocx and collect export.

.collect style putdocx, layout(autofitcontents).collect export Diabetes.docx, as(docx) replace(collection ETable exported to file Diabetes.docx)

## Tables of marginal predicted probabilities using etable

You can also use **etable** to create tables of marginal predictions, in this case, expected probabilities of having diabetes for each group. Simply fit your model, use margins to estimate the marginal predictions, and add the **margins** option to **etable**. Here's a quick example to get you started.

.quietly logistic diabetes i.agegrp bmi i.highbp.quietly margins agegrp highbp.etable, margins

diabetes |

Age group 20–29 0.008 (0.002) 30–39 0.013 (0.003) 40–49 0.032 (0.005) 50–59 0.051 (0.006) 60–69 0.078 (0.005) 70+ 0.105 (0.010) High blood pressure 0 0.043 (0.003) 1 0.052 (0.003) Number of observations 10349 |

You can use the same options to customize your tables as before. You can even combine the results from your model with the marginal predictions. The example below shows how to include the odds ratios and the marginal predicted probabilities in the same table.

.quietly logistic diabetes i.agegrp bmi i.highbp.quietly etable.quietly margins agegrp highbp.etable, append margins keep(agegrp highbp)<column(index)<cstat(_r_b, nformat(%4.2f))<cstat(_r_ci, nformat(%5.2f) cidelimiter(,))<notes(Column 1: Odds ratios [95% CI])<notes(Column 2: Marginal predicted probabilities [95% CI])

1 2 |

Age group 20–29 0.01 [0.00, 0.01] 30–39 1.73 0.01 [0.89, 3.37] [0.01, 0.02] 40–49 4.26 0.03 [2.34, 7.74] [0.02, 0.04] 50–59 6.89 0.05 [3.91,12.15] [0.04, 0.06] 60–69 10.89 0.08 [6.40,18.53] [0.07, 0.09] 70+ 15.25 0.10 [8.77,26.53] [0.09, 0.12] High blood pressure 0 0.04 [0.04, 0.05] 1 1.25 0.05 [1.02, 1.53] [0.05, 0.06] Number of observations 10349 10349 |

## Conclusion

These are just some of the fun things you can do with the new **etable** command in Stata 17. We've barely scratched the surface, but I hope that I've inspired you to try it out. You can learn more about **etable** at stata.com and in the Stata manual.

— by Chuck Huber

Director of Statistical Outreach