Home  /  Stata News  /  Vol 36 No 3  /  In the spotlight: Customizable tables in Stata 17
The Stata News

«Back to main page

In the spotlight: Customizable tables in Stata 17

You asked for it—you got it! We expanded the functionality of the table command. We also developed an entirely new system that allows you to collect results from any Stata command, create custom table layouts and styles, save and use those layouts and styles, and export your tables to most popular document formats. We even added a new manual to show you how to use this powerful and flexible system.

Today, I'm going to show you how to use this new system to create a table of regression results and export it to Microsoft Word. Regression table in Wrod

Below, I'll show you how to create, customize, and export this table with just a few commands. I will not explain all the commands and concepts in detail here, but I will give you more detail in my blog series. You can find the first posts at

Customizable tables in Stata 17, part 1: The new table command

Customizable tables in Stata 17, part 2: The new collect command

Table of regression results

Let's begin by typing webuse nhanes2l to open a new Stata 17 dataset based on the National Health and Nutrition Examination Survey data (NHANES).

. webuse nhanes2l
(Second National Health and Nutrition Examination Survey)

. describe age sex bmi highbp diabetes

Variable Storage Display Value
name type format label Variable label
age byte %9.0g Age (years) sex byte %9.0g sex Sex bmi float %9.0g Body mass index (BMI) highbp byte %8.0g * High blood pressure diabetes byte %12.0g diabetes Diabetes status

Almost all Stata commands leave results in memory temporarily after you run them. The new customizable table system allows you to collect these results and use them to create tables. The example below creates a new collection named mytable and collects the results of a logistic regression model for high blood pressure (highbp).

. collect create mytable

. collect: logistic highbp c.age##i.sex i.diabetes bmi

Logistic regression                                    Number of obs =  10,349
                                                       LR chi2(5)    = 2507.35
                                                       Prob > chi2   =  0.0000
Log likelihood = -5795.9927                            Pseudo R2     =  0.1778


highbp Odds ratio Std. err. z P>|z| [95% conf. interval]
age 1.033714 .0019249 17.81 0.000 1.029948 1.037494
sex
Female .152188 .0232806 -12.31 0.000 .1127638 .2053956
sex#c.age
Female 1.027994 .00294 9.65 0.000 1.022248 1.033773
diabetes
Diabetic 1.197894 .1269626 1.70 0.088 .973198 1.47447
bmi 1.146828 .0059073 26.60 0.000 1.135308 1.158464
_cons .0054089 .0008847 -31.91 0.000 .0039255 .007453
Note: _cons estimates baseline odds.

We can then type collect dims to see what we collected.

. collect dims

Collection dimensions
Collection: mytable
Dimension No. levels
Layout, style, header, label cmdset 1 coleq 1 colname 13 program_class 1 result 43 result_type 3 rowname 1 Header, label diabetes sex Style only border_block 4 cell_type 4

The output tells us that we created a collection named mytable. Our collection includes dimensions and each dimension has levels. Collections, dimensions, and levels are important concepts for building tables. The concept is similar to a dataset (collection) that contains variables (dimensions) that have categories (levels).

Our collection contains a dimension named result, which has 43 levels. We can type collect levelsof result to view the levels of the dimension result. The levels include things named N, _r_b_, and chi2. Levels that begin with "_r" refer to the reported statistics in the regression output, such as the coefficients or transformed coefficients (_r_b), their standard errors (_r_se), and their confidence intervals (_r_ci).

. collect levelsof result

Collection: mytable
 Dimension: result
    Levels: N N_cdf N_cds _r_b _r_ci _r_df _r_lb _r_p _r_se _r_ub _r_z chi2
            chi2type cmd cmdline converged depvar df_m estat_cmd ic k k_dv
            k_eq k_eq_model ll ll_0 marginsnotok marginsok ml_method mns opt p
            predict properties r2_p rank rc rules technique title user vce
            which

Let's type collect levelsof colname to view the levels of the dimension colname.

. collect levelsof colname

Collection: mytable
 Dimension: colname
    Levels: age 1.sex 2.sex 1.sex#age 2.sex#age 0.diabetes 1.diabetes bmi c1
            c2 c3 c4 _cons

The levels of the dimension colname contain the independent variables in our model, including any interactions. Levels can have labels, and we can view them by typing collect label list colname. I will show you how to customize these labels later.

. collect label list colname

  Collection: mytable
   Dimension: colname
       Label: Covariate names and column names
Level labels:
       _cons  Intercept
         age  Age (years)
         bmi  Body mass index (BMI)
    diabetes  Diabetes status
         sex  Sex

Now that we know how the regression results are stored in the collection, we can use collect layout to create a table using the dimensions in our collection. The basic syntax is

collect layout (RowDimensions) (ColumnDimensions)

We can specify the levels of the dimensions by using square brackets:

collect layout (RowDimensions[Levels]) (ColumnDimensions[Levels])

The example below uses collect layout to create a table using the row dimension colname and the column dimension result. Note that I have only included the levels _r_b (the transformed coefficients—the odds ratios), _r_se (the standard errors), _r_z (the z statistics), _r_p (the p-values), and _r_ci (the confidence interval) for the column dimension result.

. collect layout (colname) (result[_r_b _r_se _r_z _r_p _r_ci])

Collection: mytable
      Rows: colname
   Columns: result[_r_b _r_se _r_z _r_p _r_ci]
   Table 1: 9 x 5

Coefficient Std. error z p-value 95% CI
Age (years) 1.033714 .0019249 17.81 0.000 1.029948 1.037494
Male 1 0
Female .152188 .0232806 -12.31 0.000 .1127638 .2053956
Male # Age (years) 1 0
Female # Age (years) 1.027994 .00294 9.65 0.000 1.022248 1.033773
Not diabetic 1 0
Diabetic 1.197894 .1269626 1.70 0.088 .973198 1.47447
Body mass index (BMI) 1.146828 .0059073 26.60 0.000 1.135308 1.158464
Intercept .0054089 .0008847 -31.91 0.000 .0039255 .007453

Now we can customize the style of our table. Let's begin by using collect style cell to format the numbers in our table. We can refer to the numbers we wish to customize by using their dimension and levels. For example, we can refer to the coefficients and standard errors by typing result[_r_b _r_se] or to the p-values by typing result[_r_p]. Then we can type collect preview to view our changes.

. collect style cell result[_r_b _r_se], nformat(%6.2f)

. collect style cell result[_r_p], nformat(%5.4f)

. collect preview                   

Coefficient Std. error z p-value 95% CI
Age (years) 1.03 0.00 17.81 0.0000 1.029948 1.037494
Male 1.00 0.00
Female 0.15 0.02 -12.31 0.0000 .1127638 .2053956
Male # Age (years) 1.00 0.00
Female # Age (years) 1.03 0.00 9.65 0.0000 1.022248 1.033773
Not diabetic 1.00 0.00
Diabetic 1.20 0.13 1.70 0.0884 .973198 1.47447
Body mass index (BMI) 1.15 0.01 26.60 0.0000 1.135308 1.158464
Intercept 0.01 0.00 -31.91 0.0000 .0039255 .007453

We can also customize the confidence intervals by using collect style cell. The option nformat(%5.2f) formats the numbers, sformat("[%s]") places square brackets around the numbers, and cidelimiter(,) places a comma between the lower and upper bounds.

. collect style cell result[_r_ci],     
                    nformat(%5.2f)      
                    sformat("[%s]")     
                    cidelimiter(,)

. collect preview                   

Coefficient Std. error z p-value 95% CI
Age (years) 1.03 0.00 17.81 0.0000 [1.03, 1.04]
Male 1.00 0.00
Female 0.15 0.02 -12.31 0.0000 [0.11, 0.21]
Male # Age (years) 1.00 0.00
Female # Age (years) 1.03 0.00 9.65 0.0000 [1.02, 1.03]
Not diabetic 1.00 0.00
Diabetic 1.20 0.13 1.70 0.0884 [0.97, 1.47]
Body mass index (BMI) 1.15 0.01 26.60 0.0000 [1.14, 1.16]
Intercept 0.01 0.00 -31.91 0.0000 [0.00, 0.01]

Next we can use collect label levels to change the column labels from "Coefficient" to "OR" and from "Std. error" to "SE". We can use collect style cell to center-align the column headers, and we can use collect style column to add extra space between the columns.

. collect label levels result _r_b "OR", modify  

. collect label levels result _r_se "SE", modify

. collect style cell cell_type[column-header], halign(center)

. collect style column, extraspace(1)                   

. collect preview 

OR SE z p-value 95% CI
Age (years) 1.03 0.00 17.81 0.0000 [1.03, 1.04]
Male 1.00 0.00
Female 0.15 0.02 -12.31 0.0000 [0.11, 0.21]
Male # Age (years) 1.00 0.00
Female # Age (years) 1.03 0.00 9.65 0.0000 [1.02, 1.03]
Not diabetic 1.00 0.00
Diabetic 1.20 0.13 1.70 0.0884 [0.97, 1.47]
Body mass index (BMI) 1.15 0.01 26.60 0.0000 [1.14, 1.16]
Intercept 0.01 0.00 -31.91 0.0000 [0.00, 0.01]

Next we can customize the row labels. We can remove the base levels from the factor variables by typing collect style showbase off. We can use collect style row to stack the dimension levels below the dimension name (stack), remove the "=" between the dimensions and their levels (nobinder), and change the interaction delimiter from "#" to "x" (delimiter(" x ")).

. collect style showbase off

. collect style row stack, nobinder delimiter(" x ")

. collect preview

OR SE z p-value 95% CI
Age (years) 1.03 0.00 17.81 0.0000 [1.03, 1.04]
Sex
Female 0.15 0.02 -12.31 0.0000 [0.11, 0.21]
Sex x Age (years)
Female 1.03 0.00 9.65 0.0000 [1.02, 1.03]
Diabetes status
Diabetic 1.20 0.13 1.70 0.0884 [0.97, 1.47]
Body mass index (BMI) 1.15 0.01 26.60 0.0000 [1.14, 1.16]
Intercept 0.01 0.00 -31.91 0.0000 [0.00, 0.01]

The vertical line between the row labels and odds ratios is a border along the right side of the first column. We can use collect style cell to remove this vertical line.

. collect style cell border_block, border(right, pattern(nil))

. collect preview

OR SE z p-value 95% CI
Age (years) 1.03 0.00 17.81 0.0000 [1.03, 1.04] Sex Female 0.15 0.02 -12.31 0.0000 [0.11, 0.21] Sex x Age (years) Female 1.03 0.00 9.65 0.0000 [1.02, 1.03] Diabetes status Diabetic 1.20 0.13 1.70 0.0884 [0.97, 1.47] Body mass index (BMI) 1.15 0.01 26.60 0.0000 [1.14, 1.16] Intercept 0.01 0.00 -31.91 0.0000 [0.00, 0.01]

Exporting tables to documents

Now we're ready to export our table to a Microsoft Word document. We can use collect style putdocx to customize the style of our table in the Word document. By default, Word will stretch our table across the width of the document. We can prevent this with the layout(autofitcontents) option, and we can use title() to add a title to our table. Then we can export our table to Word by using collect export.

. collect style putdocx, layout(autofitcontents)    
                        title("Table 1: Logistic Model For Hypertension")

. collect export MyLogitTable2.docx, as(docx) replace
(collection mytable exported to file MyLogitTable1.docx)
Regression table in Wrod

You can use putdocx collect to export your table if you are using putdocx to create a larger Word document.

You can also export tables to Excel, HTML, LaTeX, Markdown, PDF, Stata SMCL, and plain text.

Save and use your styles and labels

Once we are happy with the look of our table, we can save the style and labels and reuse them for other tables. We can use collect label save and collect style save to save our labels and styles, respectively.

. collect label save MyLogitLabels, replace
(labels from mytable saved to file MyLogitLabels.stjson)

. collect style save MyLogitStyle, replace         
(style from mytable saved to file MyLogitStyle.stjson)

Then we can open another dataset and apply our styles and labels to another model.

. webuse lbw, clear
(Hosmer & Lemeshow data)

. collect style use MyLogitStyle, replace layout
(dimension colname not found)
(dimension result not found)

Collection: mytable
      Rows: colname
   Columns: result[_r_b _r_se _r_z _r_p _r_ci]

. collect label use MyLogitLabels, replace

. quietly collect: logistic low age i.smoke i.race ht ui

. collect preview

OR SE z p-value 95% CI
Age (years) 0.97 0.03 -0.90 0.3705 [0.91, 1.04] Smoked during pregnancy Smoker 2.93 1.12 2.81 0.0049 [1.39, 6.20] Race Black 2.71 1.36 1.99 0.0468 [1.01, 7.26] Other 2.83 1.18 2.48 0.0130 [1.24, 6.41] Has history of hypertension 3.87 2.43 2.16 0.0309 [1.13,13.25] Presence, uterine irritability 2.66 1.17 2.22 0.0267 [1.12, 6.32] Intercept 0.26 0.24 -1.48 0.1376 [0.04, 1.54]

We can see that row and column layout, numeric formats, alignment, treatment of factor variables, and labels are automatically applied from the saved style and label files.

Conclusion

Customizable tables in Stata 17 will help you streamline your workflow and speed up document production. You can save custom styles for different journals or reporting agencies and reformat large documents quickly and easily. And here we have barely scratched the surface of the power and flexibility of customizable tables in Stata 17.

The new Customizable Tables and Collected Results Reference Manual includes examples that show you how to create a "classic table 1", create a table of t-test results, compare the results of several regression models, and much more. We have also added videos to the Stata YouTube Channel that show you how to use the table dialog box and the Tables Builder. And we will be adding posts about customizable tables to the Stata Blog soon.

— Chuck Huber
Director of Statistical Outreach