Home  /  Resources & Support  /  Introduction to Stata basics  /  Create descriptive tables with dtable

Stata's dtable command is designed to create tables of descriptive statistics. These can be simple tables for casual use or sophisticated tables for publication.

Let's begin by opening the nhanes2l dataset. Then let's describe the variables highbp, age, sex, and hlthstat.

. webuse nhanes2l
(Second National Health and Nutrition Examination Survey)

. describe highbp age sex hlthstat

Variable Storage Display Value
name type format label Variable label
highbp byte %8.0g * High blood pressure
age byte %9.0g Age (years)
sex byte %9.0g sex Sex
hlthstat byte %20.0g hlth Health status

Basic tables for continuous variables

Let's begin with a simple table of descriptive statistics for age.

. dtable age

Summary
N 10,351
Age (years) 47.580 (17.215)

By default, dtable displays the number of observations followed by the mean and standard deviation of the continuous variable age. We can use the by() option to display the same statistics for each category of highbp as well as the total sample.

. dtable age, by(highbp)

High blood pressure
0 1 Total
N 5,975 (57.7%) 4,376 (42.3%) 10,351 (100.0%)
Age (years) 42.165 (16.772) 54.973 (14.909) 47.580 (17.215)

The categories are not labeled, so let's use label define to create a label named YesNo and use label values to attach the labels to highbp. Then we can create our table again.

. label define YesNo 0 "No" 1 "Yes"
. label values highbp YesNo
. dtable age, by(highbp)

High blood pressure
No Yes Total
N 5,975 (57.7%) 4,376 (42.3%) 10,351 (100.0%)
Age (years) 42.165 (16.772) 54.973 (14.909) 47.580 (17.215)

We can use the continuous() option to specify the statistics to be reported for the continuous variable age. The example below simply re-creates the default output.

. dtable age, by(highbp) 
     continuous(age, statistics(mean sd))

High blood pressure
No Yes Total
N 5,975 (57.7%) 4,376 (42.3%) 10,351 (100.0%)
Age (years) 42.165 (16.772) 54.973 (14.909) 47.580 (17.215)

We can use the nformat() option to format the mean and standard deviation, and we can use the sformat() option to add parentheses around the standard deviation. Note that the parentheses were already there, but we could add other characters if we wanted to. You can type help format to learn more about the syntax for formatting numbers.

. dtable age, by(highbp) 
     continuous(age, statistics(mean sd)) 
     nformat(%9.1f mean sd)
     sformat((%s) sd)

High blood pressure
No Yes Total
N 5,975 (57.7%) 4,376 (42.3%) 10,351 (100.0%)
Age (years) 42.2 (16.8) 55.0 (14.9) 47.6 (17.2)

We can remove the Total column from our table by adding nototals to the by(highbp) option. And we can test the equivalence of the mean age in each group by adding tests to the by(highbp) option.

. dtable age, by(diabetes, nototals tests) 
     continuous(age,                   
     statistics(mean sd))             
     nformat(%9.1f mean sd)      
     sformat((%s) sd)

note: using test regress across levels of diabetes for age.

Diabetes status
Not diabetic Diabetic Test
N 9,850 (95.2%) 499 (4.8%)
Age (years) 46.9 (17.2) 60.7 (11.5) <0.001

Basic tables for categorical variables

We can create similar tables for categorical variables by adding the i. prefix to the variable, such as sex.

. dtable i.sex

Summary
N 10,351
Sex
Male 4,915 (47.5%)
Female 5,436 (52.5%)

We can use the by(highbp) option to create a cross-tabulation of sex and highbp.

. dtable i.sex, by(highbp)

High blood pressure
No Yes Total
N 5,975 (57.7%) 4,376 (42.3%) 10,351 (100.0%)
Sex
Male 2,611 (43.7%) 2,304 (52.7%) 4,915 (47.5%)
Female 3,364 (56.3%) 2,072 (47.3%) 5,436 (52.5%)

We can explicitly specify the statistics that we would like to report by using the factor() option. And we can customize the numeric display of the statistics by using the nformat() option.

. dtable i.sex, by(highbp)    
     factor(, statistics(fvfrequency fvpercent)) 
     nformat(%16.0fc fvfrequency)       
     nformat(%9.1f fvpercent)

High blood pressure
No Yes Total
N 5,975 (57.7%) 4,376 (42.3%) 10,351 (100.0%)
Sex
Male 2,611 (43.7%) 2,304 (52.7%) 4,915 (47.5%)
Female 3,364 (56.3%) 2,072 (47.3%) 5,436 (52.5%)

We can again remove the Total column from our table by adding nototals to the by(highbp) option. And we can test the association of sex and highbp by adding tests to the by(highbp) option.

. dtable i.sex, by(highbp, nototals tests)  
     factor(, statistics(fvfrequency fvpercent)) 
     nformat(%16.0fc fvfrequency)    
     nformat(%9.1f fvpercent)

High blood pressure
No Yes Test
N 5,975 (57.7%) 4,376 (42.3%)
Sex
Male 2,611 (43.7%) 2,304 (52.7%) <0.001
Female 3,364 (56.3%) 2,072 (47.3%)

Tables for continuous and categorical variables

We can easily create tables that include both continuous and categorical variables together. The example below simply combines the options from the examples above.

. dtable age i.sex, by(highbp)           
     continuous(, statistics(mean sd))
     nformat(%9.1f mean sd)         
     sformat((%s) semean)           
     factor(, statistics(fvfrequency fvpercent)) 
     nformat(%16.0fc fvfrequency)  
     nformat(%9.1f fvpercent)

High blood pressure
No Yes Total
N 5,975 (57.7%) 4,376 (42.3%) 10,351 (100.0%)
Age (years) 42.2 (16.8) 55.0 (14.9) 47.6 (17.2)
Sex
Male 2,611 (43.7%) 2,304 (52.7%) 4,915 (47.5%)
Female 3,364 (56.3%) 2,072 (47.3%) 5,436 (52.5%)

We can easily add more variables. Remember to omit the i. prefix for continuous variables and include the i. prefix for categorical variables.

. dtable age i.sex i.race i.hlthstat i.diabetes 
     weight height bmi bpsystol bpdiast      
     , by(highbp)                        
     continuous(, statistics(mean sd)) 
     nformat(%9.1f mean sd)         
     sformat((%s) semean)          
     factor(, statistics(fvfrequency fvpercent)) 
     nformat(%16.0fc fvfrequency)    
     nformat(%9.1f fvpercent)

High blood pressure
No Yes Total
N 5,975 (57.7%) 4,376 (42.3%) 10,351 (100.0%) Age (years) 42.2 (16.8) 55.0 (14.9) 47.6 (17.2) Sex Male 2,611 (43.7%) 2,304 (52.7%) 4,915 (47.5%) Female 3,364 (56.3%) 2,072 (47.3%) 5,436 (52.5%) Race White 5,317 (89.0%) 3,748 (85.6%) 9,065 (87.6%) Black 545 (9.1%) 541 (12.4%) 1,086 (10.5%) Other 113 (1.9%) 87 (2.0%) 200 (1.9%) Health status Excellent 1,649 (27.7%) 758 (17.3%) 2,407 (23.3%) Very good 1,666 (27.9%) 925 (21.2%) 2,591 (25.1%) Good 1,572 (26.4%) 1,366 (31.2%) 2,938 (28.4%) Fair 766 (12.8%) 904 (20.7%) 1,670 (16.2%) Poor 310 (5.2%) 419 (9.6%) 729 (7.1%) Diabetes status Not diabetic 5,795 (97.0%) 4,055 (92.7%) 9,850 (95.2%) Diabetic 178 (3.0%) 321 (7.3%) 499 (4.8%) Weight (kg) 68.3 (13.6) 76.9 (16.2) 71.9 (15.4) Height (cm) 167.7 (9.5) 167.6 (9.9) 167.7 (9.7) Body mass index (BMI) 24.2 (4.1) 27.4 (5.3) 25.5 (4.9) Systolic blood pressure 116.5 (11.8) 150.5 (20.7) 130.9 (23.3) Diastolic blood pressure 74.2 (8.1) 92.0 (11.0) 81.7 (12.9)

Titles and notes

We can also use the titles() option to add titles and the notes() option to add notes.

. dtable age i.sex i.race i.hlthstat i.diabetes   
     weight height bmi bpsystol bpdiast         
     , by(highbp)                           
     continuous(, statistics(mean sd))   
     nformat(%9.1f mean sd)           
     sformat((%s) semean)          
     factor(, statistics(fvfrequency fvpercent))    
     nformat(%16.0fc fvfrequency )              
     nformat(%9.1f fvpercent)                   
     title(Table 1: Descriptive statistics by hypertension status) 
     note("Note: Mean (SD)")

Table 1: Descriptive statistics by hypertension status
High blood pressure
No Yes Total
N 5,975 (57.7%) 4,376 (42.3%) 10,351 (100.0%) Age (years) 42.2 (16.8) 55.0 (14.9) 47.6 (17.2) Sex Male 2,611 (43.7%) 2,304 (52.7%) 4,915 (47.5%) Female 3,364 (56.3%) 2,072 (47.3%) 5,436 (52.5%) Race White 5,317 (89.0%) 3,748 (85.6%) 9,065 (87.6%) Black 545 (9.1%) 541 (12.4%) 1,086 (10.5%) Other 113 (1.9%) 87 (2.0%) 200 (1.9%) Health status Excellent 1,649 (27.7%) 758 (17.3%) 2,407 (23.3%) Very good 1,666 (27.9%) 925 (21.2%) 2,591 (25.1%) Good 1,572 (26.4%) 1,366 (31.2%) 2,938 (28.4%) Fair 766 (12.8%) 904 (20.7%) 1,670 (16.2%) Poor 310 (5.2%) 419 (9.6%) 729 (7.1%) Diabetes status Not diabetic 5,795 (97.0%) 4,055 (92.7%) 9,850 (95.2%) Diabetic 178 (3.0%) 321 (7.3%) 499 (4.8%) Weight (kg) 68.3 (13.6) 76.9 (16.2) 71.9 (15.4) Height (cm) 167.7 (9.5) 167.6 (9.9) 167.7 (9.7) Body mass index (BMI) 24.2 (4.1) 27.4 (5.3) 25.5 (4.9) Systolic blood pressure 116.5 (11.8) 150.5 (20.7) 130.9 (23.3) Diastolic blood pressure 74.2 (8.1) 92.0 (11.0) 81.7 (12.9)
Note: Mean (SD)

And here is an example of a classic table of tests.

. dtable corpuscl trnsfern albumin vitaminc zinc copper porphyrn lead 
     , by(highbp, nototals tests)                            
     continuous(, statistics(mean sd))                  
     nformat(%9.1f mean sd)                          
     sformat((%s) semean)                        
     title(Table 2: Lab tests by hypertension status) 
     note("Note: Mean (SD)")

note: using test regress across levels of highbp for corpuscl, trnsfern, albumin, vitaminc, zinc,
      copper, porphyrn, and lead.

Table 2: Lab tests by hypertension status
High blood pressure
No Yes Test
N 5,975 (57.7%) 4,376 (42.3%) Mean corpuscular volume (fL) 90.1 (5.4) 89.8 (5.7) 0.001 Transferrin saturation (%) 28.3 (10.4) 26.7 (9.4) <0.001 Serum albumin (g/dL) 4.7 (0.3) 4.7 (0.3) <0.001 Serum vitamin C (mg/dL) 1.0 (0.6) 1.0 (0.6) 0.007 Serum zinc (mcg/dL) 87.1 (14.7) 85.7 (14.2) <0.001 Serum copper (mcg/dL) 125.1 (35.2) 126.3 (28.4) 0.067 Erythrocyte porphyrin (mcg/dl) 53.0 (25.3) 54.5 (26.3) 0.004 Lead (mcg/dL) 13.9 (6.0) 14.9 (6.4) <0.001
Note: Mean (SD)

Exporting tables

We can also use the export() option to export our tables to many types of documents, including Microsoft Word, Microsoft Excel, HTML, PDF, LaTeX, SMCL, text, and Markdown. The example below exports Table 1 to a Microsoft Word document named table1.docx.

. dtable age i.sex i.race i.hlthstat i.diabetes          
     weight height bmi bpsystol bpdiast                
     , by(highbp)                                  
     continuous(, statistics(mean sd))          
     nformat(%9.1f mean sd)                  
     sformat((%s) semean)                
     factor(, statistics(fvfrequency fvpercent))          
     nformat(%16.0fc fvfrequency )                       
     nformat(%9.1f fvpercent)                         
     title(Table 1: Descriptive statistics by hypertension status) 
     note("Note: Mean (SD)")        
     export(table1.docx, replace)

Table 1: Descriptive statistics by hypertension status
High blood pressure
No Yes Total
N 5,975 (57.7%) 4,376 (42.3%) 10,351 (100.0%) Age (years) 42.2 (16.8) 55.0 (14.9) 47.6 (17.2) Sex Male 2,611 (43.7%) 2,304 (52.7%) 4,915 (47.5%) Female 3,364 (56.3%) 2,072 (47.3%) 5,436 (52.5%) Race White 5,317 (89.0%) 3,748 (85.6%) 9,065 (87.6%) Black 545 (9.1%) 541 (12.4%) 1,086 (10.5%) Other 113 (1.9%) 87 (2.0%) 200 (1.9%) Health status Excellent 1,649 (27.7%) 758 (17.3%) 2,407 (23.3%) Very good 1,666 (27.9%) 925 (21.2%) 2,591 (25.1%) Good 1,572 (26.4%) 1,366 (31.2%) 2,938 (28.4%) Fair 766 (12.8%) 904 (20.7%) 1,670 (16.2%) Poor 310 (5.2%) 419 (9.6%) 729 (7.1%) Diabetes status Not diabetic 5,795 (97.0%) 4,055 (92.7%) 9,850 (95.2%) Diabetic 178 (3.0%) 321 (7.3%) 499 (4.8%) Weight (kg) 68.3 (13.6) 76.9 (16.2) 71.9 (15.4) Height (cm) 167.7 (9.5) 167.6 (9.9) 167.7 (9.7) Body mass index (BMI) 24.2 (4.1) 27.4 (5.3) 25.5 (4.9) Systolic blood pressure 116.5 (11.8) 150.5 (20.7) 130.9 (23.3) Diastolic blood pressure 74.2 (8.1) 92.0 (11.0) 81.7 (12.9)
Note: Mean (SD) (collection DTable exported to file table1.docx)

The resulting table in the Microsoft Word document looks good, but it could look better. Note that some of the variable labels wrap to a second row. We can use collect style putdocx to automatically fit the contents of the table in the document. Then we can use collect export to export the table to the Microsoft Word document named table1.docx.

. collect style putdocx, layout(autofitcontents)
. collect export table1.docx, as(docx) replace

We can export Table 2 the same way.

. dtable corpuscl trnsfern albumin vitaminc zinc copper porphyrn lead 
     , by(highbp, nototals tests)                           
     continuous(, statistics(mean sd))                  
     nformat(%9.1f mean sd)                           
     sformat((%s) semean)                          
     title(Table 2: Lab tests by hypertension status) 
     note("Note: Mean (SD)")          
     export(table2.docx, replace)

note: using test regress across levels of highbp for corpuscl, trnsfern, albumin, vitaminc, zinc,
      copper, porphyrn, and lead.

Table 2: Lab tests by hypertension status
High blood pressure
0 1 Test
N 5,975 (57.7%) 4,376 (42.3%) Mean corpuscular volume (fL) 90.1 (5.4) 89.8 (5.7) 0.001 Transferrin saturation (%) 28.3 (10.4) 26.7 (9.4) <0.001 Serum albumin (g/dL) 4.7 (0.3) 4.7 (0.3) <0.001 Serum vitamin C (mg/dL) 1.0 (0.6) 1.0 (0.6) 0.007 Serum zinc (mcg/dL) 87.1 (14.7) 85.7 (14.2) <0.001 Serum copper (mcg/dL) 125.1 (35.2) 126.3 (28.4) 0.067 Erythrocyte porphyrin (mcg/dl) 53.0 (25.3) 54.5 (26.3) 0.004 Lead (mcg/dL) 13.9 (6.0) 14.9 (6.4) <0.001
Note: Mean (SD) (collection DTable exported to file table2.docx) . collect style putdocx, layout(autofitcontents) . collect export table2.docx, as(docx) replace (collection DTable exported to file table2.docx)

You can read more about dtable by clicking on the link to the manual entry below, and you can also watch a demonstration of it on YouTube by clicking on the link below.