Home  /  Resources & support  /  FAQs  /  Using named expressions with collect get to add results into a table

How do I include risk ratio (RR) and odds ratio (OR) in a table after fitting a logistic regression model?

Title   Using named expressions with collect get to add results into a table
Authors Pei-Chun Lai, StataCorp
Jeff Pitblado, StataCorp

The risk ratio (RR) and odds ratio (OR) are two common measures of effect size that we can obtain after fitting a logistic regression model. While we can get OR values in the results from the logistic command (and logit with the or option), RR values are neither estimated nor stored in a way that allows us to easily get them into a table. How could we estimate these two measures and include them in a table? In this FAQ, we focus primarily on using “named expressions” with the collect get command to get RR and OR values and on including them in a table using the collect suite of commands.

We will use the nhanes2l dataset and model described in Logistic regression 1: Introduction.

Our variables of interest are highbp and diabetes.

. clear all

. webuse nhanes2l
(Second National Health and Nutrition Examination Survey)

. describe highbp diabetes

Variable Storage Display Value
name type format label Variable label
highbp byte %8.0g * High blood pressure
diabetes byte %12.0g diabetes Diabetes status

First, let us add labels to the levels of highbp:

. label define highbp 0 "Normotensive" 1 "Hypertensive"

. label values highbp highbp

Then we examine the relationship between the variables highbp and diabetes using the tabulate command:

. tabulate highbp diabetes

High blood Diabetes status
pressure Not diabe Diabetic Total
Normotensive 5,795 178 5,973
Hypertensive 4,055 321 4,376
Total 9,850 499 10,349

The odds of hypertension for people with diabetes is

(321/499)/{1 − (321/499)} = 321/178 = 1.80337

The odds of hypertension for people without diabetes is

(4055/9850)/{1 − (4055/9850)} = 4055/5795 = 0.69974

The resulting OR is

1.80337/0.69974 = 2.57720

The risk of hypertension for people with diabetes is

321/499 = 0.64329

For people without diabetes, the risk is

4055/9850 = 0.41168

Then the RR for people with diabetes is

0.64329/0.41168 = 1.56260

Now let us fit a logistic regression model using highbp as the binary outcome variable and i.diabetes as a binary predictor variable:

. logit highbp i.diabetes

Iteration 0:  Log likelihood = -7049.6663  
Iteration 1:  Log likelihood = -6998.1058  
Iteration 2:  Log likelihood = -6998.1054  

Logistic regression                                     Number of obs = 10,349
                                                        LR chi2(1)    = 103.12
                                                        Prob > chi2   = 0.0000
Log likelihood = -6998.1054                             Pseudo R2     = 0.0073

highbp Coefficient Std. err. z P>|z| [95% conf. interval]
diabetes
Diabetic .9467015 .0956682 9.90 0.000 .7591952 1.134208
_cons -.3570448 .0204737 -17.44 0.000 -.3971725 -.3169171

After we run the logit command, the OR and RR values are not stored as scalars in e() or r(). So how can we compute these ratios and include them in a table? One solution is to use the margins command to get the predicted probabilities of highbp = 1 for people with diabetes and people without diabetes:

. margins diabetes

Adjusted predictions                                    Number of obs = 10,349
Model VCE: OIM

Expression: Pr(highbp), predict()

Delta-method
Margin std. err. z P>|z| [95% conf. interval]
diabetes
Not diabetic .4116751 .0049587 83.02 0.000 .4019563 .421394
Diabetic .6432864 .0214443 30.00 0.000 .6012563 .6853164
. matlist r(b)
0. 1.
diabetes diabetes
r1 .4116751 .6432864

The values of the predicted probabilities are stored in the r(b) matrix, and the formulas for our measures of interest become

RR = r(b)[1,2]/ r(b)[1,1]

OR = r(b)[1,2]/(1 − r(b)[1,2]) / r(b)[1,1]/(1 − r(b)[1,1])

Here we use the collect get command to compute these values and put them into a collection:

. collect get RR = (r(b)[1,2]/r(b)[1,1]) OR = ((r(b)[1,2]/(1 − r(b)[1,2]))/(r(b)[1,1]/(1 − r(b)[1,1])))

In the above, we apply the named expression syntax to define the RR and OR. They are specified as name = (exp). The name becomes the level of the result dimension used to tag the collected value. Therefore, the RR’s value is tagged with result[RR], and the OR’s value is tagged with result[OR]. While expressions with spaces must be bound in parentheses, we recommend binding all expressions in parentheses.

Here we use collect levelsof results to show the names of the values (the levels of the result dimension) we have collected.

. collect levelsof result

Collection: default
 Dimension: result
    Levels: OR RR

Our call to collect get also updated the automatic levels of the result dimension. Automatic levels are the levels of a dimension that are automatically displayed in the table when the dimension is specified without explicit levels in the collect layout command. To view these, the collect query autolevels result command displays the automatic levels for the result dimension:

. collect query autolevels result

Automatic dimension levels
Collection: default
 Dimension: result
    Levels: RR OR

For more information on the automatic levels, see FAQ: What are the autolevels of a dimension in a table (collection)?

Let's arrange our collected items into a table.

. collect layout (cmdset) (result)

Collection: default
      Rows: cmdset
   Columns: result
   Table 1: 1 x 2

RR OR
1 1.562607 2.577195

collect's default numeric format is %9.0g. We use the nformat() option with the collect style cell command to modify this format to display four digits after the decimal:

. collect style cell result, nformat(%7.4f)

Then we specify the labels for result[RR] and result[OR] as follows:

. collect label levels result RR "Risk Ratio" OR "Odds Ratio"

We use collect preview to preview the table of RR and OR:

. collect preview

Risk Ratio Odds Ratio
1 1.5626 2.5772

Finally, we apply the collect export command to export this table to a Word file:

. collect export RR_OR.docx, replace
(collection default exported to file RR_OR.docx)

The full code is provided below:

clear all
webuse nhanes2l
describe highbp diabetes
label define highbp 0 "Normotensive" 1 "Hypertensive"
label values highbp highbp
tabulate highbp diabetes

logit highbp i.diabetes

margins diabetes
matlist r(b)

collect get RR = (r(b)[1,2]/r(b)[1,1]) OR = ((r(b)[1,2]/(1- r(b)[1,2]))/(r(b)[1,1]/(1- r(b)[1,1]))) 
collect levelsof result
collect query autolevels result
collect layout (cmdset) (result)

collect style cell result, nformat(%7.4f)
collect label levels result RR "Risk Ratio" OR "Odds Ratio"
collect preview
collect export RR_OR.docx, replace