Title | A terminology problem: odds ratio versus odds | |

Author |
William Gould, StataCorp James Hardin, StataCorp |

Unfortunately, the language used to describe statistical terms is not used
uniformly across fields. One example of this is **odds** and **odds
ratio**. Economists especially refer to what others call the odds as the
odds ratio. Below, we will be careful to define our terms.

Let there be a binary outcome **y**; we will say **y**=0 or
**y**=1, and let us assume that

Pr(y==1) = F(Xb)

where **X** and **b** are vectors and F() is some cumulative distribution.

If F() is the normal distribution, we have the probit estimator.

If F() is the logistic distribution, we have the logit (logistic) estimator.

The cumulative distribution for the logistic distribution is

F(Xb) = exp(Xb) / [1 + exp(Xb)]

Thus,

Pr(y==1) = exp(Xb) / [1 + exp(Xb)]

Let us write **p** for Pr(**y**==1)

p = exp(Xb) / [1 + exp(Xb)]

The **odds** **p**/(1−**p**) is therefore

p exp(Xb) / [1 + exp(Xb)] exp(Xb) / [1 + exp(Xb)] --- = ------------------------- = ----------------------- 1-p 1 - exp(Xb)/[1 + exp(Xb)] 1 / [1 + exp(Xb)] = exp(Xb)

Many authors present this formula as

log( p/[1-p] ) = Xb

which also means

p / (1-p) = exp(Xb)

The language here is sometimes confusing because some authors call this the
**odds ratio**. Englishwise, they are correct: it is the **odds** and
the **odds** are based on a ratio calculation. It is **not**,
however, the **odds ratio** that is talked about when results are
reported.

The **odds ratio** when results are reported refers to the ratio of two
**odds** or, if you prefer, the **ratio of two odds ratios**.

That is, let us write

o(Xb) = exp(Xb)

The **odds ratio** is

o(evaluated at one place) ------------------------- o(evaluated at another)

In particular, we want to consider the ratio of the **odds** for a
one-unit change in one of the components of **X**. Let us now write

Xb = b0 + b1*x1 + b2*x2 + ... + bk*xk

Let us arbitrarily consider what is called the **odds ratio** for
**x1**:

o(b0 + b1*(x1+1) + b2*x2 + ... + bk*xk) --------------------------------------- o(b0 + b1*x1 + b2*x2 + ... + bk*xk) o(b0 + b1*x1 + b2*x2 + ... + bk*xk + b1) = ---------------------------------------- o(b0 + b1*x1 + b2*x2 + ... + bk(xk)

Now, remember, o() = exp(), so

exp(b0 + b1*x1 + b2*x2 + ... + bk*xk + b1) = ---------------------------------------- exp(b0 + b1*x1 + b2*x2 + ... + bk(xk) exp(b0 + b1*x2 + b2*x2 + ... + bk*xk) * exp(b1) = ----------------------------------------------- exp(b0 + b1*x2 + b2*x2 + ... + bk*xk) = exp(b1)

This is the standard result. The ratio of the **odds** for a one-unit
increase in **X _{i}** is exp(

This ratio is constant: it does not change according to the value of the
other **X**s because they cancel out in the calculation.

Be careful about language:

- This is called the
**odds ratio**; it is called that because it is the ratio of two**odds**. - Some people call the
**odds**the**odds ratio**because the**odds**itself is a ratio. That is fine English, but this can quickly lead to confusion. If you did that, you would have to call this calculation the**odds ratio ratio**or the**ratio of the odds ratios**.

It is the language, and not the math, that leads to the confusion. When we
say that in a logistic model, the **odds ratio** is constant, we mean

o(evaluated at one point) -------------------------- is constant. o(evaluated somewhere else)

We do **not** mean that

o(evaluated at one point) is constant.

(that is, we do not mean the **odds** are constant).