This question was originally posed on
Statalist
and answered by StataCorp’s William Gould. Also see
“How can I convert Stata's parameterization of ordered probit and logistic models to one in which a constant is estimated?”
for another answer to a part of this question.

Title | Relationship between ordered probit and probit | |

Author | William Gould, StataCorp | |

Date | September 2001; minor revisions July 2009 |

The answer is either “yes, and in fact, there already is an intercept in the model” or “no, given how it is parameterized, there is no role for an intercept”.

Let us think about a three-outcome ordered probit model. In that model,

Pr(outcome==1) = Pr( X*b + u <= /cut1) Pr(outcome==2) = Pr(/cut1 < X*b + u <= /cut2) (1) Pr(outcome==3) = Pr(/cut2 < X*b + u)

Now, let’s add an intercept; replace X*b with X*b + a, producing

Pr(outcome==1) = Pr( X*b + a + u <= /cut1) Pr(outcome==2) = Pr(/cut1 < X*b + a + u <= /cut2) (2) Pr(outcome==3) = Pr(/cut2 < X*b + a + u)

Doing some algebra, I can equivalently write

Pr(outcome==1) = Pr( X*b + u <= /cut1−a) Pr(outcome==2) = Pr(/cut1−a < X*b + u <= /cut2−a) (2') Pr(outcome==3) = Pr(/cut2−a < X*b + u)

Let’s pretend that in (1), our estimates were /cut1 = −1 and /cut2 = −2. Tell me a value of the intercept a, and I will tell you the new values for /cut1 and /cut2 that will make (2) exactly equivalent to (1). The fact is that /cut1, /cut2, and a are collinear, and there is no room for intercept a to play a role.

In fact, /cut1 and /cut2 are very much like intercepts. First, let’s consider a two-outcome model:

Pr(outcome==1) = Pr( X*b + u <= /cut1) Pr(outcome==2) = Pr(/cut1 < X*b + u)

I can rewrite the second equation as

Pr(outcome==2) = Pr(0 < X*b + (−/cut1) + u) = Pr( X*b + (−/cut1) + u > 0)

and I can rewrite the first equation as 1 − Pr(outcome==2). Doing that, I have the standard probit model with −/cut1 being equal to the intercept. Check the result for yourself. Using the automobile data, type

. probit foreign mpg weight

and

. oprobit foreign mpg weight

The coefficients will all be the same, and the /cut1 will be the negative of the intercept.

The above result, as a matter of fact, generalizes to when there are more than two outcomes. In our three-outcome model, we have

Pr(outcome==1) = Pr( X*b + u <= /cut1) Pr(outcome==2) = Pr(/cut1 < X*b + u <= /cut2) (1) Pr(outcome==3) = Pr(/cut2 < X*b + u)

and again I will start at the bottom, rewrite the third equation, then the second, and then the third:

Pr(outcome==3) = Pr(/cut2 < X*b + u) = Pr( 0 < X*b + (−/cut2) + u) = Pr( X*b + (−/cut2) + u > 0) (1.3)

For the second equation, I want to write not Pr(outcome==2) but Pr(outcome>=2):

Pr(outcome>=2) = Pr(/cut1 < X*b + u) = Pr( 0 < X*b + (−/cut1) + u) = Pr( X*b + (−/cut1) + u > 0) (1.2)

Now, if you look carefully at (1.3) and (1.2), you will recognize that they are ordinary, binary-outcome probit equations.

Equation (1.3) amounts to running a binary-outcome probit with success being outcome==3 and failure being outcome<3. In this equation, −/cut2 corresponds to the intercept.

Equation (1.2) amounts to running a binary probit with success being outcome>=2 and failure being outcome<2. In this equation, −/cut1 corresponds to the intercept.

Ordered probit amounts to estimating (1.3) and (1.2) simultaneously, and with the constraint, that b in (1.3) equals b in (1.2). Ergo, ordered probit amounts to estimating the standard binary probit models

Pr(outcome==3) = Pr( X*b + (−/cut2) + u > 0) (1.3)

and

Pr(outcome>=2) = Pr( X*b + (−/cut1) + u > 0) (1.2)

with the constraint that the coefficients, but not the INTERCEPTS, are equal.