Home  /  Resources & support  /  FAQs  /  Confidence intervals for predicted probabilities after probit

How can I get confidence intervals for predicted probabilities after probit?

Title   Confidence intervals for predicted probabilities after probit
Author William Sribney, StataCorp

What predict does after probit

After the probit command, predicted probabilities can be obtained by predict.

. sysuse auto

. probit foreign weight mpg

. predict phat

The variable phat contains the predicted probabilities for each observation.

. predict ihat, xb

creates the variable ihat containing the linear predictor (x*beta) for each observation.

. predict error, stdp

creates the variable error containing the standard error of the linear predictor for each observation.

Confidence intervals

Since predict gives the standard error of the linear predictor, to compute confidence intervals for the predicted probabilities, you must first compute confidence intervals for the linear predictor, and then transform them to probability space.

Here is how to compute 95% confidence intervals:

. generate lb = ihat - invnormal(0.975)*error
. generate ub = ihat + invnormal(0.975)*error
. generate plb = normal(lb)
. generate pub = normal(ub)

The variables plb and pub contain, respectively, the lower and upper confidence bounds for the predicted probabilities.

Standard error of the predicted probabilities

You can compute the standard error of the predicted probabilities by the following formula:

. generate pr_err = error*exp(-0.5*ihat^2)/sqrt(2*_pi)

This is a Taylor-series approximation for the standard error. It should NOT be used to generate confidence intervals. Normality holds much better on the index scale than on the probability scale. Thus it is much better to compute the confidence interval for the index and then transform the endpoints to probability space (as we did above) than it is to use the approximate standard errors of the predicted probabilities to compute confidence intervals.

A similar problem about logistic regression is discussed in FAQ: How do I obtain confidence intervals for the predicted probabilities after logistic regression?