Home  /  Resources & support  /  FAQs  /  Prediction confidence intervals after logistic regression

How do I obtain confidence intervals for the predicted probabilities after logistic regression?

Title   Prediction confidence intervals after logistic regression
Author Mark Inlow, StataCorp

After logistic, the predicted probabilities of the positive outcome can be obtained by predict:

. webuse lbw, clear

. logistic low age lwt i.race smoke, coef

. predict phat

The variable phat contains the predicted probabilities.

The linear predictors \( X\beta \) can be obtained by

. predict xb,xb 

According to the logistic regression model, the relationship between the predicted probabilities and the linear predictors is

\[ P(Y=1) = \frac{\exp(X\beta)}{1+\exp(X\beta)} \]

Since predict gives the standard error of the linear predictor, to compute confidence intervals for the predicted probabilities, you can first compute confidence intervals for the linear predictors, and then transform them to the probability space.

Here is how to compute 95% confidence intervals:

. predict error, stdp

. generate lb = xb - invnormal(0.975)*error

. generate ub = xb + invnormal(0.975)*error

. generate plb = invlogit(lb)

. generate pub = invlogit(ub)

Generating the confidence intervals for the linear predictors and then converting them to probabilities to get confidence intervals for the predicted probabilities is better than estimating the standard error of the predicted probabilities and then generating the confidence intervals directly from that standard error. The distribution of the linear predictors is closer to normality than the predicted probability. Another advantage is that this method will always generate probabilities within the range of [0,1].