|Title||Confidence intervals for predicted probabilities after probit|
|Author||William Sribney, StataCorp|
. sysuse auto . probit foreign weight mpg . predict phat
The variable phat contains the predicted probabilities for each observation.
. predict ihat, xb
creates the variable ihat containing the linear predictor (x*beta) for each observation.
. predict error, stdp
creates the variable error containing the standard error of the linear predictor for each observation.
Since predict gives the standard error of the linear predictor, to compute confidence intervals for the predicted probabilities, you must first compute confidence intervals for the linear predictor, and then transform them to probability space.
Here is how to compute 95% confidence intervals:
. predict ihat, xb . predict error, stdp . generate lb = ihat - invnormal(0.975)*error . generate ub = ihat + invnormal(0.975)*error . generate plb = normal(lb) . generate pub = normal(ub)
The variables plb and pub contain, respectively, the lower and upper confidence bounds for the predicted probabilities.
You can compute the standard error of the predicted probabilities by the following formula:
. generate pr_err = error*exp(-0.5*ihat^2)/sqrt(2*_pi)
This is a Taylor-series approximation for the standard error. It should NOT be used to generate confidence intervals. Normality holds much better on the index scale than on the probability scale. Thus it is much better to compute the confidence interval for the index and then transform the endpoints to probability space (as we did above) than it is to use the approximate standard errors of the predicted probabilities to compute confidence intervals.
A similar problem about logistic regression is discussed in FAQ: How do I obtain confidence intervals for the predicted probabilities after logistic regression?