Title | Confidence intervals for predicted probabilities after probit | |
Author | William Sribney, StataCorp | |
Date | January 1999; minor revisions July 2007 |
After the probit command
. predict phat
creates the variable phat containing the predicted probabilities for each observation.
. predict ihat, xb
creates the variable ihat containing the linear predictor (x*beta) for each observation.
. predict error, stdp
creates the variable error containing the error of the linear predictor for each observation.
Since predict gives the error of the linear predictor, to compute confidence intervals for the predicted probabilities, you must first compute confidence intervals for the linear predictor, and then transform them to probability space.
Here is how to compute 95% confidence intervals:
. predict ihat, xb . predict error, stdp . generate lb = ihat - invnormal(0.975)*error . generate ub = ihat + invnormal(0.975)*error . generate plb = normal(lb) . generate pub = normal(ub)
Note: invnormal(0.975) = 1.96, and you could have just plugged 1.96 into the above formulas instead of invnormal(0.975).
The variables plb and pub contain, respectively, the lower and upper confidence bounds for the predicted probabilities.
You can compute the standard error of the predicted probabilities by the following formula:
. generate pr_err = error*exp(-0.5*ihat^2)/sqrt(2*_pi)
This is a Taylor-series approximation for the standard error. It should NOT be used to generate confidence intervals. Normality holds much better on the index scale than on the probability scale. Thus it is much better to compute the confidence interval for the index and then transform the endpoints to probability space (as we did above) than it is to use the approximate standard errors of the predicted probabilities to compute confidence intervals.