|Title||Standard errors, confidence intervals, and significance tests for ORs, HRs, IRRs, and RRRs|
|Authors||William Sribney, StataCorp
Vince Wiggins, StataCorp
How does Stata get the standard errors of the odds ratios reported by logistic and why do the reported confidence intervals not agree with a 95% confidence bound on the reported odds ratio using these standard errors? Likewise, why does the reported significance test of the odds ratio not agree with either a test of the odds ratio against 0 or a test against 1 using the reported standard error?
The odds ratios (ORs), hazard ratios (HRs), incidence-rate ratios (IRRs), and relative-risk ratios (RRRs) are all just univariate transformations of the estimated betas for the logistic, survival, and multinomial logistic models. Using the odds ratio as an example, for any coefficient b we have
ORb = exp(b)
When ORs (or HRs, or IRRs, or RRRs) are reported, Stata uses the delta rule to derive an estimate of the standard error of ORb. For the simple expression of ORb, the standard error by the delta rule is just
se(ORb) = exp(b)*se(b)
The confidence intervals reported by Stata for the odds ratios are the exp() transformed endpoints of the confidence intervals in the natural parameter space—the betas. They are
CI(ORb) = [exp(bL), exp(bU)] where: bL = lower limit of confidence interval for b bU = upper limit of confidence interval for b
Some people prefer confidence intervals computed from the odds-ratio estimates and the delta rule SEs. Asymptotically, these two are equivalent, but they will differ for real data.
In practice, the confidence intervals obtained by transforming the endpoints have some intuitively desirable properties; e.g., they do not produce negative odds ratios. In general, we also expect the estimates to be more normally distributed in the natural space of the problem (the beta space); see the long answer below.
The proper test of significance for ORs, HRs, IRRs, and RRRs is whether the ratio is 1 not whether the ratio is 0. The test against 0 is a test that the coefficient for the parameter in the fitted model is negative infinity and has little meaning. Stata reports the test of whether the ratio (OR, HR, IRR, RRR) differs from 1—e.g., H0: ORb = 1.
As with the confidence interval, there are two asymptotically equivalent ways to form this test: (1) Test whether the parameter b differs from 0 in the natural space of the model (H0: b = 0), or (2) test whether the transformed parameter differs from 1 in the OR space (H0: exp(b) = 1). The latter test would use the SE(ORb) from the delta rule. When reporting ORs, HRs, or RRRs, Stata reports the statistic and significance level from the test in the natural estimation space—H0: b = 0.
Asymptotic theory gives no clue as to which test should be preferred, but we would expect the estimates to be more normally distributed in the natural estimation space—see the discussion below.
To continue with the point that confidence intervals can be computed in two ways for transformed estimates (ORs, RRRs, IRRs, HRs, ...), a user asked
Wouldn’t it be more appropriate to use the the standard errors for the RR when calculating CI?
Asymptotically, both methods are equally valid, but it is better to start with the CI in the metric in which the estimates are closer to normal and then transform its endpoints. Since the estimate b is likely to be more normal than exp(b) (since exp(b) is likely to be skewed), it is better to transform the endpoints of the CI for b to produce a CI for exp(b).
First, when you transform a standard error of an ML estimate using the delta method, you get the same standard error that you would have obtained had you performed the maximization with the transformed parameter directly.
Consider a general transformation B = f(b) of b. Using the delta method,
Var(B) = f'(b)2 * Var(b) = f'(b)2 * (d2 lnL/db2)-1
where lnL is the log likelihood. Had we done the maximization in B,
d ln L/dB = d lnL/db * db/dB d2 lnL/dB2 = d2 lnL/db2 * (db/dB)2 + d lnL/db * d2b/dB2
since d lnL/db = 0 at the maximum,
d2 lnL/dB2 = d2 lnL/db2 * (db/dB)2 = d2 lnL/db2 * f'(b)-2
Var(B) = (d2 lnL/dB2)-1 = (d2 lnL/db2)-1 * f'(b)2
So this fact might give someone more reason to say we should use the standard error of B to produce its CI.
However, we could also use this as evidence to say we could use ANY transformation to produce a confidence interval. That is, we could look at further transformations g(B) of B. According to asymptotic theory,
[g(B) - z*se(g(B)), g(B) + z*se(g(B))] (1)
gives a valid CI for g(B) (where z is the normal quantile and se(g(B)) is the standard error computed using the delta method).
This CI with endpoints transformed back to the B metric gives a CI
[g-1(g(B) - z*se(g(B))), g-1(g(B) + z*se(g(B)))]
The above CI must give an equally valid CI since it will yield the same coverage probability as (1).
So, ideally, we should search for the best transformation g(B) of any quantity B such that g(B) is roughly normal so that the CI given above gives the best coverage probability.
The estimate B = exp(b) is likely to have a skewed distribution, so it is certainly not likely to be as normal as the distribution of the coefficient estimate b. It’s better to use g = exp−1 to produce the CI for B = exp(b).
Both CIs are equally valid according to asymptotic theory. But, in practice, the CI produced from the more normal estimate (i.e., b rather than exp(b)) will likely yield slightly better CIs for coverage probability.