 »  Home »  Resources & support »  FAQs »  Standard errors, confidence intervals, and significance tests

## How are the standard errors and confidence intervals computed for hazard ratios (HRs) by stcox and streg?

 Title Standard errors, confidence intervals, and significance tests for ORs, HRs, IRRs, and RRRs Authors William Sribney, StataCorp Vince Wiggins, StataCorp

Question:

How does Stata get the standard errors of the odds ratios reported by logistic and why do the reported confidence intervals not agree with a 95% confidence bound on the reported odds ratio using these standard errors? Likewise, why does the reported significance test of the odds ratio not agree with either a test of the odds ratio against 0 or a test against 1 using the reported standard error?

### Standard Errors

The odds ratios (ORs), hazard ratios (HRs), incidence-rate ratios (IRRs), and relative-risk ratios (RRRs) are all just univariate transformations of the estimated betas for the logistic, survival, and multinomial logistic models. Using the odds ratio as an example, for any coefficient b we have

        ORb = exp(b)


When ORs (or HRs, or IRRs, or RRRs) are reported, Stata uses the delta rule to derive an estimate of the standard error of ORb. For the simple expression of ORb, the standard error by the delta rule is just

        se(ORb) = exp(b)*se(b)


The confidence intervals reported by Stata for the odds ratios are the exp() transformed endpoints of the confidence intervals in the natural parameter space—the betas. They are

        CI(ORb) = [exp(bL), exp(bU)]

where:
bL = lower limit of confidence interval for b
bU = upper limit of confidence interval for b


Here is an example with logistic. We show how to obtain the standard errors and confidence intervals for odds ratios manually in Stata's method

. webuse lbw, clear
(Hosmer & Lemeshow data)

. logistic low age lwt i.race smoke, coef

Logistic regression                             Number of obs     =        189
LR chi2(5)        =      20.08
Prob > chi2       =     0.0012
Log likelihood = -107.29639                     Pseudo R2         =     0.0856

low   Odds Ratio   Std. Err.      z    P>|z|     [95% Conf. Interval]

age    -.0225071   .0341688    -0.66   0.510    -.0894766    .0444625
lwt    -.0125017   .0063843    -1.96   0.050    -.0250146    .0000113

race
black      1.23121   .5171062     2.38   0.017     .2177006    2.244719
other     .9435946   .4162001     2.27   0.023     .1278573    1.759332

smoke     1.054433   .3799787     2.77   0.006     .3096879    1.799177
_cons     .3301267   1.107607     0.30   0.766    -1.840743    2.500997

.
. * Manually compute the standard error of the odds ratio age
. display exp(_b[age])*_se[age]
.03340831

.
. * compute the confidence interval of the odds ratio age
. display "lower limit: " exp(_b[age]-invnormal(0.975)*_se[age])
lower limit: .91440966

. display "upper limit: " exp(_b[age]+invnormal(0.975)*_se[age])
upper limit: 1.0454657

.
. * verify
. logistic low age lwt i.race smoke

Logistic regression                             Number of obs     =        189
LR chi2(5)        =      20.08
Prob > chi2       =     0.0012
Log likelihood = -107.29639                     Pseudo R2         =     0.0856

low   Odds Ratio   Std. Err.      z    P>|z|     [95% Conf. Interval]

age     .9777443   .0334083    -0.66   0.510     .9144097    1.045466
lwt     .9875761    .006305    -1.96   0.050     .9752956    1.000011

race
black     3.425372   1.771281     2.38   0.017     1.243215    9.437768
other       2.5692   1.069301     2.27   0.023     1.136391    5.808555

smoke     2.870346    1.09067     2.77   0.006        1.363    6.044672
_cons     1.391144   1.540841     0.30   0.766     .1586994    12.19464

1.391144   1.540841     0.30   0.766     .1586994    12.19464

Some people prefer confidence intervals computed from the odds-ratio estimates and the delta rule SEs. Asymptotically, these two are equivalent, but they will differ for real data.

In practice, the confidence intervals obtained by transforming the endpoints have some intuitively desirable properties; e.g., they do not produce negative odds ratios. In general, we also expect the estimates to be more normally distributed in the natural space of the problem (the beta space); see the long answer below.

### Test of significance

The proper test of significance for ORs, HRs, IRRs, and RRRs is whether the ratio is 1 not whether the ratio is 0. The test against 0 is a test that the coefficient for the parameter in the fitted model is negative infinity and has little meaning. Stata reports the test of whether the ratio (OR, HR, IRR, RRR) differs from 1—e.g., H0: ORb = 1.

As with the confidence interval, there are two asymptotically equivalent ways to form this test: (1) Test whether the parameter b differs from 0 in the natural space of the model (H0: b = 0), or (2) test whether the transformed parameter differs from 1 in the OR space (H0: exp(b) = 1). The latter test would use the SE(ORb) from the delta rule. When reporting ORs, HRs, or RRRs, Stata reports the statistic and significance level from the test in the natural estimation space—H0: b = 0.

Asymptotic theory gives no clue as to which test should be preferred, but we would expect the estimates to be more normally distributed in the natural estimation space—see the discussion below.

To continue with the point that confidence intervals can be computed in two ways for transformed estimates (ORs, RRRs, IRRs, HRs, ...), a user asked

Wouldn’t it be more appropriate to use the the standard errors for the RR when calculating CI?

Asymptotically, both methods are equally valid, but it is better to start with the CI in the metric in which the estimates are closer to normal and then transform its endpoints. Since the estimate b is likely to be more normal than exp(b) (since exp(b) is likely to be skewed), it is better to transform the endpoints of the CI for b to produce a CI for exp(b).

First, when you transform a standard error of an ML estimate using the delta method, you get the same standard error that you would have obtained had you performed the maximization with the transformed parameter directly.

Consider a general transformation B = f(b) of b. Using the delta method,

     Var(B) = f'(b)2 * Var(b) = f'(b)2 * (d2 lnL/db2)-1


where lnL is the log likelihood. Had we done the maximization in B,

     d ln L/dB = d lnL/db * db/dB

d2 lnL/dB2 = d2 lnL/db2 * (db/dB)2 + d lnL/db * d2b/dB2


since d lnL/db = 0 at the maximum,

     d2 lnL/dB2 = d2 lnL/db2 * (db/dB)2 = d2 lnL/db2 * f'(b)-2


hence,

     Var(B) = (d2 lnL/dB2)-1 = (d2 lnL/db2)-1 * f'(b)2


So this fact might give someone more reason to say we should use the standard error of B to produce its CI.

However, we could also use this as evidence to say we could use ANY transformation to produce a confidence interval. That is, we could look at further transformations g(B) of B. According to asymptotic theory,

     [g(B) - z*se(g(B)), g(B) + z*se(g(B))]                  (1)


gives a valid CI for g(B) (where z is the normal quantile and se(g(B)) is the standard error computed using the delta method).

This CI with endpoints transformed back to the B metric gives a CI

     [g-1(g(B) - z*se(g(B))), g-1(g(B) + z*se(g(B)))]


The above CI must give an equally valid CI since it will yield the same coverage probability as (1).

So, ideally, we should search for the best transformation g(B) of any quantity B such that g(B) is roughly normal so that the CI given above gives the best coverage probability.

The estimate B = exp(b) is likely to have a skewed distribution, so it is certainly not likely to be as normal as the distribution of the coefficient estimate b. It’s better to use g = exp−1 to produce the CI for B = exp(b).

Both CIs are equally valid according to asymptotic theory. But, in practice, the CI produced from the more normal estimate (i.e., b rather than exp(b)) will likely yield slightly better CIs for coverage probability.