FAQ: Standard errors, confidence intervals, and significance tests

Home / Resources & support / FAQs / Standard errors, confidence intervals, and significance tests

How are the standard errors and confidence intervals computed for relative-risk ratios (RRRs) by mlogit?

How are the standard errors and confidence intervals computed for odds ratios (ORs) by logistic?

How are the standard errors and confidence intervals computed for incidence-rate ratios (IRRs) by poisson and nbreg?

How are the standard errors and confidence intervals computed for hazard ratios (HRs) by stcox and streg?

Title		Standard errors, confidence intervals, and significance tests for ORs, HRs, IRRs, and RRRs
Authors		William Sribney, StataCorp Vince Wiggins, StataCorp

Question:

How does Stata get the standard errors of the odds ratios reported by logistic and why do the reported confidence intervals not agree with a 95% confidence bound on the reported odds ratio using these standard errors? Likewise, why does the reported significance test of the odds ratio not agree with either a test of the odds ratio against 0 or a test against 1 using the reported standard error?

Standard Errors

The odds ratios (ORs), hazard ratios (HRs), incidence-rate ratios (IRRs), and relative-risk ratios (RRRs) are all just univariate transformations of the estimated betas for the logistic, survival, and multinomial logistic models. Using the odds ratio as an example, for any coefficient b we have

\(OR_b = exp(b)\)

When ORs (or HRs, or IRRs, or RRRs) are reported, Stata uses the delta rule to derive an estimate of the standard error of \(OR_b\). For the simple expression of \(OR_b\), the standard error by the delta rule is just

\(se(OR_b) = exp(b)*se(b)\)

Confidence intervals—short answer

The confidence intervals reported by Stata for the odds ratios are the \(exp()\) transformed endpoints of the confidence intervals in the natural parameter space—the betas. They are

\(CI(OR_b) = [exp(b_L), exp(b_U)]\)

where:

\(b_L\) = lower limit of confidence interval for \( b\)
\(b_U\) = upper limit of confidence interval for \( b\)

Here is an example with logistic. We show how to obtain the standard errors and confidence intervals for odds ratios manually in Stata's method

. webuse lbw, clear
 (Hosmer & Lemeshow data)

. logistic low age lwt i.race smoke, coef

Logistic regression                                     Number of obs =    189
                                                        LR chi2(5)    =  20.08
                                                        Prob > chi2   = 0.0012
Log likelihood = -107.29639                             Pseudo R2     = 0.0856



         low   Coefficient  Std. err.      z    P>|z|     [95% conf. interval]

         age    -.0225071   .0341688    -0.66   0.510    -.0894766    .0444625
         lwt    -.0125017   .0063843    -1.96   0.050    -.0250146    .0000113
              
        race  
      Black       1.23121   .5171062     2.38   0.017     .2177006    2.244719
      Other      .9435946   .4162001     2.27   0.023     .1278573    1.759332
              
       smoke     1.054433   .3799787     2.77   0.006     .3096879    1.799177
       _cons     .3301267   1.107607     0.30   0.766    -1.840743    2.500997


. 
. * Manually compute the standard error of the odds ratio age
. display exp(_b[age])*_se[age]
.03340831

. 
. * compute the confidence interval of the odds ratio age
. display "lower limit: " exp(_b[age]-invnormal(0.975)*_se[age])
lower limit: .91440966

. display "upper limit: " exp(_b[age]+invnormal(0.975)*_se[age])
upper limit: 1.0454657

. 
. * verify
. logistic low age lwt i.race smoke

Logistic regression                                     Number of obs =    189
                                                        LR chi2(5)    =  20.08
                                                        Prob > chi2   = 0.0012
Log likelihood = -107.29639                             Pseudo R2     = 0.0856



         low   Odds ratio   Std. err.      z    P>|z|     [95% conf. interval]
  
         age     .9777443   .0334083    -0.66   0.510     .9144097    1.045466
         lwt     .9875761    .006305    -1.96   0.050     .9752956    1.000011
              
        race  
      Black      3.425372   1.771281     2.38   0.017     1.243215    9.437768
      Other        2.5692   1.069301     2.27   0.023     1.136391    5.808555
              
       smoke     2.870346    1.09067     2.77   0.006        1.363    6.044672
       _cons     1.391144   1.540841     0.30   0.766     .1586994    12.19464

Note: _cons estimates baseline odds.

Some people prefer confidence intervals computed from the odds-ratio estimates and the delta rule SEs. Asymptotically, these two are equivalent, but they will differ for real data.

In practice, the confidence intervals obtained by transforming the endpoints have some intuitively desirable properties; for example, they do not produce negative odds ratios. In general, we also expect the estimates to be more normally distributed in the natural space of the problem (the beta space); see the long answer below.

Test of significance

The proper test of significance for ORs, HRs, IRRs, and RRRs is whether the ratio is 1 not whether the ratio is 0. The test against 0 is a test that the coefficient for the parameter in the fitted model is negative infinity and has little meaning. Stata reports the test of whether the ratio (OR, HR, IRR, RRR) differs from 1—for example, \(H0: OR_b = 1\).

As with the confidence interval, there are two asymptotically equivalent ways to form this test: (1) Test whether the parameter \(b\) differs from 0 in the natural space of the model \((H0: b = 0)\), or (2) test whether the transformed parameter differs from 1 in the OR space \((H0: exp(b) = 1)\). The latter test would use the \(SE(OR_b)\) from the delta rule. When reporting ORs, HRs, or RRRs, Stata reports the statistic and significance level from the test in the natural estimation space—\(H0: b = 0\).

Asymptotic theory gives no clue as to which test should be preferred, but we would expect the estimates to be more normally distributed in the natural estimation space—see the discussion below.

Confidence intervals (CIs)—long answer

To continue with the point that confidence intervals can be computed in two ways for transformed estimates (ORs, RRRs, IRRs, HRs, ...), a user asked

Wouldn’t it be more appropriate to use the the standard errors for the RR when calculating CI?

Asymptotically, both methods are equally valid, but it is better to start with the CI in the metric in which the estimates are closer to normal and then transform its endpoints. Since the estimate \(b\) is likely to be more normal than \(exp(b)\) (since \(exp(b)\) is likely to be skewed), it is better to transform the endpoints of the CI for b to produce a CI for \(exp(b)\).

First, when you transform a standard error of an ML estimate using the delta method, you get the same standard error that you would have obtained had you performed the maximization with the transformed parameter directly.

Consider a general transformation \(B = f(b)\) of \(b\). Using the delta method,

\(Var(B) = f'(b)^2 * Var(b) = f'(b)^2 * (-d^2\;\;lnL/db^2)^{-1}\)

where lnL is the log likelihood. Had we done the maximization in \(B\),

\(d\;\;lnL/dB = d\;\;lnL/db * db/dB\)

\(d^2\;\;lnL/dB^2 = d^2\;\;lnL/db^2 * (db/dB)^2 + d\;\;lnL/db * d^2b/dB^2\)

since \(d\;\;lnL/db = 0\) at the maximum,

\(d^2\;\;lnL/dB^2 = d^2\;\;lnL/db^2 * (db/dB)^2 = d^2\;\;lnL/db^2 * f'(b)^{-2}\)

hence,

\(Var(B) = (-d^2\;\;lnL/dB^2)^{-1} = (-d^2\;\;lnL/db^2)^{-1} * f'(b)^2\)

So this fact might give someone more reason to say we should use the standard error of \(B\) to produce its CI.

However, we could also use this as evidence to say we could use ANY transformation to produce a confidence interval. That is, we could look at further transformations \(g(B)\) of \(B\). According to asymptotic theory,

\([g(B) - z*se(g(B)), g(B) + z*se(g(B))]\)(1)

gives a valid CI for \(g(B)\) (where \(z\) is the normal quantile and \(se(g(B))\) is the standard error computed using the delta method).

This CI with endpoints transformed back to the \(B\) metric gives a CI

\([g^{-1}(g(B) - z*se(g(B))), g^{-1}(g(B) + z*se(g(B)))]\)

The above CI must give an equally valid CI since it will yield the same coverage probability as (1).

So, ideally, we should search for the best transformation \(g(B)\) of any quantity \(B\) such that \(g(B)\) is roughly normal so that the CI given above gives the best coverage probability.

The estimate \(B = exp(b)\) is likely to have a skewed distribution, so it is certainly not likely to be as normal as the distribution of the coefficient estimate \(b\). It’s better to use \(g = exp^{-1}\) to produce the CI for \(B = exp(b)\).

Both CIs are equally valid according to asymptotic theory. But, in practice, the CI produced from the more normal estimate (for example, \(b\) rather than \(exp(b))\) will likely yield slightly better CIs for coverage probability.

How are the standard errors and confidence intervals computed for relative-risk ratios (RRRs) by mlogit?

How are the standard errors and confidence intervals computed for odds ratios (ORs) by logistic?

How are the standard errors and confidence intervals computed for incidence-rate ratios (IRRs) by poisson and nbreg?

How are the standard errors and confidence intervals computed for hazard ratios (HRs) by stcox and streg?

Standard Errors

Confidence intervals—short answer

Test of significance

Confidence intervals (CIs)—long answer

We use cookies

Privacy policy

Required cookies

Advertising and performance cookies


low		Coefficient Std. err. z P>\|z\| [95% conf. interval]

age		-.0225071 .0341688 -0.66 0.510 -.0894766 .0444625
lwt		-.0125017 .0063843 -1.96 0.050 -.0250146 .0000113

race
Black		1.23121 .5171062 2.38 0.017 .2177006 2.244719
Other		.9435946 .4162001 2.27 0.023 .1278573 1.759332

smoke		1.054433 .3799787 2.77 0.006 .3096879 1.799177
_cons		.3301267 1.107607 0.30 0.766 -1.840743 2.500997


low		Odds ratio Std. err. z P>\|z\| [95% conf. interval]

age		.9777443 .0334083 -0.66 0.510 .9144097 1.045466
lwt		.9875761 .006305 -1.96 0.050 .9752956 1.000011

race
Black		3.425372 1.771281 2.38 0.017 1.243215 9.437768
Other		2.5692 1.069301 2.27 0.023 1.136391 5.808555

smoke		2.870346 1.09067 2.77 0.006 1.363 6.044672
_cons		1.391144 1.540841 0.30 0.766 .1586994 12.19464

Stata/MP4 Annual License (download)

How are the standard errors and confidence intervals computed for relative-risk ratios (RRRs) by mlogit?

How are the standard errors and confidence intervals computed for odds ratios (ORs) by logistic?

How are the standard errors and confidence intervals computed for incidence-rate ratios (IRRs) by poisson and nbreg?

How are the standard errors and confidence intervals computed for hazard ratios (HRs) by stcox and streg?

Standard Errors

Confidence intervals—short answer

Test of significance

Confidence intervals (CIs)—long answer

We use cookies

Privacy policy

Required cookies

Advertising and performance cookies