Stata | FAQ: Missing standard errors reported by stcox

Home / Resources & support / FAQs / Missing standard errors reported by stcox

Note: This FAQ is relevant for users of Stata 10 or earlier. It is not relevant for newer versions.

Why does stcox sometimes produce missing standard errors?

Title		Missing standard errors reported by stcox
Author		Mario Cleves, StataCorp

There are two major reasons for missing standard errors in a Cox proportional hazards regression. The first is failure to converge. Although this is rare, if in the last step of the iteration log the message “nonconcave function encountered” or “unproductive step attempted” appear, then the estimation procedure did not converge to the MLE and the results cannot be trusted.

Missing standard errors in a Cox proportional hazards regression, however, are more often due to one of four types of collinearity:

1) Covariate is collinear with the dead/censor variable.

This results in a hazard ratio of infinity (large number printed out) and a missing standard error if there is positive collinearity, or a hazard ratio of zero (large negative coefficient) and a missing standard error if there is negative collinearity.

 . webuse cancer
 (Patient Survival in Drug Trial)

 . stset studytime, f(died)

      failure event:  died != 0 & died < .
 obs. time interval:  (0, studytime]
  exit on or before:  failure
 
 ------------------------------------------------------------------------------
        48  total obs.
         0  exclusions
 ------------------------------------------------------------------------------
        48  obs. remaining, representing
        31  failures in single record/single failure data
       744  total analysis time at risk, at risk from t =         0
                              earliest observed entry t =         0
                                   last observed exit t =        39
 
 . generate copy=_d

 . stcox age drug copy, exactp nolog

          failure _d:  died
    analysis time _t:  studytime

 Cox regression -- exact partial likelihood

 No. of subjects =           48                     Number of obs   =        48
 No. of failures =           31
 Time at risk    =          744
                                                    LR chi2(3)      =     59.38
 Log likelihood  =   -62.481243                     Prob > chi2     =    0.0000
 
 ------------------------------------------------------------------------------
           _t | Haz. Ratio   Std. Err.      z    P>|z|     [95% Conf. Interval]
 -------------+----------------------------------------------------------------
          age |   1.090842    .042937     2.21   0.027     1.009851    1.178328
         drug |   .2851362   .1067876    -3.35   0.001     .1368565    .5940725
         copy |   5.28e+15          .        .       .            .           .
 ------------------------------------------------------------------------------
 
 . generate negcopy=-_d
 
 . stcox age drug negcopy, exactp nolog
 
          failure _d:  died
    analysis time _t:  studytime
 
 Cox regression -- exact partial likelihood
 
 No. of subjects =           48                     Number of obs   =        48
 No. of failures =           31
 Time at risk    =          744
                                                    LR chi2(3)      =     59.38
 Log likelihood  =   -62.481243                     Prob > chi2     =    0.0000
 
 ------------------------------------------------------------------------------
           _t | Haz. Ratio   Std. Err.      z    P>|z|     [95% Conf. Interval]
 -------------+----------------------------------------------------------------
          age |   1.090842    .042937     2.21   0.027     1.009851    1.178328
         drug |   .2851362   .1067876    -3.35   0.001     .1368565    .5940725
      negcopy |   1.89e-16          .        .       .            .           .
 ------------------------------------------------------------------------------

2) Covariate is collinear with the time variable.

This results in a hazard ratio close to one (coefficient is zero) and a missing standard error.

 . clear

 . set obs 1000
 obs was 0, now 1000

 . generate t=_n

 . stset t

      failure event:  (assumed to fail at time=t)
 obs. time interval:  (0, t]
  exit on or before:  failure
 
 ------------------------------------------------------------------------------
      1000  total obs.
         0  exclusions
 ------------------------------------------------------------------------------
      1000  obs. remaining, representing
      1000  failures in single record/single failure data
    500500  total analysis time at risk, at risk from t =         0
                              earliest observed entry t =         0
                                   last observed exit t =      1000
 
 . generate copy=_t
 
 . stcox copy
 
          failure _d:  1 (meaning all fail)
    analysis time _t:  t
 
 Iteration 0:   log likelihood = -5912.1282
 Iteration 1:   log likelihood = -4537.5754
 Iteration 2:   log likelihood = -3821.8484
 Iteration 3:   log likelihood = -3430.1547
 Iteration 4:   log likelihood = -3427.9073
 Iteration 5:   log likelihood = -3344.6335
 Refining estimates:
 Iteration 0:   log likelihood = -3312.0701
 Iteration 1:   log likelihood = -2920.7381
 Iteration 2:   log likelihood = -2709.5843
 Iteration 3:   log likelihood = -2701.5327
 
 Cox regression -- no ties
 
 No. of subjects =         1000                     Number of obs   =      1000
 No. of failures =         1000
 Time at risk    =       500500
                                                    LR chi2(1)      =   6421.19
 Log likelihood  =   -2701.5327                     Prob > chi2     =    0.0000
 
 ------------------------------------------------------------------------------
           _t | Haz. Ratio   Std. Err.      z    P>|z|     [95% Conf. Interval]
 -------------+----------------------------------------------------------------
         copy |   .9343625          .        .       .            .           .
 ------------------------------------------------------------------------------

3) Covariate is collinear with the entry-time variable.

This results in a hazard ratio close to one (coefficient is zero) and a missing standard error.

 . clear

 . set obs 1000
 obs was 0, now 1000
 
 . generate t0=_n-5
 
 . generate t=_n
 
 . stset t, enter(t0)
 
      failure event:  (assumed to fail at time=t)
 obs. time interval:  (0, t]
  enter on or after:  time t0
  exit on or before:  failure
 
 ------------------------------------------------------------------------------
      1000  total obs.
         0  exclusions
 ------------------------------------------------------------------------------
      1000  obs. remaining, representing
      1000  failures in single record/single failure data
      4990  total analysis time at risk, at risk from t =         0
                              earliest observed entry t =         0
                                   last observed exit t =      1000
 
 . generate copy=_t0
 
 . stcox copy
 
          failure _d:  1 (meaning all fail)
    analysis time _t:  t
   enter on or after:  time t0
 
 Iteration 0:   log likelihood = -1606.1782
 Iteration 1:   log likelihood = -1545.3983
 Iteration 2:   log likelihood = -1540.1655
 Refining estimates:
 Iteration 0:   log likelihood = -1541.2987
 Iteration 1:   log likelihood = -1484.0017
 Iteration 2:   log likelihood = -1473.3656
 Iteration 3:   log likelihood = -1470.0384
 Iteration 4:   log likelihood = -1469.6364
 Iteration 5:   log likelihood =  -1463.425
 
 Cox regression -- no ties
 
 No. of subjects =         1000                     Number of obs   =      1000
 No. of failures =         1000
 Time at risk    =         4990
                                                    LR chi2(1)      =    285.51
 Log likelihood  =    -1463.425                     Prob > chi2     =    0.0000
 
 ------------------------------------------------------------------------------
           _t | Haz. Ratio   Std. Err.      z    P>|z|     [95% Conf. Interval]
 -------------+----------------------------------------------------------------
         copy |   .9293137          .        .       .            .           .
 ------------------------------------------------------------------------------

4) Covariate does not vary within death event risk sets.

This is a complicated form of collinearity wherein a covariate varies overall, but for each death event, it does not vary within the associated risk set.

This results in a hazard ratio of one (coefficient is zero) and a missing standard error.

 . clear

 . input id t0 t dead x  

             id         t0          t       dead          x
   1.      1      0    1          1       6.18  
   2.      2      0.5  1          1       6.18  
   3.      3      1    6          1       5.55  
   4.      4      3    7          0       5.55  
   5. end
 
 . stset t, failure(dead) enter(t0)

      failure event:  dead != 0 & dead < .
 obs. time interval:  (0, t]
  enter on or after:  time t0
  exit on or before:  failure

 ------------------------------------------------------------------------------
         4  total obs.
         0  exclusions
 ------------------------------------------------------------------------------
         4  obs. remaining, representing
         3  failures in single record/single failure data
      10.5  total analysis time at risk, at risk from t =         0
                              earliest observed entry t =         0
                                   last observed exit t =         7
 
 . list
 
      +-------------------------------------------------+
      | id   t0   t   dead      x   _st   _d   _t   _t0 |
      |-------------------------------------------------|
   1. |  1    0   1      1   6.18     1    1    1     0 |
   2. |  2   .5   1      1   6.18     1    1    1    .5 |
   3. |  3    1   6      1   5.55     1    1    6     1 |
   4. |  4    3   7      0   5.55     1    0    7     3 |
      +-------------------------------------------------+
 
 . stcox x
 
          failure _d:  dead
    analysis time _t:  t
   enter on or after:  time t0
 
 Iteration 0:   log likelihood = -2.0794415
 Refining estimates:
 Iteration 0:   log likelihood = -2.0794415
 
 Cox regression -- Breslow method for ties
  
 No. of subjects =            4                     Number of obs   =         4
 No. of failures =            3
 Time at risk    =         10.5
                                                    LR chi2(1)      =      0.00
 Log likelihood  =   -2.0794415                     Prob > chi2     =    1.0000
 
 ------------------------------------------------------------------------------
           _t | Haz. Ratio   Std. Err.      z    P>|z|     [95% Conf. Interval]
 -------------+----------------------------------------------------------------
            x |          1          .        .       .            .           .
 ------------------------------------------------------------------------------

Coefficients for the variables that have (any form of) collinearity cannot be estimated. Leaving them in or deleting them from the model results in the same likelihood value and does not alter the results for the noncollinear variables.

Although the first three forms of collinearity can be easily assessed, the fourth requires that the appropriate risk sets be formed. This task is facilitated by the use of the program st_rpool, written by Bill Gould, that can be downloaded from Stata’s website.

To obtain st_rpool, type in Stata:

 . net from http://www.stata.com 
 . net cd users/wgould          
 . net describe st_rpool
 . net install st_rpool

or,

from the Help menu select SJ and community-contributed programs
click on Other locations
click on users
click on wgould
click on st_rpool
and finally on click here to install

Let’s use st_rpool to look at the values of the covariate x in the risk sets:

 . clear

 . input id t0 t dead x  

             id         t0          t       dead          x
   1.      1      0    1          1       6.18  
   2.      2      0.5  1          1       6.18  
   3.      3      1    6          1       5.55  
   4.      4      3    7          0       5.55  
   5. end

 . stset t, failure(dead) enter(t0)

      failure event:  dead != 0 & dead < .
 obs. time interval:  (0, t]
  enter on or after:  time t0
  exit on or before:  failure

 ------------------------------------------------------------------------------
         4  total obs.
         0  exclusions
 ------------------------------------------------------------------------------
         4  obs. remaining, representing
         3  failures in single record/single failure data
      10.5  total analysis time at risk, at risk from t =         0
                              earliest observed entry t =         0
                                   last observed exit t =         7
 
 . list
 
      +-------------------------------------------------+
      | id   t0   t   dead      x   _st   _d   _t   _t0 |
      |-------------------------------------------------|
   1. |  1    0   1      1   6.18     1    1    1     0 |
   2. |  2   .5   1      1   6.18     1    1    1    .5 |
   3. |  3    1   6      1   5.55     1    1    6     1 |
   4. |  4    3   7      0   5.55     1    0    7     3 |
      +-------------------------------------------------+

 . st_rpool set

 . sort set id

 . list, sepby(set)

      +--------------------------------------+
      | id   t0      x   _d   _t   _t0   set |
      |--------------------------------------|
   1. |  1    0   6.18    1    1     0     1 |
   2. |  2   .5   6.18    1    1    .5     1 |
      |--------------------------------------|
   3. |  3    1   5.55    1    6     1     2 |
   4. |  4    3   5.55    0    7     3     2 |
      +--------------------------------------+

Why does stcox sometimes produce missing standard errors?

We use cookies

Privacy policy

Required cookies

Advertising and performance cookies

Stata/MP4 Annual License (download)

Why does stcox sometimes produce missing standard errors?

We use cookies

Privacy policy

Required cookies

Advertising and performance cookies