Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: QUERY: unrealistic results with stcox (Multiple-record-per-subject survival data)


From   Alessandro Marcon <[email protected]>
To   [email protected]
Subject   st: QUERY: unrealistic results with stcox (Multiple-record-per-subject survival data)
Date   Wed, 17 Jul 2013 11:31:36 +0200



Dear All,

I have repeated cross-sectional (panel) data where "id_pz" is the
patient's unique id, "year" ranges 2009 to 2012, the event of interest
is "decesso_", which stands for death.
Time entering/exiting the study are "t_enter" and "t_exit", respectively.

	 +--------------------------------------------+
       | id_pz   year   decesso_   t_enter   t_exit |
       |--------------------------------------------|
 1249. |   388   2009          0     17898    18262 |
 1250. |   388   2010          0     18263    18627 |
 1251. |   388   2011          0     18628    18992 |
 1252. |   388   2012          1     18993    19152 |
       |--------------------------------------------|
 1253. |   389   2009          0     17898    18262 |
 1254. |   389   2010          1     18263    18546 |
       |--------------------------------------------|
 1255. |   390   2012          0     18993    19358 |
       |--------------------------------------------|
 1256. |   391   2009          0     17898    18262 |
 1257. |   391   2010          0     18263    18627 |
 1258. |   391   2011          0     18628    18992 |
 1259. |   391   2012          0     18993    19358 |
       |--------------------------------------------|
 1260. |   392   2009          0     17898    18262 |
 1261. |   392   2010          0     18263    18627 |
 1262. |   392   2011          0     18628    18992 |
 1263. |   392   2012          0     18993    19358 |
       |--------------------------------------------|

Patients can enter one or more of 4 years of observation (2009-2012)
like this:

.xtdescribe

   id_pz:  1, 2, ..., 10998                                  n =
10998
    year:  2009, 2010, ..., 2012                             T
=          4
           Delta(year) = 1 unit
           Span(year)  = 4 periods
           (id_pz*year uniquely identifies each observation)

Distribution of T_i:   min      5%     25% 50%       75%     95%     max
                         1       1       2 4         4       4       4

     Freq.  Percent    Cum. |  Pattern
 ---------------------------+---------
     6809     61.91   61.91 |  1111
     1017      9.25   71.16 |  ...1
      810      7.36   78.52 |  ..11
      520      4.73   83.25 |  .111
      432      3.93   87.18 |  111.
      428      3.89   91.07 |  1...
      361      3.28   94.35 |  11..
      296      2.69   97.04 |  ..1.
      183      1.66   98.71 |  .1..
      142      1.29  100.00 | (other patterns)
 ---------------------------+---------
    10998    100.00         |  XXXX


I want to analyse survival of this dynamic cohort. Since I have
"Multiple-record-per-subject survival data", I stset my data like this:

stset t_exit,  id(id_pz) failure(decesso_==1) origin(time t_enter)
scale(365)

This is what I get when computing annual rates and testing by Cox model:

. strate year, per(1000)

         failure _d:  decesso_ == 1
   analysis time _t:  (t_exit-origin)/365
             origin:  time t_enter
                 id:  id_pz

Estimated rates (per 1000) and lower/upper bounds of 95% confidence
intervals
(34673 records included in the analysis)

  +------------------------------------------------+
  | year     D        Y     Rate    Lower    Upper |
  |------------------------------------------------|
  | 2009   219   7.9698   27.479   24.070   31.370 |
  | 2010   240   8.2815   28.980   25.536   32.889 |
  | 2011   281   8.8487   31.756   28.252   35.695 |
  | 2012   275   9.1627   30.013   26.667   33.778 |
  +------------------------------------------------+


. stcox i.year

         failure _d:  decesso_ == 1
   analysis time _t:  (t_exit-origin)/365
             origin:  time t_enter
                 id:  id_pz

Iteration 0:   log likelihood =  -9195.495
Iteration 1:   log likelihood = -9194.8578
Iteration 2:   log likelihood = -9194.8575
Iteration 3:   log likelihood = -9194.8575
Refining estimates:
Iteration 0:   log likelihood = -9194.8575

Cox regression -- Breslow method for ties

No. of subjects =        10998                     Number of obs
=     34673
No. of failures =         1015
Time at risk    =  34262.67945
                                                   LR chi2(3)
=      1.28
Log likelihood  =   -9194.8575                     Prob > chi2
=    0.7350

------------------------------------------------------------------------------
          _t | Haz. Ratio   Std. Err.      z P>|z|     [95% Conf.
Interval]
-------------+----------------------------------------------------------------
        year |
       2010  |   1.160537   .1907743     0.91   0.365 .8408812    1.601708
       2011  |   .9716829    .153747    -0.18   0.856 .7125922    1.324976
       2012  |   1.028494   .1607485     0.18   0.857 .7571171    1.397141
------------------------------------------------------------------------------



HOWEVER, when I repeat this analyses only including patients who enter
the study in 2009 (baseline==1) this is what I get!


. stset t_exit if baseline==1,  id(id_pz) failure(decesso_==1)
origin(time t_enter) scale(365)

                id:  id_pz
     failure event:  decesso_ == 1
obs. time interval:  (t_exit[_n-1], t_exit]
 exit on or before:  failure
    t for analysis:  (time-origin)/365
            origin:  time t_enter
            if exp:  baseline==1

------------------------------------------------------------------------------
    34673  total obs.
     4848  ignored at outset because of -if <exp>-
------------------------------------------------------------------------------
    29825  obs. remaining, representing
     8086  subjects
      879  failures in single failure-per-subject data
 29477.81  total analysis time at risk, at risk from t =         0
                             earliest observed entry t =         0
                                  last observed exit t =         4

. strate year , per(1000)

         failure _d:  decesso_ == 1
   analysis time _t:  (t_exit-origin)/365
             origin:  time t_enter
                 id:  id_pz

Estimated rates (per 1000) and lower/upper bounds of 95% confidence
intervals
(29825 records included in the analysis)

  +------------------------------------------------+
  | year     D        Y     Rate    Lower    Upper |
  |------------------------------------------------|
  | 2009   219   7.9698   27.479   24.070   31.370 |
  | 2010   212   7.5069   28.241   24.684   32.310 |
  | 2011   242   7.1824   33.694   29.705   38.218 |
  | 2012   206   6.8188   30.211   26.355   34.631 |
  +------------------------------------------------+


. stcox i.year

         failure _d:  decesso_ == 1
   analysis time _t:  (t_exit-origin)/365
             origin:  time t_enter
                 id:  id_pz

Iteration 0:   log likelihood = -7824.2973
Iteration 1:   log likelihood = -7822.9333
Iteration 2:   log likelihood = -7822.3438
Iteration 3:   log likelihood = -7822.0976
Iteration 4:   log likelihood =      -7822
Iteration 5:   log likelihood = -7821.9629
Iteration 6:   log likelihood =  -7821.949
[omitted]
Iteration 26:  log likelihood = -7821.7751  (backed up)
Refining estimates:
Iteration 0:   log likelihood = -7821.9409
Iteration 1:   log likelihood = -7821.9409
Iteration 2:   log likelihood = -7821.9408
[omitted]
Iteration 19:  log likelihood = -7821.3159  (backed up)

Cox regression -- Breslow method for ties

No. of subjects =         8086                     Number of obs
=     29825
No. of failures =          879
Time at risk    =   29477.8137
                                                   LR chi2(2)
=      5.96
Log likelihood  =   -7821.3159                     Prob > chi2
=    0.0507

------------------------------------------------------------------------------
          _t | Haz. Ratio   Std. Err.      z P>|z|     [95% Conf.
Interval]
-------------+----------------------------------------------------------------
        year |
       2010  |   1.81e+24   1.01e+30     0.00 1.000
0           .
       2011  |   1.26e+13   4.58e+18     0.00 1.000
0           .
       2012  |   73.05721          .        . .            .           .
------------------------------------------------------------------------------


The number of events and person-time in 2009 seem to be big enough. The
rates are still reasonable but COX MODEL SEEMS TO FAIL.
Any help will be very much welcome!

Best regards,
Alessandro Marcon


--

Alessandro Marcon, PhD
Unit of Epidemiology &   Medical Statistics
Department of Public Health and Community Medicine
University of Verona
Strada Le Grazie 8, 37134 Verona, Italy
tel. +39 045 8027668 fax +39 045 8027154





*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index