st: Coding time-varying variables in Cox regression

 From Allan Garland To statalist@hsphsun2.harvard.edu Subject st: Coding time-varying variables in Cox regression Date Mon, 09 Jun 2008 13:08:45 -0500

```I have a question about coding for a Cox survival model using time-varying variables (i.e. they vary with time but are not a function of time itself).  Here is the structure of my data, which concerns medical treatment:

For each patient after study entry (t=0) time is divided into  5 epochs:  0-1 month, 1-6, 6-12, 12-24 and >24 months.  For the first 4 of  those 5 epochs there is a separate variable representing the amount of a  treatment given (these are different treatments given in the 4 epochs).  These  treatment variables are continuous, not categorical – let us call them TREAT1,  TREAT2, TREAT3, and TREAT4.

I understand how to use -stsplit- to divide  each patient into up to 5 records, each representing a different epoch.  Here is  some sample data:

-Before -stsplit- the data looks like this for 3  patients:

id   time   died   TREAT1  TREAT2   TREAT3    TREAT4
----------------------------------------------------------------------------------
1     42      0           5             17              23              8
2     56      1           9              3               22            16
3     12      1           7             11               6              .

-And  let us consider patient 3, who after -stsplit- will have 3 records:

id   _t0   _t   _d   TREAT1   TREAT2   TREAT3   TREAT4
---------------------------------------------------------------------------------------------
3      0     1     0       7              11               6               .
3      1     6     0       7              11               6               .
3      6    12    1       7              11               6               .

For the first of those 3 records (covering 0-1 months after study  entry), the TREAT1 variable is meaningful, but none of TREAT2-TREAT4 have any meaning yet.  Similarly, in his 2nd record (for months 1-6) he receives TREAT2, and TREAT1 does not really apply to the 2nd epoch.  In addition, within that 2nd record, neither TREAT3 or TREAT4 have any meaning.

SO, my 3 questions are, using the 2nd record for this patient as an example:

1) How do I code TREAT1 for this 2nd record?  It does not really apply to the 1-6 month time interval.  Is it given the same value (=7) as in the 1st record for this
patient, or should it be assigned a zero, or missing for this record?

2) How do I code TREAT3 for this 2nd record?  It does not yet have any meaning since the 2nd record includes events that occurred before the 3rd epoch (when he
receives TREAT3).  Should it be assigned a zero, or missing?  If missing, I am concerned that Stata will delete this record since TREAT3 is included as a covariate in the -stcox- statement.

3) How do I code TREAT4 for this record (and for all records for this patient)?  Since he died in the 12th month, he never got to the point of receiving TREAT4.  So it makes most sense to leave it as missing, but then, I am concerned that Stata will delete this record since TREAT4 is included as a covariate in the -stcox- statement?

My best guess is that in the second record for patient 3 I should have TREAT2 as is (=11) but set TREAT1, TREAT3 and TREAT4 equal to 0.  But I am not sure.

I am told that in SAS you can explicitly tell it which covariates apply to which time intervals, but I do not see a way to do that in Stata.

All help appreciated.

Allan

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```