[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: Re: Suggestion on Cox and left truncation

From   Antoine Terracol <[email protected]>
To   [email protected], [email protected]
Subject   st: Re: Suggestion on Cox and left truncation
Date   Thu, 14 Jan 2010 13:32:27 +0100

Copy again to Statalist

I'm afraid there is not much you can do, since your design does not record censored durations. Consequently, only the shortest spells are observed (I abstract from the left truncation problem). Running -stcox- on your data will give biased estimates, the size of the bias depending on the relative length of a typical wait versus the length of the observation window (i.e. on the number of individuals excluded from the dataset by your design)

In the following example, I draw from a weibul distribution with a coefficient of 1 for the explanatory variable. The beginning date is uniform between zero and a scalar names "span" (span=2 in the example below). The end date is begin+duration. If the end date is greater than "span", I censor the duration accordingly. I then estimate a cox model with and without the censored observations.

As you can see, the second estimation gives biased estimates (while the first is not, remember the true parameter is 1). You can modify the value of "span" to see how the bias varies as the observation windows gets bigger while the data generating process is kept constant.


/// code to draw from a weibull
/// with shape parameter alpha
/// and scale parameter lambda

cap prog drop draweib
program define draweib
syntax newvarlist  [if] [in] , LAmbda(string) ALpha(string) [double]
tokenize "`varlist'"
while "`1'"!="" {
   tempname vlambda
   tempvar `vlambda'
   gen `vlambda' = exp(ln(`lambda')/`alpha')
   g `double' `1' =((log(1/uniform()))^(1/`alpha'))/`vlambda' `if' `in'
   mac shift

set obs 10000
g x=runiform()
draweib dur, alpha(1) lambda(exp(x))

scalar span=2
g beg=runiform()*span
g end=beg+dur
g fail=end<=span
replace end=span if end>span
replace dur=end-beg

stset dur, f(fail)
stcox x, nohr

keep if fail
stcox x, nohr


[email protected] wrote:
Quoting Antoine Terracol <[email protected]>:

Dear Antoine,
thanks a lot for your reply.

Our study records only the surgeries performed in the observation window, i.e. between 2006-2008. This is due to the fact that hospital statistics first keep records of the date of surgery and then add the date of registration. We do not have a registration date without the date of surgery. So all individuals have yet been through surgery at the end of the observation window by study design.

Accordingly, the following types of individuals are included in our sample: 0,1,3 while

 (2) diagnosed<2006, surgery>2008 => NOT observed
 (4) diagnosed>2006, surgery>2008 => NOT observed

So, what do you think?

Many thanks.


Dear Giuliana,

I'm copy-ing this reply to the Statalist for anyone to comment on it

let me rephrase your setup to see if I got it right.

all observed exits take place between 2006 and 2008. Some individuals
are diagnosed after 2006, some before. I assume that some individuals do
not exit (i.e. have not yet been through surgery at the end of the
observation window). the following types of individuals can be defined:

(0) diagnosed<2006, surgery<2006 => not observed
(1) diagnosed<2006, surgery in [2006,2008] => observed, left-truncated
with exit
(2) diagnosed<2006, surgery>2008 => observed, left truncated and
(3) diagnosed>2006, exit in [2006,2008] => observed, no left truncation,
(4) diagnosed>2006, surgery>2008 => observed, no left-truncation but
right censoring

If your design allows types (1) to (4) to be included in your dataset,
then your -stset- looks ok, although I think there is no need for the
-time0()- option

If your design is such that type-(2) individuals cannot be included in
your dataset (for example because you record only the registrations or
surgeries performed in the observation window), then individuals
diagnosed before 2006 will be observed because their spells are long
enough to end after 2006, but short enough to end before 2008. In this
case your sample will be biased, and I see no easy way to correct the
likelihood within the -st- suite. In this case I would drop the
individuals diagnosed<2006, and -stset- the data without the -enter()-


[email protected] wrote:
Dear Dr Terracol,
I would like to ask something about my work after having seen some of your comments on statalist forum.

I am studying the effect of education on WAITED times for
elective surgery using hospital individual level data and applying Cox
estimation. Date of surgeries are observed between 2006-2008. I have the
following key variables:

date of registration (onset of the risk)
date of surgery.

So timeatrisk=date of surgery - date of registration (i.e. waitED time)

However, some individuals became at risk before 2006 (start of
our OBSERVATIONAL WINDOW), i.e. the date of registration is before 2006. This
because of our study design which is retrospect. How I can treat such
individuals when I stset data to perform cox regression? Is this the case of
left truncation?

I thought the following:
stset date_of_surgery, origin(date_of_registration) enter(time
mdy 81,1,2006)) failure(surgery) time0(date_of_registration)

Thank you very much for your help.
Kind Regards.

Giuliana De Luca

This message was sent using IMP, the Internet Messaging Program.

This message was sent using IMP, the Internet Messaging Program.

*   For searches and help try:

© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index