Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: Competing Hazards with Multiple-Record-per-Subject Data (2)

From   Stefan Göke <>
To   <>
Subject   st: Competing Hazards with Multiple-Record-per-Subject Data (2)
Date   Sat, 23 Jul 2011 12:13:29 +0200

Dear statalisters,

in earlier posts implementation of data replication in Lunn/McNeil 1995
Applying Cox Regression to Competing Risks were discussed. Under the subject
line above, Alex Gelber (see below) asked how to implement this particularly
when using multiple-record-per-subject data ? unfortunately with no reply.

I stsplitted my data by year to model a linear time trend and age covariates
and thus, I am facing the same problem: stsetting does not work when the
data is replicated along Lunn/McNeil 1995 since at every instant two records
are attached to the id variable. The question is how to solve this stset

Given that 5 years have passed by since Alex posted, please allow myself to
basically repost this question ? maybe even Alex has the answer and is still

Many thanks for your consideration


Stefan Göke
Ph.D. Candidate
Department of Management
University Paderborn


In earlier Statalist posts, May Boggess of Statacorp explained how to
estimate a competing hazard model using Lunn and McNeil's methods A and B.
There seems to be a problem, however, with implementing Lunn and McNeil's
Method B when using multiple-record-per-subject data.

The problem is that to implement Lunn and McNeil Method B, you need to
duplicate each subject's record twice (in the case of two competing
hazards). In the case of multiple-record-per-subject data, when using the
stset command to stset the data, you need to specify the ID variable using
id(idname), where "idname" is the name of the ID variable. But then Stata
gives you an error message, because each ID has two records attached to it
at every instant, and the stset command with multiple-record-per-subject
data only works correctly when there is only one record for each id-time

For example, in my case, the time variable is "month," the failure variable
is "status," and the id variable is "id," and here is what happens when I
try to stset my data:

. stset month, failure(status) id(id)

id: id
failure event: status != 0 & status < .
obs. time interval: (month[_n-1], month]
exit on or before: failure

2000 total obs.
2000 multiple records at same instant PROBABLE ERROR
0 obs. remaining, representing
0 subjects
0 failures in single failure-per-subject data
0 total analysis time at risk, at risk from t = 0
earliest observed entry t = .
last observed exit t = .

Does anyone know how to address this problem? Any help would be much


Alexander Gelber
Ph.D. Candidate
Harvard University Department of Economics
Littauer Center
Cambridge, MA 02138

*   For searches and help try:

© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index