Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: using stcox with clustered data

From   "Seidel, Kristy" <>
To   "''" <>
Subject   st: using stcox with clustered data
Date   Tue, 1 Oct 2002 11:30:47 -0700


I have been asked to come up with a plan for analyzing an unusual clinical
data set that consists of paired time-to-event observations.  The setting is
an infant intensive care unit (IICU).  The research question is:  which of
two topical medications is better at shortening the duration of diaper rash
in IICU patients. Because there is so much variability in the types of
problems that infants are admitted to the IICU with, the investigators wish
to use a paired design in which the nursing staff will use medication A on
one side of the infant's rear and medication B on the other side.  Each
subject would then contribute two observations, time to resolution of rash
on side A and time to resolution on side B.  The times to resolution within
individual are likely to be strongly correlated. They also expect a fair
degree of censoring in this setting -- up to half of the study subjects may
be discharged from the IICU prior to resolution of rash on one or both

I initially thought that the -stcox- command with the --robust- and
-cluster()- options would work in this setting, but as I delve deeper into
it, I'm not 100% sure that is appropriate.  The Stata manual examples
suggest that the -cluster()- option in -stcox- is intended mainly to allow
proper handling of data sets with multiple records representing successive
time intervals on individual subjects.  The data in the IICU study proposed
above would have multiple records representing parallel time intervals on
individual subjects.  Can anyone tell me whether it is also appropriate to
use -stcox- with -robust cluster()- in the latter situation?

This study has not been carried out yet, and right now I am just trying to
do some power/sample size calculations via Monte Carlo simulation.  I've
created a program to simulate correlated time-to-resolution observation
pairs and applied the following analysis strategy to the simulated data

stset resday, failure(res_i) time0(timezero)

stcox treatmnt, cluster(idnum) robust


  resday = time of resolution (or time of censoring, if resolution did not
occur before discharge from IICU)

  res_i = resolution indicator variable

  treatmnt = treatment indicator variable

  idnum = id variable for subject (each id has 2 records, one for the side
that got treatment A and one for the side that got B)

  timezero = a dummy variable set to 0 for all records (I'm not sure I need
this but it seems to clarify that the two records per person are parallel
time periods, not successive intervals.  Also, I found that I could not use
the -id(idnum)- option in the -stset- step because it would force the two
records per person to be interpreted as successive intervals, regardless of
the -time0- option.)       

My simulation program with this code appears to work fine and it suggests
that the sample size needed to achieve good power with the paired design is
far smaller than that required with a two independent groups design (as
expected).  However, I cannot find any examples using -stcox- options in
this way and that makes me wonder whether I have been naive in equating the
-robust- option for Cox regression with the corresponding option from
-xtgee-.  Any comments or suggestions would be most welcome.


Kristy Seidel
Clinical Research Center/Research Administration
Children's Hospital and Regional Medical Center
Seatlle, WA

*   For searches and help try:

© Copyright 1996–2017 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index