Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Interval censoring using intcens

From   Patrick Munywoki <>
Subject   Re: st: Interval censoring using intcens
Date   Wed, 1 Aug 2012 11:58:21 +0100

Many thanks for the suggestions.

The main problem in my dataset is i do not have an exact date/time of when
the study participants either started or stopped shedding the respiratory
virus of interest. note i sample participant twice-a-week hence there are
intervals of 3 to 4 days(longer in cases where sample was not collected)
between sample collections for all the participants. Any further ideas on
how to analyse this data is welcome.

I am currently thinking of using imputation techniques to determine when
the infection episodes started and ended before i proceed with the survival
analysis. Your thoughts on this approach is also welcome.


On 29 July 2012 13:02, <> wrote:
> Steve Samuels provided very good advice. Some other reflections from me:
> -intcens- (on SSC) is a program that fits parametric _continuous_
> survival time distributions to interval-censored survival time data
> (a.k.a. as grouped or discrete time data). The program doesn't allow
> time-varying covariates. It has one row per spell/obs -- convenient for
> the maximisation by -ml-.
> I'm not sure that -stpm- (which you ask about) is appropriate for
> interval-censored data. I would check further if I were you. (If it is,
> then also check out -stpm2- which is more flexible and faster. Use
> -findit- to get latest version -- it's from SJ or SSC.)
> You could think more generally about models for interval-censored data
> -- see the MS and lessons off my survival analysis webpages (URL below)
> for discussion and references.  This shows how you can fit models which
> make no assumption about the shape of the underlying survival time
> distribution. (You can assume shapes for the interval-hazard if you
> wish; but can also assume interval-specific values if you wish and your
> data allow it.) And time-varying covariates can be easily incorporated.
> More complicated is what to do with multiple spells. (You don't mention
> them explicitly, but it sounds as if you have them according to your
> description.)   The key issue is non-independence across spells from the
> same person. Steve Samuels remarked on this and suggested clustering the
> standard errors (persons as clusters). An alternative is to assume some
> parametric form for the individual-specific effect that generates the
> non-independence across spells from the same person -- this is 'frailty'
> a.k.a. 'unobserved heterogeneity'. The most straightforward of handling
> this would be:
> * Reorganise (expand) your data so that you have one row in data set for
> each interval that each person is at risk of infection, and create an
> event occurrence indicator y_it for person i and interval t (see my
> Lessons)
> * Create any time-varying covariates required. At minimum, this will be
> some specification for the duration dependence of the interval hazard
> * fit a -xtcloglog- model with the binary outcome variable being y_it.
> This assumes that the person-specific frailty is normal (Gaussian). Or
> just fit a -cloglog- model if you want to ignore frailty. Either way,
> you would be fitting the interval-censored model corresponding to an
> underlying continuous time model that satisfies the proportional hazards
> assumption. (That assumption can be tested using interactions between
> explanatory variables and the variables summarising duration
> dependence.)  An alternative would be -xtlogit- and -logit- to data
> organised in the same way.
> [Cf. -pgmhaz8- and -hshaz- (on SSC) which also fit discrete time
> proportional hazards models with frailty (Gamma, and discrete mass
> point, respectively), but only to single spell data.  -xtcloglog- and
> -xtlogit- work with multiple spell data because the frailty is
> integrated out numerically.]
> Stephen
> -------------------------------------
> Professor Stephen P. Jenkins <>
> Department of Social Policy
> London School of Economics and Political Science
> Houghton Street, London WC2A 2AE, U.K.
> Tel: +44 (0)20 7955 6527
> Changing Fortunes: Income Mobility and Poverty Dynamics in Britain, OUP
> 2011,
> Survival Analysis using Stata:
> Downloadable papers and software:
> ----------------------------------------------------------------------
> Date: Sat, 28 Jul 2012 09:29:15 +0100
> From: Patrick Munywoki <>
> Subject: st: Interval censoring using intcens
> Hi,
> I have been attempting to analyse interval censored time-to-event data
> with 'intcens' ado (Griffin et al 2006). My data arise from a
> longitudinal household-based study with nasal swab collections
> twice-a-week for a duration of 26 weeks regardless of their any
> symptoms. I want to be able to estimate the duration of infectious
> period for one of the viruses we detected. I have reduced the data  to
> one observation per infection episode in order to use the 'intcens'
> command with t0 being the date last positive sample while t1 is the
> date of the next negative sample. I hope this data conversion to
> single observation per infection episode data is alright?
> My questions?
> 1. How do i interpret the coefficient given in the results below?
> intcens t0 t1 male, dist(exp) time nolog
> stata output
> Exponential distribution -      log acceleration factors
> Uncensored               0
> Right-censored           0
> Left-censored            0
> Interval-censored      188
>         Number of obs   =       188
>         Wald chi2(1)    =       0.00
> Log likelihood =  -1796.982     Prob > chi2     =       0.9990
> Coef.   Std. Err.      z    P>z     [95% Conf.  Interval]
> male   -.0001871        .1470683    -0.00   0.999    -.2884356  .2880615
> _cons    9.817517       .2234524    43.94   0.000     9.379558  10.25548
> Note the actual interval between the dates t0 and t1 is on average(sd)
> 3.6 (0.98) days; median(IQR) 4 (3-4) days; and range 2-7 days.
> 2. Whenever i try using any other distribution this error message pops
> up. What could be the problem here?
> intcens t0 t1 male, dist(weib) time nolog
> initial values not feasible
> r(1400);
> 3. Is there an alternative method to the interval censoring which
> allows me to use the multiple records per person accounting for the
> interval censoring. I have tried stpm but not sure whether it allows
> for this.
> I would greatly appreciate your help ,
> Many thanks,
> - --
> Patrick Munywoki
> Please access the attached hyperlink for an important electronic communications disclaimer:
> *
> *   For searches and help try:
> *
> *
> *

Patrick Munywoki
*   For searches and help try:

© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index