Stata: Data Analysis and Statistical Software

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Interval censoring using intcens

From	Patrick Munywoki <[email protected]>
To	[email protected]
Subject	Re: st: Interval censoring using intcens
Date	Wed, 1 Aug 2012 11:58:21 +0100

Many thanks for the suggestions.

The main problem in my dataset is i do not have an exact date/time of when
the study participants either started or stopped shedding the respiratory
virus of interest. note i sample participant twice-a-week hence there are
intervals of 3 to 4 days(longer in cases where sample was not collected)
between sample collections for all the participants. Any further ideas on
how to analyse this data is welcome.

I am currently thinking of using imputation techniques to determine when
the infection episodes started and ended before i proceed with the survival
analysis. Your thoughts on this approach is also welcome.

Thanks
Patrick

On 29 July 2012 13:02, <[email protected]> wrote:
>
> Steve Samuels provided very good advice. Some other reflections from me:
>
> -intcens- (on SSC) is a program that fits parametric _continuous_
> survival time distributions to interval-censored survival time data
> (a.k.a. as grouped or discrete time data). The program doesn't allow
> time-varying covariates. It has one row per spell/obs -- convenient for
> the maximisation by -ml-.
>
> I'm not sure that -stpm- (which you ask about) is appropriate for
> interval-censored data. I would check further if I were you. (If it is,
> then also check out -stpm2- which is more flexible and faster. Use
> -findit- to get latest version -- it's from SJ or SSC.)
>
> You could think more generally about models for interval-censored data
> -- see the MS and lessons off my survival analysis webpages (URL below)
> for discussion and references.  This shows how you can fit models which
> make no assumption about the shape of the underlying survival time
> distribution. (You can assume shapes for the interval-hazard if you
> wish; but can also assume interval-specific values if you wish and your
> data allow it.) And time-varying covariates can be easily incorporated.
>
> More complicated is what to do with multiple spells. (You don't mention
> them explicitly, but it sounds as if you have them according to your
> description.)   The key issue is non-independence across spells from the
> same person. Steve Samuels remarked on this and suggested clustering the
> standard errors (persons as clusters). An alternative is to assume some
> parametric form for the individual-specific effect that generates the
> non-independence across spells from the same person -- this is 'frailty'
> a.k.a. 'unobserved heterogeneity'. The most straightforward of handling
> this would be:
> * Reorganise (expand) your data so that you have one row in data set for
> each interval that each person is at risk of infection, and create an
> event occurrence indicator y_it for person i and interval t (see my
> Lessons)
> * Create any time-varying covariates required. At minimum, this will be
> some specification for the duration dependence of the interval hazard
> * fit a -xtcloglog- model with the binary outcome variable being y_it.
> This assumes that the person-specific frailty is normal (Gaussian). Or
> just fit a -cloglog- model if you want to ignore frailty. Either way,
> you would be fitting the interval-censored model corresponding to an
> underlying continuous time model that satisfies the proportional hazards
> assumption. (That assumption can be tested using interactions between
> explanatory variables and the variables summarising duration
> dependence.)  An alternative would be -xtlogit- and -logit- to data
> organised in the same way.
>
> [Cf. -pgmhaz8- and -hshaz- (on SSC) which also fit discrete time
> proportional hazards models with frailty (Gamma, and discrete mass
> point, respectively), but only to single spell data.  -xtcloglog- and
> -xtlogit- work with multiple spell data because the frailty is
> integrated out numerically.]
>
> Stephen
> -------------------------------------
> Professor Stephen P. Jenkins <[email protected]>
> Department of Social Policy
> London School of Economics and Political Science
> Houghton Street, London WC2A 2AE, U.K.
> Tel: +44 (0)20 7955 6527
> Changing Fortunes: Income Mobility and Poverty Dynamics in Britain, OUP
> 2011, http://ukcatalogue.oup.com/product/9780199226436.do
> Survival Analysis using Stata:
> http://www.iser.essex.ac.uk/survival-analysis
> Downloadable papers and software: http://ideas.repec.org/e/pje7.html
>
> ----------------------------------------------------------------------
>
> Date: Sat, 28 Jul 2012 09:29:15 +0100
> From: Patrick Munywoki <[email protected]>
> Subject: st: Interval censoring using intcens
>
> Hi,
> I have been attempting to analyse interval censored time-to-event data
> with 'intcens' ado (Griffin et al 2006). My data arise from a
> longitudinal household-based study with nasal swab collections
> twice-a-week for a duration of 26 weeks regardless of their any
> symptoms. I want to be able to estimate the duration of infectious
> period for one of the viruses we detected. I have reduced the data  to
> one observation per infection episode in order to use the 'intcens'
> command with t0 being the date last positive sample while t1 is the
> date of the next negative sample. I hope this data conversion to
> single observation per infection episode data is alright?
>
> My questions?
> 1. How do i interpret the coefficient given in the results below?
>
> intcens t0 t1 male, dist(exp) time nolog
>
> stata output
> Exponential distribution -      log acceleration factors
>
> Uncensored               0
> Right-censored           0
> Left-censored            0
> Interval-censored      188
>
>         Number of obs   =       188
>         Wald chi2(1)    =       0.00
> Log likelihood =  -1796.982     Prob > chi2     =       0.9990
>
>
> Coef.   Std. Err.      z    P>z     [95% Conf.  Interval]
>
> male   -.0001871        .1470683    -0.00   0.999    -.2884356  .2880615
> _cons    9.817517       .2234524    43.94   0.000     9.379558  10.25548
>
> Note the actual interval between the dates t0 and t1 is on average(sd)
> 3.6 (0.98) days; median(IQR) 4 (3-4) days; and range 2-7 days.
>
>
> 2. Whenever i try using any other distribution this error message pops
> up. What could be the problem here?
> intcens t0 t1 male, dist(weib) time nolog
> initial values not feasible
> r(1400);
>
> 3. Is there an alternative method to the interval censoring which
> allows me to use the multiple records per person accounting for the
> interval censoring. I have tried stpm but not sure whether it allows
> for this.
>
> I would greatly appreciate your help ,
>
> Many thanks,
>
> - --
> Patrick Munywoki
>
> Please access the attached hyperlink for an important electronic communications disclaimer: http://lse.ac.uk/emailDisclaimer
>
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/

-- 
Patrick Munywoki
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

Prev by Date: Re: st: Comparing coefficients across sub-samples
Next by Date: Re: st: Interaction term with categorical variables and no. of observations stepwise regressions
Previous by thread: Re: st: Mata Data Structure or "variable" variable names for timeseries computations
Next by thread: Re: st: Interval censoring using intcens
Index(es):
- Date
- Thread