Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Re: Interval censored survival model


From   plumsh <plumsh119@gmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: Re: Interval censored survival model
Date   Fri, 25 Jan 2013 11:22:10 -0600

Thank you very much for responding. I'm involved in the research that
produced the original question so my response is to the point.

Two questions:
1) I guess my issue with INTCENS boils down to a technicality, namely
data formatting for intcens (searching statalist gives some hints but
I'd very much like to verify).
Again, suppose that observations on the same land parcel are recorded
on, say, Jan 1 of 1980, 1997, 2005, and 2010 (same dates for all
parcels in the sample). Say the intervals (t_0, t_1) are (1,8),
(8,16), and (16,21). [not sure if counting from 1 is necessary but
intcens ignores st settings] Should the data be in the following form
then:

id (land parcel)      t_0      t_1     event (0=stays as farmland,
1=converted to housing)
1                              1         8           0
1                              8         16         0
1                              16       21         1
2                              1         8           0
2                              8         16         1
2                              16       21         0
3                              1         8           0
3                              8         16         0
3                              16       21         0
As you see, parcel 1 gets converted in the third interval, parcel 2 in
the second, and parcel 3 does not get converted and is censored at
t=16 (end of third period).

With the data in this form, is it OK to run the following:

. intcens t_0  t_1  flood, dist(*)

where FLOOD is floodplain level classification (i.e., time invariant).
Will add more covariates of course.

Knowing if I'm correct with this specification would make my day.


2) Regarding the reference to pgmhaz(8), I'm afraid I don't understand
how the unequal interval length can be ignored. Even with constant
piecewise proportional hazard, the likelihood depends on the interval
length (t1 - t0). If there is no way to specify that in the syntax
(dataset?), we can't use it even if the intervals are the same for all
the subjects.


Regards,

On Fri, Jan 25, 2013 at 3:37 AM,  <S.Jenkins@lse.ac.uk> wrote:
> ------------------------------
>
> Date: Thu, 24 Jan 2013 15:58:41 -0600
> From: plumsh <plumsh119@gmail.com>
> Subject: st: Re: Interval censored survival model
>
>> The manual (Page 20 of the Survival Analysis section) explicitly
> states
>> that there are no discrete-time models in Stata. The only user-made
> codes
>> for grouped (interval censored) data that I found are pgmhaz(8),
> hshaz, and
>> intcens. The first two don't accommodate intervals of unequal length
> and,
>> unfortunately, the model and the syntax for INTCENS seems a little
> obscure
>> (at least to me at this point).
>>
>> My setup: land plots in agricultural use (farmland) have been
> converted to
>> residential and other commercial uses. Observations on the same land
> parcel
>> are recorded on, say, Jan 1 of 1980, 1997, 2005, and 2010 (same dates
> for
>> all parcels in the sample). Thus, the intervals are of unequal length.
> Apart
>> from that, we have stock sampling (the land has been farmed since a
> long
>> time ago; no record when and it does not really matter).
>>
>> I want to do survival analysis using location (distance to beach,
> roads,
>> schools), demographic (population density, mix, etc.), and economic
> (plenty)
>> parcel attributes.
>>
>> The theory on Grouped Duration Data analysis (particularly the
> piecewise
>> constant proportional hazard) is pretty straightforward (section 20.4
> in
>> Wooldridge, Econometric Analysis of Cross Section and Panel Data).
>>
>> Since I don't have the time to write a readily working function for
> the ml
>> command, I would greatly appreciate any advice on how to estimate my
>> interval censored (grouped) data on land parcels. Pity they didn't
> record
>> exact conversion times. My only alternative now is probit/logit codes
> (I
>> read most of the relevant posts on the Statalist archives).
>>
>> Regards
>>
>> Sheng
> =============
>
> To be frank, I don't see what the problem with using -intcens- (on SSC)
> is. To me, the help file gives examples of how to use it. The command
> line seeks, inter alia, the time points that define the intervals. To
> me, -intcens- is very nice because of (a) the flexibility regarding
> interval length (as you say), and (b) it's a convenient way of fitting a
> number of continuous time _parametric_ models in the situation where the
> available data are interval-censored. The restrictions of -intcens- to
> me are: (c) time-varying predictors are not allowed; (d) there is a
> particular set of parametric models and these may not suit you; (e) no
> unobserved heterogeneity ('frailty').
>
> The other user-written commands that you cite (by me, on SSC) handle (c)
> and (e). I think they would also be ok if the unequal-length intervals
> are the same unequal length for each person. That is, suppose 2 subjects
> have the same spell length (number of intervals) recorded. If the first
> interval is 2 months long for both (all) subjects, and the second
> interval is 1 month long for all subjects, etc., then the likelihood is
> fine. (One has to be careful about post-estimation interpretation,
> however.)
>
> Also check out -stpm- on SSC. I've not used it, but the help file states
> that it can handle interval-censored data. There is also -stpm2- on SSC
> which is a development of -stpm-, but I am not sure whether it handles
> interval-censored data (not mentioned in help file in the same way). If
> Paul Lambert or Michael Crowther are list members, perhaps they can
> clarify matters.
>
> I don't see how "probit/logit codes" would be a way forward, unless you
> were to ignore the impact of elapsed duration on the hazard rate, and
> simply model event occurrence.
>
> Stephen
> ------------------
> Stephen P. Jenkins <s.jenkins@lse.ac.uk>
> Professor of Economic and Social Policy
> Department of Social Policy
> London School of Economics and Political Science
> Houghton Street, London WC2A 2AE, UK
> Tel: +44(0)20 7955 6527
> The Great Recession and the Distribution of Household Incomes, OUP 2013,
> http://ukcatalogue.oup.com/product/9780199671021.do
> Changing Fortunes: Income Mobility and Poverty Dynamics in Britain, OUP
> 2011, http://ukcatalogue.oup.com/product/9780199226436.do
> Survival Analysis Using Stata:
> http://www.iser.essex.ac.uk/survival-analysis
> Downloadable papers and software: http://ideas.repec.org/e/pje7.html
>
> Please access the attached hyperlink for an important electronic communications disclaimer: http://lse.ac.uk/emailDisclaimer
>
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/faqs/resources/statalist-faq/
> *   http://www.ats.ucla.edu/stat/stata/
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index