  # Re: st: Predicting with time varying covariates

 From rgutierrez@stata.com (Roberto G. Gutierrez, StataCorp) To statalist@hsphsun2.harvard.edu Subject Re: st: Predicting with time varying covariates Date Wed, 14 Apr 2004 17:44:56 -0500

```Deepa <bhat@economics.rutgers.edu> asks:

> I am working on a Survival Analysis model (under the lognormal distribution)
> for which I am trying to predict the mean duration while testing for some X
> (a policy variable which is binary) that affects it. The "predict" command
> in the manual shows me how I can achieve this with time constant variables.
> But in my model, the policy variables change over time and I also have
> demographic variables that affect the duration which change over time.

> How do I predict the mean duration for spells when I have time-varying
> covariates?

In the case of the parametric model, you would have to work out the
integration necessary to calculate the predicted mean manually.  That is,
instead of using predict to get the mean time-to-failure for a subject
with static covariate pattern (say)

ID        x1    x2    x2
101       0.5   4.5   0.2

you want the the predicted mean time-to-failure for a subject with covariate
_history_

ID        x1    x2    x3    _t0    _t
101       0.5   4.5   0.2     0     5
101       0.5   4.5   0.4     5    10
101       0.5   4.5  -1.4    10    24

where we now allow variable -x3- to be time varying.  There are some issues
with this calculation, however.  In particular, we need to know the value
of -x3- from time 0 to infinity in order to perform a mean calculation.
Subject 101 was observed to fail (or censored) at time _t=24, but in the
calculation of the mean, are we to assume that -x3- is -1.4 from time 24 to
infinity?  If not, what do we assume for this value?  If we can indeed assume
it is 24, then the calculation can proceed, but does the resulting mean have
the interpretation that Deepa wants?

Another issue:  what if one or more subjects have delayed entry or other gaps
in their histories?  What do you assume about -x3- when the subject is not
observed?

These issues aside, the integration necessary to calculate the mean would then
need to be done in pieces, in the above case, from 0 to 5, then from 5 to 10,
then from 10 to 24, and finally from 24 to infinity.  And you would have to do
this for each subject in the data according to when their covariates change.
Then you could just sum the pieces within each subject to get the estimated
mean for each.

If Deepa wishes to email me privately with some data and assumptions about the
time varying covariates, I can provide further assistance with the sequence of
commands necessary to perform the mean calculation.  I don't recall offhand
whether the piecewise integrals will be of closed form, which if not would add
a further level of complication.

--Bobby
rgutierrez@stata.com
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```