# st: Survival Forecasting....help needed replicating results outside of Stata

 From Jon Haveman To statalist@hsphsun2.harvard.edu Subject st: Survival Forecasting....help needed replicating results outside of Stata Date Tue, 13 May 2008 09:59:17 -0700

Hi All,
I am working with mortgage data and am trying to forecast the probability of default in each of the next 72 months for particular loans. I have a question at the bottom about how to take Stata's predictions and replicate them outside of Stata.

My data are organized on a monthly basis, with time varying covariates (e.g., home price appreciation).

Here is my approach:

1) stset a sample of historical data

stset enddate, time0(startdate) origin(time omonths)
enter(time omonths) exit(prepay==1 time thismonth)
id(numericloanid) failure(default==1)

where enddate is the current month and startdate is enddate - 1
omonths is the month in which the loan was originated
different loans are indicated by numericloanid
default is a dummy variable for default
prepay is a dummy variable for prepayment
thismonth indicates the current month and right censoring for
any loan that is still alive today

2) streg <covariates>, distribution(llogistic) iter(100)

3) stset a loan that is still alive, for which I want to predict default

stset enddate, time0(startdate) origin(time omonths)
enter(time origmonths) exit(prepay==1 time thismonth+72)
id(numericloanid) failure(default==1)

This is the same as before, but moves the right censoring out 72 months. All of the underlying covariates have been forecast out 72 months.

4) predict survival rate: predict s, surv
predict the hazard rate: predict h, haz

5) generate the conditional probability of default in each month

f = s * h;

Now, I get very sensible results this way, but I'm a bit confused on several matters. I had thought that I should be able to replicate (S) using the following:

gen s = (1+(exp(-xb)*_t)^(1/gamma))^(-1)

But this obviously does not work:

. predict s, surv;

. gen gamma = exp(-.657325);

. predict xb, xb;

. gen s1 = 1/(1+(exp(-xb)*_t)^(1/gamma));

. list enddate _t _t0 s s1;

+-------------------------------------------+
| enddate _t _t0 s s1 |
|-------------------------------------------|
1. | 554 1 0 .9727553 .9526516 |
2. | 555 2 1 .9034937 .8407519 |
3. | 556 3 2 .8404005 .7034764 |
4. | 557 4 3 .8025259 .5773274 |
5. | 558 5 4 .7860863 .4721465 |
|-------------------------------------------|
6. | 559 6 5 .7856199 .393317 |
7. | 560 7 6 .7904842 .3272628 |
8. | 561 8 7 .7988772 .2743748 |
9. | 562 9 8 .8090085 .2328688 |
10. | 563 10 9 .8190303 .1975901 |

My question is how do I replicate "s" without using the predict command?

Is this possible? Is this sensible?

Any assistance that anybody can provide would be very much appreciated.

Jon

*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/