Statalist The Stata Listserver

[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: RE: RE: Validity of hazard predictions in frailtymodels

From   "Steinar Fossedal" <>
To   <>
Subject   st: RE: RE: Validity of hazard predictions in frailtymodels
Date   Tue, 29 May 2007 16:04:03 +0200

Not my finest moment there, thank you so much for correcting my
misinterpretations, Maarten. I will use (1-surv_a) as the probability
for failure in each step.

On a sidenote, when working with these predictions, I noticed that I
sometimes get missing predictions from -predict csurv_a, csurv-. Does
anyone know what might be the cause for this? I've left a transcription
below to show the problem. It strikes me as odd that Stata is able to
calculate surv_a but not the cumulative csurv_a.

I have also been experiencing problems doing out-of-sample predictions
of e.g. cox-snell residuals (I did stset the data to include the
validation sample). Maybe the two problems are related.


time	haz_a		surv_a	csurv_a	_st	_t	_t0	_d
1	.0006299	.9993703	.9993706	1	1
0	0
2	.0006299	.9993703	.9987423	1	2
1	0
3	.0006299	.9993703	.		1	3
2	0
4	.0006549	.9993452	.		1	4
3	0
5	.0006549	.9993452	.		1	5
4	0
6	.0007081	.9992921	.9980397	1	6
5	0
7	.0007081	.9992921	.9973384	1	7
6	0
8	.0007657	.9992346	.9965819	1	8
7	0

-----Original Message-----
[] On Behalf Of Maarten Buis
Sent: 29. mai 2007 15:15
Subject: st: RE: Validity of hazard predictions in frailtymodels

---- Steinar Fossedal wrote:
> I have fitted an exponential survival model with gamma frailty, and
> find myself in a pickle trying to interpret and apply the predictive
> results. Specifically I'm getting predictions of individual hazard
> exceeds one. Such estimates are, of course, problematic to use in the
> next step of my analysis.

Actually you have no problem. Hazards are not the same as probabilities,

and can range between 0 and +infinity. The interpretation of a hazard is

the number of times in a unit time that you can on average be expected 
to experience the event. Say the event is experiencing a cold. My hazard

for experiencing a cold when time is measured in months is probably less

than 1, but if time is measured in centuries it will clearly be larger 
than one (until someone invents the cure for the common cold). 

> I reason that this is caused by the multiplicative effect of the
> parameter on the hazard, and that the model only ensures the validity
> the population hazard values - not the unobserved individual's. With
> validity I mean restrictions that ensure the hazard stays between zero
> and one. To me, this seems like an inherent weakness in frailty
> This may not matter much when investigating hazard ratios and
> differences between populations, but it does pose a problem when
> individual predictions

Actually the model without frailty component (possibly with robust 
standard errors) is the model that correctly looks at population values 
of the hazard, while the model with frailty component captures the
at the individual level (if you believe your model, i.e. the unobserved 
component of your model is gamma distributed, uncorrelated with your 
observed variables, the effects of the observed variables are correctly 
specified, etc. etc. It is no coincidence that robust standard errors
called robust, thus implying that other models are less robust.
I still think that the name robust suggests more robustness than it can 
deliver. But that is another issue))

Hope this helps,

Maarten L. Buis
Department of Social Research Methodology 
Vrije Universiteit Amsterdam 
Boelelaan 1081 
1081 HV Amsterdam 
The Netherlands

visiting address:
Buitenveldertselaan 3 (Metropolitan), room Z434 

+31 20 5986715

*   For searches and help try:

*   For searches and help try:

© Copyright 1996–2017 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index