# Re: st: R: Estimating the probability of censoring

 From Michael McCulloch <[email protected]> To [email protected] Subject Re: st: R: Estimating the probability of censoring Date Thu, 25 Sep 2008 13:23:06 -0700

Antoine, thanks for your splendid explanation. This is just what I need to continue forward!
Have a wonderful day,
Michael

Michael,

Xhat is calculated here is the survival function at time t conditional on covariates, i.e., in your case, the probability that the subjects time-to-censoring will be greater than t if dying from cancer was not a risk. If that makes sense to you, the formula is:

S(t|X)=exp(-IBH(t)*exp(XB))

where IBH(t) is the integrated baseline hazard at time t

and this is what is calculated in cens_adj, assuming that the -basec()- option of -stcox- does indeed compute the integrated baseline hazard.

Now, since the baseline survival function is

S0(t)=exp(-IBH(t))

one can write the survival function as

S(t|X)=exp(exp(XB)*ln(S0(t)))

which can be rewritten as

S(t|X)=S0(t)^(exp(XB))

that last formula avoids the missing value for ln(S0(t)) when the baseline survival is 0

It is used to generate cens_adj2, assuming that the -basesurv()- option of -stcox- calculates the proper survival function.

As you can see, the results are slightly different, so I assumed after checking the web page that maybe -basec()- calculates something slightly different from IBH(t).

Antoine

Michael McCulloch wrote:

Thank you kindly Antoine, for pointing out:
to specify the baseline cumulative hazard in -stcox-, and
to multiply that baseline cumulative hazard by the regression coefficient.

I see that the two approaches you suggest:
provide very similar results (as seen by . twoway scatter cens_adj cens_adj2).

how those two approaches differ, and
why the probability varies so much given that died is y/n?
Michael

Hello _all

I might be missing something, but isn't the correct way to do this more like (the part where I generate cens_adj):

sysuse cancer.dta, clear
gen id = _n // generate individual IDs
stset studytime, failure(died==0) // note that total person-time is 744
*estimate the unadjusted probability of censoring
sts gen cens = s
*estimate the adjusted probability of censoring
stcox drug age, nohr basec(cum)
predict coeff_cens, xb // predicts linear coefficients of censoring
gen p_cens = exp(coeff_cens) // adjusted probability of time-to-censoring

*lists the results

list id drug age died cens p_cens cens_adj in 40/48, clean noobs

id drug age died cens p_cens cens_adj
40 3 50 0 .58476475 .8601028 .6811302
41 3 55 1 .58476475 .9429854 .6563864
42 3 57 1 .51166915 .9783332 .5705886
43 3 48 0 .51166915 .8290267 .6216006
44 3 56 0 .34111277 .9604967 .4199265
45 3 60 1 .34111277 1.033855 .3930005
46 3 62 0 .22740851 1.072609 .2585016
47 3 48 0 .11370426 .8290267 .2171345
48 3 52 0 0 .8923438 .0710848

Antoine

Michael McCulloch wrote:

at a very first glance, what hits the eyes is the adjusted probability of
being censoring sometimes above the usual upper constraint. How can it be? I
should have missed something in your assumptions.
Yes, that's one of my main questions. Since I used the Stata-supplied cancer.dta file and provided all my methods, I'm hoping that someone on Statalist might have advice on how to correct the method for unadjusted estimation of censoring probability.

Kind Regards,
Carlo

-----Messaggio originale-----
Da: [email protected]
[mailto:[email protected]] Per conto di Michael McCulloch
Inviato: gioved� 25 settembre 2008 17.56
A: Statalist
Oggetto: st: Estimating the probability of censoring

Hello,
I'm seeking guidance on a series of commands I've written to estimate
the probability of being censored. Might anyone be able to offer
commentary as to whether I've done this correctly? The resulting
unadjusted probability of censoring ranges from 0-1, while the

sysuse cancer.dta, clear
gen id = _n // generate individual IDs
stset studytime, failure(died==0) // note that total
person-time is 744
*estimate the unadjusted probability of censoring
sts gen cens = s
*estimate the adjusted probability of censoring
stcox drug age, nohr
predict coeff_cens, xb // predicts linear
coefficients of censoring
gen p_cens = exp(coeff_cens) // adjusted probability of
time-to-censoring
*lists the results
list id drug age died cens p_cens in 40/48, clean noobs

40 3 50 0 .58476475 .8601028
41 3 55 1 .58476475 .9429854
42 3 57 1 .51166915 .9783332
43 3 48 0 .51166915 .8290267
44 3 56 0 .34111277 .9604967
45 3 60 1 .34111277 1.033855
46 3 62 0 .22740851 1.072609
47 3 48 0 .11370426 .8290267
48 3 52 0 0 .8923438

--

Best wishes,
Michael McCulloch

Pine Street Foundation
124 Pine St., San Anselmo, CA 94960-2674
Tel: (415) 407-1357
Fax: (415) 485-1065
[email protected]
www.pinestreetfoundation.org
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/

*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/