Stata: Data Analysis and Statistical Software

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Bootstrapping Harrell's C - problem with freezing model to establish optimism - stepwise etc.

From	Roger Newson <[email protected]>
To	"[email protected]" <[email protected]>
Subject	Re: st: Bootstrapping Harrell's C - problem with freezing model to establish optimism - stepwise etc.
Date	Mon, 21 Feb 2011 14:50:37 +0000

I don't know who "Choot" and "Corig" are, although I think I know which2 papers by Harrell et al. you are referring to. You should includereferences, because not everybody on the list will know the papers towhich you refer.

However, I have written a paper on the use of Harrell's c (and Somers'D) with models in general and survival models in particular (Mewson,2010). This stresses the importance of training sets and test sets, anddiscusses the Harrell et al. methods. The Harrell et al. methods arebootstrap-like, but are not the bootstrap. Instead, the user must dividethe data into multiple pairs of a training set and a test set, and, foreach training set-test set pair, estimate the optimism, and thencalculate the confidence limits using methods similar to the bootstrap.The -bs- command does not do this for you. You will probably have towrite your own program for defining multiple test sets and multipletraining sets.


I hope this helps. Let me know if you have any more queries.

Best wishes

Roger


References

Newson RB. 2010. Comparing the predictive powers of survival modelsusing Harrell's C or Somers' D. The Stata Journal 10(3): pp. 339-358 .Purchase from

http://www.stata-journal.com/article.html?article=st0198
or download a pre-publication draft from
http://www.imperial.ac.uk/nhli/r.newson/papers.htm#papers_in_journals


Roger B Newson BSc MSc DPhil
Lecturer in Medical Statistics
Respiratory Epidemiology and Public Health Group
National Heart and Lung Institute
Imperial College London
Royal Brompton Campus
Room 33, Emmanuel Kaye Building
1B Manresa Road
London SW3 6LR
UNITED KINGDOM
Tel: +44 (0)20 7352 8121 ext 3381
Fax: +44 (0)20 7351 8322
Email: [email protected]
Web page: http://www.imperial.ac.uk/nhli/r.newson/
Departmental Web page:
http://www1.imperial.ac.uk/medicine/about/divisions/nhli/respiration/popgenetics/reph/

Opinions expressed are those of the author, not of the institution.

On 19/02/2011 08:06, Jon Kroll Bjerregaard wrote:

Hello

I'm trying to determine the optimism - as described by Harrell et al. for a
Cox model established for pancreatic cancer (with Harrell's C instead of
Somers' D).

I have made a model including clinical (forced into) and clinical
(stepwise'ed selected) variables - I have 150 events in 178 patients.

Selection statement
xi: stepwise, pr(.15) lockterm1: stcox (i.AJCC i.inf_PS) zalder i.gender
vol_GTV i.forb_regime i.hem_LNL zwbc zthromb i.LDH_UNL i.ALAT_UNL i.BASP_UNL
sero_bili i.resection_perf
Ending with the final model
xi: stcox i.AJCC i.inf_PS i.BASP_UNL vol_GTV i.resection_perf i.forb_regime
Which is a mixture of continuous and categorical variables.

This is what I'm trying to do(as Harrell describes 1996/2001):
Bootstrap Harrell's C from the full model including the stepwise selection-
(Cboot)
"Freeze" the bootstrapped model and apply it to the original dataset and
calculate Harrell's C (Corig)
Calculate optimism from: Cboot-Corig
Repeat 200 times bootstrap

So this is where my problems start (or my lack of skills)

I use this program - adapted from another statalist post
****************************************************
capture program drop b_conc
program define b_conc, rclass
                              xi: stepwise, pr(.15) lockterm1: stcox (i.AJCC
i.inf_PS) zalder i.gender vol_GTV i.forb_regime i.hem_LNL zwbc zthromb
i.LDH_UNL i.ALAT_UNL i.BASP_UNL sero_bili i.resection_perf, efron
                              estat concordance
                              return scalar c = r(C)
                              end
bs d=r(c), reps(200) seed (123456) saving(myfile, replace): b_conc
***************************************************
Then I do another one with the final model and substract them - but this is
not really the plan.

I have several problems with this since it refuses to perform the bootstrap
(I get a lot of x's) which is most likely due to not using temporary
variables - haven't figured out exactly what is wrong yet.
I also need to put in the "freezed" model and apply to the original dataset
- which I'm not sure how I get into a bootstrap routine.

Thanks in advanced

Jon K. Bjerregaard, MD.
Dep. of Oncology, Odense University Hospital




*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

Follow-Ups:
- RE: st: Bootstrapping Harrell's C - problem with freezing model to establish optimism - stepwise etc.
  - From: "Jon Kroll Bjerregaard" <[email protected]>

References:
- st: Bootstrapping Harrell's C - problem with freezing model to establish optimism - stepwise etc.
  - From: "Jon Kroll Bjerregaard" <[email protected]>

Prev by Date: RE: st: best practice for dates and times
Next by Date: RE: st: best practice for dates and times
Previous by thread: st: Bootstrapping Harrell's C - problem with freezing model to establish optimism - stepwise etc.
Next by thread: RE: st: Bootstrapping Harrell's C - problem with freezing model to establish optimism - stepwise etc.
Index(es):
- Date
- Thread