[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: stcox output: p-value and CI don't agree

From   Philip Ryan <>
Subject   Re: st: stcox output: p-value and CI don't agree
Date   Wed, 08 Aug 2007 14:02:09 +0930

OK. I was wrong and I was (perhaps) right.

I was wrong is assuming you were merely doing:

.stcox drug, vce(bootstrap)

in which case the header of the "coefficient" column is, as it should be, "Observed Haz Ratio".

But in fact you were using a separate bootstrapping program, and, as you rightly say, the "coefficient" IS the bootstrap estimate of the hazard ratio (since you are bootstrapping exp_b[drug]).

But you are not bootstrapping estimates of the upper and lower confidence intervals on the exponential scale. Stata just sees the 100 instances of the "coefficients" as numbers, not hazard ratios, it gets their mean which it reports as the final estimate of "coefficient" (OK) and it gets their standard deviation and it just does the usual normal large sample approximation to a confidence interval. But the coverage of this confidence interval may not be very good, because the hazard ratio is not usually well approximated by a normal distribution. One usually deals with this by operating on the log scale - more closely approximating the normal distribution - and exponentiating afterwards.

So, what happens to your paradoxical example if you bootstrap _b[drug] rather than exp(_b[drug]) and exponentiate the reported coefficient, the lower limit and the upper limit as the final step? Does this resolve the inconsistency?


At 12:56 PM 8/08/2007, you wrote:

Finally, if you estimated the hazard ratio in -stcox- the header would be "Observed Haz. Ratio", not "Observed coefficient".
Thank you Philip, but with all due respect I promise it's the hazard ratio that is returned. The following is the code I used, but for the purpose of this question, applied to the Stata system data file <cancer.dta>. Note that the plain -stcox- and bootstrapped -Cox- have the same effect size, but a different header.

My question is this: in the case where Cox proportional hazards regression results in apparently contradictory p-value and 95% CI, what steps would one follow to investigate this observed result?

I'd appreciate any strategy pointers that anybody might have.
Thank you!

*--------------- begin code in question -----------------------

* plain cox
* load data and define as survival data
sysuse cancer, clear
stset studytim, failure(died)
* run cox
stcox drug

* bootstrap cox
* load data and define as survival data
sysuse cancer, clear
stset studytim, failure(died)

* define program
capture program drop boot_hr
program define boot_hr, rclass

* cox
stcox drug
indeplist, local
foreach var of varlist `X' {
return scalar `var' = exp(_b[`var'])

* set seed for reproducibility, since bootstrap is a random sampling
set seed 12358

* run program
bootstrap drug=r(drug), reps(100): boot_hr
*--------------- end code in question -----------------------

* For searches and help try:
Philip Ryan
Discipline of Public Health

Director, Data Management & Analysis Centre

Associate Dean (IT)
Faculty of Health Sciences

postal address:
Discipline of Public Health
Mail Drop 511
University of Adelaide 5005
South Australia

Level 6, Room 6-18
Bice Building
Royal Adelaide Hospital
North Terrace

tel +61 8 8303 3570
fax +61 8 8223 4075
CRICOS Provider Number 00123M
This email message is intended only for the addressee(s)
and contains information that may be confidential and/or
copyright. If you are not the intended recipient please
notify the sender by reply email and immediately delete
this email. Use, disclosure or reproduction of this email
by anyone other than the intended recipient(s) is strictly
prohibited. No representation is made that this email or
any attachments are free of viruses. Virus scanning is
recommended and is the responsibility of the recipient.
*   For searches and help try:

© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index