[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: reporting cox regression for ordinal variables

From   Maarten buis <>
Subject   Re: st: reporting cox regression for ordinal variables
Date   Sun, 5 Oct 2008 23:24:07 +0100 (BST)

--- moleps islon <> wrote:
> I'm writing up a paper on survival in cancer. However I've got two
> ordinal variables. I've done the univariate analysis using xi: stcox
> a, and then tested for linear trend using testparm and discarded one
> of the variables due to p>0.5. However one of my variables (ecog)
> comes back highly significant using - testparm -.
> In a regular cox table how would you report this? Would you report
> each and every one of the levels of the ordinal variable with its
> coefficient? Or can I just use the beta from stcox without xi'ing it
> since I now know the variable has a linear relationship??

With or without -xi-ing means two different models: with -xi-ing you
enter the variable non-linearly as a set of dummies, without -xi-ing
you enter that variable linearly, in this case linear in the log(hazard
ratio). Choosing which model to report is ultimately your choice, which
means that you and only you are responsible and you cannot delegate
that responsibility to Stata or any test. 

For instance, if you found that a linear model is not significantly
different from a non-linear model than that could just means that your
sample size is not big enough to detect any non-linearity, and if you
find that the two models are significantly different than that could
just mean that your sample size is so big that you detected irrelevant
deviations from linearity. Either way the test result are inconclusive.
I would limit testing to the hypotheses I really care about, and build
my model such that it includes at least all the variables I am
interested in, even if they are insignificant, and maybe some controls
(though keep in mind to include only possible confounding variables and
not to include intervening variables). 

I would decide whether or not to enter a variable linearly or as a set
of dummies using a graph, like the graph below. (Notice the
inconsistency in my argument here as I include confidence intervals in
the graph. What can I say: I am only human.) 

Also when entering a variable linearly you should think very carefully
about the spacing of the categories: do you have any information that
might help you give these categories more realistic values, are these
categories evenly spaced, etc?

*------------- begin example ------------------
sysuse cancer, clear
gen cat_age = cond(age <= 50, 0, ///
              cond(age <= 60, 1, 2))
stset studytime, failure(died)
xi: stcox i.drug i.cat_age
est store a

xi: stcox i.drug cat_age
est store b

lrtest a b

est restore a
xi i.cat_age i.drug
adjust _Idrug_2=0 _Idrug_3=0, by(cat_age) ci replace
est restore b
adjust _Idrug_2=0 _Idrug_3=0, by(cat_age) ci replace
twoway scatter xb cat_age || ///
    rcap lb ub cat_age    || ///
    line _xb cat_age,        ///
    legend(off)              ///
    xlab(0 1 2)              ///
    ytitle("log(hazard ratio)")
*----------------- end example -----------------------
(For more on how to use examples I sent to the Statalist, see )

Hope this helps,

Maarten L. Buis
Department of Social Research Methodology
Vrije Universiteit Amsterdam
Boelelaan 1081
1081 HV Amsterdam
The Netherlands

visiting address:
Buitenveldertselaan 3 (Metropolitan), room N515

+31 20 5986715

*   For searches and help try:

© Copyright 1996–2015 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index