Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down at the end of May, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Large standard error, Cox PH


From   Lee Savage <leemsavage@yahoo.co.uk>
To   "statalist@hsphsun2.harvard.edu" <statalist@hsphsun2.harvard.edu>
Subject   Re: st: Large standard error, Cox PH
Date   Sun, 29 Jul 2012 09:08:10 +0100 (BST)

Thanks for you help Steve. I don't really know my way around R very well but now might be a good time to learn. Is there any way to fit a Cox model using lasso in Stata?



----- Original Message -----
From: Steve Samuels <sjsamuels@gmail.com>
To: statalist@hsphsun2.harvard.edu
Cc: 
Sent: Sunday, 29 July 2012, 3:46
Subject: Re: st: Large standard error, Cox PH

"but I would say that the ratio of the number of
failures to the number of predictors should be no more than 5:1"

That should be "no less than 5:1"

Steve

A scatter plot of "Minority" against your time variable is likely to show
very little overlap of minority/non-minority countries. If so, the effect of the
"Minority" variable is not accurately described by a proportional hazards model.
The ordinary solution would be to designate "Minority" as a stratum variable
in the Cox model.

But you have a far more serious problem: overfitting (Bayak, 2004). Rules of
thumb are not easy to come by, but I would say that the ratio of the number of
failures to the number of predictors should be no more than 5:1.  At 19:10, You
are far over that limit. Thus you must throw the entire model out and start from
scratch. You simply cannot assess the simultaneous effects of all those
predictors.

For solutions see Chapters 4 and 5 of: Harrell (2001).If you have access to the
R Statistical package, you can employ the lasso  (Tibshirani, 1997) for
coefficient shrinkage, which is available in packages -glmpath-, -glmnet, and
-penalized-. 

References:

Babyak, MA. 2004. What you see may not be what you get: a brief, nontechnical
introduction to overfitting in regression-type models. Psychosom Med 66,
no.3:411-421. http://www.psychosomaticmedicine.org/cgi/content-nw/full/66/3/411/

Harrell, Frank E. 2001. Regression modeling strategies : with applications to
linear models, logistic regression, and survival analysis. New York: Springer.

Tibshirani, R. 1997. The lasso method for variable selection in the Cox model.
Stat Med 16, no. 4: 385-395.

Steve
sjsamuels@gmail.com


> On Jul 28, 2012, at 8:48 AM, Lee Savage wrote:
> 
> The study is an analysis of government termination, estimated using a Cox
> proportional hazards model. The problem variable is 'Minority', this is a binary
> variable that indicates whether or not a government holds a parliamentary
> majority. The problem is that the standard error of the coefficient is extremely
> high. I have only seen this before when the coefficient was insignificant but in
> this case the coefficient is significant (as you can see below).
> Multicollinearity isn't a problem. I'm looking for advice on whether or not this
> is a problem or can I simply report the model and just state that the high SE of
> the 'Minority' variable means that it can't really be generalized?
> 
> Here is the printout. 
> 
> Iteration 0:   log pseudolikelihood = -40.288812
> Iteration 1:   log pseudolikelihood = -28.304301
> Iteration 2:   log pseudolikelihood = -26.968036
> Iteration 3:   log pseudolikelihood = -26.902024
> Iteration 4:   log pseudolikelihood = -26.901788
> Refining estimates:
> Iteration 0:   log pseudolikelihood = -26.901788
> 
> Cox regression -- Breslow method for ties
> 
> No. of subjects      =           19                Number of obs   =       347
> No. of failures      =           19
> Time at risk         =          347
>                                                   Wald chi2(7)    =   1603.76
> Log pseudolikelihood =   -26.901788                Prob > chi2     =    0.0000
> 
>                      Haz.     Robust
>                      Ratio    SE        z        P>z      [95% Conf Int
> Minority               77.01     56.61     5.91    0.00     18.23   325.28
> Ideology                0.84      0.20    -0.73    0.47      0.52    1.35
> formdays                0.94      0.03    -2.16    0.03      0.90    0.99
> nogovtpart~s            1.51      1.68     0.37    0.71      0.17    13.34
> ciep12                  1.36      1.25     0.34    0.74      0.23    8.21
> ConsNoCon               1.28      1.18     0.27    0.79      0.21    7.86              
> tvc                        
> Unemployment            0.99      0.01    -2.31    0.02      0.98    1.00
> GDP                     1.00      0.00    2.24    0.03      1.00    1.00
> Inflation               0.98      0.01    -4.38    0.00      0.97    0.99
> 
> 
> 
> 
> __________________________
> 
> 
> From: Steve Samuels <sjsamuels@gmail.com>
> To: statalist@hsphsun2.harvard.edu 
> Sent: Friday, 27 July 2012, 21:30
> Subject: Re: st: Large standard error, Cox PH
> 
> 
> To answer your questions, we'd need more detail.  Describe the study and the
> problem variable in particular.
> As the FAQ request, "Say exactly what you typed and exactly what Stata typed (or
> did) in response". 
> 
> Steve
> sjsamuels@gmail.com
> 
> 
> 
> 
> On Jul 27, 2012, at 2:20 PM, Lee Savage wrote:
> 
> I have estimated a Cox PH model using a small sample (n=19, 347 months at risk).
> For one of my covariates I have found a large hazard ratio (77.01) with a
> correspondingly large standard error (56.61). I have seen this before but every
> time the covariate was insignificant, in the current model the covariate is
> significant (p=.001). I have tested the covariates for collinearity and
> everything looks fine. I think the probable cause is the small sample size.
> 
> 
> So my question is: is this a problem for my model overall model? My inclination
> is to report the model as it is and just state that the significant effect for
> the covariate in question should be treated with extreme caution, perhaps even
> ignored.
> 
> I'd appreciate any advice on this.
> 
> Thanks.
> 
> *
> *   For searches and help try:
> *  http://www.stata.com/help.cgi?search
> *  http://www.stata.com/support/statalist/faq
> *  http://www.ats.ucla.edu/stat/stata/
> 
> 
> *
> *   For searches and help try:
> *  http://www.stata.com/help.cgi?search
> *  http://www.stata.com/support/statalist/faq
> *  http://www.ats.ucla.edu/stat/stata/
> 
> *
> *   For searches and help try:
> *  http://www.stata.com/help.cgi?search
> *  http://www.stata.com/support/statalist/faq
> *  http://www.ats.ucla.edu/stat/stata/
> 

*
*   For searches and help try:
*  http://www.stata.com/help.cgi?searchhttp://www.stata.com/support/statalist/faqhttp://www.ats.ucla.edu/stat/stata/


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index