Notice: On March 31, it was **announced** that Statalist is moving from an email list to a **forum**. The old list will shut down on April 23, and its replacement, **statalist.org** is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
Steve Samuels <sjsamuels@gmail.com> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: Large standard error, Cox PH |

Date |
Sun, 29 Jul 2012 07:25:18 -0400 |

On Jul 29, 2012, at 4:08 AM, Lee Savage wrote: > > Thanks for you help Steve. I don't really know my way around R very well but > now might be a good time to learn. Is there any way to fit a Cox model using > lasso in Stata? > Not that I know of. There are implementations for linear regression( -lars- from SSC) and for a related technique, penalized logistic regression (http://www.homepages.ucl.ac.uk/~ucakgam/stata.html), Stepwise is a rightly condemned method for selecting variables, but bootstrapping has been proposed as way of rehabilitating stepwise. Maarten Buis showed how to bootstrap stepwise Cox models in http://www.stata.com/statalist/archive/2011-05/msg01427.html . The publication I referred to, and others, can be found on Rob Tibshirani's lasso page: http://www-stat.stanford.edu/~tibs/lasso.html Steve sjsamuels@gmail.com ----- Original Message ----- From: Steve Samuels <sjsamuels@gmail.com> To: statalist@hsphsun2.harvard.edu Cc: Sent: Sunday, 29 July 2012, 3:46 Subject: Re: st: Large standard error, Cox PH "but I would say that the ratio of the number of failures to the number of predictors should be no more than 5:1" That should be "no less than 5:1" Steve A scatter plot of "Minority" against your time variable is likely to show very little overlap of minority/non-minority countries. If so, the effect of the "Minority" variable is not accurately described by a proportional hazards model. The ordinary solution would be to designate "Minority" as a stratum variable in the Cox model. But you have a far more serious problem: overfitting (Bayak, 2004). Rules of thumb are not easy to come by, but I would say that the ratio of the number of failures to the number of predictors should be no more than 5:1. At 19:10, You are far over that limit. Thus you must throw the entire model out and start from scratch. You simply cannot assess the simultaneous effects of all those predictors. For solutions see Chapters 4 and 5 of: Harrell (2001).If you have access to the R Statistical package, you can employ the lasso (Tibshirani, 1997) for coefficient shrinkage, which is available in packages -glmpath-, -glmnet, and -penalized-. References: Babyak, MA. 2004. What you see may not be what you get: a brief, nontechnical introduction to overfitting in regression-type models. Psychosom Med 66, no.3:411-421. http://www.psychosomaticmedicine.org/cgi/content-nw/full/66/3/411/ Harrell, Frank E. 2001. Regression modeling strategies : with applications to linear models, logistic regression, and survival analysis. New York: Springer. Tibshirani, R. 1997. The lasso method for variable selection in the Cox model. Stat Med 16, no. 4: 385-395. Steve sjsamuels@gmail.com > On Jul 28, 2012, at 8:48 AM, Lee Savage wrote: > > The study is an analysis of government termination, estimated using a Cox > proportional hazards model. The problem variable is 'Minority', this is a binary > variable that indicates whether or not a government holds a parliamentary > majority. The problem is that the standard error of the coefficient is extremely > high. I have only seen this before when the coefficient was insignificant but in > this case the coefficient is significant (as you can see below). > Multicollinearity isn't a problem. I'm looking for advice on whether or not this > is a problem or can I simply report the model and just state that the high SE of > the 'Minority' variable means that it can't really be generalized? > > Here is the printout. > > Iteration 0: log pseudolikelihood = -40.288812 > Iteration 1: log pseudolikelihood = -28.304301 > Iteration 2: log pseudolikelihood = -26.968036 > Iteration 3: log pseudolikelihood = -26.902024 > Iteration 4: log pseudolikelihood = -26.901788 > Refining estimates: > Iteration 0: log pseudolikelihood = -26.901788 > > Cox regression -- Breslow method for ties > > No. of subjects = 19 Number of obs = 347 > No. of failures = 19 > Time at risk = 347 > Wald chi2(7) = 1603.76 > Log pseudolikelihood = -26.901788 Prob > chi2 = 0.0000 > > Haz. Robust > Ratio SE z P>z [95% Conf Int > Minority 77.01 56.61 5.91 0.00 18.23 325.28 > Ideology 0.84 0.20 -0.73 0.47 0.52 1.35 > formdays 0.94 0.03 -2.16 0.03 0.90 0.99 > nogovtpart~s 1.51 1.68 0.37 0.71 0.17 13.34 > ciep12 1.36 1.25 0.34 0.74 0.23 8.21 > ConsNoCon 1.28 1.18 0.27 0.79 0.21 7.86 > tvc > Unemployment 0.99 0.01 -2.31 0.02 0.98 1.00 > GDP 1.00 0.00 2.24 0.03 1.00 1.00 > Inflation 0.98 0.01 -4.38 0.00 0.97 0.99 > > > > > __________________________ > > > From: Steve Samuels <sjsamuels@gmail.com> > To: statalist@hsphsun2.harvard.edu > Sent: Friday, 27 July 2012, 21:30 > Subject: Re: st: Large standard error, Cox PH > > > To answer your questions, we'd need more detail. Describe the study and the > problem variable in particular. > As the FAQ request, "Say exactly what you typed and exactly what Stata typed (or > did) in response". > > Steve > sjsamuels@gmail.com > > > > > On Jul 27, 2012, at 2:20 PM, Lee Savage wrote: > > I have estimated a Cox PH model using a small sample (n=19, 347 months at risk). > For one of my covariates I have found a large hazard ratio (77.01) with a > correspondingly large standard error (56.61). I have seen this before but every > time the covariate was insignificant, in the current model the covariate is > significant (p=.001). I have tested the covariates for collinearity and > everything looks fine. I think the probable cause is the small sample size. > > > So my question is: is this a problem for my model overall model? My inclination > is to report the model as it is and just state that the significant effect for > the covariate in question should be treated with extreme caution, perhaps even > ignored. > > I'd appreciate any advice on this. > > Thanks. > > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/statalist/faq > * http://www.ats.ucla.edu/stat/stata/ > > > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/statalist/faq > * http://www.ats.ucla.edu/stat/stata/ > > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/statalist/faq > * http://www.ats.ucla.edu/stat/stata/ > * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**References**:**st: Large standard error, Cox PH***From:*Lee Savage <leemsavage@yahoo.co.uk>

**Re: st: Large standard error, Cox PH***From:*Steve Samuels <sjsamuels@gmail.com>

**Re: st: Large standard error, Cox PH***From:*Lee Savage <leemsavage@yahoo.co.uk>

**Re: st: Large standard error, Cox PH***From:*Steve Samuels <sjsamuels@gmail.com>

**Re: st: Large standard error, Cox PH***From:*Lee Savage <leemsavage@yahoo.co.uk>

- Prev by Date:
**st: format** - Next by Date:
**st: Re: format** - Previous by thread:
**Re: st: Large standard error, Cox PH** - Next by thread:
**st: economic interpretation mfx!!** - Index(es):