Notice: On March 31, it was **announced** that Statalist is moving from an email list to a **forum**. The old list will shut down on April 23, and its replacement, **statalist.org** is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
Steve Samuels <sjsamuels@gmail.com> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: Dfbeta in Cox regression post-estimation with Stata 10.1 |

Date |
Thu, 26 Apr 2012 16:59:24 -0400 |

A clarification: If you apply my jackknife- based approximation to DFBETA after -regress-, it would not match Stata's official calculation. My version is DFBETA_i = (b_i -b)/se(b) where the se(b) is for the model with all the data. The correct version substitutes se_i(b), the standard error from the regression without observation i. In practice there will be little difference _unless_ the observation is a strong outlier. Of course, DFBETA even for -regress- is a poor measure of influence because several nearby outliers can mask one another. That is why I recommend the very robust command -mmregress- (from SSC), which can also detect high-leverage observations. Rereading BKW, I found that the authors used the term DFBETA for the unscaled difference in regression coefficients, i.e. DFBETA = (b_i - b). The scaled DFBETA was termed DFBETAS, a usage that thankfully disappeared, or we would be referring to "DFBETASs". Steve sjsamuels@gmail.com Belsley, D. A., Kuh, E., & Welch, R. E. (1980). Regression Diagnostics: Identifying Influential Data and Sources of Collinearity. New York, NY: Wiley. On March 15, 2012 at 17:39 PM, Steve Samuels wrote: This email apparently didn't make it to the list and I didn't get a bounce-back. Here's a version that works with an arbitrary number of predictors. *************CODE BEGINS************* version 10 sysuse auto, clear stset length local xvars turn price jackknife _b, keep: stcox `xvars', nohr stcox `xvars', nohr foreach z of varlist `xvars'{ gen dfb_`z' = /// (1/_se[`z'])* (_b_`z'-_b[`z'])/(e(N_sub)-1) } sum dfb* ***********CODE ENDS******************* Steve sjsamuels@gmail.com On Feb 26, 2012, at 9:47 AM, Steve Samuels wrote: Use -jackknife- to calculate dfbeta from first principles: b_i = coefficient, omitting observation i from analysis: Jackknife Pseudo-value : v_i = n*b - (n-1)*b_i dfbeta_i = (b - b_i)/se(b) = (v_i - b)/(se(b)*(n-1)) This is a direct calculation. The values issued by later versions of Stata after -stcox- are approximations. *************CODE BEGINS************* sysuse auto, clear stset length version 10 jackknife _b[turn] , keep: stcox turn stcox turn gen dfbjack1 = /// (1/(_se[turn])*(_jk_1 - _b[turn])/(e(N)-1)) version 12.1 stcox turn predict dfbcox* ,dfbeta corr dfb* **************CODE ENDS************** Steve sjsamuels@gmail.com On Feb 25, 2012, at 6:57 AM, Annibale Cois wrote: I need to calculate dfbeta statistics for Cox proportional Hazard Models. Stata 10 does not allow directly to calculate them (Stata 11 does, conversely). Does anyone know if there is a way to do the same (indirectly) in Stata 10? Thanks for any help! Annibale Cois UCT School Of Public Health (Master Student) * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

- Prev by Date:
**Re: st: Running loops with graphs** - Next by Date:
**st: 1:1 match and identifying result** - Previous by thread:
**st: question: reported e(V) after a logit vs the information matrix** - Next by thread:
**st: 1:1 match and identifying result** - Index(es):