Stata: Data Analysis and Statistical Software

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Dfbeta in Cox regression post-estimation with Stata 10.1

From	Steve Samuels <[email protected]>
To	[email protected]
Subject	Re: st: Dfbeta in Cox regression post-estimation with Stata 10.1
Date	Thu, 26 Apr 2012 16:59:24 -0400

A clarification:

If you apply my jackknife- based approximation to DFBETA after -regress-,
it would not match Stata's official calculation.
My version is  DFBETA_i = (b_i -b)/se(b) where the se(b) is for the model with all the data.
The correct version substitutes se_i(b), the standard error from the regression without observation i.
In practice there will be little difference _unless_ the observation is a strong outlier.

Of course, DFBETA even for -regress- is a poor measure of influence because several nearby outliers
can mask one another. That is why I recommend  the very robust command -mmregress- (from SSC), which
can also detect high-leverage observations.

Rereading BKW, I found that the authors used the term DFBETA for the unscaled difference in regression coefficients,
i.e. DFBETA = (b_i - b).  The scaled DFBETA was termed DFBETAS, a usage that thankfully disappeared, or we would be
referring to "DFBETASs".

Steve
[email protected]

Belsley, D. A., Kuh, E., & Welch, R. E. (1980).
Regression Diagnostics: Identifying Influential Data and Sources of Collinearity.
New York, NY: Wiley.


On March 15, 2012 at 17:39 PM, Steve Samuels wrote:
This email apparently didn't make it to the list and I didn't get a bounce-back.

Here's a version that works with an arbitrary number of predictors.


*************CODE BEGINS*************
version 10

sysuse auto, clear
stset length

local xvars  turn price

jackknife _b, keep: stcox `xvars', nohr

stcox `xvars', nohr

foreach z of varlist `xvars'{
gen dfb_`z' = ///
(1/_se[`z'])* (_b_`z'-_b[`z'])/(e(N_sub)-1)
}

sum dfb*
***********CODE ENDS*******************


Steve
[email protected]

On Feb 26, 2012, at 9:47 AM, Steve Samuels wrote:



Use -jackknife- to calculate dfbeta from first principles:

b_i = coefficient, omitting observation i from analysis:

Jackknife Pseudo-value : v_i = n*b - (n-1)*b_i

dfbeta_i = (b - b_i)/se(b)

  = (v_i - b)/(se(b)*(n-1))

This is a direct calculation. The values issued by
later versions of Stata after -stcox- are approximations.

*************CODE BEGINS*************
sysuse auto, clear
stset length

version 10

jackknife _b[turn] , keep: stcox turn
stcox turn
gen  dfbjack1 =  ///
(1/(_se[turn])*(_jk_1 - _b[turn])/(e(N)-1))

version 12.1

stcox turn
predict  dfbcox* ,dfbeta
corr dfb*

**************CODE ENDS**************

Steve
[email protected]


On Feb 25, 2012, at 6:57 AM, Annibale Cois wrote:

I need to calculate dfbeta statistics for Cox proportional Hazard
Models. Stata 10 does not allow directly to calculate them (Stata 11
does, conversely).
Does anyone know if there is a way to do the same (indirectly) in Stata 10?

Thanks for any help!

Annibale Cois
UCT School Of Public Health
(Master Student)
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

Prev by Date: Re: st: Running loops with graphs
Next by Date: st: 1:1 match and identifying result
Previous by thread: st: question: reported e(V) after a logit vs the information matrix
Next by thread: st: 1:1 match and identifying result
Index(es):
- Date
- Thread