^DATE: 4/24/91: J. Hilbe Update 8/19/91 sqv1.hlp ^ HELP ON EXTLOGIT.ADO EXTENTION The following program listing adds further extensions to logistic regression. However, rather than changing the CRC-supplied ^logiodds^, a new command, ^logiodd2^, is provided. It is the same as ^logiodds^, except it includes the new extended option. Refer to ^logiodds.hlp^ for assistance with the command. The proper syntax for use of the logiodd2 command is: ^ logiodd2 depvar varlist [=exp] [if exp] [in exp] [in range] ^ [,[e][f][l][st[level#]] [roc [saving(string)]] ] The '^e^' option gives access to the '^extlogit.ado^' file from within the initial '^logiodds^' command. To get a quick look at option capabilities, load an appropriate file and type: ^ logiodd2 {depvar} {varlist}, e f l st roc [ENTER] The '^e^' option provides the following statistics: 1. Model goodness-of-fit statistics including a modified Hosmer-Lemeshow goodness-of-fit statistic, and both -2LL(model) and - 2LL(intercept) statistics. They are accompanied by appropriate chi-square significance levels. The modified H-M statistic is not based upon replications of covariate patterns; hence will work when logifit GOF statistics may not. 2. Wald statistics and their respective chi-square significance levels. Although the t significance values used by Stata are nearly the same as those for the Wald statistics, many programs supply Wald stats. Comparisons can thus be made. 3. Partial correlation values for each coefficient. Values are related as such: as values rise from between 0 and 1, the likelihood increases of the event occurring; as they decrease from 0 to -1, the likelihood of the event occurring decreases. The option also creates several diagnostic variables which can aid in assessing influence and fit. Among the foremost are... 1. ^logpred^ - the probability of success (1) 2. ^resid^ - residual 3. ^stresid^ - standardized (Pearson) residual 4. ^hat^ - diagonal of hat matrix 5. ^dev^ - deviance 6. ^cmod^ - confid interval displacement diagnostic; measure of the effect of each covariate pattern on (estimated parameter) values 7. ^cook^ - Cook's distance 8. ^deltad^ - measure of the effect of each covariate pattern on model fit; based on deviance 9. ^deltax^ - same as deltad, except based on Pearson chi-square The hat values are based upon covariate replications; ie. they represent a more correct m-asymtotic distribution (unlike several of the other major software packages). Delta* values, Cook, and the modified Cook's distance statistic, since they are calculated using hat values, are also based on m-asymtotic covariate distributions. There are many obstacles in assessing influence and fit. For those who want a rather quick look at their model, having first found it significantly acceptable, use the following three commands: To check for cases influencing model fit... ^gr deltad logpred, xlab ylab yline(4)^ and ^gr deltax logpred, xlab ylab yline(4) (those cases above 4 significantly effect fit) To check for cases influencing parameters... ^gr cmod logpred, xlab ylab yline(.9) (those cases above 0.9 significantly influence coefficient values) The above methodology can serve well as a preliminary test. ^***^ I shall demonstrate use of the extension by means of a partial data listing found on the Stata disks called cancer.dta. The classification or dependent variable is died (1=died,0=not died) while the two independent variables are drug and age. Drug type 1 is the placebo and will be considered as the referent. Only the 'e' option will be used. ^. use cancer, clear (Patient Survival in Drug Trial) ^. tabulate drug, gen(drg) Drug type| (1=placebo)| Freq. Percent Cum. ------------+----------------------------------- 1 | 20 41.67 41.67 2 | 14 29.17 70.83 3 | 14 29.17 100.00 ------------+----------------------------------- Total | 48 100.00 ^. logiodds died age drg2 drg3, e Variable| Odds Ratio Std. Error [95% Conf. Interval] Delta*Coef --------+--------------------------------------------------------- age | 1.975711* .8030674 .8713186 4.479917 .6809284* drg2 | .0287022 .0351004 .0024446 .3369952 -3.550783 drg3 | .0387934 .0458072 .0035966 .4184295 -3.249504 ---------+-------------------------------------------------------- (*) Note: Delta = 1 SD rather than 1 unit. ^MODEL GOODNESS OF FIT STATISTICS ^** ChiSq>.05 fails to reject hypothesis that model fits H-L Goodness of Fit (Mod) => 56.5838 ChiSq sign. (df:N-k) => 0.0966 -2 LL(Model) => 43.0511 ChiSq sign. (df:N-k) => 0.5122 -2 LL(Intercept) => 62.3988 ChiSq sign. (df:N-k) => 0.0352 ^WALD STATISTICS & PARTIAL CORRELATIONS No Var Wald Prob(Chi) Partial Corr ================================================ 1 age 2.8064 0.0939 0.1137 2 drg2 8.4305 0.0037 0.3210 3 drg3 7.5733 0.0059 0.2989 ^Additional diagnostic variables created... ^logindex^ = Logit; Index value ^sepred^ = Standard error of index ^logpred^ = Probability of success (1) ^resid^ = Residual ^stresid^ = Standardized residual (Pearson) ^hat^ = Hat matrix diagonal ^dev^ = Deviance ^cmod^ = Influence on est parameter values ^cook^ = Cook's distance ^deltad^ = Change in Deviance ^deltax^ = Change in Pearson chi-square ^. list died logpred age drg2 drg3 deltad if deltad>=4 died logpred age drg2 drg3 deltad 1. 1 .1793184 47 1 0 4.141695 41. 0 .9662145 58 0 0 9.167167 ^. list died logpred age drg2 drg3 deltax if deltax>=4 died logpred age drg2 drg3 deltax 1. 1 .1793184 47 1 0 5.281184 41. 0 .9662145 58 0 0 30.99025 ^. list died logpred age drg2 drg3 cmod if cmod>=.8 died logpred age drg2 drg3 cmod 1. 1 .1793184 47 1 0 .812959 41. 0 .9662145 58 0 0 2.591738 To obtain a graphical interpretation of the relation between deltad and logpred, weighted by cmod, type... ^. gr deltad logpred =cmod, xlab ylab yline(4) border t1(DELTAD ^> by LOGPRED) ^END