Welsch One-Step Bounded-Influence Estimator ------------------------------------------- AUTHOR: Richard Goldstein, Qualitas SUPPORT: Written communication only, EMAIL goldst@@harvarda.bitnet or 37 Kirkwood Road, Brighton MA 02135. ^bound^ depvar varlist [^in^ range] [^if^ exp] ^bound^ estimates a standard linear regression first, then computes a number of regression diagnostics, finally uses dffits to estimate the one-step Welsch bounded-influence estimator. This is not the full Krasker-Welsch bounded- influence estimator. This is suggested in Welsch, RE (1980), "Regression Sensitivity Analysis and Bounded-Influence Estimation", in ^Evaluation of Econometric Models^, ed. by J Kmenta and JB Ramsey, New York: Academic Press, pp. 153-167; Welsch claims that the "cutoff of 0.34 is chosen for approximately 95% asymptotic efficiency." (p. 165) As part of the output, between the standard and one-step regression results, a list of cases appears with values on a number of diagnostics, including hat, studentized residual, dffits, dfbeta on the first of the right-hand- side variables, Cook's distance, the covariance ratio, the likelihood distance and a probablility for that distance. Only cases that cross the 'rule-of-thumb' cutoff for any ^one^ of these are shown. Above the list, some of the cutoffs for that data set are presented. No cutoffs are shown for Cook's distance (I just use 1.0) or for the studentized residual (I just use 2.0). The cutoff for hat is 3p/n not the BKW suggestion of 2p/n, as I have found this to be more informative in my own work. I have not found the covar- iance ratio or the likelihood distance to be worth very much, but have left them in case others find them helpful. Do not use any standard regression options, or ^test^, as they interfere with the purpose of this file, which is JUST to use some diagnostics. The example below uses the ^auto.dta^ data set delivered with Stata: Example: -------- . ^use auto^ (1978 Automobile Data) . ^bound mpg weight weightsq^ Some Regression Diagnostics (obs=74) Source | SS df MS Number of obs = 74 ---------+------------------------------ F( 2, 71) = 72.80 Model | 1642.52197 2 821.260986 Prob > F = 0.0000 Residual | 800.937487 71 11.2808097 R-square = 0.6722 ---------+------------------------------ Adj R-square = 0.6630 Total | 2443.45946 73 33.4720474 Root MSE = 3.3587 Variable | Coefficient Std. Error t Prob > |t| Mean ---------+-------------------------------------------------------------- mpg | 21.2973 ---------+-------------------------------------------------------------- weight | -.0141581 .0038835 -3.646 0.001 3019.459 weightsq | 1.32e-06 6.26e-07 2.116 0.038 9713003 _cons | 51.18308 5.767884 8.874 0.000 1 ---------+-------------------------------------------------------------- (_dfbeta now contains DF-Betas for weight) (13 changes made) covratio is measure of influence of observation on covariance matrix _ld is measure of influence of observation on the likelihood (b AND s^^2) _ld is distributed approx. as chi-square with df=# of IV's w/constant _ldp is the 'probability value' indicating that the removal of the case in question will displace the estimate the edge of an x% confidence region, where 'x'=value in table; thus, HIGH %'s sig. dfbeta cut=0.2325 dffits cut=0.4027 hat cut=0.0811 covratio cut=1 +/- 0.1216 _dfbeta _dffits cook _hat rstu covratio _ld _ldp 11. 0.035 -0.067 0.002 0.085 -0.219 1.1384 0.011 0.000 13. -0.277 0.562 0.101 0.077 1.953 0.9638 0.388 0.057 24. 0.138 -0.187 0.012 0.084 -0.616 1.1210 0.039 0.002 26. 0.308 -0.399 0.054 0.307 -0.599 1.4837 0.169 0.018 27. 0.255 -0.347 0.040 0.232 -0.630 1.3365 0.128 0.012 42. 0.311 0.459 0.064 0.026 2.811 0.7771 0.552 0.093 43. -0.280 0.380 0.048 0.084 1.253 1.0660 0.153 0.015 57. -0.259 0.479 0.072 0.045 2.193 0.8955 0.344 0.048 62. 0.182 -0.238 0.019 0.094 -0.738 1.1254 0.060 0.004 65. 0.240 -0.333 0.037 0.077 -1.152 1.0689 0.117 0.010 66. -0.240 0.477 0.072 0.042 2.277 0.8790 0.364 0.052 67. -0.035 -0.318 0.032 0.023 -2.082 0.8917 0.186 0.020 71. -0.496 0.962 0.242 0.043 4.533 0.5038 3.354 0.660 (10 changes made) (sum of wgt is 7.1771e+01) (obs=74) Source | SS df MS Number of obs = 74 ---------+------------------------------ F( 2, 71) = 82.79 Model | 1468.06748 2 734.033739 Prob > F = 0.0000 Residual | 629.466469 71 8.86572491 R-square = 0.6999 ---------+------------------------------ Adj R-square = 0.6914 Total | 2097.53395 73 28.7333417 Root MSE = 2.9775 Variable | Coefficient Std. Error t Prob > |t| Mean ---------+-------------------------------------------------------------- mpg | 20.99078 ---------+-------------------------------------------------------------- weight | -.0123652 .0034982 -3.535 0.001 3025.622 weightsq | 1.07e-06 5.65e-07 1.894 0.062 9732317 _cons | 47.99348 5.196598 9.236 0.000 1 ---------+-------------------------------------------------------------- (1978 Automobile Data) Results: -------- Note that no case is larger than the 'rule-of-thumb' guidelines on ^every^ diagnostic. Also, as one might expect, certain diagnostics ^tend^ to agree with each other; for example, though dffits and Cook's distance are closely related, dffits "agrees" more often with dfbeta than Cook's distance does. Also note that dffits is signed while Cook's distance is not--some people find the sign helpful. These added variables are not dropped in the ado file--you might want to use some for graphs. For example, neither the diagonal of the hat matrix ("hat") nor the studentized residual are themselves particularly informative; however, a case that is high on both of these is quite likely to be influential; thus, one might graph a scatterplot of these two variables, adding guidelines at the cutoffs and looking for cases that are high on both diagnostics. It is important to compare the standard regression, at the top, and the Welsch one-step bounded influence regression (at the bottom). If they are very different, then, regardless of what is in the list of diagnostics, then there are influential cases. A difference could show up in the coefficients or their standard errors or in the summary statistics for the regression (such as R-squared). Here there is very little difference.