Difference Test for Specification Error in Regression ----------------------------------------------------- AUTHOR: Richard Goldstein, Qualitas SUPPORT: Written communication only, EMAIL goldst@@harvarda.bitnet or 37 Kirkwood Road, Brighton MA 02135. ^pswdiff^ varlist [^in^ range] [^if^ exp] ^pswdiff^ is a test for general (i.e., non-specific) specification error in a standard linear regression; it is implemented as a test for omitted variables, via added variables as a function of lags and leads of the included right-hand- side variables. ^DO NOT^ include a constant, or seasonal dummies, or polynomial terms in your model; do not include a lagged version of the left-hand-side var- iable on the right-hand-side. A version of this test including a lagged dependent variable is provided in the citation below. The reported regression itself is of little (NO) interest; what is of interest is the joint test of significance reported after the regression: if signifi- cant, you have left out at least one important variable or you have the wrong functional form; if not significant, then you may not have one of these problems. The PSW difference test for specification error (see R. Davidson, L. Godfrey, and MacKinnon, JG (1985), "A Simplified Version of the Differencing Test", ^International Economic Review^, 26, pp. 639-47) is a general test for model misspecification--applicable to time-series data. It is equivalent to a test for omitted variables, and is accomplished via adding the omitted variables which are defined as the sum of the lagged and leaded values of the variables. Example: -------- . ^use auto^ (1978 Automobile Data) . ^pswdiff mpg weight weightsq^ (2 missing values generated) (1 change made) (1 change made) (2 missing values generated) (1 change made) (1 change made) (obs=74) Source | SS df MS Number of obs = 74 ---------+------------------------------ F( 4, 70) = 458.15 Model | 34683.1966 4 8670.79916 Prob > F = 0.0000 Residual | 1324.80338 70 18.9257625 R-square = 0.9632 ---------+------------------------------ Adj R-square = 0.9611 Total | 36008.00 74 486.594595 Root MSE = 4.3504 Variable | Coefficient Std. Error t Prob > |t| Mean ---------+-------------------------------------------------------------- mpg | 21.2973 ---------+-------------------------------------------------------------- weight | .0070224 .0030402 2.310 0.024 3019.459 weightsq | -1.99e-06 5.13e-07 -3.875 0.000 9713003 diff1 | .0068591 .0015765 4.351 0.000 5966.757 diff2 | -1.13e-06 2.79e-07 -4.036 0.000 1.92e+07 ---------+-------------------------------------------------------------- ( 1) diff1 = 0.0 ( 2) diff2 = 0.0 F( 2, 70) = 9.63 Prob > F = 0.0002 Clearly the results are highly statistically significant and something is wrong with our specification of the model. If we were actually analyzing this we would next try to find out what specifically is wrong and then do something about it.