  # Re: st: multicollinearity test for probit?

 From John Hendrickx To statalist@hsphsun2.harvard.edu Subject Re: st: multicollinearity test for probit? Date Thu, 11 Dec 2003 10:10:31 -0800 (PST)

```--- "Matt A. Barreto" <Mbarreto@uci.edu> wrote:
> Is there a similar command to vif following regress when using
> probit or
> oprobit (or logit/ologit) to test for multicollinearity among
> independent
> variables in a probit equation?
>
Others have noted that collinearity is a problem among righthand side
variables and the vif and condition diagnostics from a regression
model are valid for a probit model with those independent variables.
I've done some work on an alternative approach using SPSS and
recently I've been working on a Stata version.

The basic problem at the time was that collinearity diagnostics
aren't useful if there are interactions or polynomial terms in the
model, according to Belsley, D.A. (1991). "Conditioning diagnostics,
collinearity and weak data in regression", New York: John Wiley &
Sons. The basic problem with collinearity is that because of the
strong correlation among independent variables, a small change in the
value of one of them can result in very different coefficients for
other collinear variables. (This is discernable in the large standard
errors of coefficients where collinearity is strong). In models with
independent variables in main effects and interactions, and in
linear, squared etc terms, the usual diagnostics don't take these
relationships between model terms into account.

This was Belsley's main argument against the use of collinearity
diagnostics for interactions as I recall. He also wasn't satisfied
with the VIF statistic and showed that it can't always detect
multicollinearity. Condition indices are better but even if they
indicate multicollinearity, the problem need not be serious if the
dataset is large enough to provide stable estimates.

Ok, the solution gleaned from Besley and implemented by a group of us
was to add a small random value to selected independent variables and
re-estimate the model. Iterate 100 or so times, and assess the
stability of the estimates.

An SPSS macro for doing this is at
http://www.xs4all.nl/~jhckx/spss/perturb/perturb.html
along with a little more background information. Here's a Stata
do-file for performing the same procedure. It should be fairly simple
to adapt it to other problems, just modify the lines with prtb*
variables and change the macro if there are other transformations of
independent variables. It should work with probit or any other
procecure that produces an e(b) matrix of coefficients. As said, I'm
working on a proper Stata ado file. Comments on this procedure are
very welcome.

HTH,
John Hendrickx

-------------------------------------
set memory 16m
set matsize 150
use recoded

xi3 ses fses*eyr educyr*eyr fses*exp educyr*exp exp2
corr  fses eyr _IfsXey educyr _IedXey exp _IfsXex _IedXex exp2
collin  fses eyr _IfsXey educyr _IedXey exp _IfsXex _IedXex exp2

xi3: regr ses fses*eyr educyr*eyr fses*exp educyr*exp exp2
vif
matrix allb=e(b)

gen prtb1 = 0
gen	prtb2 = 0
gen	nlin1 = 0
label var prtb1 "eyr + uniform(-2.5,2.5)"
label var prtb2 "exp + uniform(-2.5,2.5)"
label var nlin1 "exp^2"

program define exp2
quietly replace nlin1=prtb2^2
end

forval i=1/100 {

quietly replace prtb1=eyr+(uniform()-.5)*5
quietly replace prtb2=exp+(uniform()-.5)*5
exp2

quietly xi3: regr ses fses*prtb1 educyr*prtb1 fses*prtb2 educyr*prtb2
nlin1
matrix allb=allb\e(b)
}

matrix list allb

drop _all

svmat allb, names(eqcol)
summarize

__________________________________
Do you Yahoo!?