Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
From | David Hoaglin <dchoaglin@gmail.com> |
To | statalist@hsphsun2.harvard.edu |
Subject | Re: st: cmp and condition numbers |
Date | Fri, 4 May 2012 08:30:24 -0400 |
In a least-squares regression problem, the condition number is the ratio of the largest and smallest singular values of the matrix of regressors (X). The singular values are the eigenvalues of (X-transpose)X. For a nonsingular square matrix A, the eigenvalues of A-inverse are the reciprocals of the eigenvalues of A. Centering and rescaling the regressors changes the condition number, usually making it smaller. The correlation matrix should have a smaller condition number than (X-transpose)X, but that need not be smaller than the condition number of X. As condition numbers go, 1000 is not terribly high. David Hoaglin On Fri, May 4, 2012 at 7:45 AM, David Roodman (droodman@cgdev.org) <DRoodman@cgdev.org> wrote: > Philip the two issues you raise may be unrelated. > > cmp is not designed for true simultaneous systems, by which I mean ones in the matrix of coefficients of the dependent variables in each other's equations is not triangular. > > As for the condition number, there is more than one way to compute this. For each equation, cmp applies Mata's built-in cond() function to the correlation matrix of the non-constant regressors, taking cond()'s defaults. This amounts to computing the product of the maximum eigenvalues of the correlation matrix and its inverse. I think I got this formulation from Greene, but I can't check today because I am travelling. > > --David > > --------------- > From "Bromiley, Philip" <bromiley@uci.edu> > To "statalist@hsphsun2.harvard.edu" <statalist@hsphsun2.harvard.edu> > Subject st: cmp and condition numbers > Date Mon, 30 Apr 2012 23:30:28 +0000 > I'm trying to estimate a simultaneous system with three continuous and one discrete variable using cmp. I have been unable to get it to estimate properly - lots of not concave and backed up messages and then it crashes saying it has hit a discontinuous or flat region. > > Cmp warns me that I have an ill-conditioned regressor matrix and reports high condition numbers for each of the equations (40 to 1000). However, when I run the equation with regress, I don't get high VIF's, and get a much lower condition number. > > Would someone know the reason for such a discrepancy? Any suggestions would be welcome. > > Phil * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/