[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

From |
"Salvati, Jean" <JSalvati@imf.org> |

To |
<statalist@hsphsun2.harvard.edu> |

Subject |
RE: RE: st: _rmcoll query |

Date |
Tue, 8 Mar 2005 11:07:33 -0500 |

Bill, Thanks a lot for the detailed reply. I'll have to read more about the sweep operator. Jean Salvati > -----Original Message----- > From: owner-statalist@hsphsun2.harvard.edu > [mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of > William Gould, Stata > Sent: Tuesday, March 08, 2005 10:21 AM > To: statalist@hsphsun2.harvard.edu > Subject: Re: RE: st: _rmcoll query > > Jean Salvati <JSalvati@imf.org> wants to know more about how > syminv() works: > > > Specifically, I don't understand how looking for zeroes on the > > diagonal of > > syminv(X'X) allows you to determine which variables cause > singularity > > problems. > > The inverse of matrix A is defined as the matrix X solving > the equation > > A*X = I (1) > > where I is the identity matrix. Some matrices are singular, > meaning there is no solution to (1). In that case, it is > often sufficient to obtain a generalized inverse, defined as > matrix X meeting the restrictions > > A*X*A = A > (2) > X*A*X = X > > When A is singular, (2) is not sufficient to identify > uniquely an X -- there are lots of different X's that > suffice. For most problems, one X will work as well as another. > > syminv() produces generalized inverses for symmetric, > positive definite matrices, and chooses among the X's the X > that minimizes roundoff error given its approach, its > approach being to ignore rows and columns that look collinear > to it. Ignoring rows (columns) amounts to dropping variables. When > syminv() decides to ignore a row (column), it substitutes 0 > for the row and column in the result. Thus, the easy way to > determine whether syminv() ignored a row (column) is to check > the diagonal element for zero. > > That makes it easy to calculate the rank. If > > +- -+ > | 1 0 0 | > A * syminv(A) = | 0 1 0 | > | 0 0 1 | > +- -+ > > then A is full rank. If > > +- -+ > | 1 0 0 | > A * syminv(A) = | -1 0 1 | > | 0 0 1 | > +- -+ > > then A has rank 2. More correctly, we should say that > syminv() acted as if A had rank 2. If we tightened down one > some of syminv()'s secret tolerances, we could probably > convince syminv() to treat A as if it had full rank. > > Substituting zeros in the result also means that if you > calculate A*syminv(A), you can read the linear dependencies > from the rows of the result. For instance, if you got > > +- -+ > | 1 0 0 | > A * syminv(A) = | -1 0 1 | > | 0 0 1 | > +- -+ > > then that means that > > x2 = -x1 + x3 > > > > Do you use LU or QR decomposition? > > No. LU is a popular and accurate way to to obtain inverses > of square matrices, but it does not exploit symmetry (i.e., > LU can calculate the inverse of nonsymmetric matrices). The > more numerically intensive QR is even more general, in that > it can obtain inverses of nonsquare matrices (see equation > (2) above). > > syminv() is implemented in terms of a sweep operator. Some > researchers think the sweep operator is not adequately > accurate, but the sweep can be made accurate if one puts the > same effort in its development and implementation as, say, LU > decomposition. > > What is sometimes taught as LU decomposition is not the LU > decomposition actually used. LU decomposition is often > taught as finding L and U such that A = LU, where L is lower > triangular and U and upper triangular. > Numerically speaking, however, finding L and U such that A = > LU cannot be done very accurately, and so what is actually > used is A = PLU, where P is a permutation matrix. The > reordering performed by P makes all the difference, > accuracywise, but it complicates how one uses the results > from the decomposition. > > Similarly, the sweep operator, performed in its simplest > form, is inaccurate. > There are two issues, scaling and ordering, the latter being > similar to the reordering performed by P in LU decomposition. > To handle scaling, syminv() scales the original matrix by > the median of the diagonal. As for reordering, that cannot > be handled by pivot methods (permuation matrices) without > losing the symmetric structure of the problem. The > counterpart is to determine the order in which the rows are > to be swept rather than sweeping them in order in which they > appear, and syminv() makes calculations concerning the > roundoff error that are very similar to those made in constructing P. > > -- Bill > wgould@stata.com > > * > * For searches and help try: > * http://www.stata.com/support/faqs/res/findit.html > * http://www.stata.com/support/statalist/faq > * http://www.ats.ucla.edu/stat/stata/ > * * For searches and help try: * http://www.stata.com/support/faqs/res/findit.html * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

- Prev by Date:
**RE: st: Re: Making sure identifiers are unique** - Next by Date:
**st: Inteff problem** - Previous by thread:
**Re: RE: st: _rmcoll query** - Next by thread:
**RE: st: confirm** - Index(es):

© Copyright 1996–2014 StataCorp LP | Terms of use | Privacy | Contact us | What's new | Site index |