[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

From |
"Stephen Soldz" <ssoldz@soldzresearch.com> |

To |
<statalist@hsphsun2.harvard.edu> |

Subject |
st: Program to calculate minimum average partial correlation added to SSC |

Date |
Sat, 9 Nov 2002 11:25:38 -0500 |

Statalisters, Thanks to Kit Baum, my minap program has been added to SSC. minap calculates the minimum average partial correlation criterion for the number of principal components to extract. Velicer(1976) proposed that, when conducting principal components analysis as a version of factor analysis, the number of components one should extract is that at which the average partial correlation of the variables, after partialling out m principal components, would be a minimum. minap calculates this criterion. It can take as input either a variable list or a correlation matrix. Many criteria for estimating the number of components in principal components analysis, or of factors in factor analysis, have been proposed (Gorsuch, 1983). One relatively little used of these criteria is the minimum average partial correlation proposed by Velicer (1976). The minap criteria is useful when principal components is being used as an approximation to factor analysis, as with the Stata pcf option to the factor command. Gorsuch also points out that,while minap was developed for pricipal components analysis, it may also be usefu for common factor analysis. This criterion has performed well in simulation studies with data with a relatively clear factor structure (Zwick & Velicer, 1986). Gorsuch (1976), however, warns that minimum average partial correlation may not perform well and may suggest underextraction when there are components or factors with only a few loadings. Similarly, in many applications of principal components analysis, one may be interested in components on which only one or two variables load. minap would be inappropriate in those cases. For comparison purposes, the number of eigenvalues greter than one, claimed by Kaiser (1960) to be a good estimator of the number of components to extract, is also provided. In most cases, this rule will recommend the extraction of more components than will minap and Zwick and Velicer (1986) claim that it leads to overextraction. It should be noted that no criterion can be counted on by itself to determine the number of components or factors to extract with real data. Considerations of interpretability (as Nick Cox points out) are also important. In general, determining the precise number of components to retain matters more when the component (or factor) solution will be rotated. While I cetainly don't take this criterion as gospel, I find it useful. for example, in a dataset I'm working on where theory strongly suggests 5 components (and the eigenvalue >1 rule, 7 components), the minap procedure suggests 4. Lo and behold, the 4 component solution seems preferable in several ways to the 5 (or 7). If anyone wants to test it, I've included a little program to create a matrix containing the correlation matrix from a classic data set of Harmon's that Velicer analyzes in his paper: program define harmon ************************************************************************ ** Creates Harmon correlation matrix for testing map.do ** ** Harmon correlation matrix for 8 Physical variables for 305 girls ** ** Harmon(1976), p. 22 ** ** for testing MAP program, map results in Velicer(1976). ** ************************************************************************ mat Harmon = /* */ [1.000, .846, .805, .859, .473, .398, .301, .382\ /* */ .846, 1.000, .881, .826, .376, .326, .277, .415\ /* */ .805, .881, 1.000, .801, .380, .319, .237, .345\ /* */ .859, .826, .801, 1.000, .436, .329, .327, .365\ /* */ .473, .376, .380, .436, 1.000, .762, .730, .629\ /* */ .398, .326, .319, .329, .762, 1.000, .583, .577\ /* */ .301, .277, .237, .327, .730, .583, 1.000, .539\ /* */ .382, .415, .345, .365, .629, .577, .539, 1.000] mat list Harmon end Bug reports, etc., to me. Cheers, Stephen Stephen Soldz The Center for Research, Evaluation, and Program Development Boston Graduate School of Psychoanalysis 1581 Beacon St. Brookline, MA 02446 ssoldz@bgsp.edu * * For searches and help try: * http://www.stata.com/support/faqs/res/findit.html * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

- Prev by Date:
**Re: st: sum of residuals=zero?** - Next by Date:
**Re: st: adjust after poisson** - Previous by thread:
**st: sum of residuals=zero?** - Next by thread:
**st: merging household and person data** - Index(es):

© Copyright 1996–2015 StataCorp LP | Terms of use | Privacy | Contact us | What's new | Site index |