Stata 11 help for pca_postestimation

help pca postestimation dialogs: predict estat rotate loadingplot scoreplot screeplot also see: pca -------------------------------------------------------------------------------

Title

[MV] pca postestimation -- Postestimation tools for pca and pcamat

Description

The following postestimation commands are of special interest after pca and pcamat:

command description ------------------------------------------------------------------------- estat anti anti-image correlation and covariance matrices estat kmo Kaiser-Meyer-Olkin measure of sampling adequacy estat loadings component-loading matrix in one of several normalizations estat residuals matrix of correlation or covariance residuals estat rotatecompare compare rotated and unrotated components estat smc squared multiple correlations between each variable and the rest + estat summarize display summary statistics over the estimation sample loadingplot plot component loadings rotate rotate component loadings scoreplot plot score variables screeplot plot eigenvalues ------------------------------------------------------------------------- + estat summarize is not available after pcamat.

The following standard postestimation commands are also available:

command description ------------------------------------------------------------------------- + estat examine the VCE matrix estimates cataloging estimation results * lincom point estimates, standard errors, testing, and inference for linear combinations of coefficients * nlcom point estimates, standard errors, testing, and inference for nonlinear combinations of coefficients predict score variables, predictions, and residuals * predictnl point estimates, standard errors, testing, and inference for generalized predictions * test Wald tests of simple and composite linear hypotheses * testnl Wald tests of nonlinear hypotheses ------------------------------------------------------------------------- + estat is available after pca and pcamat with the vce(normal) option. * lincom, nlcom, predictnl, test, and testnl are available only after pca with the vce(normal) option.

Special-interest postestimation commands

estat anti displays the anti-image correlation and anti-image covariance matrices. These are minus the partial covariance and minus the partial correlation of all pairs of variables, holding all other variables constant.

estat kmo displays the Kaiser-Meyer-Olkin (KMO) measure of sampling adequacy. KMO takes values between 0 and 1, with small values indicating that overall the variables have too little in common to warrant a PCA analysis. Historically, the following labels are often given to values of KMO:

0.00 to 0.49 unacceptable 0.50 to 0.59 miserable 0.60 to 0.69 mediocre 0.70 to 0.79 middling 0.80 to 0.89 meritorious 0.90 to 1.00 marvelous

estat loadings displays the component-loading matrix in one of several normalizations of the columns (eigenvectors).

estat residuals displays the difference between the observed correlation or covariance matrix and the fitted (reproduced) matrix using the retained factors.

estat rotatecompare displays the unrotated (principal) components next to the most recent rotated components.

estat smc displays the squared multiple correlations between each variable and all other variables. SMC is a theoretical lower bound for communality and thus an upper bound for the unexplained variance.

estat summarize displays summary statistics of the variables in the principal component analysis over the estimation sample. This subcommand is not available after pcamat.

Syntax for predict

predict [type] {stub*|newvarlist} [if] [in] [, statistic options ]

statistic # of vars. description (k = # of orig. vars.; f = # of components) ------------------------------------------------------------------------- Main score 1,...,f scores based on the components; the default fit k fitted values using the retained components residual k raw residuals from the fit using the retained components q 1 residual sum of squares -------------------------------------------------------------------------

options description ------------------------------------------------------------------------- Main norotated use unrotated results, even when rotated results are available center base scores on centered variables notable suppress table of scoring coefficients format(%fmt) format for displaying the scoring coefficients -------------------------------------------------------------------------

Menu

Statistics > Postestimation > Predictions, residuals, etc.

Options for predict

Note on pcamat: predict requires that variables with the correct names be available in memory. Apart from centered scores, means() should have been specified with pcamat. If you used pcamat because you have access only to the correlation or covariance matrix, you cannot use predict.

+------+ ----+ Main +-------------------------------------------------------------

score calculates the scores for components 1, ..., #, where # is the number of variables in newvarlist.

fit calculates the fitted values, using the retained components, for each variable. The number of variables in newvarlist should equal the number of variables in the varlist of pca.

residual calculates for each variable the raw residuals (residual = observed - fitted), with the fitted values computed using the retained components.

q calculates the Rao statistics (i.e., the sum of squares of the omitted components) weighted by the respective eigenvalues. This equals the residual sum of squares between the original variables and the fitted values.

norotated uses unrotated results, even when rotated results are available.

center bases scores on centered variables. This option is relevant only for a PCA of a covariance matrix, in which the scores are based on uncentered variables by default. Scores for a PCA of a correlation matrix are always based on the standardized variables.

notable suppresses the table of scoring coefficients.

format(%fmt) specifies the display format for scoring coefficients. The default is format(%8.4f).

Syntax for estat

Display the anti-image correlation and covariance matrices

estat anti [, nocorr nocov format(%fmt) ]

Display the Kaiser-Meyer-Olkin measure of sampling adequacy

estat kmo [, novar format(%fmt) ]

Display the component-loading matrix

estat loadings [, cnorm(unit|eigen|inveigen) format(%fmt) ]

Display the differences in matrices

estat residuals [, obs fitted format(%fmt) ]

Display the unrotated and rotated components

estat rotatecompare [, format(%fmt) ]

Display the squared multiple correlations

estat smc [, format(%fmt) ]

Display the summary statistics

estat summarize [, label noheader noweights]

Menu

Statistics > Postestimation > Reports and statistics

Options for estat

nocorr, an option used with estat anti, suppresses the display of the anti-image correlation matrix, i.e., minus the partial correlation matrix of all pairs of variables, holding constant all other variables.

nocov, an option used with estat anti, suppresses the display of the anti-image covariance matrix, i.e., minus the partial covariance matrix of all pairs of variables, holding constant all other variables.

format(%fmt) specifies the display format. The defaults differ between the subcommands.

novar, an option used with estat kmo, suppresses the Kaiser-Meyer-Olkin measures of sampling adequacy for the variables in the principal component analysis, displaying the overall KMO measure only.

cnorm(unit|eigen|inveigen), an option used with estat loadings, selects the normalization of the eigenvectors, the columns of the principal-component loading matrix. The following normalizations are available,

unit ssq(column) = 1 (default) eigen ssq(column) = eigenvalue inveigen ssq(column) = 1/eigenvalue

with ssq(column) being the sum-of-squares of the elements in a column and eigenvalue, the eigenvalue associated with the column (eigenvector).

obs, an option used with estat residuals, displays the observed correlation or covariance matrix for which the PCA was performed.

fitted, an option used with estat residuals, displays the fitted (reconstructed) correlation or covariance matrix based on the retained components.

label, noheader, and noweights are the same as for the generic estat summarize command; see [R] estat.

Examples

Setup . sysuse auto . pca trunk weight length headroom Statistics . estat residuals, fitted . estat loadings, cnorm(eigen)

Scree plot . screeplot, . screeplot, ci(normal)

Plots of component loadings and scores . loadingplot, component(3) . scoreplot, component(3) mlabel(country)

Rotation of loadings

. rotate . rotate, varimax . rotate, oblimin(0.5) oblique

Individual scores for the components are obtained via predict . predict f1 . predict f1 f2

Residual sum of squares . predict t, q

Saved results

Let p be the number of variables and f, the number of factors.

predict, in addition to generating variables, also saves the following in r():

Matrices r(scoef) p x f matrix of scoring coefficients

estat anti saves the following in r():

Matrices r(acov) p x p anti-image covariance matrix r(acorr) p x p anti-image correlation matrix

estat kmo saves the following in r():

Scalars r(kmo) the Kaiser-Meyer-Olkin measure of sampling adequacy

Matrices r(kmow) column vector of KMO measures for each variable

estat loadings saves the following in r():

Macros r(cnorm) component normalization: eigen, inveigen, or unit

Matrices r(A) p x f matrix of normalized component loadings

estat residuals saves the following in r():

Matrices r(fit) p x p matrix of fitted values r(residual) p x p matrix of residuals

estat smc saves the following in r():

Matrices r(smc) vector of squared multiple correlations of variables with all other variables

See [R] estat for the returned results of estat summarize and estat vce (available when vce(normal) is specified with pca or pcamat).

rotate after pca and pcamat add to the existing e():

Scalars e(r_f) number of components in rotated solution e(r_fmin) rotation criterion value

Macros e(r_class) orthogonal or oblique e(r_criterion) rotation criterion e(r_ctitle) title for rotation e(r_normalization) kaiser or none

Matrices e(r_L) rotated loadings e(r_T) rotation e(r_Ev) explained variance by rotated components

The components in the rotated solution are in decreasing order of e(r_Ev).

Also see

Manual: [MV] pca postestimation

Help: [MV] pca; [MV] rotate, [MV] scoreplot, [MV] screeplot


© Copyright 1996–2009 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index