Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: st: RE: principal component analysis-creating linear combinations


From   Nick Cox <[email protected]>
To   "'[email protected]'" <[email protected]>
Subject   RE: st: RE: principal component analysis-creating linear combinations
Date   Thu, 10 Mar 2011 15:12:44 +0000

Not so: There is an explicit example for exactly your need in the help. 

Individual scores for the components are obtained via predict
        . predict f1
        . predict f1 f2

That is, for 2, 3, ... components, specify as many names as you need. 

I am looking at Stata 11 documentation; if you are using an earlier version, you should state that as requested in the Statalist FAQ. 

Nick 
[email protected] 

James Wu [mailto:[email protected]] 

Nick, thank you very much.

But how can I obtain the second component scores (that would
correspond to Y2 that I called earlier) by using predict?
I read the manual on pca postestimation, but there is no indication on
it (only the first component scores).

On Thu, Mar 10, 2011 at 9:56 AM, Nick Cox <[email protected]> wrote:

> The easiest and best way to create the principal components themselves is use -predict- after -pca-. There is no need for you to do the calculation by typing out coefficients in a linear equation. That is even at best problematic in terms of keeping precision.
>
> The default of -pca- is to use the correlation matrix; that is entirely equivalent to using standardised variables, so that there is absolutely no need to standardise yourself, except possibly as an exercise.
>
> I wouldn't call the eigenvectors the PCs myself, although there are varying habits on this.

James Wu

> Suppose we ran pca on four variables, x1, x2, x3, x4 as follows:

> . pca  x1 x2 x3 x4, components (3)
>
> Principal components/correlation                  Number of obs    =       659
> Number of comp.  =         3
> Trace            =         4
> Rotation: (unrotated = principal)             Rho              =    0.9550
> --------------------------------------------------------------------------
> Component    Eigenvalue   Difference         Proportion   Cumulative
> -------------+------------------------------------------------------------
> Comp1       2.42894      1.67142             0.6072       0.6072
> Comp2       .757515      .124084             0.1894       0.7966
> Comp3       .633431      .453314             0.1584       0.9550
> Comp4       .180117            .             0.0450       1.0000
> --------------------------------------------------------------------------
> Principal components (eigenvectors)
> ----------------------------------------------------------
> Variable     Comp1     Comp2     Comp3  Unexplained
> -------------+------------------------------+-------------
> x1    0.3894    0.8726   -0.2945    .00004265
> x2    0.4517    0.0966    0.8858     .0003491
> x3    0.5733   -0.3179   -0.2218       .09384
> x4    0.5619   -0.3580   -0.2817       .08588
> ----------------------------------------------------------
>
>
> Now, suppose that you decide to retain the firs two principal
> components, and then you want to create two variables that are linear
> combinations of the original four variables.
>
> Question1:  Would it be simply to create by multiply the Principal
> Components (eigenvectors, columns)  with the orginal variables, say,
> Y1=0.3894*x1+0.4517*x2+0.5733*x3+0.5619*x4 and
> Y2=0.8726*x1+0.0966*x2-0.3179*x3-0.3580*x4?
>
> Question 2: Assuming that I am correct in creating new variables by
> simply multiplying the Principal components (eigenvectors) with the
> orginal variables (Question 1),
> if these four original variables are in different units of
> measurement, then should we standardize the original four variables
> (so that each of standardized original variable has mean 0 and std of
> 1) before computing the multiproducts as in my Question 1?
>
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
>

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index