Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: st: principal component analysis-creating linear combinations


From   Nick Cox <[email protected]>
To   "'[email protected]'" <[email protected]>
Subject   RE: st: principal component analysis-creating linear combinations
Date   Thu, 10 Mar 2011 16:12:19 +0000

What was wrong? In your own calculation, you were using unstandardised variables, but you needed to standardise them. However, as said, none of it is necessary as -predict- does all the work for you. 

Nick 
[email protected] 

Nick Cox 

This is largely superseded by answers already sent. But yes, something was wrong with your home-made attempt to create PC1, as the correlation with the PC1 from -predict- is not even 1. 

James Wu

Thank you, but "-predict-" generates only the first component scores.

(1) By the way, would it be wrong to construct the linear combinations
as I described earlier?
such as, Y1=0.3894*x1+0.4517*x2+0.5733*x3+0.5619*x4.

Here is the comparison:

. predict pc1
(ommission the output)

. sum pc1

    Variable |       Obs        Mean    Std. Dev.       Min        Max
-------------+--------------------------------------------------------
         pc1 |       659    2.97e-09    1.558505   -3.00555   6.801751


. gen Y1=0.3894*x1+0.4517*x2+0.5733*x3+0.5619*x4

. pwcorr  pc1 Y1

             |      pc1       Y1
-------------+------------------
         pc1 |   1.0000
          Y1 |   0.9724   1.0000


(2) As one can see from the original PCA, the second component have
positive signs on x1 and x2.
So I want to create the second component scores.

How can I obtain (if I do not create it by
Y2=0.8726*x1+0.0966*x2-0.3179*x3-0.3580*x4)?

On Thu, Mar 10, 2011 at 9:46 AM, Maarten buis <[email protected]> wrote:

> --- On Thu, 10/3/11, James Wu wrote:
>> Suppose we ran pca on four variables, x1, x2, x3, x4 as
>> follows:
>> Now, suppose that you decide to retain the firs two
>> principal components, and then you want to create two
>> variables that are linear combinations of the original
>> four variables.
>
> Then you need to use -predict-, see help -pca postestimation-.

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index