Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: RE: principal component analysis-creating linear combinations
From
James Wu <[email protected]>
To
[email protected]
Subject
Re: st: RE: principal component analysis-creating linear combinations
Date
Thu, 10 Mar 2011 10:26:16 -0500
Nick and Maarten,
Now it works. Thanks a lot for the tips.
James
On Thu, Mar 10, 2011 at 10:12 AM, Nick Cox <[email protected]> wrote:
> Not so: There is an explicit example for exactly your need in the help.
>
> Individual scores for the components are obtained via predict
> . predict f1
> . predict f1 f2
>
> That is, for 2, 3, ... components, specify as many names as you need.
>
> I am looking at Stata 11 documentation; if you are using an earlier version, you should state that as requested in the Statalist FAQ.
>
> Nick
> [email protected]
>
> James Wu [mailto:[email protected]]
>
> Nick, thank you very much.
>
> But how can I obtain the second component scores (that would
> correspond to Y2 that I called earlier) by using predict?
> I read the manual on pca postestimation, but there is no indication on
> it (only the first component scores).
>
> On Thu, Mar 10, 2011 at 9:56 AM, Nick Cox <[email protected]> wrote:
>
>> The easiest and best way to create the principal components themselves is use -predict- after -pca-. There is no need for you to do the calculation by typing out coefficients in a linear equation. That is even at best problematic in terms of keeping precision.
>>
>> The default of -pca- is to use the correlation matrix; that is entirely equivalent to using standardised variables, so that there is absolutely no need to standardise yourself, except possibly as an exercise.
>>
>> I wouldn't call the eigenvectors the PCs myself, although there are varying habits on this.
>
> James Wu
>
>> Suppose we ran pca on four variables, x1, x2, x3, x4 as follows:
>
>> . pca x1 x2 x3 x4, components (3)
>>
>> Principal components/correlation Number of obs = 659
>> Number of comp. = 3
>> Trace = 4
>> Rotation: (unrotated = principal) Rho = 0.9550
>> --------------------------------------------------------------------------
>> Component Eigenvalue Difference Proportion Cumulative
>> -------------+------------------------------------------------------------
>> Comp1 2.42894 1.67142 0.6072 0.6072
>> Comp2 .757515 .124084 0.1894 0.7966
>> Comp3 .633431 .453314 0.1584 0.9550
>> Comp4 .180117 . 0.0450 1.0000
>> --------------------------------------------------------------------------
>> Principal components (eigenvectors)
>> ----------------------------------------------------------
>> Variable Comp1 Comp2 Comp3 Unexplained
>> -------------+------------------------------+-------------
>> x1 0.3894 0.8726 -0.2945 .00004265
>> x2 0.4517 0.0966 0.8858 .0003491
>> x3 0.5733 -0.3179 -0.2218 .09384
>> x4 0.5619 -0.3580 -0.2817 .08588
>> ----------------------------------------------------------
>>
>>
>> Now, suppose that you decide to retain the firs two principal
>> components, and then you want to create two variables that are linear
>> combinations of the original four variables.
>>
>> Question1: Would it be simply to create by multiply the Principal
>> Components (eigenvectors, columns) with the orginal variables, say,
>> Y1=0.3894*x1+0.4517*x2+0.5733*x3+0.5619*x4 and
>> Y2=0.8726*x1+0.0966*x2-0.3179*x3-0.3580*x4?
>>
>> Question 2: Assuming that I am correct in creating new variables by
>> simply multiplying the Principal components (eigenvectors) with the
>> orginal variables (Question 1),
>> if these four original variables are in different units of
>> measurement, then should we standardize the original four variables
>> (so that each of standardized original variable has mean 0 and std of
>> 1) before computing the multiproducts as in my Question 1?
>>
>> *
>> * For searches and help try:
>> * http://www.stata.com/help.cgi?search
>> * http://www.stata.com/support/statalist/faq
>> * http://www.ats.ucla.edu/stat/stata/
>>
>
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/statalist/faq
> * http://www.ats.ucla.edu/stat/stata/
>
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/