Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: st: RE: PCA vs. Factor Loadings


From   "Nick Cox" <n.j.cox@durham.ac.uk>
To   <statalist@hsphsun2.harvard.edu>
Subject   RE: st: RE: PCA vs. Factor Loadings
Date   Wed, 16 Dec 2009 13:27:27 -0000

If you're using Stata 10, please do make that clear in future postings.

I often calculate correlations directly between PCs (in my case always
vanilla flavour) and original variables. That way I know exactly what I
am getting without having to decode eigenspeak. -correlate- will do
that, naturally, but I sometimes use -cpcorr- from SSC, which shows me
just the submatrix I care about. Presumably that could also be used for
looking at results of other modes of PCA or factor analysis. 

. pca trunk displacement weight length

Principal components/correlation                  Number of obs    =
74
                                                  Number of comp.  =
4
                                                  Trace            =
4
    Rotation: (unrotated = principal)             Rho              =
1.0000

 
------------------------------------------------------------------------
--
       Component |   Eigenvalue   Difference         Proportion
Cumulative
 
-------------+----------------------------------------------------------
--
           Comp1 |      3.35594      2.91098             0.8390
0.8390
           Comp2 |      .444963      .287221             0.1112
0.9502
           Comp3 |      .157742       .11639             0.0394
0.9897
           Comp4 |     .0413522            .             0.0103
1.0000
 
------------------------------------------------------------------------
--

Principal components (eigenvectors) 

    --------------------------------------------------------------------
        Variable |    Comp1     Comp2     Comp3     Comp4 | Unexplained 
    -------------+----------------------------------------+-------------
           trunk |   0.4418    0.8721    0.1981   -0.0700 |           0 
    displacement |   0.5007   -0.4020    0.7328    0.2252 |           0 
          weight |   0.5272   -0.2661   -0.2728   -0.7595 |           0 
          length |   0.5255   -0.0833   -0.5910    0.6063 |           0 
    --------------------------------------------------------------------

. predict pc1-pc4
(score assumed)

Scoring coefficients 
    sum of squares(column-loading) = 1

    ------------------------------------------------------
        Variable |    Comp1     Comp2     Comp3     Comp4 
    -------------+----------------------------------------
           trunk |   0.4418    0.8721    0.1981   -0.0700 
    displacement |   0.5007   -0.4020    0.7328    0.2252 
          weight |   0.5272   -0.2661   -0.2728   -0.7595 
          length |   0.5255   -0.0833   -0.5910    0.6063 
    ------------------------------------------------------

. cpcorr trunk displacement weight length \ pc1-pc4
(obs=74)

                  pc1      pc2      pc3      pc4
       trunk   0.8094   0.5818   0.0787  -0.0142
displacement   0.9172  -0.2682   0.2910   0.0458
      weight   0.9659  -0.1775  -0.1084  -0.1544
      length   0.9626  -0.0556  -0.2347   0.1233

Nick 
n.j.cox@durham.ac.uk 

Michael I. Lichter

Thanks, Nick! This would give me exactly what I was looking for ... 
Except that -estat loadings- ignores rotations, as far as I can tell, in

Stata 10. I can do the normalization "by hand" in Excel, though, so 
that's better than nothing.

Michael

Nick Cox wrote:
> I think the short answer is that you are not comparing like with like.

>
> Loadings can be presented in various ways. See for example the help
for
> -pca postestimation- and then experiment with the different
> normalisations on offer for -estat loadings-. 
>
> The default presentation of PCA loadings is not what you want, but a
> different normalisation shows that PCA and factor analysis coincide in
> the limit: 
>
> . pca headroom trunk weight length displacement
>
> Principal components/correlation                  Number of obs    =
> 74
>                                                   Number of comp.  =
> 5
>                                                   Trace            =
> 5
>     Rotation: (unrotated = principal)             Rho              =
> 1.0000
>
>  
>
------------------------------------------------------------------------
> --
>        Component |   Eigenvalue   Difference         Proportion
> Cumulative
>  
>
-------------+----------------------------------------------------------
> --
>            Comp1 |      3.76201        3.026             0.7524
> 0.7524
>            Comp2 |      .736006      .427915             0.1472
> 0.8996
>            Comp3 |      .308091      .155465             0.0616
> 0.9612
>            Comp4 |      .152627      .111357             0.0305
> 0.9917
>            Comp5 |     .0412693            .             0.0083
> 1.0000
>  
>
------------------------------------------------------------------------
> --
>
> Principal components (eigenvectors) 
>
>  
>
------------------------------------------------------------------------
> ------
>         Variable |    Comp1     Comp2     Comp3     Comp4     Comp5 |
> Unexplained 
>  
>
-------------+--------------------------------------------------+-------
> ------
>         headroom |   0.3587    0.7640    0.5224   -0.1209    0.0130 |
> 0 
>            trunk |   0.4334    0.3665   -0.7676    0.2914    0.0612 |
> 0 
>           weight |   0.4842   -0.3329    0.0737   -0.2669    0.7603 |
> 0 
>           length |   0.4863   -0.2372   -0.1050   -0.5745   -0.6051 |
> 0 
>     displacement |   0.4610   -0.3390    0.3484    0.7065   -0.2279 |
> 0 
>  
>
------------------------------------------------------------------------
> ------
>
> . estat loadings, cnorm(eigen)
>
> Principal component loadings (unrotated)
>     component normalization: sum of squares(column) = eigenvalue
>
>     ----------------------------------------------------------------
>                  |    Comp1     Comp2     Comp3     Comp4     Comp5 
>     -------------+--------------------------------------------------
>         headroom |    .6958     .6554       .29   -.04724   .002635 
>            trunk |    .8405     .3144    -.4261     .1138    .01243 
>           weight |    .9392    -.2856    .04092    -.1043     .1545 
>           length |    .9432    -.2035   -.05829    -.2245    -.1229 
>     displacement |    .8942    -.2909     .1934      .276   -.04629 
>     ----------------------------------------------------------------
>
> . factor   headroom trunk weight length displacement, pcf
> (obs=74)
>
> Factor analysis/correlation                        Number of obs    =
> 74
>     Method: principal-component factors            Retained factors =
> 1
>     Rotation: (unrotated)                          Number of params =
> 5
>
>  
>
------------------------------------------------------------------------
> --
>          Factor  |   Eigenvalue   Difference        Proportion
> Cumulative
>  
>
-------------+----------------------------------------------------------
> --
>         Factor1  |      3.76201      3.02600            0.7524
> 0.7524
>         Factor2  |      0.73601      0.42791            0.1472
> 0.8996
>         Factor3  |      0.30809      0.15546            0.0616
> 0.9612
>         Factor4  |      0.15263      0.11136            0.0305
> 0.9917
>         Factor5  |      0.04127            .            0.0083
> 1.0000
>  
>
------------------------------------------------------------------------
> --
>     LR test: independent vs. saturated:  chi2(10) =  373.68 Prob>chi2
=
> 0.0000
>
> Factor loadings (pattern matrix) and unique variances
>
>     ---------------------------------------
>         Variable |  Factor1 |   Uniqueness 
>     -------------+----------+--------------
>         headroom |   0.6958 |      0.5159  
>            trunk |   0.8405 |      0.2935  
>           weight |   0.9392 |      0.1180  
>           length |   0.9432 |      0.1103  
>     displacement |   0.8942 |      0.2003  
>     ---------------------------------------
>
> On the broader question, the question has some similarity with the
> question of how big should a correlation be before one should pay
> attention. I doubt there's an answer independent of discipline and
> problem. 
>
> Nick 
> n.j.cox@durham.ac.uk 
>
> Michael I. Lichter
>
> Why are component loadings from -pca- so much smaller than factor 
> loadings from -factor-? Is there something about the procedure used by

> Stata that makes them systematically smaller? I get the sense (which
may
>
> be mistaken; I don't have any evidence in my hand) that in other 
> packages -pca- and -factor- loadings are more similar.
>
> For example, in the example below the variable -trunk- has a component

> loading of 0.5068 and a factor loading of .8807, which is a fairly
large
>
> difference. Aside from the difference in the loading sizes, the 
> solutions look comparable.
>
> My question is prompted by a more fundamental question, which is how 
> large should a loading be before it is considered significant (in the 
> sense of "worthy of notice")? Texts that give advice on interpretation

> seem to assume that -pca- and -factor- results are on the same scale, 
> and I am a bit flustered about what to do with the low-ish loadings
I'm 
> getting from -pca-.
>
> Example:
>
> . sysuse auto
> . pca trunk weight length headroom, mineigen(1)
>
> Principal components/correlation                  Number of obs    
> =        74
>                                                   Number of comp.  
> =         1
>                                                   Trace            
> =         4
>     Rotation: (unrotated = principal)             Rho              =

> 0.7551
>
>     
>
------------------------------------------------------------------------
> --
>        Component |   Eigenvalue   Difference         Proportion
> Cumulative
>     
>
-------------+----------------------------------------------------------
> --
>            Comp1 |      3.02027      2.36822             0.7551
> 0.7551
>            Comp2 |      .652053       .37494             0.1630
> 0.9181
>            Comp3 |      .277113      .226551             0.0693
> 0.9874
>            Comp4 |     .0505616            .             0.0126
> 1.0000
>     
>
------------------------------------------------------------------------
> --
>
> Principal components (eigenvectors)
>
>     --------------------------------------
>         Variable |    Comp1 | Unexplained
>     -------------+----------+-------------
>            trunk |   0.5068 |       .2243
>           weight |   0.5221 |       .1768
>           length |   0.5361 |       .1319
>         headroom |   0.4280 |       .4467
>     --------------------------------------
> . factor trunk weight length headroom, pcf
> (obs=74)
>
> Factor analysis/correlation                        Number of obs    
> =       74
>     Method: principal-component factors            Retained factors 
> =        1
>     Rotation: (unrotated)                          Number of params 
> =        4
>
>     
>
------------------------------------------------------------------------
> --
>          Factor  |   Eigenvalue   Difference        Proportion
> Cumulative
>     
>
-------------+----------------------------------------------------------
> --
>         Factor1  |      3.02027      2.36822            0.7551
> 0.7551
>         Factor2  |      0.65205      0.37494            0.1630
> 0.9181
>         Factor3  |      0.27711      0.22655            0.0693
> 0.9874
>         Factor4  |      0.05056            .            0.0126
> 1.0000
>     
>
------------------------------------------------------------------------
> --
>     LR test: independent vs. saturated:  chi2(6)  =  257.89 Prob>chi2
= 
> 0.0000
>
> Factor loadings (pattern matrix) and unique variances
>
>     ---------------------------------------
>         Variable |  Factor1 |   Uniqueness
>     -------------+----------+--------------
>            trunk |   0.8807 |      0.2243 
>           weight |   0.9073 |      0.1768 
>           length |   0.9317 |      0.1319 
>         headroom |   0.7438 |      0.4467 
>     ---------------------------------------

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index