Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: wealth score using principal component analysis (PCA)


From   Nick Cox <njcoxstata@gmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: wealth score using principal component analysis (PCA)
Date   Thu, 27 Sep 2012 09:26:14 +0100

I can't give an answer to this question that is likely to satisfy you.
PCA and SEM are very different methods. PCA is in my view primarily a
multivariate transformation technique. SEM is, more obviously, a
family of modelling techniques. Even in this thread the use of PCA
appears to be part of a wider model-based strategy and that is likely
to be typical of most projects in which it appears. I don't think "use
PCA" is ever likely to be the core of the answer to "what should I
do?" but "use SEM" might be, sometimes.

Stas [sic] can speak for himself, but I suspect his position would be
close to mine on this.

Nick

On Thu, Sep 27, 2012 at 8:06 AM, 汪哲仁 <chejen.wang@gmail.com> wrote:

> Dear Nick and Stat,
>
> May I ask a question? In which circumstances, the PCA is a better
> choice than SEM?

 2012/9/27 Nick Cox <njcoxstata@gmail.com>

>> You are confusing two different questions. Throughout I focus on the
>> case you are looking at where PCA is based on the correlation matrix.
>>
>> If the aim is to use the most important PC, then that is labelled 1,
>> but even if it weren't we could identify it by its having the largest
>> eigenvalue attached and no extra considerations arise.
>>
>> If the aim is to identify which PCs are "important" or "worthy of use"
>> (typically one or more) and should be used in later analyses, then
>> this is necessarily a looser, more open question and the best art is a
>> darker matter. There can't be an answer independent of what you are
>> trying to do. Some people do stress a rule of thumb such as
>> eigenvalues > 1 and some people look for a break in the eigenvalues
>> using a scree plot. In some projects PCs that are used later are good
>> if interpretable as having high correlations with particular
>> variables; in other projects the PCs are just composite variables with
>> the properties assigned to them and interpretability is less material.
>>
>> Every book I know on PCA stresses this open aspect of the method. The
>> books by Jolliffe and Jackson referenced in the -pca- documentation
>> certainly do.
>>
>> It's not clear exactly why you feel committed in advance to using PCA
>> like this. I sympathise with the advice given earlier by Stas
>> Kolenikov to consider something more like an SEM.
>>
>> Nick
>>
>> On Wed, Sep 26, 2012 at 9:33 PM, Shikha Sinha <shikha.sinha414@gmail.com> wrote:
>> > Ok, I got it now that if I want to use one score, then PC1 is the most
>> > relevant one, and then for further distinction between financial vs
>> > social, we need to look at factor loadings in each PC2, PC3 , to
>> > figure out if PC2 is better than PC1 if the focus is on social or
>> > financial autonomy.
>> >
>> > Then I am struggling to understand the use of selecting components
>> > based on eigenvalues. What is the use of selecting PC based either on
>> > eigenvalues or screeplot, if we are always (most of the time) going to
>> > use the 1st component. An example on the importance of eigenvalues in
>> > selecting components would be very helpful ( or any ref.)

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index