Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: RE: RE: -factor pcf- vs -pca- (was factor score postestimation)

From   "Garrard, Wendy M." <wendy.garrard@Vanderbilt.Edu>
To   <>
Subject   st: RE: RE: -factor pcf- vs -pca- (was factor score postestimation)
Date   Sun, 11 Sep 2005 11:26:08 -0500

Thanks.  I am aware of the more basic differences between --factor-- and
--pca-- , but am still confused by what Stata is doing with the
"principal-components factors" option in the --factor-- command. I get
different results (loadings) for --pca-- and --factor, pcf-- even when I
restrict the number of components/factors to be the same for each

I am most familiar with a stat package having PCA as a special case of
FA, (i.e., SPSS) as you mention was so in earlier versions of Stata.
Therefore,  I am especially confused by Stata having something called
"principal components" available as a separate --pca-- and also as a
special case of --factor--.  I naively expected both "principal
components" procedures to return roughly similar results, but now I see
that they can be very different. 

Thanks for the reference. I will do a bit of homework, although I am not
sure that my confusion due to the "pcf" and "pca" terms will be resolved
so easily.


-----Original Message-----
[] On Behalf Of Nick Cox
Sent: Sunday, September 11, 2005 11:06 AM
Subject: st: RE: -factor pcf- vs -pca- (was factor score postestimation)

You are asking me to describe a minefield. 

Many people regard PCA as a transformation procedure, as no error term
and thus no model is involved. Given the choice of either correlation or
covariance matrix, results are eigenvectors, eigenvalues and other
properties of that matrix, with (in a sense) no statistical arguments
being used at all. 

Conversely, FA is most usually regarded
as a modelling technique. Its invocation of latent variables is regarded
as its worst and its best feature, depending on tribal attitudes. 

In many fields, one is regarded as wonderful or at least useful, and the
other is regarded as misguided if not pernicious. 

But there is a large literature on this. Standard texts include those by
Jolliffe and Jackson. 
In my opinion, any text that does _not_ explain that the choice between
PCA and FA is controversial is likely to be too elementary to be worth
your time. 

Originally in Stata, meaning from version 2.1, PCA was just obtainable
-factor- as a special case. The bifurcation of -factor- into -factor-
and -pca- in version 8 was partly based on a recognition that many
people want principal components without any of the latent modelling

Whenever I use PCA it is often to help choose predictors for a
regression, but the PCA is just a means to an end, and not necessarily
mentioned in the full report, but pretty much the same information is
given in a correlation or scatter plot matrix, which can be much more


Garrard, Wendy M.
> Thanks very much. The "predict" is just what I needed.  Also, I 
> appreciate your suggestion about using pca instead of factor since I 
> am using regression. I had noticed Stata has two commands that do 
> principal components; pca, and the pcf option within factor. I 
> generally use the pcf  factor option, since I usually want to reduce 
> several predictor variables to a single factor for purposes of 
> regression.
> I am a bit confused about the difference Stata is making with --pca-- 
> and --factor, pcf--, and should undoubtedly become familiar with this.
> Would you mind pointing out the gist, and perhaps a reference for more

> detail?

*   For searches and help try:

*   For searches and help try:

© Copyright 1996–2017 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index