Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

# Re: st: Factors correlated after -predict-... What is going wrong?

 From Trevor Zink To Subject Re: st: Factors correlated after -predict-... What is going wrong? Date Thu, 12 Dec 2013 09:28:15 -0800

```Thanks, Nick

```
Just to be clear, since Stata uses different terminology, when you say PCA, do you mean -factor, pcf- or -factor, pf-? From what I can tell from the manual, the PCF method is closer to PCA whereas the PF method is closer to factor analysis.
```
Thanks
Trevor

On 12/12/2013 9:20 AM, Nick Cox wrote:
```
```The use of factor analysis rather than PCA is typically based on some
ideas about what structure might exist, whether they are called theory
or not. If you don't have well developed grounds for FA, PCA does the
advantage you seek of uncorrelated scores.
Nick
njcoxstata@gmail.com

On 12 December 2013 17:10, Trevor Zink <tzink@bren.ucsb.edu> wrote:
```
```Red and William,

Thanks for the replies. I initially also excepted it was an estimation
sample issue, but I tried adjusting for that, and as Red's example shows, it
doesn't fix the issue. Thanks for the insight on varimax--I was indeed under
the impression that varimax would always produce perfectly orthogonal
factors. Interesting to know this is not the case.
Is there another method I should consider that produces less correlated
factor scores?

Thanks again,
Trevor

On 12/12/2013 4:26 AM, Red Owl wrote:
```
```I doubt Trevor's concern Trevor is due exclusively to a failure to
maintain the e(sample) in estimating the factor score correlations.  I
believe the problem is that he was expecting that varimax rotation would
always produce perfectly uncorrelated factor scores and that their
correlation matrix should match the identity matrix presented after
-estat common-.

See the following example, which demonstrates that (a) -estat common-
simply produces an identity matrix after varimax rotation, as the mv.pdf
documentation indicates, (b) the estimated factor scores in this case
are not perfectly orthogonal even after varimax rotation, and (c) the
correlation matrix of factor scores calculated with -if e(sample)- does
not reproduce the identity matrix with either pairwise or
listwise/casewise deletion of cases with missing values.

** Begin Example
use http://www.stata-press.com/data/r13/sp2, clear
factor ghp31-ghp05, fac(3)
rotate, varimax
estat common
predict f1-f3
pwcorr f1-f3 if e(sample), sig
corr f1-f3 if e(sample)
** End Example

Red Owl
redowl@liu.edu

```
```Did you restrict your prediction to your estimation sample?  Maybe
someobservations that were excluded from fitting the PCA had predicted

values and the pattern of missingness was correlated across those
observations?
William Buchanan <william@williambuchanan.net>
Sent from my iPhone
```
```
```
```On Dec 12, 2013, at 4:32, Red Owl <rh.redowl@liu.edu> wrote:

Trevor,

See mv.pdf (from help factor postestimation) on p. 317 in Stata 13.x
documentation, which states:

"estat common displays the correlation matrix of the common factors. For
hence an identity matrix is shown. estat common is of more interest
after oblique rotations."

I recommend that you rely on the results of -pwcorr- or -corr- after
varimax rotation is an orthogonal procedure, it does not guarantee
perfectly uncorrelated factor scores.

Red Owl
redowl@liu.edu
```
```Hi Statalist,

I am using -factor- to develop three factors, rotating them using
-rotate, varimax- and then produce variables from the factors using
-predict-. Varimax is orthogonal rotation so should produce factors
with
zero correlation. Testing the factors' correlation after rotation with
-estat common- produces the expected result, that correlation is 0.
However, after I produce variables from the factors using -predict-,
these new variables are correlated. How? Why? I tried replicating the
steps using the example dataset from the manual (/r12/sp2), and in that
case the predicted variables also have zero correlation. So, I guess
it's something unique to my dataset, but I have no idea what. Any
ideas?

<snipped>

Thanks,
Trevor Zink
```
```*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/
```
```
--
Trevor Zink, MBA, MA
Ph.D. Candidate
UC Regents Special Fellow
Bren School of Environmental Science and Management
University of California, Santa Barbara
tzink@bren.ucsb.edu <mailto:tzink@bren.ucsb.edu>

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/
```
```*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/
```
```
--
Trevor Zink, MBA, MA
Ph.D. Candidate
UC Regents Special Fellow
Bren School of Environmental Science and Management
University of California, Santa Barbara
tzink@bren.ucsb.edu <mailto:tzink@bren.ucsb.edu>
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/
```