Stata: Data Analysis and Statistical Software

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Re: Re: st: Factors correlated after -predict-... What is going wrong?

From	Red Owl <[email protected]>
To	<[email protected]>
Subject	Re: Re: Re: st: Factors correlated after -predict-... What is going wrong?
Date	Thu, 12 Dec 2013 13:13:38 -0500

Trevor,

I'm going to take some major liberties in this response. With great
trepidation, I will offer an heuristic (not technically accurate)
explanation that will probably be painful to the statisticians here.
Hopefully, however, this heuristic explanation may help explain why no
rotation method is likely to produce perfectly uncorrelated factor
scores with real world data.

First, think of factor scores and factors as separate entities.

Think of the factor scores for any factor as equivalent to the estimates
or predictions from a regression model.  If those predictions were
perfect (i.e., error free with all residuals = 0), then the predicted
values would all fall exactly on the regression line implied by the
model.  That is not likely to happen, however, so the predictions likely
will not perfectly reproduce the regression line but, rather, will be
clustered around that regression line with some degree of error.

Think of the factors as equivalent to regression lines (which under
varimax or any other form of orthogonal rotation are kept 90 degrees
apart).  (The technical term for the factor lines is eigenvectors, and
varimax maintains orthogonal eigenvectors.)

Essentially, the same issue exists with factors and their factor scores
as we see with with regression models and their estimates.

You can view factor scores as estimates of the factors, but those
estimates almost surely include error (i.e., residuals not equal to 0).
 So, the factor scores will not usually fall exactly on the factor lines
(i.e., the eigenvectors).  Varimax will produce _factors_ (eigenvectors)
that are orthogonal, but the predicted _factor scores_ will only be
perfectly orthogonal in the rare case of perfect predictions of the
factors with all residuals = 0.

So, it is entirely possible to have orthogonal factors but factor scores
that are correlated to some degree.

It seems, then, that you were basically expecting a factor rotation
algorithm and a factor scores prediction algorithm to transform your
data into factor scores that can be measured without error and fall
exactly on the lines (eigenvectors) -- which would  be like spinning
gold out of straw.

If you want to minimize the correlations between estimated factor
scores, varimax is as good a choice for rotation as any, but you should
not expect perfect orthogonality in the estimated factor scores.  If you
are lucky, the correlations between factor scores will not be
statistically significant, and you can treat the factor scores as
possibly uncorrelated.

Hopefully, one of the statisticians on this list will "fix" my attempted
heuristic explanation if I have missed the mark with my regression analogy.

Red Owl
[email protected]

> Red and William,
>
> Thanks for the replies. I initially also excepted it was an estimation
sample issue, but I tried adjusting for that, and as Red's example
shows, it doesn't fix the issue. Thanks for the insight on varimax--I
was indeed under the impression that varimax would always produce
perfectly orthogonal factors. Interesting to know this is not the case.
Is there another method I should consider that produces less correlated
factor scores?
>
>
> Thanks again,
> Trevor

>>On 12/12/2013 4:26 AM, Red Owl wrote:
>>
>>I doubt Trevor's concern Trevor is due exclusively to a failure to
>>maintain the e(sample) in estimating the factor score correlations.  I
>>believe the problem is that he was expecting that varimax rotation would
>>always produce perfectly uncorrelated factor scores and that their
>>correlation matrix should match the identity matrix presented after
>>-estat common-.
>>
>>See the following example, which demonstrates that (a) -estat common-
>>simply produces an identity matrix after varimax rotation, as the mv.pdf
>>documentation indicates, (b) the estimated factor scores in this case
>>are not perfectly orthogonal even after varimax rotation, and (c) the
>>correlation matrix of factor scores calculated with -if e(sample)- does
>>not reproduce the identity matrix with either pairwise or
>>listwise/casewise deletion of cases with missing values.
>>
>>** Begin Example
>>use http://www.stata-press.com/data/r13/sp2, clear
>>factor ghp31-ghp05, fac(3)
>>rotate, varimax
>>estat common
>>predict f1-f3
>>pwcorr f1-f3 if e(sample), sig
>>corr f1-f3 if e(sample)
>>** End Example
>>
>>Red Owl
>>[email protected]
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/

Follow-Ups:
- Re: st: Factors correlated after -predict-... What is going wrong?
  - From: Trevor Zink <[email protected]>

Prev by Date: Re: st: margeff after mlogit
Next by Date: Re: st: RE: Features for Stata 14
Previous by thread: Re: st: Factors correlated after -predict-... What is going wrong?
Next by thread: Re: st: Factors correlated after -predict-... What is going wrong?
Index(es):
- Date
- Thread