Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
From | Red Owl <rh.redowl@liu.edu> |
To | <statalist@hsphsun2.harvard.edu> |
Subject | Re: Re: Re: st: Factors correlated after -predict-... What is going wrong? |
Date | Thu, 12 Dec 2013 13:13:38 -0500 |
Trevor, I'm going to take some major liberties in this response. With great trepidation, I will offer an heuristic (not technically accurate) explanation that will probably be painful to the statisticians here. Hopefully, however, this heuristic explanation may help explain why no rotation method is likely to produce perfectly uncorrelated factor scores with real world data. First, think of factor scores and factors as separate entities. Think of the factor scores for any factor as equivalent to the estimates or predictions from a regression model. If those predictions were perfect (i.e., error free with all residuals = 0), then the predicted values would all fall exactly on the regression line implied by the model. That is not likely to happen, however, so the predictions likely will not perfectly reproduce the regression line but, rather, will be clustered around that regression line with some degree of error. Think of the factors as equivalent to regression lines (which under varimax or any other form of orthogonal rotation are kept 90 degrees apart). (The technical term for the factor lines is eigenvectors, and varimax maintains orthogonal eigenvectors.) Essentially, the same issue exists with factors and their factor scores as we see with with regression models and their estimates. You can view factor scores as estimates of the factors, but those estimates almost surely include error (i.e., residuals not equal to 0). So, the factor scores will not usually fall exactly on the factor lines (i.e., the eigenvectors). Varimax will produce _factors_ (eigenvectors) that are orthogonal, but the predicted _factor scores_ will only be perfectly orthogonal in the rare case of perfect predictions of the factors with all residuals = 0. So, it is entirely possible to have orthogonal factors but factor scores that are correlated to some degree. It seems, then, that you were basically expecting a factor rotation algorithm and a factor scores prediction algorithm to transform your data into factor scores that can be measured without error and fall exactly on the lines (eigenvectors) -- which would be like spinning gold out of straw. If you want to minimize the correlations between estimated factor scores, varimax is as good a choice for rotation as any, but you should not expect perfect orthogonality in the estimated factor scores. If you are lucky, the correlations between factor scores will not be statistically significant, and you can treat the factor scores as possibly uncorrelated. Hopefully, one of the statisticians on this list will "fix" my attempted heuristic explanation if I have missed the mark with my regression analogy. Red Owl redowl@liu.edu > Red and William, > > Thanks for the replies. I initially also excepted it was an estimation sample issue, but I tried adjusting for that, and as Red's example shows, it doesn't fix the issue. Thanks for the insight on varimax--I was indeed under the impression that varimax would always produce perfectly orthogonal factors. Interesting to know this is not the case. Is there another method I should consider that produces less correlated factor scores? > > > Thanks again, > Trevor >>On 12/12/2013 4:26 AM, Red Owl wrote: >> >>I doubt Trevor's concern Trevor is due exclusively to a failure to >>maintain the e(sample) in estimating the factor score correlations. I >>believe the problem is that he was expecting that varimax rotation would >>always produce perfectly uncorrelated factor scores and that their >>correlation matrix should match the identity matrix presented after >>-estat common-. >> >>See the following example, which demonstrates that (a) -estat common- >>simply produces an identity matrix after varimax rotation, as the mv.pdf >>documentation indicates, (b) the estimated factor scores in this case >>are not perfectly orthogonal even after varimax rotation, and (c) the >>correlation matrix of factor scores calculated with -if e(sample)- does >>not reproduce the identity matrix with either pairwise or >>listwise/casewise deletion of cases with missing values. >> >>** Begin Example >>use http://www.stata-press.com/data/r13/sp2, clear >>factor ghp31-ghp05, fac(3) >>rotate, varimax >>estat common >>predict f1-f3 >>pwcorr f1-f3 if e(sample), sig >>corr f1-f3 if e(sample) >>** End Example >> >>Red Owl >>redowl@liu.edu * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/faqs/resources/statalist-faq/ * http://www.ats.ucla.edu/stat/stata/