Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
Trevor Zink <tzink@bren.ucsb.edu> |

To |
<statalist@hsphsun2.harvard.edu> |

Subject |
Re: st: Factors correlated after -predict-... What is going wrong? |

Date |
Thu, 12 Dec 2013 12:44:46 -0800 |

This is a great explanation--it makes much more sense now! Thank you! Trevor On 12/12/2013 10:13 AM, Red Owl wrote:

Trevor, I'm going to take some major liberties in this response. With great trepidation, I will offer an heuristic (not technically accurate) explanation that will probably be painful to the statisticians here. Hopefully, however, this heuristic explanation may help explain why no rotation method is likely to produce perfectly uncorrelated factor scores with real world data. First, think of factor scores and factors as separate entities. Think of the factor scores for any factor as equivalent to the estimates or predictions from a regression model. If those predictions were perfect (i.e., error free with all residuals = 0), then the predicted values would all fall exactly on the regression line implied by the model. That is not likely to happen, however, so the predictions likely will not perfectly reproduce the regression line but, rather, will be clustered around that regression line with some degree of error. Think of the factors as equivalent to regression lines (which under varimax or any other form of orthogonal rotation are kept 90 degrees apart). (The technical term for the factor lines is eigenvectors, and varimax maintains orthogonal eigenvectors.) Essentially, the same issue exists with factors and their factor scores as we see with with regression models and their estimates. You can view factor scores as estimates of the factors, but those estimates almost surely include error (i.e., residuals not equal to 0). So, the factor scores will not usually fall exactly on the factor lines (i.e., the eigenvectors). Varimax will produce _factors_ (eigenvectors) that are orthogonal, but the predicted _factor scores_ will only be perfectly orthogonal in the rare case of perfect predictions of the factors with all residuals = 0. So, it is entirely possible to have orthogonal factors but factor scores that are correlated to some degree. It seems, then, that you were basically expecting a factor rotation algorithm and a factor scores prediction algorithm to transform your data into factor scores that can be measured without error and fall exactly on the lines (eigenvectors) -- which would be like spinning gold out of straw. If you want to minimize the correlations between estimated factor scores, varimax is as good a choice for rotation as any, but you should not expect perfect orthogonality in the estimated factor scores. If you are lucky, the correlations between factor scores will not be statistically significant, and you can treat the factor scores as possibly uncorrelated. Hopefully, one of the statisticians on this list will "fix" my attempted heuristic explanation if I have missed the mark with my regression analogy. Red Owl redowl@liu.eduRed and William, Thanks for the replies. I initially also excepted it was an estimationsample issue, but I tried adjusting for that, and as Red's example shows, it doesn't fix the issue. Thanks for the insight on varimax--I was indeed under the impression that varimax would always produce perfectly orthogonal factors. Interesting to know this is not the case. Is there another method I should consider that produces less correlated factor scores?Thanks again, TrevorOn 12/12/2013 4:26 AM, Red Owl wrote: I doubt Trevor's concern Trevor is due exclusively to a failure to maintain the e(sample) in estimating the factor score correlations. I believe the problem is that he was expecting that varimax rotation would always produce perfectly uncorrelated factor scores and that their correlation matrix should match the identity matrix presented after -estat common-. See the following example, which demonstrates that (a) -estat common- simply produces an identity matrix after varimax rotation, as the mv.pdf documentation indicates, (b) the estimated factor scores in this case are not perfectly orthogonal even after varimax rotation, and (c) the correlation matrix of factor scores calculated with -if e(sample)- does not reproduce the identity matrix with either pairwise or listwise/casewise deletion of cases with missing values. ** Begin Example use http://www.stata-press.com/data/r13/sp2, clear factor ghp31-ghp05, fac(3) rotate, varimax estat common predict f1-f3 pwcorr f1-f3 if e(sample), sig corr f1-f3 if e(sample) ** End Example Red Owl redowl@liu.edu* * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/faqs/resources/statalist-faq/ * http://www.ats.ucla.edu/stat/stata/

-- Trevor Zink, MBA, MA Ph.D. Candidate UC Regents Special Fellow Bren School of Environmental Science and Management University of California, Santa Barbara tzink@bren.ucsb.edu <mailto:tzink@bren.ucsb.edu> * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/faqs/resources/statalist-faq/ * http://www.ats.ucla.edu/stat/stata/

**References**:**Re: Re: Re: st: Factors correlated after -predict-... What is going wrong?***From:*Red Owl <rh.redowl@liu.edu>

- Prev by Date:
**Re: st: RE: -icd9- crashes Stata** - Next by Date:
**st: Question about how streg using the constant-only model for starting values** - Previous by thread:
**Re: Re: Re: st: Factors correlated after -predict-... What is going wrong?** - Next by thread:
**st: dialog program: make LISTBOX initially hidden or use repopulate?** - Index(es):