[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

From |
"Clive Nicholas" <Clive.Nicholas@newcastle.ac.uk> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: -factor- with binary variables |

Date |
Sun, 28 Nov 2004 10:08:48 -0000 (GMT) |

Patricia Sourdin wrote: > just a query on -factor-. > I am trying to construct an index where I have five variables which are > binary > indicators. > I have read somewhere that it is not appropriate to use factor analysis if > the > variables are binary. Can anyone confirm, please? Well, if you ever read what Chatfield and Collins (1980) had to say (or, should I say, spit?) on CFA, as they prefer to call it, it's such a useless and unreliable method of data analysis (largely, they say, because it's difficult to replicate), that there's little point in wasting your time doing it! I don't entirely share this view, however. :) The whole point of factor analysis, as I understand it, is to explore (in a preliminary fashion) correlations between variables that appear to 'hang together', which in turn _could_ be combined into new variables in further analysis if it were both valid and desirable to do so. It's no accident that FA was part-invented by Karl Pearson back in the early 1930s. Strictly speaking, you're not meant to run Pearson correlations between binary/discrete variables because they are designed for continuous variables only. You use chi-square tests for two binary variables and eta-coefficient tests if one variable is continuous and the other is discrete. But as Eric Morecambe would have said, "Come on now, be honest!" How many of us have run Pearson correlations inappropriately? I know I have: and I'm not proud of myself, either. Having flicked through perhaps one of the most accessible books around on factor analysis (Kline, 1994), although he does not say that the use of binary variables is disallowed, _all_ of the examples in the book exclusively use either scale measures or naturally continuous scores, such as years of education or age. Therefore, I would advise against using binary variables in factor analysis. > Also, would -pca- be an alternative in this case? Principal components analysis is similar to FA in that it's a data reduction technique. However, the 'factors' extracted in FA are hypothetical: it's left to you to describe why the variables that form the factor(s) just extracted have something in common. PCA is rather more autistic in its approach: in practice, it's all about estimating how much variance the first couple of PCs account for, regardless of whether they _really_ have something in common or not. Thus, dropping binary variables into a PCA would make it no more valid, in my view. CLIVE NICHOLAS |t: 0(044)7903 397793 Politics |e: clive.nicholas@ncl.ac.uk Newcastle University |http://www.ncl.ac.uk/geps References: Chatfield C and Collins AJ (1980) INTRODUCTION TO MUTLIVARIATE ANALYSIS, London: Chapman and Hall. Kline P (1994) AN EASY GUIDE TO FACTOR ANALYSIS, London: Routledge. * * For searches and help try: * http://www.stata.com/support/faqs/res/findit.html * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**Re: st: -factor- with binary variables***From:*Patricia Sourdin <patricia.sourdin@adelaide.edu.au>

**References**:**st: -factor- with binary variables***From:*Patricia Sourdin <patricia.sourdin@adelaide.edu.au>

- Prev by Date:
**Re: st: -factor- with binary variables** - Next by Date:
**Re: st: -factor- with binary variables** - Previous by thread:
**st: -factor- with binary variables** - Next by thread:
**Re: st: -factor- with binary variables** - Index(es):

© Copyright 1996–2015 StataCorp LP | Terms of use | Privacy | Contact us | What's new | Site index |