[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]
Re: st: -factor- with binary variables
Thanks Clive, I'll check out the references.
Quoting Clive Nicholas <Clive.Nicholas@newcastle.ac.uk>:
> Patricia Sourdin wrote:
> > just a query on -factor-.
> > I am trying to construct an index where I have five variables which are
> > binary
> > indicators.
> > I have read somewhere that it is not appropriate to use factor analysis if
> > the
> > variables are binary. Can anyone confirm, please?
> Well, if you ever read what Chatfield and Collins (1980) had to say (or,
> should I say, spit?) on CFA, as they prefer to call it, it's such a
> useless and unreliable method of data analysis (largely, they say, because
> it's difficult to replicate), that there's little point in wasting your
> time doing it! I don't entirely share this view, however. :)
> The whole point of factor analysis, as I understand it, is to explore (in
> a preliminary fashion) correlations between variables that appear to 'hang
> together', which in turn _could_ be combined into new variables in further
> analysis if it were both valid and desirable to do so.
> It's no accident that FA was part-invented by Karl Pearson back in the
> early 1930s. Strictly speaking, you're not meant to run Pearson
> correlations between binary/discrete variables because they are designed
> for continuous variables only. You use chi-square tests for two binary
> variables and eta-coefficient tests if one variable is continuous and the
> other is discrete. But as Eric Morecambe would have said, "Come on now, be
> honest!" How many of us have run Pearson correlations inappropriately? I
> know I have: and I'm not proud of myself, either.
> Having flicked through perhaps one of the most accessible books around on
> factor analysis (Kline, 1994), although he does not say that the use of
> binary variables is disallowed, _all_ of the examples in the book
> exclusively use either scale measures or naturally continuous scores, such
> as years of education or age. Therefore, I would advise against using
> binary variables in factor analysis.
> > Also, would -pca- be an alternative in this case?
> Principal components analysis is similar to FA in that it's a data
> reduction technique. However, the 'factors' extracted in FA are
> hypothetical: it's left to you to describe why the variables that form the
> factor(s) just extracted have something in common. PCA is rather more
> autistic in its approach: in practice, it's all about estimating how much
> variance the first couple of PCs account for, regardless of whether they
> _really_ have something in common or not. Thus, dropping binary variables
> into a PCA would make it no more valid, in my view.
> CLIVE NICHOLAS |t: 0(044)7903 397793
> Politics |e: firstname.lastname@example.org
> Newcastle University |http://www.ncl.ac.uk/geps
> Chatfield C and Collins AJ (1980) INTRODUCTION TO MUTLIVARIATE ANALYSIS,
> London: Chapman and Hall.
> Kline P (1994) AN EASY GUIDE TO FACTOR ANALYSIS, London: Routledge.
> * For searches and help try:
> * http://www.stata.com/support/faqs/res/findit.html
> * http://www.stata.com/support/statalist/faq
> * http://www.ats.ucla.edu/stat/stata/
* For searches and help try: