Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: -factor- with binary variables


From   Patricia Sourdin <patricia.sourdin@adelaide.edu.au>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: -factor- with binary variables
Date   Sun, 28 Nov 2004 21:04:02 +1030

Thanks Clive, I'll check out the references.

Quoting Clive Nicholas <Clive.Nicholas@newcastle.ac.uk>:

> Patricia Sourdin wrote:
>
> > just a query on -factor-.
> > I am trying to construct an index where I have five variables which are
> > binary
> > indicators.
> > I have read somewhere that it is not appropriate to use factor analysis if
> > the
> > variables are binary.  Can anyone confirm, please?
>
> Well, if you ever read what Chatfield and Collins (1980) had to say (or,
> should I say, spit?) on CFA, as they prefer to call it, it's such a
> useless and unreliable method of data analysis (largely, they say, because
> it's difficult to replicate), that there's little point in wasting your
> time doing it! I don't entirely share this view, however. :)
>
> The whole point of factor analysis, as I understand it, is to explore (in
> a preliminary fashion) correlations between variables that appear to 'hang
> together', which in turn _could_ be combined into new variables in further
> analysis if it were both valid and desirable to do so.
>
> It's no accident that FA was part-invented by Karl Pearson back in the
> early 1930s. Strictly speaking, you're not meant to run Pearson
> correlations between binary/discrete variables because they are designed
> for continuous variables only. You use chi-square tests for two binary
> variables and eta-coefficient tests if one variable is continuous and the
> other is discrete. But as Eric Morecambe would have said, "Come on now, be
> honest!"  How many of us have run Pearson correlations inappropriately? I
> know I have: and I'm not proud of myself, either.
>
> Having flicked through perhaps one of the most accessible books around on
> factor analysis (Kline, 1994), although he does not say that the use of
> binary variables is disallowed, _all_ of the examples in the book
> exclusively use either scale measures or naturally continuous scores, such
> as years of education or age. Therefore, I would advise against using
> binary variables in factor analysis.
>
> > Also, would -pca- be an alternative in this case?
>
> Principal components analysis is similar to FA in that it's a data
> reduction technique. However, the 'factors' extracted in FA are
> hypothetical: it's left to you to describe why the variables that form the
> factor(s) just extracted have something in common. PCA is rather more
> autistic in its approach: in practice, it's all about estimating how much
> variance the first couple of PCs account for, regardless of whether they
> _really_ have something in common or not. Thus, dropping binary variables
> into a PCA would make it no more valid, in my view.
>
> CLIVE NICHOLAS        |t: 0(044)7903 397793
> Politics              |e: clive.nicholas@ncl.ac.uk
> Newcastle University  |http://www.ncl.ac.uk/geps
>
> References:
>
> Chatfield C and Collins AJ (1980) INTRODUCTION TO MUTLIVARIATE ANALYSIS,
> London: Chapman and Hall.
>
> Kline P (1994) AN EASY GUIDE TO FACTOR ANALYSIS, London: Routledge.
> *
> *   For searches and help try:
> *   http://www.stata.com/support/faqs/res/findit.html
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
>


*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index