Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

RE: st: RE: Re: Cluster Analysis

From   "Nick Cox" <[email protected]>
To   <[email protected]>
Subject   RE: st: RE: Re: Cluster Analysis
Date   Mon, 20 Oct 2003 19:04:42 +0100

Michael I. Lichter

> Could Nick or someone else explain (a) what is meant by 
> "continua," and
> (b) how you justify & properly handle 
> categorical/dichotomous variables
> in PCA? Thanks.
> -ml
> Quoting Nick Cox <[email protected]>:
> > 1. You are interested, naturally enough, in correlations 
> > among predictors. Whether observations are clustered 
> > together is a different issue. It is easy to think 
> > of continua with high correlations, continua with 
> > low correlations, cluster structure with high 
> > correlations and cluster structure with low 
> > correlations. 
> >
> > 2. If cluster structure exists, it will be evident 
> > in plots of the first few principal components. 
> > The fact that some of your variables
> > are categorical or binary would complicate a PCA without 
> > making it impossible. 

"continua" is the plural of "continuum". 

PCA to me is a transformation procedure. 
What is "justified" or "proper" in PCA 
may differ for you if you have different 
expectations of what it can achieve. 
Some analysts seem determined to try 
to turn it into a modelling or inferential 

I don't see why e.g. a 0-1 variable 
can't be an input variable to PCA. This to 
me is no more and no less problematic 
than putting such a variable as a predictor 
into a regression model. I would imagine 
that a purely nominal variable should be 
best inserted as a series of indicators. 

However, I do think that PCA works best 
when all variables are measured. 

[email protected] 

*   For searches and help try:

© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index