Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: Using Cluster stop with Clustermat


From   [email protected]
To   [email protected]
Subject   Re: st: Using Cluster stop with Clustermat
Date   Fri, 22 Jul 2005 11:54:38 -0500

Allan Garland <[email protected]> asks:

> I am doing a cluster analysis on variables (columns), not observations 
> (rows).  So, I've got my dissimilarity matrix, and because this is being 
> done on the variables rather than the observations, the syntax of the 
> clustermat command is:
> 
> clustermat averagelinkage `DISMatrix', shape(full)  clear  
> labelvar(varnames)
> 
> Use of the "clear" option seems to be necessary, because if I don't use 
> it the dimension of my data clashes with that of the cluster solution, 
> and I get an error message.
> 
>  OK -- the problem then is that when I want to use cluster stop with 
> clustermat, I'm required to delineate the variables to use, BUT the 
> original dataset has been eliminated from memory and replaced with the 
> cluster solution -- so when I do:
> 
> cluster stop, rule(calinski) variables(`varlist')
> 
>  I get an error telling me it can't find the variable names (of course).
> 
> So, can anyone advise how I actually use cluster stop in this situation?

Stata thinks of a dataset as consisting of a data matrix with
observations (rows) and variables (columns) and -cluster stop-
and -clustermat stop- are no exception.  Based on spliting the
*ROWS* into the various number of groups it can compute the
cluster stopping measures.  When attempting to cluster variables
(columns) instead of observations (rows), the information it
needs is organized along the COLUMNS instead of along the ROWS.

One of several possible uses of -clustermat- is to cluster
variables (as documented in the manual, see [MV] clustermat).
-clustermat stop- is difficult to use in this setting (though
still useful in other settings where -clustermat- is used).

There is another approach for clustering variables.  You can use
the -xpose- command to flip the variables into observations.
Then use -cluster- instead of -clustermat-.  Then you can use
-cluster stop- for the stopping rules.

Ken Higbee    [email protected]
StataCorp     1-800-STATAPC

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index