Dear Statalisters,
I have a question about applying Ward linkage method for cluster analysis of data:
As raw data matrix we have a 30 x 1200000 matrix consisting of binary data variables.
Afterwards we use a Jaccard similarity coefficient to compare each pair of the 30 objects. So we obtain a new matrix filled with the values of the Jaccard coefficients. Thus, we have now a 30 x 30 similarity matrix with continuous values between 0 and 1.
Now we perform single linkage method on this 30 x 30 similarity matrix in order to identify outliers within the objects. After that we perform both average linkage method and Ward linkage method to find appropriate clusters among the objects (without outliers). The results shown in our dendrograms are quite reasonable, but in literature I read that variables have to be measured on a metric scale when Ward linkage method is used for clustering.
Therefore the question: Can the Ward linkage method be applied for clustering in this case (binary raw data matrix)?
Thank you very much in advance,
Jochen
------------
Jochen Siegele
Universitaet Karlsruhe (TH)
Institut fuer Wirtschaftspolitik und Wirtschaftsforschung (IWW)
Sektion Verkehr und Kommunikation
Postfach 6980, D-76128 Karlsruhe
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/