Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Clustermat puzzle

From   "Abrams, Judith" <[email protected]>
To   <[email protected]>
Subject   Re: st: Clustermat puzzle
Date   Sat, 24 Mar 2012 11:43:40 -0400

T-Mobile. America's First Nationwide 4G Network

Brendan Halpin <[email protected]> wrote:

I have a small matrix of pairwise distances (all integers) that I'm
passing to clustermat (Ward's method). I notice that if I scale the
distances by a constant, I get different results. On investigation it
seems that if I scale it by other than an integer power of two I get one
solution, and by a power of two, another.

Code below demonstrates the problem. Experimentation with the code shows
that using a factor of a power of two by 0.11 (e.g. 0.44, 1.76) also
returns the original solution. 

While clustering is often vulnerable to small changes in the data, it
shouldn't be affected by a simple scale change. Presumably something
subtle is happening with the internal representations of the distances.


Code to download the distance matrix and compare solutions:


mkmat d1-d42, mat(D)

clustermat wards D, name(D) add
cluster generate a4=groups(4)

capture program drop cltest
program define cltest
args mult
tempname n4 diff M
matrix `M' = D * `mult'
clustermat wards `M', name(`M') add
cluster generate `n4'=groups(4)
tab `n4' a4
gen `diff' = `n4' - a4
su `diff'
di _newline
if r(mean)!=0 {
di "Cluster solutions differ, factor " `mult'
else {
di "Cluster solutions identical, factor " `mult'
cluster drop `M'

cltest 2
cltest 3
cltest 1/40
cltest 0.125
cltest 0.44

Brendan Halpin,   Department of Sociology,   University of Limerick,   Ireland
Tel: w +353-61-213147  f +353-61-202569  h +353-61-338562;  Room F1-009 x 3147
mailto:[email protected]    ULSociology on Facebook:         twitter:@ULSociology
*   For searches and help try:

Confidentiality Notice: This email message, including any attachments, is for the sole use of the intended recipient(s) and may contain confidential and/or privileged information. If you are not the intended recipient(s), you are hereby notified that any dissemination, unauthorized review, use, disclosure or distribution of this email and any materials contained in any attachments is prohibited. If you receive this message in error, or are not the intended recipient(s), please immediately notify the sender by email and destroy all copies of the original message, including attachments.

*   For searches and help try:

© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index