Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down at the end of May, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Clustering with Sequence analysis/optimal matching


From   Ulrich Kohler <kohler@wzb.eu>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: Clustering with Sequence analysis/optimal matching
Date   Thu, 22 Mar 2012 09:22:22 +0100

I am the author of the sq package that is involved here. However note
that -clustermat- and -cluster tree- are official Stata, and, frankly I
do not  know much about cluster analysis. So I can only add that during
verification of -sqclusterdat- I quite regularly encounter the
"currently can't handle dendrogram reversals" from cluster tree (and I
don't remember whether it arises only with other methods than Ward's). I
usually solve the problem by simply changing the cutnumber to something
very close. 

It is really the first time that I hear that dendrogram reversals should
not happen with Ward's method. If the problem arises due to some
features of distance matrix created by -sqom- I would be happy to hear
what I can do about it. 

Many regards

Uli




Am Mittwoch, den 21.03.2012, 21:39 +0000 schrieb Brendan Halpin:
> On Wed, Mar 21 2012, Stefan Weih wrote:
> 
> > Thanks for your comments, Brendan.
> >
> > However, as indicated in the correspondance earlier, I did use Ward's
> > method. Also no typo involved. For the complete syntax on my clustering
> > procedure, please see below:
> >
> > sqclusterdat
> > clustermat wardslinkage SQdist, name(wards) add
> > cluster tree wards, cutnumber(20)
> > sqclusterdat, return
> 
> 
> OK, that's fairly unambiguous. I'm puzzled. As far as I understand,
> reversals (where after combining clusters i and j, the distance from
> cluster k to the joint cluster is less than d(i,k) and/or d(j,k)) don't
> happen with Ward's method. 
> 
> Could your distance matrix be defective? For instance, could it be
> non-metric? Normally OM is guaranteed to generate metric distances, but
> if the substitution matrix is not metric, the distances are not
> guaranteed to be metric. That's a long shot, though -- I have no idea
> whether non-metric distances will give Ward's method indigestion. 
> 
> 
> Another thing that might be helpful would be to post the output from the
> following:
> 
> . sqclusterdat
> . clustermat wardslinkage SQdist, name(wards) add
> . cluster query wards
> . return list
> . cluster tree wards, cutnumber(20)
> 
> 
> Regards,
> 
> Brendan


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index