Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Re: st: -cluster measures- with Gower measure

Subject   Re: Re: st: -cluster measures- with Gower measure
Date   Mon, 20 Sep 2010 16:23:19 -0500

I previously wrote:

>> Generally I prefer the -matrix dissimilarity- command over the - 
>> cluster measures- command for obtaining dissimilarities.  If the  
>> dissimilarities Phil is after will fit in a Stata matrix, then - 
>> matrix dissimilarity- is the solution to use.

and Phil Schumm <> replied:

> I appreciate the advice.  In this case, the dataset is rather large  
> (i.e., with numbers of observations in the 50-75k range), and all I  
> need is to repeatedly sort the observations WRT their distance from  
> specific records taken from another dataset.  The -cluster measures-  
> command seemed well suited to this (I'm not too concerned about speed  
> here), but I'd of course welcome other suggestions.

You are right.  Comparing that many observations against a few
other observations, is best dealt with by -cluster measures-.
That is the situation the command was designed to best handle.

If you need a solution to your problem before Stata's fix that
will allow the -Gower- option for -cluster measures-, you might
loop over smaller chunks of your 50-75k observations and use the
-matrix dissim- command to get your results (use -if- or -in- to
restrict to a subset of observations) and then pull the results
from the resulting matrix for each chunk into the portion of the
new variable(s) you wish to create.

Ken Higbee
StataCorp     1-800-STATAPC

*   For searches and help try:

© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index