Statalist The Stata Listserver


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: Cluster analysis - cluster kmeans-


From   khigbee@stata.com
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: Cluster analysis - cluster kmeans-
Date   Fri, 11 May 2007 16:36:16 -0500

Herve STOLOWY <stolowy@hec.fr> asks:

> I have a group of 21 observations with one variable (a score) and
> would like to create three "homogeneous" groups.
> 
> I found the -cluster kmeans- command. Here are my command lines:
> 
> gsort - finance_aggregate
> cluster kmeans finance_aggregate, k(3)
> 
> Each time I run these commands, I get a different result (i.e., a
> different clustering: the three groups are different). I looked
> at the help file but don't understand. (It might be related to
> the start option but I am not sure).
> 
> Is there a way to obtain the same result everytime?

You can -set seed 183289- (or any other number you like) before
each call of -cluster kmeans- so that the same set of random
starting values are selected each time.  Or, as you were
guessing, you can use the -start()- option to do the same thing
(with several suboptions controlling the k starting groups), see
-help cluster kmeans- for details.

SR Millis <srmillis@yahoo.com> said:

> You're going to need more than 1 variable. Cluster
> analysis is a multivariable technique.  In addition, a
> sample size of only 21 is often too small for cluster
> analysis.

While cluster analysis is a multivariate technique, it will work
with a single variable also.  That is no problem.  Having only 21
observations might or might not be a problem.  It depends on the
data.  After you do your cluster analysis you might want to look
at some summaries or graphs of the resulting three groups.

    . set seed 12345
    . cluster kmeans myvar, k(3) name(myclus)
    . bysort myclus: summarize myvar
    . twoway dot myvar myclus

and possibly also

    . cluster stop

(or similarly -anova myvar myclus-) to get a feel for how
distinct the groups are.

Ken Higbee    khigbee@stata.com
StataCorp     1-800-STATAPC

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index