Hello All,
the start()-option in the kmeans allows specifying how the initial values
for the centroids are generated. I am using the random option. As I
understand, when the command is called the initial values are generated
only once.
What is the simplest way to repeat the command say 1000 times for
different random initial values, and store the clusterings? More
importantly, how do you decide which clustering should be preferred?
Should I choose the one that occurs most frequently? Also, how to detect
repetitions which are identical up to the labels (numbers) of the
clusters?
Thanks you for you help,
Serguei
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/