[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

From |
khigbee@stata.com |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: Initial grouping variable in Kmeans/Kmedians |

Date |
Mon, 27 Jun 2005 09:52:26 -0500 |

Rob Hall <rob@environmetrics.com.au> asks concerning -cluster kmeans- or -cluster kmedians-: > In SPSS, I can use a matrix of cluster centres as the starting point > for analysis. For example, I eight variables that will be used in the > analysis and I have "target" means for each cluster on each variable. > How can I get Stata to attend to the eight by 'n' matrix rather than > just a single grouping variable? Look at the -start()- option, and in particular -start(lastk, exclude)-. The approach would be to append the target means to the data. -exclude- indicates that these appended data points are not to be clustered, but instead are only to act as starting center points for the algorithm. Let's say that the Stata matrix holding the starting points is X and that you have 8 variables (named a1, a2, a3, ..., a8) and are clustering to 10 groups and that your dataset has 1000 observations. Here is one approach: set the number of observations to 10 (the # of groups you desire) more than the current number of observations. set obs 1010 The newly created observations hold missing values until you fill them with something else. We want to place the values that are in the X matrix into those last observations. forvalues i = 1/8 { forvalues j = 1/10 { local k = 1000 + `j' replace a`i' = X[`j',`i'] in `k' } } If you are new to Stata be careful to notice that I am using a left single quote and a right single quote (different characters) in the quoting around the i, j, and k in the code above. Now I can call -cluster kmeans- cluster kmeans ... , k(10) ... start(lastk, exclude) After the cluster analysis you might wish to remove the bottom observations you added drop in 1001/1010 There are other approaches besides using the -forvalues- loops for getting the starting center information from a matrix to the bottom of your dataset. For instance you might use something like preserve drop _all svmat ... save ... restore append using ... Ken Higbee khigbee@stata.com StataCorp 1-800-STATAPC * * For searches and help try: * http://www.stata.com/support/faqs/res/findit.html * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

- Prev by Date:
**Re: st: RE: Docking and floating windows** - Next by Date:
**Re: st: RE: Docking and floating windows** - Previous by thread:
**st: Initial grouping variable in Kmeans/Kmedians** - Next by thread:
**st: speeding up gllamm** - Index(es):

© Copyright 1996–2015 StataCorp LP | Terms of use | Privacy | Contact us | What's new | Site index |