[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: creating Hierarchical cluster analysis with a different measure ofdistance

From	Allan Garland <[email protected]>
To	[email protected]
Subject	st: creating Hierarchical cluster analysis with a different measure ofdistance
Date	Sun, 17 Jul 2005 16:53:31 -0400

Is there a relatively easy way to implement a hierarchical cluster analysis in Stata 9 on the variables (not the observations), using a different measure of distance between the variables?

The "clv" program appears to use an approach to assessing the distances similar to that of principal components. Using the built-in cluster commands on the variables requires transposing the rows and columns of the data. I looked at that routine and it wasn't simple enough (for me) to see how to alter Jean-Benoit Hardouin's code to do this (I'm an intermediate at program writing).
In any case, what I want to implement in Stata is what Frank Harrell describes in his textbook (FE Harrell Jr. (2001). Regression Modeling Strategies. New York, Springer) where he promotes the value of doing HCA (for data reduction) using as a measure of distance/similarity between variables the Hoeffding's D (W Hoeffding. A Non-Parametric Test of Independence. Annals of Mathematical Statistics 19(4):546-557, 1948). I've written code that calculates the matrix of D's between all pairs of variables, and am HOPING that someone can point me in the direction of Stata programming code that will let me do something simple --- i.e. just plug the matrix of D's in and thus obtain a HCA.

Any ideas would be most welcome.

Allan Garland

*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/

Follow-Ups:
- st: comparing the values at different observations
  - From: sarah smith <[email protected]>

Prev by Date: AW: st: xtlogit (re) and logit: same results
Next by Date: st: efficiency of coding
Previous by thread: st: xtlogit (re) and logit: same results
Next by thread: st: comparing the values at different observations
Index(es):
- Date
- Thread