Statalist The Stata Listserver


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: "testing" a cluster analysis


From   Steven Samuels <ssamuels@albany.edu>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: "testing" a cluster analysis
Date   Wed, 7 Feb 2007 11:31:05 -0500

I agree with Nick's point about the distinct outcomes. I would list all combinations of 2, 3, 4 ,5, 6, & 7 variables (start with 7). This is easily done after -contract- and -fillin-. Changing the order of the variables in the list might be revealing. This approach will also show you associations: what responses almost always or almost never appear together. Validation of any "clusters" that you detect would require that you set aside a validation sample.

Latent class analysis is another approach that might be considered.

Steve

On Feb 7, 2007, at 9:59 AM, Nick Cox wrote:

I think Ronán and Ken made excellent points.

I am also queasy about this for quite a different
reason. As I understand it, you have a discrete
outcome space with 2^7 = 128 distinct outcomes.
I am not clear that this lends itself to cluster
analysis, nor would calculating means be what springs
to my mind as natural.

In principle, you lose no information by tabulating the
frequencies of those 128 composite outcomes and sorting
the table.


*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/




© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index