Statalist The Stata Listserver

[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: "testing" a cluster analysis

From   "Adam Seth Litwin" <>
Subject   Re: st: "testing" a cluster analysis
Date   Wed, 07 Feb 2007 16:58:13 -0500

This was all great advice. Thank you very much.

You are all correct in that I wanted a way I could eyeball the data for theory-driven "clusters." So, I built the 128-cell table using the -contract- command. (Use of the zero option meant I did not need the -fillin- command.) This is exactly what I needed.

From: Steven Samuels <>
Subject: Re: st: "testing" a cluster analysis
Date: Wed, 7 Feb 2007 11:31:05 -0500

I agree with Nick's point about the distinct outcomes. I would list all combinations of 2, 3, 4 ,5, 6, & 7 variables (start with 7). This is easily done after -contract- and -fillin-. Changing the order of the variables in the list might be revealing. This approach will also show you associations: what responses almost always or almost never appear together. Validation of any "clusters" that you detect would require that you set aside a validation sample.

Latent class analysis is another approach that might be considered.


On Feb 7, 2007, at 9:59 AM, Nick Cox wrote:

I think Ronán and Ken made excellent points.

I am also queasy about this for quite a different
reason. As I understand it, you have a discrete
outcome space with 2^7 = 128 distinct outcomes.
I am not clear that this lends itself to cluster
analysis, nor would calculating means be what springs
to my mind as natural.

In principle, you lose no information by tabulating the
frequencies of those 128 composite outcomes and sorting
the table.

* For searches and help try:
Turn searches into helpful donations. Make your search count.

* For searches and help try:

© Copyright 1996–2017 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index