I think Ronán and Ken made excellent points.
I am also queasy about this for quite a different
reason. As I understand it, you have a discrete
outcome space with 2^7 = 128 distinct outcomes.
I am not clear that this lends itself to cluster
analysis, nor would calculating means be what springs
to my mind as natural.
In principle, you lose no information by tabulating the
frequencies of those 128 composite outcomes and sorting
the table.
However, I may be mis-reading your problem. For example,
there is a hint in this second posting -- but not in
the first -- that the real interest is in how yet
other variables relate to these magnificent seven.
(If not, what has OLS to do with seven binary indicators?)
Nick
n.j.cox@durham.ac.uk
Adam Seth Litwin
> Your point is well-taken, and conventional
> hypothesis test might
> not be the best tool. I have already analyzed the data more
> formally with
> OLS, but one of my advisors suggested I see how the
> observations cluster
> with respect to these seven binary indicators. So, I started
> playing around
> with different techniques for clustering observations. Now,
> I am trying to
> decide--"scientistically," I realize--just how
> well-defined/tight/distinct
> the clusters would be from one another if I clustered the data into 5
> clusters. (Then, I might do the same thing with fewer or
> more clusters.) I
> am not truly testing a hypothesis; I am looking for some
> basis on which to
> decide just how many clusters there may be in the dataset...
>
> Does that make the original question any more valid, and if
> so, is there a
> way to do what I'm thinking...either by examining means, as I
> suggested, or
> some better way?
>
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/