Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: Cluster analysis variables - bug?


From   Steven Joel Hirsch Samuels <[email protected]>
To   [email protected]
Subject   Re: st: Cluster analysis variables - bug?
Date   Wed, 8 Aug 2007 14:16:20 -0400

--

The name option names group variable. In Stata 9.2, only one group variable is created at a time. Did you run a previous cluster analysis on the same data, possibly different variables, with three groups and "name(cluster)"? If so, that would account for your finding.

--Steve

Example:

. sysuse auto
(1978 Automobile Data)

. cluster kmeans mpg weight displacement, k(3) name(cluster01)

. cluster list cluster01
cluster01 (type: partition, method: kmeans, dissimilarity: L2)
vars: cluster01 (group variable)
other: k: 3
start: krandom
range: 0 .
cmd: cluster kmeans mpg weight displacement, k(3) name (cluster01)
varlist: mpg weight displacement


. tab cluster01

cluster01 | Freq. Percent Cum.
------------+-----------------------------------
1 | 15 20.27 20.27
2 | 39 52.70 72.97
3 | 20 27.03 100.00
------------+-----------------------------------
Total | 74 100.00

. ds
make mpg headroom weight turn gear_ratio cluster01
price rep78 trunk length displacement foreign





On Aug 8, 2007, at 12:56 PM, Ricardo Ovaldia wrote:


Thank you Nick. But I am not sure that helps. Both
variables are discrete taking the values 1, 2 and 3.
(I specific 3 clusters). Every value of -C1- appears
for every value of -cluster-. I am at lost.


           |                c1
   cluster |         1          2          3 |
Total
-----------+---------------------------------+----------
         1 |        71         90         17 |
178
         2 |         4        200        596 |
800
         3 |       123        164          5 |
292
-----------+---------------------------------+----------
     Total |       198        454        618 |
1,270


--- Nick Cox <[email protected]> wrote:

A scatter plot of the two variables may
throw light on this.

Nick
[email protected]

Ricardo Ovaldia

This may be a very simple question, but I can't
find
the answer. Doing a cluster analysis I typed:

. cluster kmeans  q4_*, k(3) name(c1)
s(kr(67492))
I get two new variables -c1- and -cluster-.
These
are
different. Could someone please tell me what is
the
difference between these two variables? Which
one
is
the cluster indicator.
*
*   For searches and help try:
*
http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


Ricardo Ovaldia, MS
Statistician
Oklahoma City, OK




______________________________________________________________________ ______________

Need a vacation? Get great deals
to amazing places on Yahoo! Travel.
http://travel.yahoo.com/
*
*   For searches and help try:
*
http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


Ricardo Ovaldia, MS
Statistician
Oklahoma City, OK


______________________________________________________________________ ______________
Fussy? Opinionated? Impossible to please? Perfect. Join Yahoo!'s user panel and lay it on us. http://surveylink.yahoo.com/gmrs/ yahoo_panel_invite.asp?a=7

*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/
Steven Joel Hirsch Samuels

[email protected]
18 Cantine's Island
Saugerties, NY 12477
Phone: 845-246-0774
EFax: 208-498-7441




*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index