help cluster dendrogram dialog: cluster dendrogram
-------------------------------------------------------------------------------
Title
[MV] cluster dendrogram -- Dendrograms for hierarchical cluster analysis
Syntax
cluster dendrogram [clname] [if] [in] [, options ]
options description
-------------------------------------------------------------------------
Main
quick do not center parent branches
labels(varname) name of variable containing leaf labels
cutnumber(#) display top # branches only
cutvalue(#) display branches above # (dis)similarity measure
only
showcount display number of observations for each branch
countprefix(string) prefix the branch count with string; default is
``n=''
countsuffix(string) suffix the branch count with string; default is
empty string
countinline put branch count inline with branch label
vertical orient dendrogram vertically (default)
horizontal orient dendrogram horizontally
Plot
line_options affect rendition of the plotted lines
Add plots
addplot(plot) add other plots to the dendrogram
Y axis, X axis, Titles, Legend, Overall
twoway_options any option other than by() documented in
[G] twoway_options
-------------------------------------------------------------------------
Note: cluster tree is a synonym for cluster dendrogram.
In addition to the restrictions imposed by if and in, the observations
are automatically restricted to those that were used in the cluster
analysis.
Menu
Statistics > Multivariate analysis > Cluster analysis > Postclustering >
Dendrograms
Description
cluster dendrogram produces dendrograms (also called cluster trees) for a
hierarchical clustering. See [MV] cluster for a list of the available
cluster commands.
Dendrograms graphically present the information concerning which
observations are grouped together at various levels of (dis)similarity.
At the bottom of the dendrogram, each observation is considered its own
cluster. Vertical lines extend up for each observation, and at various
(dis)similarity values, these lines are connected to the lines from other
observations with a horizontal line. The observations continue to
combine until, at the top of the dendrogram, all observations are grouped
together.
The height of the vertical lines and the range of the (dis)similarity
axis give visual clues about the strength of the clustering. Long
vertical lines indicate more distinct separation between the groups.
Long vertical lines at the top of the dendrogram indicate that the groups
represented by those lines are well separated from one another. Shorter
lines indicate groups that are not as distinct.
Options
+------+
----+ Main +-------------------------------------------------------------
quick switches to a different style of dendrogram in which the vertical
lines only go straight up from the observations instead of the
default action of being recentered after each merge of observations
in the dendrogram hierarchy. Some people prefer this representation,
and it is quicker to render.
labels(varname) indicates that varname is to be used in place of
observation numbers for labeling the observations at the bottom of
the dendrogram.
cutnumber(#) displays only the top # branches of the dendrogram. With
large dendrograms, the lower levels of the tree can become too
crowded. With cutnumber(), you can limit your view to the upper
portion of the dendrogram. Also see the cutvalue() and labcutn
options.
cutvalue(#) displays only those branches of the dendrogram that are above
the # (dis)similarity measure. With large dendrograms, the lower
levels of the tree become too crowded. With cutvalue(), you can
limit your view to the upper portion of the dendrogram. Also see the
cutnumber() and labcutn options.
showcount requests that the number of observations associated with each
branch be displayed below the branches. showcount is most useful
with cutnumber() and cutvalue() because, otherwise, the number of
observations for each branch is one. When this option is specified,
a label for each branch is constructed by using a prefix string, the
branch count, and a suffix string.
countprefix(string) specifies the prefix string for the branch count
label. The default is countprefix(n=). This option implies the
showcount option.
countsuffix(string) specifies the suffix string for the branch count
label. The default is an empty string. This option implies the
showcount option.
countinline requests that the branch count be put inline with the
corresponding branch label. The branch count is placed below the
branch label by default. This option implies the showcount option.
vertical and horizontal specify whether the x and y coordinates are to be
swapped before plotting -- vertical (the default) does not swap the
coordinates, whereas horizontal does.
+------+
----+ Plot +-------------------------------------------------------------
line_options affect the rendition of the lines; see [G] line_options.
+-----------+
----+ Add plots +--------------------------------------------------------
addplot(plot) allows adding more graph twoway plots to the graph; see [G]
addplot_option.
+-----------------------------------------+
----+ Y axis, X axis, Titles, Legend, Overall +--------------------------
twoway_options are any of the options documented in [G] twoway_options,
excluding by(). These include options for titling the graph (see [G]
title_options) and for saving the graph to disk (see [G]
saving_option).
Examples
Setup
. webuse labtech
. cluster generate g3 = group(3)
. cluster completelinkage x1 x2 x3 x4, name(L2clnk)
Draw dendrograms
. cluster dendrogram L2clnk, horizontal labels(labt)
. cluster dendrogram L2clnk, labels(labt) quick
Tree is a synonym for dendrogram; show only top 5 branches
. cluster tree if g3==3, showcount
Show only branches with dissimilarity greater than 75.3
. cluster dendrogram, cutvalue(75.3)
. cluster tree, cutvalue(75.3) showcount countinline
Also see
Manual: [MV] cluster,
[MV] cluster dendrogram
Help: [MV] cluster, [MV] clustermat