Home  /  Products  /  Features  /  Cluster analysis
ORDER STATA

Cluster analysis

Hierarchical clustering

  • Single linkage
  • Complete linkage
  • Average linkage
  • Ward’s linkage (including Ward’s method)
  • Weighted-average linkage
  • Centroid linkage
  • Median linkage

Nonhierarchical

  • Kmeans
  • Kmedians

Cluster on observations

    Cluster using any dissimilarity matrix

      Dendrograms

      • Full trees
      • Subtrees
      • Upper portion of tree
      • Vertical or horizontal orientation
      • Branch counts

      Stopping rules

      • Calínski and Harabasz pseudo-F index
      • Duda and Hart Je(2)/Je(1) index

      Support tools

      • Generate summary and grouping variables
      • Attach notes to analyses

      Similarity/dissimilarity measures for continuous data

      • L2/Euclidean
      • L1/absolute/cityblock/manhattan
      • L(#)
      • Canberra
      • Correlation
      • Angular

      Similarity/dissimilarity measures for binary data

      • Matching
      • Jaccard
      • Russell
      • Hamann
      • Dice
      • Antidice
      • Sneath
      • Rogers
      • Ochiai
      • Yule
      • Anderberg
      • Kulczynski
      • Gower2
      • Pearson

      Gower measure for mixed binary and continuous data

        Result-management utilities

        • Directory-style listing
        • Detailed listing of clusters
        • Drop cluster analyses
        • Mark a cluster analysis as the most recent one
        • Rename a cluster

        User-extensible commands

        • Ability to add new clustering methods and utilities
        • Full set of tools to ease making additions

        Additional resources

        See New in Stata 18 to learn about what was added in Stata 18.