17.7.3.2 Interpreting Results of Hierarchical Cluster Analysis


The Hierarchical Cluster Report Sheet

Descriptive Statistics

The descriptive statistics table is useful in observing means and standard deviations of variables, which can reveal whether the variables are measured on different scales. In this case, means and standard deviations are very different, so we may want to standardize the variables during analysis.

Distance Matrix

The distance matrix provides the actual distances, which reveals the similarities computed for any pair of observations and variables.

Cluster Stages

The cluster stages table details how observations and variables are clustered. It can be used when there are only a few variables and observations. If the sample size is large, we recommend you use the dendrogam, which visualizes the cluster stage.

Dendrogram

The dendrogram is the most important result of cluster analysis. It lists all samples and indicates at what level of similarity any two clusters were joined. The position of the line on the scale indicates the distance at which clusters were joined. The dendrogram is also a useful tool for determining the cluster number. Note any sudden increase in the difference between adjacent steps, as it will indicate an appropriate number of clusters to consider. Please see the tutorial for details.

Using the Hierarchical Cluster Analysis dialog (hcluster), you can opt to output a phylogenetic tree with selectable nodes that can be manipulated via a shortcut menu:

Phylogenetic tree select node.png
  • Use Swap Subtrees to swap clusters immediately below the current selected node.
  • Reroot with This Node is disabled for hcluster dialog output.
  • Reset Tree will undo all swapping and restore the original tree structure.
  • Duplicate Branch to New Window outputs the currently selected cluster(s) to a new graph window.

Cluster Membership

The cluster memberships provides detailed group structure after classification