Dendrogram

From Wikipedia, the free encyclopedia
Hierarchical clustering dendrogram of the Iris dataset (using R). Source

A dendrogram (from Greek dendro "tree" and gramma "drawing") is a tree diagram frequently used to illustrate the arrangement of the clusters produced by hierarchical clustering.[1] Dendrograms are often used in computational biology to illustrate the clustering of genes or samples, sometimes on top of heatmaps.

Clustering example

For a clustering example, suppose this data is to be clustered using Euclidean distance as the distance metric.

Raw data

The hierarchical clustering dendrogram would be as such:

Traditional representation

The top row of nodes represents data (individual observations), and the remaining nodes represent the clusters to which the data belong, with the arrows representing the distance (dissimilarity).

The distance between merged clusters is monotone increasing with the level of the merger: the height of each node in the plot is proportional to the value of the intergroup dissimilarity between its two daughters (the top nodes representing individual observations are all plotted at zero height).

See also

References

  1. ^ Everitt, Brian (1998). Dictionary of Statistics. Cambridge, UK: Cambridge University Press. p. 96. ISBN 0-521-59346-8. 

External links

  • Iris dendrogram - Example of using a dendrogram to visualize the 3 clusters from hierarchical clustering using the "complete" method vs the real species category (using R).
Retrieved from "https://en.wikipedia.org/w/index.php?title=Dendrogram&oldid=835970563"
This content was retrieved from Wikipedia : http://en.wikipedia.org/wiki/Dendrogram
This page is based on the copyrighted Wikipedia article "Dendrogram"; it is used under the Creative Commons Attribution-ShareAlike 3.0 Unported License (CC-BY-SA). You may redistribute it, verbatim or modified, providing that you comply with the terms of the CC-BY-SA