Divisive Clustering

What is Divisive Clustering?

Divisive clustering, also known as Divisive Analysis (DIANA), is a 'top-down' hierarchical clustering method. It begins with all data points in a single, large cluster. In each step, the most heterogeneous cluster is split into two smaller, more cohesive clusters. This process is repeated recursively until every data point resides in its own cluster or a stopping criterion is met. The result is a tree-like structure called a dendrogram, which illustrates the hierarchy of clusters.

Where did the term "Divisive Clustering" come from?

As a hierarchical clustering method, its roots are in the field of data analysis and taxonomy. It was developed as an alternative to the more common 'bottom-up' agglomerative approach. While computationally more expensive, as it requires considering all possible splits at each step, it can be more accurate in capturing the global structure of the data, as it starts with a complete overview.

How is "Divisive Clustering" used today?

Divisive clustering is less common in practice than agglomerative clustering due to its computational complexity (O(2^n)). However, it is used in fields like bioinformatics for gene expression analysis and in market research for high-level market segmentation, where understanding the primary divisions in the data is more important than the fine-grained structure. The 'monothetic divisive' variant, which splits clusters using only one variable at a time, is a more computationally feasible approach.

Related Terms