Hierarchical stability based model selection for clustering algorithms.
We present an algorithm called HS-means, which is able to learn the number of clusters in a mixture model based on the hierarchical analysis of clustering stability. Our method extends the concept of clustering stability to a concept of hierarchical stability. The method estimates a stable model for the data based on analysis of stability; it then analyzes the stability of each component in the estimated model and chooses a stable model for this component. It continues this recursive stability analysis until all the estimated components are unimodal. In so doing, the method is able to handle data symmetry that existing stability based algorithms have difficulty with. We test our algorithm on both synthetic datasets and real world datasets. The results show that HS-means apparently outperforms existing stability based model selection algorithms and is competitive to other often-used model selection methods.