Minimum Distribution Support Vector Clustering

Entropy (Basel). 2021 Nov 8;23(11):1473. doi: 10.3390/e23111473.

Abstract

Support vector clustering (SVC) is a boundary-based algorithm, which has several advantages over other clustering methods, including identifying clusters of arbitrary shapes and numbers. Leveraged by the high generalization ability of the large margin distribution machine (LDM) and the optimal margin distribution clustering (ODMC), we propose a new clustering method: minimum distribution for support vector clustering (MDSVC), for improving the robustness of boundary point recognition, which characterizes the optimal hypersphere by the first-order and second-order statistics and tries to minimize the mean and variance simultaneously. In addition, we further prove, theoretically, that our algorithm can obtain better generalization performance. Some instructive insights for adjusting the number of support vector points are gained. For the optimization problem of MDSVC, we propose a double coordinate descent algorithm for small and medium samples. The experimental results on both artificial and real datasets indicate that our MDSVC has a significant improvement in generalization performance compared to SVC.

Keywords: dual coordinate descent; margin theory; mean; support vector clustering; variance.