GQEO: Nearest neighbor graph-based generalized quadrilateral element oversampling for class-imbalance problem

Neural Netw. 2024 Dec 27:184:107107. doi: 10.1016/j.neunet.2024.107107. Online ahead of print.

Abstract

The class imbalance problem is one of the difficult factors affecting the performance of traditional classifiers. The oversampling technique is the most common way to solve the class imbalance problem. They alleviate the performance impact of the class imbalance problem on traditional machine learning by augmenting minority instance feature representation. However, many SMOTE-based oversampling techniques perform linear interpolation on the line segment between the anchor instance and its nearest neighbor. This type of method only uses local information and ignores the impact of the global neighborhood relationship on the anchor instance. Therefore, inspired by finite element interpolation, a novel generalized quadrilateral element oversampling technique (GQEO) based on k-nearest neighbor graphs is proposed. First, GQEO uses the k-nearest neighbor to search the global neighbor relationship and build the global neighbor relationship graph. Then, the global neighbor graph is searched for nodes forming generalized quadrilateral elements, using planar quadrilaterals as constraints. Finally, in generalized quadrilateral elements, we use one-dimensional shape functions to synthesize minority instances in quadrilateral elements. Experimental results on 30 imbalanced datasets show that GQEO can alleviate the impact of the class imbalance problem and prevent noise from participating in the synthesis process. GQEO obtains competitive results compared to state-of-the-art oversampling techniques that consider minority noises.

Keywords: Class-imbalance problem; Generalized quadrilateral elements; K-nearest neighbor graph; Oversampling; Shape function.