Abstract
Imbalanced data classification is a challenging task in real applications. In this work. A method is proposed for image classification using imbalanced distribution of classes. The proposed method extracts an appropriate feature vector which can be fed into a simple classifier such as support vector machine (SVM) for classification. At first, the clustering-based feature extraction (CBFE) is applied to initially reduce the feature space. Then, the relationships among each considered sample with the minority classes and majority classes are explored through constructing two individual graphs. The Laplacian matrices of these graphs are combined to provide a dedicated projection matrix for each sample under test while preserve the manifold structure of minority and majority classes. The initial feature reduction just needs the first order data statistics and the second graph-based projection is unsupervised. Moreover, SVM is not sensitive to the training set size. So, the proposed method is not only efficient for imbalanced data classification but also it is efficient for data classification in small sample size situations. Different subsets of SVHN and CIFAR-10 datasets with various imbalance ratios are used for experiments. The results show superior performance of the proposed method compared to SVM and convolutional neural network (CNN). No data augmentation or oversampling is done for balancing the data distribution in the proposed method. But, due to weak performance of CNN with imbalanced datasets, the image augmentation is applied for CNN as the competitor. However, the proposed method without data augmentation provides better results than CNN with data augmentation.