Instance categorization by support vector machines to adjust weights in AdaBoost for imbalanced data classification
- Citation data:
Information Sciences, ISSN: 0020-0255, Vol: 381, Page: 92-103
- Publication Year:
- Computer Science; Engineering; Mathematics; Decision Sciences
To address class imbalance in data, we propose a new weight adjustment factor that is applied to a weighted support vector machine (SVM) as a weak learner of the AdaBoost algorithm. Different factor scores are computed by categorizing instances based on the SVM margin and are assigned to related instances. The SVM margin is used to define borderline and noisy instances, and the factor scores are assigned to only borderline instances and positive noise. The adjustment factor is then employed as a multiplier to the instance weight in the AdaBoost algorithm when learning a weighted SVM. Using 10 real class-imbalanced datasets, we compare the proposed method to a standard SVM and other SVMs combined with various sampling and boosting methods. Numerical experiments show that the proposed method outperforms existing approaches in terms of F-measure and area under the receiver operating characteristic curve, which means that the proposed method is useful for relaxing the class-imbalance problem by addressing well-known degradation issues such as overlap, small disjunct, and data shift problems.