Searching for just a few words should be enough to get started. If you need to make more complex queries, use the tips below to guide you.
Article type: Research Article
Authors: Wu, Qina | Lin, Yapinga; * | Zhu, Tuanfeib | Zhang, Yuea
Affiliations: [a] College of Information Science and Engineering, Hunan University, Changsha, P.R. China | [b] College of Computer Engineering and Applied Mathematics, Changsha University, Changsha, China
Correspondence: [*] Corresponding author. Yaping Lin, College of Information Science and Engineering, Hunan University, Changsha, Hunan Province, 410082, P.R. China. E-mail: [email protected].
Abstract: Learning from high-dimensional imbalanced data is prevalent in many vital real-world applications, which poses a severe challenge to traditional data mining and machine learning algorithms. The existing works generally use dimension reduction methods to deal with the curse of dimensionality, then apply traditional imbalance learning techniques to combat the problem of class imbalance. However, dimensionality reduction may cause the loss of useful information, especially for the minority classes. This paper introduces an ensemble-based method, HIBoost, to directly handle the imbalanced learning problem in high dimensional space. HIBoost takes into account the inherent high-dimensional hubness phenomenon, i.e., high-dimensional data tends to contain the singular points (hubs and anti-hubs) which frequently or rarely occur in k-nearest neighbors of other points. For the singular hubs and anti-hubs induced by high dimension, HIBoost introduces a discount factor to restrict the weight growth of them in the process of updating weight, so that the risk of over fitting can be reduced when training component classifiers. For class imbalance problem, HIBoost uses SMOTE to balance the training data in each iteration so as to alleviate the prediction bias of component classifiers. Experimental results based on sixteen high-dimensional imbalanced data sets demonstrate the effectiveness of HIBoost.
Keywords: Hubness, class imbalance, high dimension, SMOTE, Ada Boost
DOI: 10.3233/JIFS-190821
Journal: Journal of Intelligent & Fuzzy Systems, vol. 39, no. 1, pp. 133-144, 2020
IOS Press, Inc.
6751 Tepper Drive
Clifton, VA 20124
USA
Tel: +1 703 830 6300
Fax: +1 703 830 2300
[email protected]
For editorial issues, like the status of your submitted paper or proposals, write to [email protected]
IOS Press
Nieuwe Hemweg 6B
1013 BG Amsterdam
The Netherlands
Tel: +31 20 688 3355
Fax: +31 20 687 0091
[email protected]
For editorial issues, permissions, book requests, submissions and proceedings, contact the Amsterdam office [email protected]
Inspirees International (China Office)
Ciyunsi Beili 207(CapitaLand), Bld 1, 7-901
100025, Beijing
China
Free service line: 400 661 8717
Fax: +86 10 8446 7947
[email protected]
For editorial issues, like the status of your submitted paper or proposals, write to [email protected]
如果您在出版方面需要帮助或有任何建, 件至: [email protected]