K-Means Cluster Based Undersampling Ensemble for Imbalanced Data Classification
S. Santha Subbulaxmi1, G. Arumugam2
1Ms. S. Santha Subbulaxmi,* Research Scholar, Madurai Kamaraj University, Madurai, Tamil Nadu, India.
2Dr. G. Arumugam, Professor & Head, Department of (Retd.), Computer Science, Madurai Kamaraj University, Madurai, Tamil Nadu, India.
Manuscript received on January 06, 2020. | Revised Manuscript received on February 05, 2020. | Manuscript published on February 30, 2020. | PP: 2074-2079 | Volume-9 Issue-3, February 2020. | Retrieval Number: C5188029320/2020©BEIESP | DOI: 10.35940/ijeat.C5188.029320
Open Access | Ethics and Policies | Cite | Mendeley
© The Authors. Blue Eyes Intelligence Engineering and Sciences Publication (BEIESP). This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)
Abstract: Imbalanced data classification is a critical and challenging problem in both data mining and machine learning. Imbalanced data classification problems present in many application areas like rare medical diagnosis, risk management, fault-detection, etc. The traditional classification algorithms yield poor results in imbalanced classification problems. In this paper, K-Means cluster based undersampling ensemble algorithm is proposed to solve the imbalanced data classification problem. The proposed method combines K-Means cluster based undersampling and boosting method. The experimental results show that the proposed algorithm outperforms the other sampling ensemble algorithms of previous studies.
Keywords: imbalanced data, classification, undersampling, ensemble; k-means clustering.