Imbalanced Multiclass Data Classification Using Combined Data Sampling and Deep Learning Method

Authors

  • Sivasankari Shunmugasundaram Research Scholar

DOI:

https://doi.org/10.12723/mjs.sp2.11

Keywords:

Imbalanced Data, Deep Learning, Stratified Sampling, Optimisation Algorithm, Combined Random Oversampling

Abstract

Multiclass Classification for finding pattern refers to classifying each data to part of the classes or labels that are generally more than two. The foremost challenge in classifying is with imbalanced data   that have large portion data known to be majority class, and small portion known as minority class that leads to poor understanding of samples and less accurate results. The existing works discussed Random Upsampling, Random Downsampling, SMOTE methods individually with FeedForward Neural Network and found Random Oversampling gave better results  .However, it generates more duplicate data and has less accuracy. Hence , this research work put forward Combined Random Over-Under Sampling approach that was preprocessed prior with Replacing Missing value with mean, Feature selection, Noise Filtering. Meanwhile this work extends the existing FeedForward Neural Network to Deep Learning . The proposed work is implemented in Rapidminer tool, assessed with appropriate evaluation measures for training and testing data individually.

References

Daniel T. Larose , Chantal D. Larose, “Data Mining and Predictive Analytics”, Second Edition, Wiley Publishing, (2015).

Hussain Ahmad Madni; Zahid Anwar; Munam Ali Shah, “Data mining techniques and applications A decade review”, 23rd International Conference on Automation and Computing (ICAC), IEEE publication, (2017).

JafarTanha, YousefAbdi, NeginSamadi, NazilaRazzaghi, Mohammad Asadpour, “Boosting methods for multi‑class imbalanced data classifcation: an experimental review”, Journal of Big Data, pp.1-47, (2020).

Pradeep Kumar, RoheetBhatnagar, Kuntal Gaur, AnuragBhatnagar, “Classification of Imbalanced Data:Review of Methods and Applications”, IOP Publishing, pp.1-9, (2021).

Max Kuhn, Kjeli Johnson, “Applied Predictive Modeling”, Springer, First edition, pp.-419, (2018).

Yunqian Ma, Haibo He, “Imbalanced Learning: Foundations, Algorithms, and Applications”, Wiley, pp.- 16, (2013).

Fernandez, “Learning from Imbalanced Datasets”, Springer, First Edition, pp -19, (2018).

Bartoszkrawczyk, “Learning from imbalanced data: open challenges and future directions”, Springer, pp-221-232, (2016).

Haseeb Ali, MohdNajibMohdSalleh, Rohmat Saedudin3, KashifHussain, Muhammad FaheemMushtaq, “Imbalance

class problems in data mining: a review”, Indonesian Journal of Electrical Engineering and Computer Science Vol. 14, No.

, pp. 1560-1571, (2019).

FarhanUllah, Bofeng Zhang, RehanUllah Khan, Tae-Sun Chung, Muhammad Attique, Khalil Khan, Salim El Khediri,

Sadeeq Jan, “Deep Edu: A Deep Neural Collaborative Filtering for Educational Services Recommendation”, IEEE Access, volume 8, pp.110915-110928, (2020).

Mustafa Bogal , KerimKürşatÇevik, AykutBurgut, “Classifying Milk Yield Using Deep Neural Network”, Pakistan J. Zool., vol. 52(4), pp 1319-1325, (2020).

ChittemLeela Krishna Dr. PoliVenkataSubbaRedd, “An Efficient Deep Neural Network Multilayer Perceptron Based Classifier in Healthcare System”, 3rd International Conference on Computing and Communication Technologies ICCCT 2019, IEEE, pp.1-6, (2019).

Máximo Eduardo Sánchez-Gutiérrez, Pedro Pablo González- Pérez, “Multi-Class Classification of Medical Data Based

on Neural Network Pruning and Information-Entropy Measures”, Entropy, MDPI, pp.1-20, (2022).

Waleed A. Almutairi and RyszardJanicki, “On relationships between imbalance and overlapping of datasets”, Proceedings of 35th International Conference on Computers and Their Applications, EPiC Series in Computing Volume

, pp. 141–150, (2020).

ShujuanWang, Yuntao Dai, JihongShen&JingxueXuan, “Research on expansion and classifcation of imbalanced data based on SMOTE algorithm”, Scientific Reports, (2021).

Johnson, J.M., Khoshgoftaar, T.M. Survey on deep learning with class imbalance. Journal of Big Data 6, 27, (2019).

Dhanalakshmi .D , Anna SaroVijendran, “Adaptive Data Structure Based Oversampling Algorithm for Ordinal Classification”, Indonesian Journal of Electrical Engineering and Computer Science Vol. 12, No. 3, , pp. 1063~1070, (2018).

Justin M. Johnson, Taghi M. Khoshgoftaar, “Survey on deep learning with class imbalance”, Journal of Bigdata, (2019).

ThaerThaher, Rashid Jayousi, “Prediction of Student’s Academic Performance using Feedforward Neural Network Augmented with Stochastic Trainers”, 14th International Conference on Application of Information and Communication Technologies, IEEE, pp.1-7, (2020).

Sivasankari .S, Dr.S.Sukumaran, Dr.S.Muthumarilakshmi, “Deep Learning based Weight Guided Wrapper Feature

Subset Method for Multiclass Data Classification”, NeuroQuantology, Volume 20, Issue 16, , pp.5675-5686, (2022).

Additional Files

Published

2023-12-27