Study of Multi-Class Classification Algorithms’ Performance on Highly Imbalanced Network Intrusion Datasets

Viktoras Bulavas; Virginijus Marcinkevičius; Jacek Rumiński

doi:10.15388/21-infor457

Study of Multi-Class Classification Algorithms’ Performance on Highly Imbalanced Network Intrusion Datasets

Abstrakt

This paper is devoted to the problem of class imbalance in machine learning, focusing on the intrusion detection of rare classes in computer networks. The problem of class imbalance occurs when one class heavily outnumbers examples from the other classes. In this paper, we are particularly interested in classifiers, as pattern recognition and anomaly detection could be solved as a classification problem. As still a major part of data network traffic of any organization network is benign, and malignant traffic is rare, researchers therefore have to deal with a class imbalance problem. Substantial research has been undertaken in order to identify these methods or data features that allow to accurately identify these attacks. But the usual tactic to deal with the imbalance class problem is to label all malignant traffic as one class and then solve the binary classification problem. In this paper, however, we choose not to group or to drop rare classes but instead investigate what could be done in order to achieve good multi-class classification efficiency. Rare class records were up-sampled using SMOTE method (Chawla et al., 2002) to a preset ratio targets. Experiments with the 3 network traffic datasets, namely CIC-IDS2017, CSE-CIC-IDS2018 (Sharafaldin et al., 2018) and LITNET-2020 (Damasevicius et al., 2020) were performed aiming to achieve reliable recognition of rare malignant classes available in these datasets. Popular machine learning algorithms were chosen for comparison of their readiness to support rare class detection. Related algorithm hyper parameters were tuned within a wide range of values, different data feature selection methods were used and tests were executed with and without over-sampling to test the multiple class problem classification performance of rare classes. Machine learning algorithms ranking based on Precision, Balanced Accuracy Score, G¯ , and prediction error Bias and Variance decomposition, show that decision tree ensembles (Adaboost, Random Forest Trees and Gradient Boosting Classifier) performed best on the network intrusion datasets used in this research.

Cytowania

1 7

CrossRef
0

Web of Science
2 0

Scopus

Autorzy (3)

Cytuj jako

Pełna treść

pobierz publikację

pobrano 28 razy

Wersja publikacji: Accepted albo Published Version
DOI:: Cyfrowy identyfikator dokumentu elektronicznego (otwiera się w nowej karcie) 10.15388/21-INFOR457
Licencja: otwiera się w nowej karcie

pełna treść artykułu zobacz w serwisie zewnętrznym otwiera się w nowej karcie

Słowa kluczowe

Informacje szczegółowe

Kategoria:: Publikacja w czasopiśmie
Typ:: artykuły w czasopismach
Opublikowano w:: Informatica nr 32, strony 441 - 475,
ISSN: 0868-4952
Język:: angielski
Rok wydania:: 2021
Opis bibliograficzny:: Bulavas V., Marcinkevičius V., Rumiński J.: Study of Multi-Class Classification Algorithms’ Performance on Highly Imbalanced Network Intrusion Datasets// INFORMATICA-LITHUAN -, (2021), s.441-475
DOI:: Cyfrowy identyfikator dokumentu elektronicznego (otwiera się w nowej karcie) 10.15388/21-infor457
Weryfikacja:: Politechnika Gdańska

wyświetlono 114 razy

Publikacje, które mogą cię zainteresować

OOA-modified Bi-LSTM network: An effective intrusion detection framework for IoT systems

S. S. Narayana Chintapalli,
S. Prakash Singh,
J. Frnda
+ 3 autorów

2024

Deep Learning-Based Intrusion System for Vehicular Ad Hoc Networks

L. Fei,
Z. Jiayan,
S. Jiaqi
+ 1 autorów

2020

A Comprehensive Analysis of Deep Neural-Based Cerebral Microbleeds Detection System

2021

Performance Analysis of Machine Learning Methods with Class Imbalance Problem in Android Malware Detection

A. G. Akintola,
A. O. Balogun,
H. Mojeed
+ 5 autorów

2022

Meta Tagi

Wyszukiwarka

Study of Multi-Class Classification Algorithms’ Performance on Highly Imbalanced Network Intrusion Datasets

Abstrakt

Cytowania

Autorzy (3)

Viktoras Bulavas

Virginijus Marcinkevičius

Jacek Rumiński prof. dr hab. inż.

Cytuj jako

Pełna treść

Słowa kluczowe

Informacje szczegółowe

Publikacje, które mogą cię zainteresować

OOA-modified Bi-LSTM network: An effective intrusion detection framework for IoT systems

Deep Learning-Based Intrusion System for Vehicular Ad Hoc Networks

A Comprehensive Analysis of Deep Neural-Based Cerebral Microbleeds Detection System

Performance Analysis of Machine Learning Methods with Class Imbalance Problem in Android Malware Detection