ML Based Model for NIDS MSc Updated Presentation.v2.pptx

JamalHussainArman 56 views 23 slides Jun 06, 2024
Slide 1
Slide 1 of 23
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23

About This Presentation

ML Based model for NIDS


Slide Content

2024/6/6 M.Sc Thesis Title: Machine Learning Based Model for Network Intrusion Detection

Outlines Introduction Research Gap Methodology of Study Attack Types Types of Intrusion Detection System and Approaches Machine Learning Types 10-Fold Cross-validation Training Model Applied ML Algorithms Performance metrics Results Conclusion and Future work References

Introduction Demand for cyber security and protection. Popularity of the Internet of Things. Confidentiality, integrity and availability of network data [1]. Smart grids are currently using data-driven technology [2]. Firewalls and encryption. For Internet-based cyber-attacks an intrusion detection system (IDS) is better. Global ransom ware damage costs would go beyond $20 billion by 2021 [3]. Global cybercrime operating costs are expected to reach $10.5 trillion annually by 2025 [4].

Research Gaps Traditional methods like firewall and encryption Usage of old datasets Only one evaluation metric i.e. accuracy Problems arising in traditional techniques 4.1 Low accuracy, high error, high false positive rate or low precision 5. F eature reduction

Proposed Solutions Advancement of modern methods like M achine L earning Usage of up-to-date datasets Considered multiple evaluation metrics To design ML-based model for intrusion detection . Optimal model for intrusion detection is most important without changing networking data.

Methodology (Flowchart of the study)

Attack Types Probe Scanning of system Low level attack Denial of Service Prevent the authorized users to get access Continuously engages the system Remote to User Abuse the privileges of a system Release vulnerabilities within network User to Root Attacker or a genuine person with minimal or normal privileges Attacker looks for system flaws

Types of Intrusion Detection System Intrusion detection system (IDS) Analyze and monitor the traffic Malicious activity Protect the computer network Host Based IDS HIDS relies on a single system Keep an eye on a host's internal environment Resources, file system, and programs Network Based IDS Composed of the networks Inbound and outbound traffic patterns [12]

Types of Intrusion Detection Approaches Misuse/Signature Based Attacks have signatures File fingerprints Comparing the signatures of every activity Anomaly Based Looks for unexpected activity Differs from the normal operational baseline Likelihood of detecting novel (zero-day) threats Preferable Hybrid Based Detection rate of zero-day assaults rises Combination of the signature and anomaly approaches Generally deployed as a hybrid arrangement

Machine Learning Types Figure 2. Machine Learning Types

Datasets NSL-KDD Binary class dataset either normal or anomaly Reference for NIDS performance evaluation Updated version of KDD Cup 99 Absence of redundant records and duplicate records UNSW-NB Binary class dataset Contain different types of new attacks Number of real-time normal activities Australian Centre for Cyber Security website Kaggle Multiclass dataset 5 categories: normal, denial of service, r21, probe and u2r [13]

10-Fold Cross-Validation Training Model

Applied ML Algorithms Random Forest Supervised Learning technique Regression and classification problems Combines a number of decision trees Lower the chance of over fitting No need feature scaling Effective on big databases. It is an ensemble classification technique Based on the DT algorithm and provides individual trees as output. This algorithm combines random feature selection with the bagging concept to generate a set of DTs having controlled variances [14]

Random Forest Architecture

Performance Metrics and Mathematical Forms

RF Technique Results NSL-KDD Kaggle UNSW NB Performance Metrics Accuracy 99.9174 99.8857 99.9053 Precision 0.999 0.999 0.999 TPR 0.999 0.999 0.999 FPR 0.001 0.001 0.001 Error Rate 0.0028 0.0014 0.0108 MCC 0.998 0.998 0.998 ROC Area 1 1 1

NSL-KDD Dataset Results Accuracy TPR FPR Precision Error Rate MCC ROC Area Random Forest 99.9174 0.999 0.001 0.999 0.0028 0.998 1 A1DE 99.7952 0.998 0.002 0.998 0.0022 0.996 1 Naïve Bayes 90.3813 0.904 0.101 0.905 0.0965 0.807 0.966 IBK/KNN 90.3813 0.904 0.101 0.905 0.0965 0.807 0.966 AdaBoostM1 94.5044 0.945 0.057 0.945 0.079 0.89 0.988 Random Tree 99.7658 0.998 0.002 0.998 0.0023 0.995 0.998 Decision Stump 92.215 0.922 0.079 0.922 0.1436 0.844 0.92 Hoeffding Tree 98.849 0.988 0.012 0.989 0.0161 0.977 0.995

Kaggle Dataset Results Accuracy TPR FPR Precision Error Rate MCC ROC Area Random Forest 99.8857 0.999 0.001 0.999 0.0014 0.998 1 A1DE 99.792 0.998 0.001 0.998 0.0009 0.997 1 Naïve Bayes 83.3996 0.834 0.046 0.91 0.0665 0.786 0.966 IBK/KNN 99.665 0.997 0.002 0.997 0.0014 0.994 0.997 AdaBoostM1 83.1519 0.832 0.12 0.984 0.153 0.9677 0.952 Random Tree 99.7293 0.997 0.002 0.997 0.0011 0.996 0.998 Decision Stump 83.1519 0.832 0.12 0.984 0.1104 0.9677 0.882 Hoeffding Tree 97.2573 0.973 0.018 0.971 0.0152 0.954 0.989

UNSW-NB Dataset Results Accuracy TPR FPR Precision Error Rate MCC ROC Area Random Forest 99.9053 0.999 0.001 0.999 0.0108 0.998 1 A1DE 99.4243 0.994 0.005 0.994 0.007 0.988 1 Naïve Bayes 76.8243 0.768 0.205 0.802 0.2335 0.572 0.864 IBK/KNN 98.721 0.987 0.013 0.987 0.0128 0.974 0.987 AdaBoostM1 99.3575 0.994 0.008 0.994 0.0373 0.987 0.999 Random Tree 99.2943 0.993 0.007 0.993 0.007 0.986 0.993 Decision Stump 76.6324 0.766 0.286 0.835 0.3287 0.579 0.738 Hoeffding Tree 96.6028 0.966 0.038 0.966 0.0529 0.932 0.981

Conclusion Random Forest best performance 10 Fold Cross Validation 3 Datasets: NSL-KDD, UNSW NB15 and Kaggle RT, A1DE, NB, KNN, AdaBoostM1, DS and HT Futuristic Direction Deep learning models Newer and real-time datasets

References K. NandhaKumar and S. Sukumaran , “A hybrid adaptive development algorithm and machine learning based method for intrusion detection and prevention system,” Turkish J. Comput . Math. Educ. , vol. 12, no. 5, pp. 1226–1236, 2021. S. N. Mohan, G. Ravikumar and M. Govindarasu , "Distributed Intrusion Detection System using Semantic-based Rules for SCADA in Smart Grid," 2020 IEEE/PES Transmission and Distribution Conference and Exposition (T&D), 2020, pp. 1-5. “Global ransomware damage costs to exceed $265 billion by 2031 - EIN presswire .” https://www.einnews.com/pr_news/542950077/global-ransomware-damage-costs-to-exceed-265-billion-by-2031 (accessed Jun. 03, 2022). “Cybercrime to cost the world $10.5 trillion annually by 2025.” https://cybersecurityventures.com/cybercrime-damages-6-trillion-by-2021/ (accessed Jan. 17, 2022). M. Sarnovsky and J. Paralic , “Hierarchical intrusion detection using machine learning and knowledge model,” Symmetry (Basel). , vol. 12, no. 2, pp. 1–14, 2020. M. Shahzad Haroon and H. Mansoor Ali, “Adversarial training against adversarial attacks for machine learning-based intrusion detection systems,” Comput . Mater. Contin . , vol. 73, no. 2, pp. 3513–3527, 2022. S. A. Hussein, A. A. Mahmood and E. O. Oraby , “Network intrusion detection system using ensemble learning approaches,” Webology , vol. 18, no. Special Issue, pp. 962–974, 2021.

References S. Razdan , H. Gupta and A. Seth, “Performance analysis of network intrusion detection systems using j48 and naive bayes algorithms,” 2021 6th Int. Conf. Converg . Technol. I2CT 2021 , pp. 1–7, 2021. Z. Ahmad, A. S. Khan, C. W. Shiang , J. Abdullah and F. Ahmad, “Network intrusion detection system: A systematic study of machine learning and deep learning approaches,” Trans. Emerg . Telecommun . Technol. , vol. 32, no. 1, pp. 1–29, 2021. M. Data and M. Aritsugi , “T-DFNN: An incremental learning algorithm for intrusion detection systems,” IEEE Access , vol. 9, pp. 154156–154171, 2021. R. Panigrahi , S. Borah, A. K. Bhoi , M. F. Ijaz , M. Pramanik et al. , “A consolidated decision tree-based intrusion detection system for binary and multiclass imbalanced datasets,” Mathematics , vol. 9, no. 7, 2021. D. Chou and M. Jiang, “A survey on data-driven network intrusion detection,” ACM Comput . Surv . , vol. 54, no. 9, pp. 1–36, 2022. S. Lee, A. Abdullah, N. Jhanjhi and S. Kok , “Classification of botnet attacks in IoT smart factory using honeypot combined with machine learning,” PeerJ Comput . Sci. , vol. 7, pp. 1–23, 2021. Z. K. Maseer , R. Yusof , N. Bahaman, S. A. Mostafa , and C. F. M. Foozy , “Benchmarking of machine learning for anomaly based intrusion detection systems in the CICIDS2017 dataset,” IEEE Access , vol. 9, pp. 22351–22370, 2021. P. Dini and S. Saponara , “Analysis, design, and comparison of machine-learning techniques for networking intrusion detection,” Designs , vol. 5, no. 1, pp. 1–22, 2021.