
Enhancing network security through machine learning: A study on intrusion detection system using supervised algorithms
- 1 Virginia Polytechnic Institute and State University
- 2 Campbellsville University
- 3 Hebei Zhengzhong high school
* Author to whom correspondence should be addressed.
Abstract
The topic of Intrusion Detection System (IDS) has become a highly debated issue in cybersecurity, generating intense discussions among experts in the field. IDS can be broadly categorized into two types: signature-based and anomaly-based. Signature-based IDS employ a collection of known network attacks to identify the precise attack the network is experiencing, while anomaly-based IDS employ machine learning models to detect anomalies present in the network traffic that could indicate a potential attack. In this study, we concentrate on anomaly-based IDS, evaluating the effectiveness of three supervised learning algorithms - Decision Tree (DT), Naive Bayes (NB), and K-Nearest Neighbor (KNN) - to determine the most suitable algorithm for each dataset based on its source. We conducted tests to evaluate each algorithm's performance and choose the best one for each dataset. Our findings show that anomaly-based IDS is highly effective in enhancing network security, providing valuable insights for organizations looking to improve their security measures.
Keywords
Network Intrusion Detection, Decision Tree, Naive Bayes, K-Nearest Neighbor
[1]. Denis Rangelov, Philipp Lämmel, Lisa Brunzel, Stephan Borgert, Paul Darius, Nikolay Tcholtchev, & Michell Boerger. (2023). Towards an Integrated Methodology and Toolchain for Machine Learning-Based Intrusion Detection in Urban IoT Networks and Platforms. Future Internet, 15(98), 98. https://doi.org/10.3390/fi15030098
[2]. Rakas, S. V. B., Stojanovic, M. D., & Markovic-Petrovic, J. D. (2020). A Review of Research Work on Network-Based SCADA Intrusion Detection Systems. IEEE Access, Access, IEEE, 8, 93083–93108. https://doi.org/10.1109/ACCESS.2020.2994961
[3]. Dhanya, K. A., Vajipayajula, S., Srinivasan, K., Tibrewal, A., Kumar, T. S., & Kumar, T. G. (2023). Detection of Network Attacks using Machine Learning and Deep Learning Models. Procedia Computer Science, 218, 57–66. https://doi.org/10.1016/j.procs.2022.12.401
[4]. Sarhan, M., Layeghy, S., & Portmann, M. (2022). Towards a standard feature set for network intrusion detection system datasets. Mobile networks and applications, 1-14.
[5]. VanderPlas, J. (2023). Python Data Science Handbook. O'Reilly Media, Inc.
[6]. Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., & Duchesnay, É. (1970, January 1). Scikit-Learn: Machine learning in Python. Journal of Machine Learning Research. Retrieved May 2, 2023, from https://jmlr.csail.mit.edu/papers/v12/pedregosa11a.html
[7]. P. Ferreira, D. C. Le and N. Zincir-Heywood, "Exploring Feature Normalization and Temporal Information for Machine Learning Based Insider Threat Detection," 2019 15th International Conference on Network and Service Management (CNSM), Halifax, NS, Canada, 2019, pp. 1-7, doi: 10.23919/CNSM46954.2019.9012708.
[8]. Kanimozhi, V., & Jacob, P. (2019). UNSW-NB15 dataset feature selection and network intrusion detection using deep learning. Int. J. Recent Technol. Eng, 7, 443-446.
[9]. Kanimozhi, V., & Jacob, P. (2019). UNSW-NB15 dataset feature selection and network intrusion detection using deep learning. Int. J. Recent Technol. Eng, 7, 443-446.
[10]. WS, J. D. S., & Parvathavarthini, B. (2020, July). Machine learning based intrusion detection framework using recursive feature elimination method. In 2020 International Conference on System,
[11]. Mati W P. Transferability of Intrusion Detection Systems Using Machine Learning between Networks[D]. University of Windsor (Canada), 2022.
[12]. Nikhitha, M., & Jabbar, M. A. (2019). K nearest neighbor based model for intrusion detection system. Int. J. Recent Technol. Eng, 8(2), 2258-2262.
[13]. Li, W., Yi, P., Wu, Y., Pan, L., & Li, J. (2014). A new intrusion detection system based on KNN classification algorithm in wireless sensor network. Journal of Electrical and Computer Engineering, 2014.
[14]. Rao, B. B., & Swathi, K. (2017). Fast kNN classifiers for network intrusion detection systems. Indian Journal of Science and Technology, 10(14), 1-10.
[15]. Mishra, S. P., Sarkar, U., Taraphder, S., Datta, S., Swain, D., Saikhom, R., ... & Laishram, M. (2017). Multivariate statistical data analysis-principal component analysis (PCA). International Journal of Livestock Research, 7(5), 60-78.
[16]. Chauhan, N. S. (2022, April 8). Naïve Bayes Algorithm: Everything you need to know. KDnuggets. Retrieved April 21, 2023, from https://www.kdnuggets.com/2020/06/naive-bayes-algorithm-everything.html
[17]. Wickramasinghe, I., & Kalutarage, H. (2021). Naive Bayes: applications, variations and vulnerabilities: a review of literature with code snippets for implementation. Soft Computing, 25(3), 2277-2293.
[18]. Yassin, W., Udzir, N. I., Muda, Z., & Sulaiman, M. N. (2013). Anomaly-based intrusion detection through k-means clustering and naives bayes classification.
[19]. Krause B. Questionable Research Practices–Anmerkungen zur aktuellen Diskussion[C]//Empirische Evaluationsmethoden Band 22 Workshop 2017. 109.
[20]. Bates, S., Hastie, T., & Tibshirani, R. (2023). Cross-validation: what does it estimate and how well does it do it?. Journal of the American Statistical Association, (just-accepted), 1-22.
[21]. Wang Z, Liang M, Delahaye D. Data-driven conflict detection enhancement in 3d airspace with machine learning[C]//2020 International Conference on Artificial Intelligence and Data Analytics for Air Transportation (AIDA-AT). IEEE, 2020: 1-9.
[22]. Perez, D., Astor, M. A., Abreu, D. P., & Scalise, E. (2017, September). Intrusion detection in computer networks using hybrid machine learning techniques. In 2017 XLIII Latin American Computer Conference (CLEI) (pp. 1-10). IEEE.
[23]. Search UNB. University of New Brunswick est.1785. (2018). Retrieved April 21, 2023, from https://www.unb.ca/cic/datasets/ids-2018.html
[24]. The UNSW-NB15 Dataset. The UNSW-NB15 Dataset | UNSW Research. (2021, June 2). Retrieved April 23, 2023, from https://research.unsw.edu.au/projects/unsw-nb15-dataset
Cite this article
Huang,Z.;Li,Z.;Zhang,J. (2023). Enhancing network security through machine learning: A study on intrusion detection system using supervised algorithms. Applied and Computational Engineering,19,50-66.
Data availability
The datasets used and/or analyzed during the current study will be available from the authors upon reasonable request.
Disclaimer/Publisher's Note
The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of EWA Publishing and/or the editor(s). EWA Publishing and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.
About volume
Volume title: Proceedings of the 5th International Conference on Computing and Data Science
© 2024 by the author(s). Licensee EWA Publishing, Oxford, UK. This article is an open access article distributed under the terms and
conditions of the Creative Commons Attribution (CC BY) license. Authors who
publish this series agree to the following terms:
1. Authors retain copyright and grant the series right of first publication with the work simultaneously licensed under a Creative Commons
Attribution License that allows others to share the work with an acknowledgment of the work's authorship and initial publication in this
series.
2. Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the series's published
version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial
publication in this series.
3. Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and
during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See
Open access policy for details).