Dimensionality Reduction Based Intrusion Detection System in Cloud Computing Environment Using Machine Learning
Main Article Content
Abstract
The rapid expansion of cloud computing has been tempered by concerns surrounding privacy and security. To tackle these issues, intrusion detection systems with machine learning techniques are increasingly being deployed in cloud environment. However, computational cost and model complexity remain major challenges. To this end, the present study proposed a dimensionality-reduction based IDS for cloud computing environments to minimize computational costs in cloud environment. Using the CSE-CIC-IDS2018 dataset, which comprises about 16 million instances and covers a broad array of attack types, we applied dimensionality reduction methods Principal Component Analysis (PCA), Non-negative Matrix Factorization (NMF), and High Correlation Filter. The resulting feature set was narrowed down to 12 essential features, which include flow duration and rate metrics, packet count and size metrics in both forward and backward directions, inter-arrival time metrics for flows in both directions, TCP flag metrics, header size metrics, and bulk transfer metrics in both forward and backward directions. Machine learning models were then trained to classify instances as either benign or attack-oriented. The models employed for classification were Gradient Boosting, Random Forest, Support Vector Machines (SVM), k-Nearest Neighbors (k-NN), and Logistic Regression, listed in order of performance. This research argued that dimensionality reduction not only simplifies the machine learning models but also reduces computational costs and the risk of overfitting, thereby improving the cost and computational efficiency and reliability of intrusion detection systems in cloud computing.