1
AI-Enabled System for Efficient and Effective Cyber Incident Detection and Response in Cloud Environments
Mohammed A. M. Farzaan, Mohamed C. Ghanem*, Ayman El-Hajjar
Abstract
—The escalating sophistication and volume of cyber threats in cloud environments necessitate a paradigm shift in strategies. Recognising the need for an automated and precise response to cyber threats, this research explores the application of AI and ML in and proposes an AI-powered cyber incident response system for cloud environments. This system, encom- passing Network Traffic Classification, Web Intrusion Detection, and post-incident Malware Analysis (built as a Flask application), achieves seamless integration across platforms like Google Cloud and Microsoft Azure. The findings from this research highlight the effectiveness of the Random Forest model, achieving an accuracy of 90% for the Network Traffic Classifier and 96% for the Malware Analysis Dual Model application. Our research highlights the strengths of AI-powered cyber security. The Ran- dom Forest model excels at classifying cyber threats, offering an efficient and robust solution. Deep learning models significantly improve accuracy, and their resource demands can be man- aged using cloud-based TPUs and GPUs. Cloud environments themselves provide a perfect platform for hosting these AI/ML systems, while container technology ensures both efficiency and scalability. These findings demonstrate the contribution of the AI- led system in guaranteeing a robust and scalable cyber incident response solution in the cloud.
Index Terms
—Cyber Incident, Digital Forensics, Artificial In- telligence, Machine Learning, Cloud Security, Incident Response.
I.
I
NTRODUCTION
In recent years, the proliferation of cyber attacks targeting organisations across various industries has reiterated the criti- cal need for robust incident response capabilities. According to the UK government’s cybersecurity Breaches Survey 2023 [1], a significant percentage of businesses and charities have expe- rienced breaches or attacks, with alarmingly low adoption rates of formal Incident Response capabilities. Consequently, there is a pressing demand for organisations to invest in Incident Response capabilities to safeguard against data breaches and cyber threats. Notably, organisations with well-tested Incident Response capabilities and high levels of AI and ML integration for threat detection and response demonstrated substantially lower data breach costs, as highlighted by IBM’s Cost of Data
—————————————————————————————— * Mohammed Chahine Ghanem the corresponding author. Mr M.A.M. Farzaan is with Cyber Security Research Group, University of Westminster, London, UK. e-mail: w1945035@westminster.ac.uk Dr
M.C.
Ghanem
is
with
the
Department
of
Computer Science, University of Liverpool, Liverpool, UK. email: mo- hamed.chahine.ghanem@liverpool.ac.uk Dr A. El-Hajjar is with Cyber Security Research Group, University of Westminster, London, UK. e-mail: a.elhajjar@my.westminster.ac.uk
Breach 2022 report [2]. This demonstrates how essential it is for organizations to embrace AI and ML technologies to bolster their cybersecurity posture This study investigates how AI contributes to cybersecurity and explores the potential of applying it in cloud environ- ments to address challenges associated with it. It proposes a novel system leveraging AI and ML techniques to enhance cybersecurity within cloud environments. The proposed system includes three main components, a network traffic classifier,
a web intrusion detection system (WIDS) and a malware analysis system. The network classifier utilises real-time network traffic capture to analyse ongoing network activity for anomalies potentially indicative of malicious behaviour. The NSL-KDD dataset [23], a benchmark for network traffic analysis, serves as the foundation for training and evaluating our classifier. By effectively classifying incoming traffic in real-time based on this rich feature set, the classifier can significantly enhance network security by enabling prompt identification and miti- gation of potential cyber-attacks. The Web Intrusion Detection System (WIDS) focuses on detecting suspicious behaviour in web traffic to prevent unau- thorised access. It achieves this by extracting informative features from standard HTTP server logs. The key innovation of this design lies in its real-time deployment and distributed data collection using lightweight agents on web servers. This ensures efficient log collection and minimises the impact on in- dividual servers. Anomaly detection employs the Isolation For- est algorithm, which is effective in high-dimensional datasets commonly encountered in security applications. To reduce false positives, the application triggers alerts only when the number of detected anomalies exceeds a predefined threshold, based on the assumption that real-world attacks often involve rapid bursts of activity. The Malware Analysis system stream- lines the process of analysing suspicious files to determine if they are malicious. It achieves this by first extracting string features from training binaries and then using them to train a model. The model adopts a combined model architecture to mitigate false positives. It uses the Random Forest model as the primary model, complemented by a secondary model, the Keras TensorFlow model. Both models were trained on a com- prehensive dataset obtained from VirusTotal.com. The system follows a logical flow where uploaded files undergo initial processing and classification. If the initial model predicts a high likelihood of malicious content, the file is classified as "Malicious", and a detailed analysis report is generated. For
2
files with an uncertain classification, a secondary deep-learning model is invoked for precise prediction. This paper represents a concerted effort to explore the practical application of AI techniques in the domain of dig- ital forensics, with a specific focus on developing an AI- enabled Cyber Incident Investigations Framework tailored for deployment in cloud environments. By leveraging the capa- bilities of AI and ML, this research seeks to enhance the efficacy and efficiency of digital forensics processes, thereby enabling organisations to better detect, analyse, and mitigate cyber threats in cloud infrastructures. Through comprehensive investigations, this work delves into three distinct AI and ML applications of digital forensics: Network Traffic Classifica- tion, Web Intrusion Detection, and Malware Analysis Systems. These applications are meticulously integrated within leading cloud platforms such as Google Cloud and Microsoft Azure to facilitate forensic operations effectively. The findings derived from this research shed light on several critical aspects of AI-driven digital forensics. Firstly, the suitability of Random Forest emerges prominently for classification tasks, demonstrating robust performance in dis- tinguishing between various network behaviours and identi- fying potential threats. Furthermore, the integration of deep learning models unveils new horizons in Malware Analysis, underscoring the potential for enhanced accuracy and efficacy in digital forensics tasks. Moreover, this research underscores the effectiveness and scalability of cloud environments as hosting platforms for AI and ML systems. By harnessing cloud infrastructures’ computational power and flexibility, organisations can significantly enhance their digital forensics capabilities, thereby overcoming the constraints of traditional on-premises solutions. Additionally, the exploration of container technology un- derscores its pivotal role in facilitating the deployment and scalability of AI and ML-driven digital forensics systems within cloud environments. The agility and resource efficiency offered by containerisation presents compelling advantages for organizations seeking to streamline their forensic operations and adapt dynamically to evolving cyber threats. In conclu- sion, this research presents a novel and pragmatic approach to combating cybercrime in cloud environments, leveraging the synergistic potential of Artificial Intelligence and cloud resources. By bridging the gap between cutting-edge AI tech- nologies and the demand for digital forensics, the proposed AI- led DFIR system represents a crucial step towards fortifying organizational resilience against cyber threats in the digital age.
II.
R
ELATED WORK
This section synthesises research related to digital forensics and incident response systems in cloud environments, as well as the integration of Artificial Intelligence (AI) and Machine Learning (ML) within these domains. The selected papers shed light on various methodologies, frameworks, and technologies aimed at enhancing cyber forensic capabilities and addressing emerging challenges in cloud computing security.
A.
Incident Response and Investigation in Cloud Environments
Several works have been done to address incident detection and response in Cloud environments; Stelly and Roussev [8] introduce SCARF, a container-based software framework designed to enable digital forensic processing at a cloud
scale. The paper advocates for leveraging containers as a solution to address critical issues in digital forensics, of- fering practical insights into the integration capabilities and performance considerations of the solution. However, the absence of experiments in cloud environments limits the assessment of SCARF’s full potential. Hemdan and Manjaiah [10] presented a Cloud Forensics Investigation model Centred around digital forensics as a Service (DFaaS), emphasising the deployment of a forensics Server within cloud service providers’ infrastructures. While the proposed model perfor- mance and features look promising, its reliance on propri- etary cloud environments restricts its applicability to public cloud deployments. Dykstra and Sherman [11] introduced FROST, a trusted digital forensics tool designed specifically for the OpenStack cloud computing platform. An essential and noteworthy feature of FROST is its focus on evidence integrity; FROST enables the reliable acquisition of virtual disks and API logs. However, its compatibility limited to OpenStack platforms presents challenges for investigations spanning diverse cloud infrastructures. Edington and Kishore [20] proposed a comprehensive forensics framework for cloud computing featuring a central forensics server and an external forensics monitoring plane. While the framework addresses key challenges in cloud forensics, its on-premise resource approach and lack of deployment in actual cloud environments necessitate further validation.
Pa˘tras¸cu
and Patriciu [21] sug- gest a secure framework focused on monitoring user activity in cloud environments, with a modular architecture tailored for KVM virtualisation technology. Although the framework offers insights into securing cloud environments, its narrow focus on KVM may limit its applicability in heterogeneous cloud infrastructures.
B.
AI and ML in Digital Forensics and Incident Response
There are different research and proposals on how to inte- grate AI and ML techniques in the DFIR process. Temechu and Anteneh [12] proposed a hybrid Machine Learning (ML) ap- proach for anomaly detection in IoT and cloud environments. Their approach utilises Convolutional Neural Networks (CNN) and Support Vector Machines (SVM) to address security threats. The authors also acknowledged the limitations of traditional CNNs on small datasets and proposed the use of Bayesian CNNs for improved performance. They emphasise the importance of large datasets for training and suggest using datasets from sources like CIADA and Packt. Their solution involves data pre-processing, feature extraction using CNN, and classification using SVM, with the possibility of incorporating entropy-based anomaly detection for further improvement. While this research offers a promising direction with its exploration of Bayesian CNN, it would benefit from addressing how the chosen ML algorithms handle zero-day attacks and the computational demands of complex models
3
on resource-constrained IoT devices. Furthermore, the paper lacks details on the evaluation methodology used to assess the effectiveness of the proposed solution. Irina Baptista et al. [29] proposed a new approach to malware detection using machine learning and a unique method of visualizing malware as images. While the reported accuracy for specific file types such as PDF and DOC, appears promising, a thorough critical analysis and evaluation is still required. Firstly, the effective- ness against a wider range of malware formats beyond PDFs and DOCs is unclear. Secondly, the generalisability of the self-organizing neural network for unknown malware requires more exploration. Finally, the computational cost of image visualisation for real-time applications must be addressed. Overall, the approach holds merit, but broader testing and efficiency analysis are needed for a more comprehensive evaluation. Al Balushi et al. [26] addressed the growing importance of machine learning (ML) in digital forensics, highlighting its potential to streamline investigations overwhelmed by vast amounts of digital evidence. Their paper investigated how ML techniques can automate tasks, improve accuracy, and expedite the forensic process, all while using different algorithms suit- able for different forensic scenarios. However, a deeper dive into specific algorithms, Implementation mechanisms and their strengths and weaknesses for different tasks would strengthen the analysis. Du et al. [25] investigated the application of artificial intelligence (AI) in digital forensics, emphasising its potential to tackle the backlog of cases caused by the ever- increasing volume of digital evidence. They explored how AI- based tools can automate evidence processing, thereby expe- diting the investigation process and increasing case throughput. The paper discussed challenges and future directions for AI in various digital forensic domains. However, a more detailed analysis of the specific AI techniques and their limitations in different forensic tasks could offer deeper insights. Qadir, et.al [16] highlighted the curcial role of machine learning in addressing challenges in digital forensics, propos- ing applications such as link analysis and fraud detection. Despite its insightful analysis, the paper lacked empirical validation of the proposed techniques. Additionally, it over- looked potential drawbacks associated with using machine learning in this context, such as the substantial amount of training data required and the possibility of bias within the algorithms themselves. Overall, the paper provides a spring- board for exploring the potential of machine learning in digital forensics. Hilmand et al. [17] conducted a survey study on the application of ML in digital forensics, offering insights into various algorithms employed for tasks such as access controls and image distortion detection. The authors discussed various applications of ML in the field, without delving into the specific strengths and weaknesses of each application. Additionally, the paper did not address the potential drawbacks of using ML such as algorithm overheads and inherent biases. Rughani [19] proposed a digital forensics framework that leverages artificial intelligence to enhance tool performance and minimise user interaction. However, it remains unclear how the framework would address the handling of entirely new types of cybercrime that are not included in its training data. While the suggested framework show potential as a viable solution, it still needs an in-depth evaluation and validation of the results it claims to achieve. Dunsin et al. [14] developed a multi-agent framework for digital investigations, showcasing reduced time for evidence file integrity checks. Despite promising results, the framework would benefit from validation in diverse cloud environments. In another study, Dunsin et al. [13] provided a thorough examination of AI and ML applications in digital forensics, summarising contributions, drawbacks, and impacts of exist- ing research. The reviewed literature showcased the growing significance of AI and ML in enhancing digital forensics and incident response capabilities while highlighting the need for empirical validations and practical implementations to realise their full potential in cloud environments.
III.
M
ETHODOLOGY
In this section, we present the methodology employed in this research and outline the systematic approach utilised to achieve the study’s objectives. This section covers the design, development and deployment of the system in detail.
A.
System Design and Development
1)
Overview:
Our research proposes a novel AI-powered system with a three-tier architecture designed for efficient cyber threat detection and investigation. This architecture leverages containerization technology to isolate and deploy various functionalities across three distinct environments: Pro- duction, Honeypot, and DFIR as illustrated in Figure 1. The Production environment securely hosts critical infrastructure needed by the customer, ensuring the integrity, availability, and confidentiality of production data. It also securely mirrors network traffic to the DFIR Environment VPC for analysis by AI models. The Honeypot environment, a core component of our system’s innovation, utilises a T-Pot honeypot to strategi- cally attract and deceive attackers. This deception facilitates the collection of valuable training data for our continuously learning AI models. The DFIR environment acts as the central hub for analysis. It houses a suite of security applications consisting of trained models that will perform predictions on new data points, including a network attack classifier that performs real-time classification on network traffic. Additionally, a Web Intrusion Detection System (WIDS) analyses web server logs collected from the production environment for anomalies. Furthermore, a storage bucket monitor in the DFIR environment lever- ages the Malware Analysis system (hosted on a separate subnet) to analyse suspicious files and perform static anal- ysis. Subnet 3, considered the nerve centre, hosts the ELK stack (Elasticsearch, Logstash, and Kibana) for centralised storage and analysis of logs generated by the ML models. These logs, enriched with insights from both production and honeypot environments, empower analysts to identify patterns and anomalies that might indicate potential threats. The final subnet, Research and Development, acts as a bridge between the honeypot environment and the system. Labelled training data from the honeypot and the computing power provided by
4
TABLE I: Summary of related works.
Reference Data Source Technique Used Approach
Stelly & Rous- sev. [9] Experimental Data (1) Containerisation is used to encapsu- late individual executable modules (2) ExifTool and OpennSFW are used as worker modules Propose a container-based software framework integrates existing forensics tools into a processing pipeline as worker modules. Nanda and Hansen. [18] Cloud Resources (1) Forensics as a Service (2) VM snap- shots Implement a Forensic as a Service (FaaS) solution, enabling digital forensics to be conducted efficiently through a cloud-based Forensic Server. Dykstra & Sherman [11] Virtual Disks, logs API Openstack cloud Suggest a set of three novel forensic tools designed for the OpenStack cloud platform, ensuring trustworthy acquisition of virtual disks, API logs, and guest firewall logs. Philip [27] et al. DNS logs (1)
Multi-agent tralised Model system (2) Decen- Propose a multi-agent model for forensics investigation in domains where devices are often distributed across a wide area. Rughani. [19] Disk Images Acquisition, Analysis and Presentation of data for forensics Introduce a framework aimed at optimising speed and performance in investigating cyber crimes and minimizing user interactions. . Irina
Baptista et al. [29] Malicious and Be- nign files. (1) Malware detection based on binary visualization. (2) Neural Networks. Describe a new approach to malware detection that combines machine learning with a creative method of visualizing malware as images. Temechu et al. [12] Log files from CAIDA and Packt. Data pre-processing, feature extraction using CNNs, and classification using SVMs Suggest a hybrid Machine Learning (ML) approach for anomaly detection in IoT and cloud environments using Convolutional Neural Networks (CNNs) and Support Vector Machines (SVMs) to address security threats. Our Model Network Traffic, HTTP Server logs, .exe files (1) Real Time feature Engineering for Classification. (2) Docker containers and Kubernetes in cloud environments (3) TensorFlow deep learning model to reduce false positives. Propose and evaluate a system with multiple applications deployed to defend against cyber threats and respond to incidents. The system can interact with large amounts of data by scaling and predicting with higher accuracy.
the cloud facilitate a continuous model training and deploy- ment pipeline, ensuring our AI models stay up-to-date with evolving threats. This research explores different approaches and methods to address security problems by utilizing AI/ML as a core defence mechanism. Our proposed system archi- tecture which was initially deployed on the Google Cloud platform as depicted in Figure 2 contributes to this goal by enabling efficient data flow and promoting a Framework of a complete end-to-end AI system. Figure 3 depicts a framework for data analysis employed in end-to-end AI system construction, which served as the guiding structure for this research. This ten-step process, beginning with defining the business problem and culminating in the deployment of the trained model, was utilised to develop the AI system presented here. Following data selection and collection, the framework emphasised data pre-processing and feature engineering to prepare the information for model training and evaluation. An iterative loop was adopted, where model performance was assessed and potentially necessitated revisiting earlier stages in the framework for refinement. We also investigate various techniques for deploying AI applications within a secure and efficient architecture in the subsequent sub-sections. Table VIII provides a summary of the specific security problems addressed and the corresponding algorithms employed.
2)
The network traffic classifier:
Securing critical infras- tructure is paramount, and network security plays a vital role in this endeavour. Attackers often exploit vulnerabilities within network systems, making network traffic analysis a crucial tool for defence. Network traffic is a rich source of data, containing valuable information about ongoing network activity. To effectively identify and mitigate potential threats, we propose the development of a network traffic classifier. Algorithm 1 illustrates the functioning of Network Traffic Classification. This classifier will leverage real-time network
Algorithm 1
Network Traffic Classification using Random Forest
Require:
Training data: Network traffic dataset
=
{(
x
1
,
1
)
,
(
x
2
,
2
)
, ...,
(
x
,
)}
1:
where
x
is a feature vector and
is the attack type label.
Ensure:
Traffic Classification model
2:
Preprocess data:
3:
Read data from CSV files with specified column names.
4:
Drop irrelevant features (e.g., flags, protocols, services).
5:
Separate features
(
X
)
and labels
(
y
)
.
6:
Split
data
into
training
and
testing
sets:
(
,
)
,
(
,
)
.
7:
Standardize features using StandardScaler.
8:
Build Random Forest model
:
9:
Define a Random Forest classifier with a desired number of estimators (e.g., 100).
10:
Set random state for reproducibility (e.g., 42).
11:
Train model
on
(
,
)
.
12:
Evaluate model
on
(
,
)
using metrics (e.g., accuracy, classification report).
13:
return
Trained Traffic Classification model
TABLE II: Performance Comparison of Algorithms for Traffic Classification
Algorithm Accuracy
Random Forest 90.92% Logistic Regression 85.75% Decision Tree 87.13% KNN 85.34% Naive Bayes 51.58%