2.1. Intrusion Detection Systems
There are three popular approaches to protect networks. The first approach is to detect an attack by setting up alert triggers, for example, once a threshold is exceeded. Such a detection strategy informs the administrators that an attack occurs but does not prevent it. The second approach is to prevent an attack by setting protection mechanisms that deny the attack from occurring, which raises a problem if a legitimate action is deemed illegitimate, resulting in a loss of service availability. The final approach is to block an attack, which entails setting protections in place in retrospect of the attack, preventing or detecting the attack when it next occurs. In the latter two approaches, the IDS is configured as an Intrusion Prevention System (IPS) [
5].
There are two locations on which IDS can be placed depending on the sources of information; firstly, sensors can be placed on a Host system (HIDS) and secondly, sensors can be placed on the Network (NIDS) [
10,
22]. The HIDS monitors a single host’s activity such as running processes, system calls, or code in memory [
3,
10,
22]. The NIDS monitors the network traffic by performing packet inspection to check protocol usage and their services or IP addresses of communication partners [
3,
10,
22]. NIDS could commonly be placed at the gateway of ingress or egress traffic at the organisation network perimeter. Therefore, it would be possible to observe all traffic from external adversaries [
23]. An IDS is an essential tool used to identify intrusions; therefore, one of the most widespread problems in cybersecurity is creating sophisticated methods and generating effective rules for the NIDS to be beneficial [
1]. An IDS’s success is often rated according to their accuracy values when testing controlled data of the known legitimate and illegitimate activity. The accuracy is made up of four metrics [
24,
25], and are used when calculating the accuracy seen in Equation (
1). We have True Positive (
TP) when an attack is correctly identified and a True Negative (
TN) when a benign incident is correctly identified. When a benign incident is identified as an attack, we have a False Positive (
FP), and when an attack is identified as a benign incident, we have a False Negative (
FN).
An IDS can be classified based on its detection method into: (i) knowledge-based IDS, where the detection is based on static knowledge, (ii) behaviour-based IDS, where the detection is based on dynamic behaviour that has been monitored, and (iii) hybrid IDS, which is a combination of (i) and (ii) [
10]. More specifically, a knowledge-based IDS, also known as signature-based, is a system in which alerts are triggered based on a list of pre-determined knowledge, such as hashes, sequence of bytes, etc. A behaviour-based IDS, also known as anomaly-based, takes on the principle of having a normal activity model, triggering alerts when anomalous activity is detected. Creating a baseline of normal activity can sometimes be inaccurate if there is malicious activity captured within the baseline. Moreover, a representative normal activity baseline is unlikely possible with an ever-changing and expanding network [
5,
26]. A behaviour-based IDS is capable of detecting zero-day attacks which have not yet been disclosed to the public [
10]. However, due to the nature of alerting all deviations from normal activity, they are known to produce high FPRs [
10,
26]. Thirdly there is a hybrid IDS that makes use of both knowledge-based IDS and behaviour-based IDS. A hybrid-based IDS can detect both zero-day as well as known attacks [
10].
2.2. Internet of Things Security
The IoT is a market that is populated with a myriad of device types. The purpose of the IoT is to interconnect physical objects seamlessly to the information network. IoT thrives on heterogeneity which enables the different devices to connect. As IoT devices are often manufactured using different hardware, specific protocols and platforms must be created for these devices [
23]. On the other hand, heterogeneity limits security advancements because there is no universal international standard. Furthermore, because of the lack of security consideration, there is some reluctance to deploy IoT devices, as both businesses and independent parties are unwilling to risk their security [
3].
As IoT attacks are increasing in popularity, and the significance of data is high (e.g., within the health sector or critical infrastructure), the incentive to improve IoT security is increasing. However, through a profit-driven business model in a competitive market, IoT security is often considered an afterthought. Data theft is regarded as a high impact risk; however, sometimes, IoT data could be temperature monitoring, which is considered trivial unless the integrity is manipulated and fails to alert deviations. Until now, the most challenging attacks in the IoT field have been botnet attacks [
6,
14]. A botnet consists of several zombie devices which an attacker is controlling. A large scale botnet enables a fullscale Denial of Service (DoS) attack to be carried out—such as Mirai botnet, or Bashlite [
6]. Moreover, botnets can be used for information gathering attacks such as port-scanning and port-probing to gather information about running services on their targets [
6].
Since the IoT network poses many challenges, and there are newly introduced network varieties, traditional NIDS can no longer act independently. The importance to protect the network lies in the pursuit of privacy. Data privacy seeks to accomplish two things. Firstly to maintain data confidentiality, and secondly, to uphold anonymity and to make it impossible to correlate data relations between the data and their owner [
27].
2.3. Machine Learning Models and Examples
To assist behaviour-based IDS to reduce the number of FPRs mentioned in
Section 2.1, researchers have demonstrated success using machine learning IDS models [
6]. Since a behaviour-based IDS does not rely on static knowledge, it can cope with the ever-growing network of devices and numerous zero-day attacks [
10,
28]. Machine learning models are trained using a preferably large dataset, and the intention is that the individual models recognise patterns within the data to decide whether an activity is malicious or not.
Machine learning models are primarily divided into three groups; firstly, supervised learning, secondly, unsupervised learning, and finally, semi-supervised learning [
10,
28]. Supervised learning models are trained using labelled data where the outcome of the data is known. Labelled data might classify network traffic as benign or as an attack. On the other hand, unsupervised learning occurs when a model is trained purely using unlabelled data. The aim is for the model to recognise patterns within the data to classify input data. Semi-supervised learning is a mixture of supervised and unsupervised datasets, meaning that some data is labelled and other data is not. Semi-supervised learning can be useful when the dataset would be excessively time-consuming to label exhaustively.
Deep learning is a subset of unsupervised machine learning and intends to imitate the human brain structure through the use of neural networks. Deep learning has proven success with regards to image recognition, speech processing, and vehicle crash prediction, among other areas [
29]. More recently, deep learning has shown advancements in regards to intrusion detection when compared to traditional machine learning methods [
26]. The success of deep learning comes through hierarchical pattern recognition within the features [
22].
However, machine learning techniques are not the golden ticket towards the perfect IDS. It is essential to use a dataset that does not have a relatively high number of redundant values because these can increase the computation time and reduce the IDS model accuracy. Moreover, the dataset must not have unnecessary features; otherwise, the accuracy would be reduced [
27]. A significant risk of machine learning IDS being considered is using inaccurate or non-integral data to train the model. For example, if an adversarial insider labels malicious activity as benign, then the IDS will not detect those attack vectors, reducing the accuracy of an IDS.
Traditional machine learning models applied to NIDS may include decision tree C4.5, Random Forests (RF), and Support Vector Machines (SVM) [
29]. The C4.5 machine learning model is increasingly considered out of date and is being replaced by RF models that are made up of a forest of many smaller decision trees, each specialising in different patterns.
SVM plot each data instance within an
n-dimensional space, where
n is the number of features [
14]. The classification finds the hyperplane distinguishing classes [
14].
Deep learning models may include Artificial Neural Networks (ANN), Recurrent Neural Networks (RNN), and Convolutional Neural Networks (CNN) [
15]. RNNs are extended neural networks that use cyclic links, learning using memory where outputs further influence inputs for the neural network [
14]. CNNs automatically generate features through rounds of learning processes [
26].
A confusion matrix is the foundation towards further metrics known as precision, recall, and F1-score. Precision is the ratio of the total number of true positives against the total number of positive predictions made, as seen in Equation (
6). The recall score is the model’s ability to predict all positive values and can be seen in Equation (
7). The F1-score is a weighted average of the recall and precision scores, as seen in Equation (
8). The final metric is the Receiving Operator Characteristic (ROC) curve and the Area Under Curve (AUC) score. The ROC curve is plotted against the FPR and TPR, and the AUC is the amount of space under the ROC curve. When evaluating the AUC score, it is desired to have as high of a number as possible.
2.4. Datasets
There are several datasets used in the IDS literature, such as the KDD-CUP-99, the UNSW-NB15, the CIC-DoS-2017, and the CSE-CIC-IDS [
10]. Nonetheless, a primary challenge of applied machine learning in IDS is the lack of publicly available and realistic datasets that can be used for model training and evaluation [
10,
14]. Organisations are not willing or allowed to disclose network traffic data as they can reveal sensitive information about the network configuration and raise privacy concerns [
30].
Moreover, several datasets have been used to study adversarial attacks in the relevant literature, such as the KDD-CUP-99, the UNSW-NB15, the CIC-DoS-2017 and the CSE-CIC-IDS [
31]. The KDD-CUP-99 dataset is the most widely researched dataset for the evaluation of IDS models [
15], containing seven weeks of network traffic [
8]. However, the KDD-CUP-99 dataset is heavily criticised by researchers stating that the data is inadequate for machine learning models due to the high number of redundant rows [
9]. The UNSW-NB15 [
32] collected data using various tools such as IXIA PerfectStorm, Tcpdump, Argus, Bro-IDS, in order to identify nine attack factors Fuzzers, analysis, backdoors, DoS, exploits, generic, reconnaissance, shellcode and worms [
32]. The CIC-DoS-2017 [
33], is focused on DoS attacks due to the recent increase in these attacks [
33]. The CSE-CIC-IDS [
34] consists of seven attack categories: Brute-Force, Heartbleed, Botnet, DoS, Distributed Denial of Service (DDoS), Web Attacks and Insider Attacks.
Until recently, there have been no specific IoT datasets proposed for machine learning IDS evaluation [
15]. The Bot-IoT dataset is the first to focus on an IoT network architecture and includes five different attacks; DDoS, DoS, operating system and service scans, keylogging, and data exfiltration [
14]. The Bot-IoT dataset makes use of the Message Queuing Telemetry Transport (MQTT) protocol—a lightweight messaging protocol used in IoT architectures [
15]. There are over 72 million rows of data in the complete dataset; therefore, a 5% version has also been released for more manageable use. The dataset released with 46 features, and the authors proposed a top-10 feature list using the Pearson Correlation Matrix and entropy values [
14]. There have been works demonstrating the efficacy of both traditional and deep learning techniques with this dataset [
5].
2.5. Related Work
Machine learning and especially deep learning models introduce significant advantages concerning FPRs and accuracy [
10,
22]; however, they are not without their flaws. Machine learning models are often vulnerable to attacks with the intent to cause incorrect data classification [
17,
18]. Incorrectly classified data could enable an attacker to evade the IDS, thereby putting the organisation at the risk of undetected attacks. There are three primary categories of offences which include influential attacks, security violation attacks, and specificity attacks [
20,
35]. Influential attacks impact the model used for intrusion detection and include causative attacks that entail altering the training process by manipulating the training data and exploratory attacks that attempt to gather a deeper understanding of the machine learning model [
20,
35].
Security violation attacks target the CIA principles concerning the machine learning model [
20,
35]. An attacker can infer such information through analysis after gaining access to data used for model training. An attack on model integrity aims to influence the model’s metrics, such as Accuracy, F1-Score, or AUC [
16]. Consequently, if the model’s integrity cannot be maintained, the attacker could force a model to classify data incorrectly and bypass the model’s detection. An attack on the model availability would trigger a high number of FP and FN alerts which could render a model unavailable for use. Moreover, if there are numerous alerts, an alert triage agent could experience alert fatigue, where the focus per alert will drop. In turn, the alert fatigue could lead the triage agent to dismiss a valid alert.
Adversarial attacks in traditional machine learning are more effective at the training stage rather than the testing stage. Attacking the model at the training stage will create fundamental changes to the machine learning model’s classification accuracy. In contrast, an attack at the testing or deployment stage can only take advantage of inherent weaknesses in the model [
36]. An adversary can affect either the features of a dataset (feature noise) or the labels of the dataset (label noise) [
36]. Label noise or also referred to as label flipping, focuses on changing the labels that are used to train a model. Label flipping seems to be a more effective method of tricking a machine learning model [
21,
37]. The threat model of a label flipping attack is to influence the model by providing inaccurate labels during the model’s training. If the labels are not correct, then the model cannot be trained integrally and correctly [
38]. There are different strategies related to label flipping, including random or targeted attacks. Random label flipping entails selecting an arbitrary percentage of labels and changing them. For example, a label classifying a record as malicious could be switched to benign or vice versa. Targeted label flips focus on flipping labels that have the most significant effect on the model’s performance. According to [
39] SVMs can often be more robust against adversarial machine learning, and therefore a suitable choice for deployment in adversarial environments. If attackers have access to feed incorrect labels to the machine learning model during training, then they can likely best achieve their objective [
20].
Deep learning is a subset of machine learning that shown success in classifying well-labelled data. Therefore, if labels are not accurately assigned, then the resulting model accuracy will also be inaccurate [
38]. Conventional adversarial machine learning methods such as label flipping do not apply to deep learning models because the model’s data is split into batches; it would be required for labels to be flipped within every batch [
17]. The authors of [
40] proposed the Fast Gradient Sign Method (FGSM) to generate new perturbed data. The FGSM is a linear manipulation of data; therefore, it is quick to generate new data [
40]. The FGSM aims to increase the deep learning model’s cost than the gradient of the loss with the input image to generate a new image. The work in [
41] proposed a family of attacks under the name of Jacobian-based Saliency Map Attack (JSMA). JSMA is more computationally heavy than FGSM; however, it can create adversarial examples which appear closer to the original sample [
41]. The work of [
42], extended the JSMA and introduced targeted and non-targeted methods to generate adversarial examples. Nevertheless, JSMA uses more computational power than FGSM [
43].
In 2018, ref. [
44] presented complete success in using the FGSM to get on average 26% incorrect classification per class with their neural network model. As reviewed in
Section 2.5, ref. [
44] used the NSL-KDD dataset and perturbed the dataset using an epsilon value of 0.02. The experiments carried out in this paper have received a higher percentage of incorrect classification, while the epsilon value has been at least 0.1. The result suggests that the Bot-IoT dataset and neural networks are more robust towards adversarial FGSMs than the NSL-KDD dataset.
Nevertheless, ref. [
45] brought up the concern that the FGSM perturbs the entire dataset of which it is given. The impact of complete data perturbing is two-fold. Firstly, by perturbing the Bot-IoT dataset, the features primarily impacted are those residing within network packet headers. A layer seven firewall could likely detect these kinds of attacks depending on the algorithm used. Secondly, ref. [
45] propose to use the JSMA method instead of FGSM because JSMA perturbs a portion of the dataset rather than the entire dataset. On the contrary, ref. [
43] suggests that FGSM is better used in experiments because of the drawback that JSMA has with high computational power. FGSM against the NSL-KDD was also used by [
46], who found that perturbing the reduced the accuracy of their CNN from 99.05% to 51.35%. Besides, ref. [
46] experimented against the NSL-KDD dataset using the JSMA method and found that the accuracy decreased from 99.05% to 50.85%. Comparing these results to the findings of our work’s experiments, we observed a similar decrease in the accuracy, as it can be seen in
Section 4.
Countermeasures can be categorised into reactive and proactive [
16]. Reactive countermeasures are only able to detect adversarial behaviour after the neural network has been built and deployed. Reactive countermeasures could be input reconstruction, and data sanitation, which could be performed using multiple classifier systems or statistical analysis [
21,
38]. Yuan et al. propose re-training to be a proactive countermeasure against adversaries, as adversarial training could prepare a model in an improved manner [
16]. Furthermore, if the machine learning model is planned to be deployed in an adversarial environment, then it is crucial to use adversarial methods to train a robust model [
17,
21]. Machine learning in an adversarial environment requires analysis of attacker’s methods to force inaccuracies from machine learning model [
20].
Our work differentiates from all previous approaches because it is the first to attempt both label noise attacks and adversarial examples generation using the Bot-IoT dataset [
14]. Our results support the fact that the Bot-IoT dataset combined with neural networks generates a more robust setup compared to the NSL-KDD dataset.