A Systematic Literature Review of Methods and Datasets For Anomaly Based Network Intrusion Detection
A Systematic Literature Review of Methods and Datasets For Anomaly Based Network Intrusion Detection
A Systematic Literature Review of Methods and Datasets For Anomaly Based Network Intrusion Detection
a r t i c l e i n f o a b s t r a c t
Article history: As network techniques rapidly evolve, attacks are becoming increasingly sophisticated and threatening.
Received 27 September 2021 Network intrusion detection has been widely accepted as an effective method to deal with network
Revised 28 November 2021
threats. Many approaches have been proposed, exploring different techniques and targeting different
Accepted 27 February 2022
types of traffic. Anomaly-based network intrusion detection is an important research and development di-
Available online 1 March 2022
rection of intrusion detection. Despite the extensive investigation of anomaly-based network intrusion de-
Keywords: tection techniques, there lacks a systematic literature review of recent techniques and datasets. We follow
Intrusion detection the methodology of systematic literature review to survey and study 119 top-cited papers on anomaly-
Systematic literature review based intrusion detection. Our study rigorously and comprehensively investigates the technical landscape
Machine learning of the field in order to facilitate subsequent research within this field. Specifically, our investigation is
Datasets conducted from the following perspectives: application domains, data preprocessing and attack-detection
techniques, evaluation metrics, coauthor relationships, and datasets. Based on the research results, we
identify unsolved research challenges and unstudied research topics from each perspective, respectively.
Finally, we present several promising high-impact future research directions.
© 2022 The Authors. Published by Elsevier Ltd.
This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/)
https://doi.org/10.1016/j.cose.2022.102675
0167-4048/© 2022 The Authors. Published by Elsevier Ltd. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/)
Z. Yang, X. Liu, T. Li et al. Computers & Security 116 (2022) 102675
(Axelsson, 20 0 0; Ghorbani et al., 20 09). Anomaly-based network Nisioti et al. (2018) provide a comprehensive overview of unsuper-
intrusion detection is an important research and development di- vised and hybrid intrusion detection methods and also present and
rection of intrusion detection. We follow the methodology of the emphasize the importance of feature engineering techniques in in-
systematic literature review (SLR) to survey 119 highly cited pa- trusion detection.
pers on anomaly-based network intrusion detection. We diversify These studies classified intrusion detection techniques based on
our analysis from multiple perspectives. First, we analyze research technical principles and detailed their advantages and disadvan-
progress and identify potential bottlenecks in specific application tages, but provided no research ideas or methods from the point
scenarios, such as the Internet of Things (IoT) and industrial control of view of reproducibility. This weakens their stringency and is not
networks. Second, we study preprocessing techniques, such as data conducive to further research. In addition, such papers lack com-
cleaning, feature selection, and feature transformation, which can prehensiveness in the discussion of intrusion detection methods.
provide suggestions for data preparation. Third, we discuss intru- We have more fully studied and discussed several aspects of intru-
sion detection techniques and analyze their principles and related sion detection such as pre-processing methods, analytical models
applications by technology category. Fourth, we investigate evalua- and evaluation methods.
tion methods, including metrics and datasets, which can help us to
standardize them. Fifth, to study the current state of the commu- 2.2. Survey of application of different fields in IDS
nity, we count the contributors and map the collaboration network.
Finally, we conduct a systematic survey of cybersecurity datasets, Zarpelão et al. (2017) surveyed IDS in the IoT. In an overview
so as to better understand them and evaluate their applicability. of IoT devices, they argued that the IoT paradigm has the phases
In summary, the contributions of this paper are as follows: of collection; transmission; and processing, management, and ex-
ploitation, and presented a range of technologies that can be used
• We are the first to use the SLR methodology to survey and for IoT devices, focusing on wireless technologies. Hande and Mud-
study the 119 most highly cited papers in the field of net- dana (2021) provide an overview of existing security solutions for
work security intrusion detection, which were systematically SDNs and a comparative study of various IDS approaches based on
screened from 14,942 candidates. deep learning models and machine learning methods.
• We establish a comprehensive technical overview of the intru- These studies discuss the current state of research on intru-
sion detection field from both coarse- and fine-grained perspec- sion detection under a certain target network. Compared to these
tives. We provide a comprehensive overview of 52 cybersecu- works, our research is broader, covering the Internet, the Internet
rity datasets and label them according to their attributes. of Things (IoT), Software Defined Networks (SDN), and Industrial
• The analysis of approaches sheds light on future research direc- Control Networks (ICN). And, we also investigate datasets from dif-
tions. ferent domains for researchers’ reference.
The remainder of this paper is organized as follows.
2.3. Survey of intrusion detection datasets
Section 2 summarizes existing literature review work related to
intrusion detection systems and datasets, and Section 3 presents a
Ring et al. (2019) identified 15 features of 34 intrusion de-
literature review of our intrusion-detection system methodology.
tection datasets, categorized in five groups: general information,
Our research results are presented in Section 4. Section 5 con-
evaluation, recording environment, data volume, and data nature.
cludes this paper and discusses future research directions.
Thakkar and Lohiya (2020) investigated different IDS datasets and
research advances used to evaluate IDS models, focusing on the
2. Related work
CIC-IDS-2017 and CSE-CIC-IDS-2018 datasets. The studies men-
tioned above focus on the characteristics of the dataset and the
A wealth of literature covers various aspects of intrusion detec-
progress of the study. Compared to these studies, our research also
tion. In this section, we present existing related works and com-
discusses intrusion detection principles and related methods.
pare them with our study.
Further, we categorized the relevant studies mentioned above.
We categorized these studies according to the following criteria
2.1. Survey of intrusion detection methods
and present the results in Table 1.
Most of the related research focuses on intrusion detection • Methodology: it indicates whether the study is based on SLR
methods. Bhuyan et al. (2013) briefly describe and compare a methodology.
large number of network anomaly detection methods and sys- • Intrusion detection technique: it indicates whether the study
tems. Ahmed et al. (2016) analyzed anomaly detection methods discusses intrusion detection techniques, and furthermore can
and the complexity of machine learning/data mining (ML/DM) be specific to preprocessing approaches, analysis models and
algorithms. Milenkoski et al. (2015) evaluated common prac- evaluation methods.
tices for intrusion detection systems by analyzing the existing • Multi-field: it indicates whether the study discusses the current
standard evaluation parameters, including workloads and met- state of research on intrusion detection in different network en-
rics. Buczak and Guven (2015) discussed machine learning and vironments.
data mining methods for network analysis to support intru- • Dataset: it indicates whether the study covers the relevant re-
sion detection. Hodo et al. (2017) presented a classification of search of the dataset.
shallow and deep network intrusion detection systems, investi- As shown in Table 1, in contrast to other studies, our study fol-
gated the performance of machine learning techniques in de- lows the SLR Methodology with comprehensive coverage of intru-
tecting anomalies, and discussed false and true positive alarm sion detection techniques (including preprocessing methods, ana-
rates. Wang and Jones (2017) reviewed the applications of data lytical models, and evaluation methods) and datasets, and explores
mining, machine learning, deep learning, and big data in intru- multi-target networks.
sion detection. Haq et al. (2015) conducted extensive research
on the application of machine learning techniques in intrusion 3. Research methodology
detection. Mishra et al. (2018) discuss the application of ma-
chine learning methods in intrusion detection and provide at- Various intrusion detection systems have been proposed. We
tack classification and attack feature mapping for each attack. have developed a research protocol according to the methodology
2
Z. Yang, X. Liu, T. Li et al. Computers & Security 116 (2022) 102675
Table 1
Related studies on network intrusion detection survey.
of the systematic literature review (SLR) (Keele et al., 2007), as 3.2. Research questions
shown in Fig. 1. This includes identification of research, research
questions, study selection, data extraction, and data synthesis. The We developed ideas for the analysis of a paper and articulated
method is approached with mixed methods (qualitative and quan- specific research questions (RQs), as shown in Table 2, including
titative research methods) to more visually represent the above detailed sub-questions, to guide our research. First, we summarize
needs. the network environment in which intrusion detection techniques
are applied (RQ1), which helps us to analyze the characteristics of
the development and application of intrusion detection techniques.
Second, we investigate the data preprocessing techniques (RQ2)
3.1. Identification of research content
and intrusion detection datasets (RQ6) commonly used in intru-
sion detection and make recommendations for the data prepara-
To obtain a comprehensive set of papers required an unbiased
tion phase based on the findings. Third, we focus on the intru-
search strategy to find original reviews related to intrusion detec-
sion detection techniques proposed in the paper (RQ3), including
tion systems. The search process must be as rigorous and sensible
framework (RQ3(a)), learning method (RQ3(b)), and types of su-
as possible, and search terms must be defined. We find that some
pervision (RQ3(c)). Also, we are very interested in the principles
anomaly-based intrusion detection articles are named with intru-
and applications of the model (RQ3(d)). Fourth, evaluation meth-
sion detection. Therefore, to fully cover anomaly-based intrusion
ods are important to measure the capability of intrusion detection
detection articles, we defined the search term as “network intru-
techniques, so we would like to learn about the general evaluation
sion detection”.
metrics in this area (RQ4). Finally, we are also interested in the
Before we started our literature search work, we evaluated
authors of the papers (RQ5).
three databases, Scopus, Google Scholar and Web of Science. Sco-
pus covers the major publishers of RE (ACM, Springer, IEEE) and is
more inclusive than Web of Science, but less inclusive than Google 3.3. Study selection
Scholar. However, Google Scholar may include many papers that
are not peer-reviewed, such as technical reports. For these reasons, The following research principles ensure consistent evaluation
we used Scopus to perform the publication search. and minimize subjectivity.
3
Z. Yang, X. Liu, T. Li et al. Computers & Security 116 (2022) 102675
Table 2
Research questions .
RQ1 Application domains (a) What domains are covered by intrusion detection techniques?
(b) How are the studies distributed among the different areas?
(c) What are the reasons for this distribution?
(d) How does research in these domains differ from country to country?
RQ2 Data preprocessing methods (a) What are the common data preprocessing techniques used in network intrusion detection?
(b) How are the preprocessing technologies implemented and what are their technical features?
(c) What is the distribution of their applications in intrusion detection?
RQ3 Detection techniques (a) Which models are applied in intrusion detection techniques?
(b) How does machine learning and deep learning apply to intrusion detection?
(c) What is the distribution of supervision types?
(d) What are the principles and characteristics of different intrusion detection technologies?
RQ4 Evaluation metrics (a) How is the performance of intrusion detection technologies evaluated?
(b) In our research articles how are these evaluation methods applied?
RQ5 Authors (a) Who are the main contributors to the articles?
(b) What does the network of co-authors look like?
RQ6 Datasets (a) What are the available public datasets?
(b) Which datasets are often used in network intrusion detection?
(c) Why are these datasets widely used?
Table 3
Exclusion criteria and their interpretation.
Inclusion IC1 Research efforts are explicitly and specifically dedicated to intrusion detection systems.
IC2 Research on intrusion detection methods based on machine learning
Exclusion EC1 Does not meet average of 10 citations per year.
EC2 A paper shorter than 6 pages contains insufficient research content.
EC3 Not an anomaly-based intrusion detection paper.
EC4 Only a review paper.
Papers collected from the database are filtered using our de-
fined criteria, as shown in Fig. 2. After filtering, we had 119 papers
related to intrusion detection.
Relevant information on each paper was extracted and tagged
for analysis. There are two categories of tags. The first can be
obtained from the content of a paper: year of publication, au-
thor, number of citations, domain, model, evaluation metrics, and
dataset. The second is based on the learning method and supervi-
sion type.
It was not necessary to carefully read the full text of a paper.
We read the title, abstract, and introduction, which contained most
of the information, and examined the text if necessary.
Fig. 2. Paper selection process.
3.5. Data synthesis
For learning methods and types of supervision, we annotated Industrial networks implement communication protocols between
by analyzing the employed models. We classified domains through field devices, digital controllers, various software suites, and exter-
research of the traditional Internet (Web), IoT, industrial control nal systems. SDNs enhance network control through programming.
networks (ICNs), and software defined networks (SDNs). Lacking This combination of features can bring the benefits of enhanced
a specific application scenario, we classified papers as such. The configuration, improved performance, and new architecture.
IoT is a network of objects embedded with sensors, software, and It is also necessary to label datasets. Datasets for network
other technologies to connect and exchange data with other de- packet analysis in commercial products are not readily available
vices and systems via the Internet. IoT technologies are most of- for privacy reasons. Publicly available datasets such as DARPA,
ten associated with the “smart home” concept. ICNs are networks KDD, and UNSW-NB15 are widely used as benchmarks. We de-
of digital control systems connected using the Ethernet standard. fined labels, including “Year”, “Authenticity”, “Count”, “Labeled”,
4
Z. Yang, X. Liu, T. Li et al. Computers & Security 116 (2022) 102675
Table 4
Number of papers by domain in intrusion detection methods.
Domain Number
Internet 88
Internet of Things (IoT) 20
Industry Control Network (ICN) 9
Software Defined Network (SDN) 2
5
Z. Yang, X. Liu, T. Li et al. Computers & Security 116 (2022) 102675
6
Z. Yang, X. Liu, T. Li et al. Computers & Security 116 (2022) 102675
7
Z. Yang, X. Liu, T. Li et al. Computers & Security 116 (2022) 102675
Table 6 Table 7
Most used Machine Learning Algorithms in Proposed Methods. Most used Deep Learning Algorithms in Proposed Methods.
inative classifier defined by a split hyperplane that uses a kernel eralization capability and performs well. Decision trees are widely
function to map training data to a high-dimensional space for lin- used due to their high efficiency and interpretability.
ear classification of intrusions. Data used in intrusion detection Deep learning is evolving rapidly, and is becoming the basis of
usually have high dimensionality, with which SVM has high gen- more intrusion detection methods. To answer RQ3(b), we plotted
8
Z. Yang, X. Liu, T. Li et al. Computers & Security 116 (2022) 102675
9
Z. Yang, X. Liu, T. Li et al. Computers & Security 116 (2022) 102675
tion gain ratio based on ID3, and prunes the tree by replacing in each cluster based on the distances between them, which im-
branches that do not help as leaf nodes. CART uses Gini impu- proves the clustering of complex data.
rity (an information-theoretic measure corresponding to Tsallis en- Clustering algorithms can categorized, such as connectivity-
tropy) as a metric, solving the problem that ID3 not handle the based (e.g., hierarchical), centroid-based (k-means, fuzzy c-means),
regression task. distribution-based (GMM), density-based (DBSCAN), or grid-based
DT has an intuitive classification strategy, is interpretable and (STING). Clustering is generally simple to implement and easy to
simple to implement, and often allows for better generalization interpret, but is sensitive to outliers, and initial values of parame-
through post-construction pruning, making it a common model ters have too much influence on the results.
in intrusion detection. Anthi et al. (2019) proposed a three-layer Peng et al. (2018) proposed a method for intrusion de-
intrusion detection system (IDS) that identifies IoT devices based tection systems using small-batch k-means for clustering and
on MAC addresses, classifies messages as bona fide or malicious, PCA to reduce data dimensionality. Experimental results and
and employs DTs to classify attacks. Abbes et al. (2010) classified time complexity analysis showed that the method is effective.
records as benign or anomalous by analyzing application proto- Casas et al. (2012) proposed UIDS, an unsupervised network in-
cols, using separate and distinct adaptive DTs for each. The sys- trusion detection system capable of detecting unknown network
tem achieved good results identifying DoS attacks, scanning at- attacks without the use of q signature, labeled traffic, or training.
tacks, and botnets. Muniyandi et al. (2012) proposed an anomaly- UIDS uses an unsupervised outlier detection method based on sub-
detection method that uses k-means to form k clusters of training space clustering, and multiple evidence accumulation techniques
instances based on Euclidean distance similarity, and C4.5 on each to identify types of attacks.
cluster to construct DTs of normal and abnormal instance density - Naive Bayes (NB) is a probabilistic classifier based on Bayes’
regions. theorem [44]. All naȯve Bayesian classifiers are based on the prin-
The disadvantage of DT is weak robustness; small changes in ciple that the value of a feature is independent of the value of any
training data may result in a completely different DT. Further- other feature, i.e.,
more, information gain is biased toward attributes with more lev-
els (Deng et al., 2011), so larger DTs may require manual pruning.
n
yˆ = argmax p(Ck ) p(xi | Ck ), (4)
- SVM (Özgür and Erdem, 2016) constructs an N-dimensional k∈{1,...,K } i=1
hyperplane to optimally classify data. SVM can be linear or non-
linear. Linear SVM is used for linearly separable data, i.e., datasets where yˆ is the conditional probability that the data belong to each
that can be divided into two categories by a straight line. Nonlin- class, k is the number of classes, Ck is the kth class, n is the num-
ear SVM is used for nonlinearly separable data. For this, we use a ber of features, p(Ck ) is the prior probability of Ck , and p(xi | Ck )
kernel trick that sets data points in a higher dimension where they is the conditional probability of feature xi given class Ck . A feature
can be separated using planes or other functions. distribution (i.e., an event model) or nonparametric model gener-
SVM can simplify the solution of high-dimensional problems. It ated from the training set must be assumed in order to compute
is based on small-sample statistical theory, has good generalization a class prior. The multimetric and Bernoulli distributions are usu-
ability, and is often used in intrusion detection. Jan et al. (2019) de- ally used for discrete features, and the Gaussian distribution for
veloped a lightweight attack-detection strategy using supervised continuous features. Bayesian classifiers can be trained on both la-
machine learning-based SVMs to detect attempts to inject un- beled and unlabeled datasets by certain semi-supervised training
wanted data into IoT networks. It obtains a feature pool from algorithms [15].
samples, and uses it with a label vector to train the SVM. The Koc et al. (2012) proposed an approach based on the hidden
method has good classification accuracy and detection times. naȯve Bayes (HNB) model, which can be applied to intrusion de-
Teng et al. (2017) proposed an intrusion detection method based tection problems affected by dimensionality, highly correlated fea-
on SVM, which constructs four two-stage SVMs based on the struc- tures, and high network data stream capacity. HNB is a data mining
ture of DT. SVM1, SVM2, SVM3, and SVM4 detect normal data, model that relaxes the conditional independence assumptions of
DoS/DDoS attacks, probing attacks, and R2L or U2R attacks, re- the NB approach. Experimental results show that the HNB model
spectively. Experiments show that this method outperforms the outperforms the traditional NB model in terms of accuracy, error
method of a single SVM in terms of detection rate and recall. rate, and misclassification cost. To address the potential threat of
De la Hoz et al. (2015) proposed a hybrid statistical technique and DDoS attacks in the IoT, Mehmood et al. (2018) proposed an NB al-
Self Organizing Map (SOM) for network anomaly detection and gorithm with multi-agent-based IDS (NB-MAIDS) and implemented
classification. The method uses PCA and the Fisher discriminant multi-agents in the whole network.
ratio (FDR) for feature selection and noise removal, and probabilis- Although the independence assumption of NB is often violated
tic self-organizing mapping (PSOM)-based modeling of the feature in practice, it still has relatively high accuracy. In addition, as a
space to distinguish normal and malicious traffic. linear algorithm, NB has high training efficiency. These qualities
- Clustering groups objects that are more similar to each other have led to its widespread application as a baseline for classifica-
than to objects in other groups. It is generally understood as a task tion problems.
to be solved rather than an algorithm. Since the concept of a clus- - Ensemble learning combines multiple classifiers through an
ter (i.e., the similarity between objects) cannot be described pre- algorithm to find a (hopefully) better hypothesis in a mixed mul-
cisely, there are widely different clustering algorithms. Clustering tiple hypothesis space. It should be noted that the combination of
can be considered hard or soft, according to matching rules be- multiple classifiers does not guarantee better performance than the
tween objects and clusters. Hard clustering strictly assigns objects best individual classifier, but it reduces the risk of a particularly
to classes. The most representative algorithms are k-means cluster- poor selection.
ing and k-nearest neighbor (KNN), which calculate the Euclidean One of the earliest and most intuitive integration-based algo-
distance between objects to classify clusters. Soft (or fuzzy) clus- rithms, bagging (bootstrap aggregating) (Breiman, 1996) obtains
tering calculates the degree (e.g., probability) of each object’s be- the diversity of classifiers by randomly drawing a subset of the en-
longing to a cluster. Data often cannot be divided into clearly sepa- tire training to train classifiers of the same type, and allows each
rated clusters, and soft classification is used to obtain more flexible classifier in the set to vote with the same weight to combine in-
results. Fuzzy clustering means is a widely used soft clustering al- dividual classifiers. The random forest classifier (Breiman, 2001) is
gorithm that calculates the membership coefficient of each object a common machine learning method that combines bagging with
10
Z. Yang, X. Liu, T. Li et al. Computers & Security 116 (2022) 102675
DTs. The boosting method recursively builds an ensemble by train- a genetic algorithm-based packing method as a search strategy and
ing a new classifier to emphasize the training data misclassified by logistic regression as a learning algorithm to select the best fea-
its previous classifier. Based on this algorithm, several well-known ture subset. The method effectively improves the intrusion detec-
machine learning algorithms have been proposed, such as adaptive tion performance. Hajisalem and Babaie (2018) proposed a hybrid
boosting (AdaBoost) (Freund et al., 1996), gradient boost decision classification method based on ABC and artificial fish swarm (AFS)
tree (GBDT), and extreme gradient boosting (XGBoost). algorithms, using fuzzy C-Means (FCM) clustering and relevance-
Ensemble learning improves the generalizability and accuracy based feature selection (CFS) to divide the training dataset and re-
of the final model by ensembling multiple classifiers, and is less move irrelevant features. Based on the selected features, if-then
likely to be overfitted. Its training and prediction speeds are rules are generated by a CART technique to distinguish normal
naturally lower than those of a single classifier, and the inter- and abnormal records. The generated rules are used to train the
pretability of the model is largely lost in some complex ensem- method to the detection model. In simulations on the NSL-KDD
bles (Madeh Piryonesi and El-Diraby, 2021). Singh et al. (2014) de- and UNSW-NB15 datasets, the method achieved a detection rate
veloped an RF-based DT model for the quasi-real-time peer-to- of 99% and a false-positive rate of 0.01%.
peer botnet detection problem. Li et al. (2018) proposed an arti- - The DNN is an artificial neural network (ANN) with multi-
ficial intelligence-based two-stage intrusion detection method that ple layers between the input and output layers (Bengio, 2009).
uses software-defined techniques. It uses the swarm partitioning In a narrow sense, it is a fully connected neural network with a
and binary difference variants of the bat algorithm to select typ- structure similar to a multilayer perceptron (MLP). The lower-layer
ical features, and RFs to classify streams by adaptively chang- neurons of a fully connected DNN can form connections with all
ing the weights of samples using a weighted voting mechanism. upper-layer neurons. A DNN uses backpropagation to perform a su-
Hu et al. (2013) proposed an online intrusion detection algorithm pervised learning task with nonlinear activation functions.
that constructs a local parameterized detection model at each node Vinayakumar et al. (2019) built a DNN-based distributed
using the online AdaBoost algorithm. A global detection model is deep learning model for an intrusion detection framework
constructed in each node using a small number of samples in the for real-time processing and analysis of very large-scale data.
nodes, combined with the local parametric model. Experimental Xu et al. (2018) proposed an IDS consisting of an RNN with gated
results show that the improved online AdaBoost has a higher de- recurrent units (GRUs), MLP, and softmax module. The DNN can
tection rate and lower false-alarm rate. theoretically approximate any function (Cybenko, 1989).
- Evolutionary algorithms are global optimization algorithms - The CNN is an artificial neural network with a shared-weight
inspired by biological evolution, usually the trial-and-error prob- structure based on convolutional kernels or filters. Inspired by bi-
lem of populations. Initial candidate solutions are repeatedly up- ological processes (Hubel and Wiesel, 1968), a CNN slides the con-
dated and iterated, with poorly performing solutions removed at volutional kernel along the input features to extract translation-
each generation and random variations introduced, consistent with equivariant responses called feature maps.
the concept of natural selection and variation. The CNN and its related architectures have received consider-
Most widely used are genetic algorithms, genetic programming, able attention due to their excellent performance at computer vi-
evolutionary algorithms, particle swarm optimization (PSO), and sion (He et al., 2016). Starting with LeNet-5 (LeCun et al., 1998),
artificial immune systems, which differ mainly in how the itera- numerous CNN architectures, including AlexNet (Krizhevsky et al.,
tions are performed. Genetic algorithms and genetic programming 2012) and ResNet (He et al., 2016), have been proposed. Although
calculate a fitness value for each individual in a population, and se- CNN architectures are usually applied to CV problems, they have
lect individuals with high fitness values for the mating pool with shown good results in IDS as well (Dong et al., 2019; Vinayakumar
high probability to produce the next generation through the ex- et al., 2017). Li et al. (2017) proposed an image conversion method
change of genetic material and mutations between individuals. The for NSL-KDD data, in which CNNs automatically learn the features
genetic algorithm considers the bit string as an individual, while of graphic NSL-KDD transformations.
genetic programming considers the program as the individual. Evo- Compared to the DNN, the CNN’s extraction of local features
lutionary algorithms generally simulate the biological learning pro- reduces the number of weights, as well as the computational com-
cess in nature. For example, the artificial bee colony (ABC) algo- plexity, thus improving the training and prediction speed. How-
rithm simulates the process of bees searching for food sources. The ever, this can lead to problems; some trained CNN models extract
artificial immune system simulates the immune system function by the features of wheels in an image and immediately judge the im-
cloning and mutating antibodies with high affinity to a “virus” (i.e., age as a truck.
the sample to be detected) in order to iterate. - The RNN is a class of artificial neural network that can tem-
Evolutionary computation is characterized by a variety of itera- porally exhibit memory behavior. This dynamic behavior is imple-
tive methods. The iterative approach typically requires the man- mented by connections between nodes to form a directed graph
ual definition of multiple parameters and evaluation functions along a time sequence (Dupond, 2019). The internal state of the
for the problem to be solved. Thus, the algorithm has problem- RNN allows it to process variable-length input sequences.
independent fast search capability and wide applicability, and the Depending on whether the constructed graph has a loop,
population-based principle brings parallelism, which increases the an RNN can be further classified as finite- or infinite-impulse
speed of the search for the optimal solution. However, the perfor- (Miljanovic, 2012). Finite-pulse networks can be unrolled and re-
mance of the evolutionary computation depends strongly on the placed with strict feedforward neural networks (FNNs), while
evaluation function and parameters (which are usually set empiri- infinite-pulse recurrent networks cannot. Moreover, there can be
cally), which affects the efficiency of the solution. Some algorithms additional stored states in an RNN, thus improving it to a network
converge too easily to a local optimum, or even an arbitrary point, that can be implemented with time delays or feedback loops (e.g.,
while others are poor at finding local optimum problems. Although long- and short-term memory networks).
this can be alleviated by replacing the evaluation function and pa- The RNN was proposed to solve the problem that a
rameters (Taherdangkoo et al., 2013), the “no free lunch” theo- DNN has difficulty fitting data that changes temporally. There-
rem (Wolpert and Macready, 1997) has proved that this problem fore, RNNs have played an important role in areas such
has no general solution. as natural language processing and action recognition (Tang
Khammassi and Krichen (2017) proposed a GA-LR packing et al., 2018). RNNs are increasingly applied to IDS, whose
method for feature selection in network intrusion detection, using data mostly consist of temporally continuous data streams
11
Z. Yang, X. Liu, T. Li et al. Computers & Security 116 (2022) 102675
TP
P recision = (6)
TP + FP
TP
Recall = (7)
TP + FN
FP
F AR = (8)
TN + FP
precision × recall
F −measure = 2 × (9)
precision + recall
Detection time is also a common evaluation metric in the field
Fig. 9. Evaluation metrics used in papers.
of intrusion detection. There are 74 articles in our research that
discuss time performance. Detection time means the time spent to
classify a sample with the trained model. Due to the complexity of
(Hochreiter and Schmidhuber, 1997; Yin et al., 2017). However, network traffic, even with methods such as feature selection for di-
since RNNs do not have a special treatment of the activation func- mensionality reduction, IDS research usually faces the problem of
tion, the continuous product of their partial derivatives can easily dimensional catastrophe, which eventually reflects as high detec-
lead to gradient disappearance or even gradient explosion when tion time. Some of the numerous existing algorithms for intrusion
the number of layers of the network is high. detection are almost unavailable in engineering implementations,
- LSTM solves the gradient vanishing problem of the classical and one important reason is their high detection time. From the
RNN by introducing additional storage states (Gers et al., 20 0 0). application point of view, the main goal of intrusion detection is to
LSTM effectively controls the degree of gradient vanishing by us- achieve an appropriate detection rate with minimal resource con-
ing a gate function as the activation function to selectively allow a sumption, which requires an ideal model structure for IDS as well
portion of the information to pass through. as parameter settings. A high detection time of a model usually
Based on the original LSTM architecture, Gers et al. (20 0 0) in- means that its algorithm complexity is too high. Reviewing previ-
troduced forgetting gates to enable the LSTM to reset its state, ous studies, a clear trade-off between the performance and com-
simulating the forgetting process of memory. Because of its ex- plexity of the model can be found.
cellent performance (Capes et al., 2017; Wu et al., 2016), it is Although deep learning based methods usually perform better
considered the most classical LSTM architecture. Based on this, in terms of detection capability compared to other methods, their
Cho et al. (2014) proposed a gate recurrent unit GRU consisting of detection times are too long, making these methods difficult to use
a reset gate and update gate, which maintains the performance of in scenarios such as big data. While computational complexity is
the LSTM as much as possible with fewer parameters. the most direct influence on detection time, considering that the
The LSTM has become one of the most used RNN variants be- computational complexity of some algorithms is difficult to cal-
cause it solves the gradient vanishing problem of traditional RNNs. culate or controversial under different assumptions, most papers
Many IDS studies use LSTM networks (Bontemps et al., 2016; Roy only provide the training and testing time of their algorithms on
et al., 2017) because they are well-suited for classification and pre- the specified dataset. Since the platforms used to obtain each re-
diction based on time-series data, and the forgetting mechanism is sult and the preprocessing methods for the datasets differ, it is
a better match for the detection of data streams. However, due to still difficult to judge the superiority of an algorithm in terms of
the inherent nature of RNNs, the classical LSTM architecture cannot time complexity just from the running time. In summary, we be-
be trained in parallel (Bai et al., 2018), making LSTM-based models lieve that there is still a need for a unified complexity evaluation
sometimes too costly to run. standard in the current IDS research, rather than just in terms of
detection time.
4.4. Evaluation metrics
4.5. Authors
We introduce commonly used evaluation criteria in intrusion
detection papers. To answer RQ4(a), Fig. 9 shows evaluation met- We assessed the main contributing authors in intrusion detec-
rics and the number of times they were used. tion by examining the total number of citations of the included
As shown in Fig. 9, accuracy, precision, recall, F1 value, and publications through Scopus, answering RQ5(a). As can be seen in
false-alarm rate (FAR) are most commonly used (RQ4(b)). Recall Fig. 10, The (Ambusaidi et al., 2016; Tan et al., 2014; Yin et al.,
and accuracy are used in most papers. Recall, also called detec- 2017) contributed most to the field. We found that citations were
tion rate or true positive rate (TPR), is the proportion of correctly not very high for all but the top few authors. This indicates that
classified attacks to all attacks. Recall can measure the accuracy few researchers are cited.
of the classifier in identifying attacks. Accuracy is the ratio of the The three most cited articles among the articles we researched
number of correctly predicted samples to the total number of pre- are shown in Table 8. The article “A Deep Learning Approach for
12
Z. Yang, X. Liu, T. Li et al. Computers & Security 116 (2022) 102675
Table 8
The three most cited articles.
A Deep Learning Approach for Intrusion Detection Using Recurrent Neural Networks 2018 430 143
A Deep Learning Approach to Network Intrusion Detection 2017 313 78
Fuzziness based semi-supervised learning approach for intrusion detection system 2017 286 71
4.6. Datasets
Intrusion Detection Using Recurrent Neural Networks” was pub-
lished in 2018 and has been cited more than 430 times in total, To answer RQ6(a), we investigated existing network intrusion
with an average annual citation of 86. In this article, the authors detection datasets. As shown in Table 10, we collected a total of 52
propose a deep learning approach for intrusion detection using Re- datasets through the survey. And, based on the information pro-
current Neural Networks (RNN-IDS). Moreover, the authors also in- vided by the dataset publishers and additional searches, we ex-
vestigate the performance of the model in binary and multiclass tracted the year of creation, creation method, data volume, anno-
classification, and the effect of the number of neurons and differ- tation status, number of tags and links for each dataset. In terms
ent learning rates on the model performance. In the paper “A Deep of time, starting with the DARPA 1998 dataset, new datasets con-
Learning Approach to Network Intrusion Detection”, the authors tinue to appear in the community. With the year 2009 as the node,
also propose a deep learning-based intrusion detection model. The research related to network intrusion detection datasets started
model is built based on stacked NDAEs and achieves excellent re- to increase. From the fourth column of Table 10, it can be seen
sults. From these two highly cited articles, we can see the great that more than half of the datasets were obtained through sim-
impact and potential of deep learning in the field of intrusion de- ulation experiments. This reflects the sensitive nature of data in
tection. In another paper, “Fuzziness based semi-supervised learn- the field of network intrusion detection from the side. However,
ing approach for intrusion detection system”, the authors propose the accuracy and authenticity of such datasets have been ques-
a fuzzy-based semi-supervised learning approach that uses unla- tioned (Mahoney et al., 2003), and the validity of intrusion detec-
beled sample-assisted supervised learning algorithm to improve tion models constructed based on such datasets is poor.
the performance of the classifier. Unlike the previous two papers, Further, we summarize the frequency of use of the dataset in
this paper aims to reduce the labor consumption in the data la- Table 9 (RQ6(b)). As can be seen from the table, KDD99 and NSL-
beling process by taking the complexity of data labeling as a pain KDD are the two most commonly used datasets, although both
point. of them are simulation experimental data. This is mainly because
13
Z. Yang, X. Liu, T. Li et al. Computers & Security 116 (2022) 102675
KDD99 and NSL-KDD are datasets that have been publicly available intrusion detection datasets, such as the CIRA-CIC-DoHBrw 2020
for a long time. Researchers have published many articles based on dataset, by referring to the information in our table. In addition,
these two datasets. When a new intrusion detection technique is researchers should try to experiment with some real datasets,
proposed, it often needs to be compared with previous techniques, such as the ISOT CID dataset, to ensure the validity of their
which leads to the constant use of KDD99 and NSL-KDD (RQ6(c)). approach.
However, the contents of the KDD99 and NSL-KDD datasets are Finally, to facilitate the work of researchers, we provide links to
obsolete. In future studies, we recommend that researchers eval- the datasets in the table and present some of the datasets in more
uate the performance of intrusion detection methods using newer detail.
14
Z. Yang, X. Liu, T. Li et al. Computers & Security 116 (2022) 102675
Table 10
Existing network intrusion detection datasets.
- DARPA datasets are most popular for intrusion detection, and characteristics. Content characteristics relate to suspicious behavior
were created at the MIT Lincoln Laboratory in an emulated net- of the data part. This is the most extensive dataset used to evaluate
work environment. The DARPA 1998 and DARPA 1999 datasets intrusion detection models.
contain seven and five weeks, respectively, of network traffic in - NSL-KDD is a dataset suggested to solve some of the inher-
packet-based format, including such attacks as DoS, buffer over- ent problems of the KDD99 dataset. Although, this new version of
flow, port scans, and rootkits. Despite (or because of) their wide the KDD dataset still suffers from some of the problems discussed
distribution, the datasets are often criticized for artificial attack in- by Tavallaee et al. (2009) and may not be a perfect representative
jections or large amounts of redundancy. of existing real networks, because of the lack of public data sets
- KDD99 dataset was created from DARPA network dataset files for network-based IDSs, we believe it still can be applied as an
by Lee and Stolfo (20 0 0). The dataset was constructed through data effective benchmark dataset to help researchers compare different
mining to analyze the features of the DARPA dataset and prepro- intrusion detection methods. Furthermore, the number of records
cess the data. The dataset contains seven weeks of network traffic, in the NSL-KDD train and test sets are reasonable. This advantage
with approximately 4.9 million vectors. Attacks are classified as: makes it affordable to run the experiments on the complete set
(1) user-to-root (U2R); (2) remote-to-local (R2L); (3) probing; and without the need to randomly select a small portion. Consequently,
(4) DoS. Each instance is represented by 41 features in three cate- evaluation results of different research work will be consistent and
gories: (1) basic; (2) traffic; and (3) content. Basic features are ex- comparable.
tracted from TCP/IP connections. Traffic characteristics are grouped - UNSW-NB15 was created by the Cyber Range Laboratory of
into those with the same host characteristics or the same service the Australian Cyber Security Center. It is widely used due to its
15
Z. Yang, X. Liu, T. Li et al. Computers & Security 116 (2022) 102675
variety of novel attacks. Types of attacks consist of Fuzzers, Anal- forming different actions. The network traffic capture for benign
ysis, Backdoor, DoS, Exploits, Generic, Reconnaissance, Shellcode, scenarios was obtained from the network traffic of three real IoT
and Worms. It has a training set with 82,332 records, and a testing devices: a Philips HUE smart LED lamp, Amazon Echo home intel-
set with 175,341 records. ligent personal assistant, and Somfy smart door lock. Both mali-
- CICIDS2017 contains benign and common attacks, with both cious and benign scenarios were run in a controlled network en-
source data (PCAPs) and results of network traffic analysis (CSV vironment with unrestrained internet connection, like any real IoT
files) based on timestamps, source and destination IPs, source and device.
destination ports, protocols, and token flows of attacks. The re- - PUF was captured over three days from a campus network
searchers used the B-Profile system (Sharafaldin, et al. 2016) to an- and contains exclusively DNS connections, where 38,120 of 298,463
alyze the abstract behavior of human interactions and to generate unidirectional flows are malicious. All flows are labeled using logs
benign background traffic. The dataset includes abstracted behav- of an intrusion prevention system. IP addresses were removed for
iors of 25 users based on HTTP, HTTPS, FTP, SSH, and email proto- privacy reasons.
cols. Brute force cracking attacks include FTP, SSH, DoS, Heartbleed, - LBNL was created to analyze the network traffic charac-
web attack, infiltration, botnet, and DDoS. teristics in an enterprise network. The dataset can be used as
- CICDoS2017 is a publicly available intrusion detection dataset background traffic for security research, as it contains almost ex-
with application layer DoS attacks from the Canadian Institute for clusively normal user behavior. The dataset is not labeled, is
Cybersecurity. The authors executed eight DoS attacks on the ap- anonymized, and contains more than 100 h of network traffic in
plication layer. Normal user behavior was generated by combin- packet-based format. The dataset can be downloaded at the web-
ing the resulting traces with attack-free traffic from the ISCX 2012 site.4
dataset. The dataset is available in packet-based format and con- - The IEEE 300-bus power test system provides the topolog-
tains 24 h of network traffic. ical and electrical structure of a power grid, to be used to de-
- CICDDoS2019 contains the latest DDoS attacks, which are sim- tect false data injection attacks in the smart grid. The system has
ilar to real-world data. It includes the results of network traf- 411 branches, and an average degree of 2.74. For details about
fic analysis using CICFLOWMeter-V3, which contains a token flow this standard test system, we refer the reader to the work of
based on timestamp source, and destination IPS source and port Hines et al. (2010). The IEEE 300-bus power test system has been
protocols and attacks. used in much work related to cyber-attack classification.
- Kyoto 2006+ is a publicly available honeypot dataset of real - The ICS cyber attack datasets consist of: (1) power system
network traffic that includes only a small number and small range dataset; (2) gas pipeline dataset; (3) energy management system
of realistic, normal user behavior. The researchers transformed dataset; (4) new gas pipeline dataset; and (5) gas pipeline and wa-
packet-based traffic into a new format called sessions. Each ses- ter storage tank dataset. The power system dataset contains 37 sce-
sion has 24 attributes, 14 of which are statistical information fea- narios divided into eight natural events, one non-event, and 28 at-
tures inspired by the KDD CUP 99 dataset, and the remaining 10 tacks. Attacks are categorized as: (1) relay setting change; (2) re-
attributes are typical traffic-based attributes such as IP address mote tripping command injection; and (3) data injection. These
(anonymous), port, and duration. The data were collected over datasets can be used for cybersecurity intrusion detection in in-
three years and include approximately 93 million sessions. dustrial control systems.
- NDSec-1 contains trace and log files of network attacks syn-
thesized by researchers from network facilities. It is publicly avail-
5. Conclusion
able, and was captured in packet-based format in 2016. It contains
additional syslog and Windows eventlog information. Attack com-
We provided a comprehensive overview and analysis of re-
positions include botnet, brute force (against FTP, HTTP, and SSH),
search work on intrusion detection in network security. The survey
DoS (HTTP, SYN, and UDP flooding), exploits, port scans, spoofing,
covered 119 of the most highly cited papers in the field of network
and XSS/SQL injection.
security intrusion detection, including preprocessing and intrusion
- CTU-13 was captured in 2013 and is available in packet, unidi-
detection techniques, and analyzed the community from multiple
rectional flow, and bidirectional flow formats. Captured in a univer-
perspectives. We analyzed the research progress and bottlenecks
sity network, its 13 scenarios include different botnet attacks. Ad-
in different scenarios. We investigated preprocessing and intru-
ditional information about infected hosts is provided at the web-
sion detection techniques. We examined evaluation methods, in-
site.3 Traffic was labeled in three stages: 1) all traffic to and from
cluding metrics and datasets, so as to standardize performance as-
infected hosts was labeled as a botnet; 2) traffic matching spe-
sessment. We counted contributors in the community and mapped
cific filters was labeled as normal; 3) remaining traffic was labeled
their collaborative network. Our publication data and category de-
as background. Consequently, background traffic can be normal or
scriptions are publicly available to facilitate repeatability and fur-
malicious.
ther research.
- BoT-IoT contains more than 72 million records, including
Our results show that research on network anomaly detection is
DDoS, DoS, OS, service scan, keylogging, and data exfiltration at-
unbalanced under different target networks. In the ICN domain, re-
tacks. The Node-red tool was used to simulate the network be-
searchers often do not disclose their datasets due to the sensitivity
havior of IoT devices. MQTT, a lightweight communication protocol,
and confidentiality of industrial network data. The lack of available
links machine-to-machine (M2M) communications. The testbed IoT
datasets limits cybersecurity research in the ICN domain. The lack
scenarios are weather station, smart fridge, motion activated lights,
of datasets is also a key factor limiting research in the SDN do-
remotely activated garage door, and smart thermostat.
main. Before conducting security research, researchers often need
- IoT-23 consists of 23 network captures (called scenarios) of
to build SDN network environments to simulate the data. In terms
IoT traffic, including 20 (PCAP files) from infected IoT devices and
of the current means of intrusion detection techniques used, su-
three of real IoT network traffic. Raspberry Pi malware was exe-
pervised learning is still the mainstream direction. However, these
cuted in each malicious scenario using several protocols and per-
studies need to be built on top of the already labeled data. At the
time of practical application, the data we obtain is unlabeled. La-
3
http://mcfp.weebly.com/the- ctu- 13- dataset- a- labeled- dataset- with- botnet-
4
normal- and- background- traffic.html. http://icir.org/enterprise-tracing/download.html.
16
Z. Yang, X. Liu, T. Li et al. Computers & Security 116 (2022) 102675
beling the data is a time-consuming and tedious task. We believe Bulavas, V., 2018. Investigation of network intrusion detection using data visualiza-
that unsupervised learning and semi-supervised learning are the tion methods, 1–6.
CAIDA, 2017. https://www.impactcybertrust.org/dataset_view?idDataset=834.
way forward for network anomaly detection. Similarly, we believe Caminero, G., Lopez-Martin, M., Carro, B., 2019. Adversarial environment rein-
that automated labeling of network data is also a direction wor- forcement learning algorithm for intrusion detection. Comput. Netw. 159, 96–
thy of in-depth study. In addition, the adversarial environment has 109.
Capes, T., Coles, P., Conkie, A., Golipour, L., Hadjitarkhani, A., Hu, Q., Huddleston, N.,
been shown to impact machine learning-based network anomaly Hunt, M., Li, J., Neeracher, M., et al., 2017. Siri on-device deep learning-guided
detection algorithms. Therefore, anti-perturbation anomaly detec- unit selection text-to-speech system. In: INTERSPEECH, pp. 4011–4015.
tion in adversarial environments also needs more research. Casas, P., Mazel, J., Owezarski, P., 2012. Unsupervised network intrusion detection
systems: detecting the unknown without knowledge. Comput. Commun. 35 (7),
772–783.
Declaration of Competing Interest CDX, 2009. https://www.usma.edu/centers- and- research/cyber-research-center/
data-sets.
Cermak, M., Jirsik, T., Velan, P., Komarkova, J., Spacek, S., Drasar, M., Plesnik, T., 2018.
The authors declare that they have no known competing finan- Towards provable network traffic measurement and analysis via semi-labeled
cial interests or personal relationships that could have appeared to trace datasets. In: 2018 Network Traffic Measurement and Analysis Conference
(TMA), pp. 1–8.
influence the work reported in this paper.
Chawla, N.V., Bowyer, K.W., Hall, L.O., Kegelmeyer, W.P., 2002. Smote: synthetic mi-
nority over-sampling technique. J. Artif. Intell. Res. 16, 321–357.
Cho, K., Van Merriënboer, B., Gulcehre, C., Bahdanau, D., Bougares, F., Schwenk, H.,
Acknowledgment
Bengio, Y., 2014. Learning phrase representations using RNN encoder-decoder
for statistical machine translation. arXiv preprint arXiv:1406.1078.
This work is partially supported by the National Natural Sci- CICDDoS-2019, 2019. https://www.unb.ca/cic/datasets/ddos-2019.html.
CICIDS-2017, 2017. https://www.unb.ca/cic/datasets/ids-2017.html.
ence Foundation of China (No. 61902010), the Major Research
CIDDS, 2017. http://www.hs-coburg.de/cidds.
Plan of National Natural Science Foundation of China (92167102), CIRA-CIC-DoHBrw-2020, 2020. https://www.unb.ca/cic/datasets/dohbrw-2020.html.
the Project of Beijing Municipal Education Commission (No. Creech, G., Hu, J., 2013. Generation of a new IDS test dataset: time to retire the KDD
KM202110 0 05025). collection. In: 2013 IEEE Wireless Communications and Networking Conference
(WCNC), pp. 4487–4492.
CSIC-HTTP-2010, 2010. https://petescully.co.uk/research/csic- 2010- http- dataset- in-
References csv- format- for- weka- analysis/.
CTU-13, 2014. http://mcfp.weebly.com/.
Abbes, T., Bouhoula, A., Rusinowitch, M., 2010. Efficient decision tree for protocol Cybenko, G., 1989. Approximation by superpositions of a sigmoidal function. Math.
analysis in intrusion detection. Int. J. Secur. Netw. 5 (4), 220–235. Control Signals Syst. 2 (4), 303–314.
ADFA-LD, 2013. https://www.unsw.adfa.edu.au/unsw-canberra-cyber/cybersecurity/ DARPA, 1998,1999. http://www.tp-ontrol.hu/index.php/TP_Toolbox.
ADFA- IDS- Datasets/. DDos-2016, 2016. www.researchgate.net/publication/292967044_Dataset- _
Ahmed, M., Mahmood, A.N., Hu, J., 2016. A survey of network anomaly detection Detecting_Distributed_Denial_of_Service_Attacks_Using_Data_Mining_
techniques. J. Netw. Comput. Appl. 60, 19–31. Techniques.
Alhajjar, E., Maxwell, P., Bastian, N., 2021. Adversarial machine learning in network DEFCON, 20 0 0. https://defcon.org/html/links/dc-ctf.html.
intrusion detection systems. Expert Syst Appl 186, 115782. Deng, H., Runger, G., Tuv, E., 2011. Bias of importance measures for multi-valued
Alkasassbeh, M., Al-Naymat, G., Hassanat, A., Almseidin, M., 2016. Detecting dis- attributes and solutions. In: International Conference on Artificial Neural Net-
tributed denial of service attacks using data mining techniques. Int. J. Adv. Com- works, pp. 293–300.
put. Sci. Appl. 7 (1), 436–445. Dong, Y., Wang, R., He, J., 2019. Real-time network intrusion detection system based
Ambusaidi, M.A., He, X., Nanda, P., Tan, Z., 2016. Building an intrusion detection on deep learning. In: 2019 IEEE 10th International Conference on Software En-
system using a filter-based feature selection algorithm. IEEE Trans. Comput. 65 gineering and Service Science (ICSESS), pp. 1–4.
(10), 2986–2998. Dupond, S., 2019. A thorough review on the current advance of neural network
An, J., Cho, S., 2015. Variational autoencoder based anomaly detection using recon- structures. Annu. Rev. Control 14, 200–230.
struction probability. Spec. Lect. IE 2 (1), 1–18. Ertekin, S., Huang, J., Bottou, L., Giles, L., 2007. Learning on the border: active
Anthi, E., Williams, L., Słowińska, M., Theodorakopoulos, G., Burnap, P., 2019. A su- learning in imbalanced data classification. In: Proceedings of the Sixteenth
pervised intrusion detection system for smart home IoT devices. IEEE Internet ACM Conference on Conference on Information and Knowledge Management,
Things J. 6 (5), 9042–9053. pp. 127–136.
AWID, 2015. http://icsdweb.aegean.gr/awid/download.html. Estabrooks, A., Jo, T., Japkowicz, N., 2004. A multiple resampling method for learn-
Axelsson, S., 20 0 0. Intrusion Detection Systems: A Survey and Taxonomy. Technical ing from imbalanced data sets. Comput. Intell. 20 (1), 18–36.
Report. Fernández, A., Garcia, S., Herrera, F., Chawla, N.V., 2018. Smote for learning from
Bach, F.R., 2008. Bolasso: model consistent Lasso estimation through the boot- imbalanced data: progress and challenges, marking the 15-year anniversary. J.
strap. In: Proceedings of the 25th international conference on Machine learning, Artif. Intell. Res. 61, 863–905.
pp. 33–40. Freund, Y., Schapire, R.E., et al., 1996. Experiments with a new boosting algorithm.
Bai, S., Kolter, J. Z., Koltun, V., 2018. An empirical evaluation of generic convolutional In: International Conference on Machine Learning, vol. 96, pp. 148–156.
and recurrent networks for sequence modeling. arXiv preprint arXiv:1803.01271. Gers, F.A., Schmidhuber, J., Cummins, F., 20 0 0. Learning to forget: continual predic-
Beer, F., Hofer, T., Karimi, D., Bühler, U., 2017. A new attack composition for network tion with LSTM. Neural Comput. 12 (10), 2451–2471.
security. 10. DFN-Forum Kommunikationstechnologien. Ghorbani, A.A., Lu, W., Tavallaee, M., 2009. Network Intrusion Detection and
Beigi, E.B., Jazi, H.H., Stakhanova, N., Ghorbani, A.A., 2014. Towards effective feature Prevention: Concepts and Techniques, vol. 47. Springer Science & Business
selection in machine learning-based botnet detection approaches. In: 2014 IEEE Media.
Conference on Communications and Network Security, pp. 247–255. Goodfellow, I., Bengio, Y., Courville, A., Bengio, Y., 2016. Deep Learning, vol. 1. MIT
Bengio, Y., 2009. Learning Deep Architectures for AI. Now Publishers Inc. Press Cambridge.
Bermingham, M.L., Pong-Wong, R., Spiliopoulou, A., Hayward, C., Rudan, I., Camp- Guyon, I., Elisseeff, A., 2003. An introduction to variable and feature selection. J.
bell, H., Wright, A.F., Wilson, J.F., Agakov, F., Navarro, P., et al., 2015. Applica- Mach. Learn. Res. 3 (Mar), 1157–1182.
tion of high-dimensional feature selection: evaluation for genomic prediction in Haider, W., Hu, J., Slay, J., Turnbull, B.P., Xie, Y., 2017. Generating realistic intrusion
man. Sci. Rep. 5 (1), 1–12. detection system dataset based on fuzzy qualitative modeling. J. Netw. Comput.
Bhattacharya, S., Selvakumar, S., 2014. SSENet-2014 dataset: a dataset for de- Appl. 87, 185–192.
tection of multiconnection attacks. In: 2014 3rd International Conference on Hajisalem, V., Babaie, S., 2018. A hybrid intrusion detection system based on
Eco-friendly Computing and Communication Systems, pp. 121–126. ABC-AFS algorithm for misuse and anomaly detection. Comput. Netw. 136,
Bhuyan, M.H., Bhattacharyya, D.K., Kalita, J.K., 2013. Network anomaly detection: 37–50.
methods, systems and tools. IEEE Commun. Surv. Tutor. 16 (1), 303–336. Hamid, Y., Sugumaran, M., 2020. A t-SNE based non linear dimension reduction for
Bontemps, L., McDermott, J., Le-Khac, N.-A., et al., 2016. Collective anomaly detec- network intrusion detection. Int. J. Inf. Technol. 12 (1), 125–134.
tion based on long short-term memory recurrent neural networks. In: Interna- Hande, Y., Muddana, A., 2021. A survey on intrusion detection system for software
tional Conference on Future Data and Security Engineering, pp. 141–152. defined networks (SDN). In: Research Anthology on Artificial Intelligence Appli-
BoT-IoT, 2019. https://www.unsw.adfa.edu.au/unsw-canberra-cyber/cybersecurity/ cations in Security. IGI Global, pp. 467–489.
ADFA- NB15- Datasets/bot_iot.php. Haq, N.F., Onik, A.R., Hridoy, M.A.K., Rafni, M., Shah, F.M., Farid, D.M., 2015. Appli-
Botnet-2014, 2014. https://www.unb.ca/cic/datasets/botnet.html. cation of machine learning approaches in intrusion detection system: a survey.
Breiman, L., 1996. Bagging predictors. Mach. Learn. 24 (2), 123–140. IJARAI-Int. J. Adv. Res. Artif. Intell. 4 (3), 9–18.
Breiman, L., 2001. Random forests. Mach. Learn. 45 (1), 5–32. He, H., Bai, Y., Garcia, E.A., Li, S., 2008. ADASYN: adaptive synthetic sampling
Buczak, A.L., Guven, E., 2015. A survey of data mining and machine learning meth- approach for imbalanced learning. In: 2008 IEEE International Joint Confer-
ods for cyber security intrusion detection. IEEE Commun. Surv. Tutor. 18 (2), ence on Neural Networks (IEEE World Congress on Computational Intelligence),
1153–1176. pp. 1322–1328.
17
Z. Yang, X. Liu, T. Li et al. Computers & Security 116 (2022) 102675
He, K., Zhang, X., Ren, S., Sun, J., 2016. Deep residual learning for image recogni- Madeh Piryonesi, S., El-Diraby, T.E., 2021. Using machine learning to examine impact
tion. In: Proceedings of the IEEE Conference on Computer Vision and Pattern of type of performance indicator on flexible pavement deterioration modeling.
Recognition, pp. 770–778. J. Infrastruct. Syst. 27 (2), 04021005.
Hines, P., Blumsack, S., Sanchez, E.C., Barrows, C., 2010. The topological and electri- Mahoney, Matthew, V., Philip, K., Chan, 2003. An analysis of the 1999 DARPA/Lincoln
cal structure of power grids. In: 2010 43rd Hawaii International Conference on Laboratory evaluation data for network anomaly detection. In: International
System Sciences, pp. 1–10. Workshop on Recent Advances in Intrusion Detection. Springer, Berlin, Heidel-
Hinton, G., Roweis, S.T., 2002. Stochastic neighbor embedding. In: NIPS, vol. 15. Cite- berg, pp. 220–237.
seer, pp. 833–840. Mani, I., Zhang, I., 2003. kNN approach to unbalanced data distributions: a case
Hochreiter, S., Schmidhuber, J., 1997. Long short-term memory. Neural Comput. 9 study involving information extraction. In: Proceedings of Workshop on Learn-
(8), 1735–1780. ing from Imbalanced Datasets, vol. 126.
Hodo, E., Bellekens, X., Hamilton, A., Tachtatzis, C., Atkinson, R., 2017. Shal- Martinez, A.M., Kak, A.C., 2001. PCA versus LDA. IEEE Trans. Pattern Anal. Mach.
low and deep networks intrusion detection system: a taxonomy and survey. Intell. 23 (2), 228–233.
arXiv preprint arXiv:1701.02145. MAWILab, 2014. http://www.fukuda-lab.org/mawilab/documentation.html.
Hofstede, R., Hendriks, L., Sperotto, A., Pras, A., 2014. SSH compromise detection McCarthy, K., Zabar, B., Weiss, G., 2005. Does cost-sensitive learning beat sampling
using NetFlow/IPFIX. ACM SIGCOMM Comput. Commun. Rev. 44 (5), 20–26. for classifying rare classes? In: Proceedings of the 1st International Workshop
Host, U., Network, 2016. https://csr.lanl.gov/data/cyber1/. on Utility-Based Data Mining, pp. 69–77.
De la Hoz, E., De La Hoz, E., Ortiz, A., Ortega, J., Prieto, B., 2015. PCA filtering and Mehmood, A., Mukherjee, M., Ahmed, S.H., Song, H., Malik, K.M., 2018. NBC-MAIDS:
probabilistic SOM for network intrusion detection. Neurocomputing 164, 71–81. Naïve Bayesian classification technique in multi-agent system-enriched IDS for
Hsu, C.-W., Chang, C.-C., Lin, C.-J., et al., 2003. A practical guide to support vector securing iot against DDoS attacks. J. Supercomput. 74 (10), 5156–5170.
classification. Milenkoski, A., Vieira, M., Kounev, S., Avritzer, A., Payne, B.D., 2015. Evaluating com-
Hu, W., Gao, J., Wang, Y., Wu, O., Maybank, S., 2013. Online adaboost-based pa- puter intrusion detection systems: asurvey of common practices. ACM Comput.
rameterized methods for dynamic distributed network intrusion detection. IEEE Surv. (CSUR) 48 (1), 1–41.
Trans. Cybern. 44 (1), 66–82. Miljanovic, M., 2012. Comparative analysis of recurrent and finite impulse response
Hubel, D.H., Wiesel, T.N., 1968. Receptive fields and functional architecture of mon- neural networks in time series prediction. Indian J. Comput. Sci. Eng. 3 (1),
key striate cortex. J. Physiol. 195 (1), 215–243. 180–191.
ICML-09, 2009. http://www.sysnet.ucsd.edu/projects/url/. Mishra, P., Varadharajan, V., Tupakula, U., Pilli, E.S., 2018. A detailed investigation
InSDN, 2020. http://aseados.ucd.ie/?p=177. and analysis of using machine learning techniques for intrusion detection. IEEE
IoT-23, 2020. https://mcfp.felk.cvut.cz/publicDatasets/IoT- 23- Dataset/iot_23_datasets Commun. Surv. Tutor. 21 (1), 686–728.
_small.tar.gz. MontazeriShatoori, M., Davidson, L., Kaur, G., Lashkari, A.H., 2020. Detection of
ISCX-IDS-2012, 2012. https://www.unb.ca/cic/datasets/ids.html. DoH tunnels using time-series classification of encrypted traffic. In: 2020
ISOT-Botnet, 2010. https://www.uvic.ca/engineering/ece/isot/datasets/botnet- IEEE Intl. Conf. on Dependable, Autonomic and Secure Computing, Intl.
ransomware/index.php. Conf. on Pervasive Intelligence and Computing, Intl. Conf. on Cloud and
ISOT-CID, 2018. https://www.uvic.ca/engineering/ece/isot/datasets/cloud-security/ Big Data Computing, Intl. Conf. on Cyber Science and Technology Congress
index.php. (DASC/PiCom/CBDCom/CyberSciTech), pp. 63–70.
ISTS-12, 2015. http://ists.sparsa.org/. Moustafa, N., Slay, J., 2015. UNSW-NB15: a comprehensive data set for network
ISOT, 2017. https://www.uvic.ca/engineering/ece/isot/datasets/botnet-ransomware/ intrusion detection systems (UNSW-NB15 network data set). In: 2015 Military
index.php. Communications and Information Systems Conference (MilCIS), pp. 1–6.
Jan, S.U., Ahmed, S., Shakhov, V., Koo, I., 2019. Toward a lightweight intrusion detec- Muniyandi, A.P., Rajeswari, R., Rajaram, R., 2012. Network anomaly detection by cas-
tion system for the internet of things. IEEE Access 7, 42450–42471. cading k-means clustering and C4. 5 decision tree algorithm. Procedia Eng. 30,
Jazi, H.H., Gonzalez, H., Stakhanova, N., Ghorbani, A.A., 2017. Detecting http-based 174–182.
application layer dos attacks on web servers in the presence of sampling. Com- NDSec-1, 2016. https://www2.hs- fulda.de/NDSec/NDSec- 1/Files/.
put. Netw. 121, 25–36. NGIDS-DS, 2016. research.unsw.edu.au/people/professor- jiankun- hu.
Jonker, M., King, A., Krupp, J., Rossow, C., Sperotto, A., Dainotti, A., 2017. Mil- Nisioti, A., Mylonas, A., Yoo, P.D., Katos, V., 2018. From intrusion detection to at-
lions of targets under attack: a macroscopic characterization of the dos ecosys- tacker attribution: acomprehensive survey of unsupervised methods. IEEE Com-
tem. In: Proceedings of the 2017 Internet Measurement Conference, pp. 100– mun. Surv. Tutor. 20 (4), 3369–3388.
113. NSL-KDD, 2009. https://www.unb.ca/cic/datasets/nsl.html.
KDD99, 1999. http://kdd.ics.uci.edu/databases/kddcup99/kddcup99.html. OPCUA, 2020. https://digi2-feup.github.io/OPCUADataset/.
Keele, S., et al., 2007. Guidelines for Performing Systematic Literature Reviews in Özgür, A., Erdem, H., 2016. A review of KDD99 dataset usage in intrusion detection
Software Engineering. Technical Report. Citeseer. and machine learning between 2010 and 2015. PeerJ Preprints 4, e1954v1.
Khammassi, C., Krichen, S., 2017. A GA-LR wrapper approach for feature selection in Peng, K., Leung, V.C., Huang, Q., 2018. Clustering approach based on mini
network intrusion detection. Comput. Secur. 70, 255–277. batch Kmeans for intrusion detection system over big data. IEEE Access 6,
Kharon, 2016. http://kharon.gforge.inria.fr/dataset/index.html. 11897–11906.
Kiss, N., Lalande, J.-F., Leslous, M., Tong, V.V.T., 2016. Kharon dataset: android mal- Pyle, D., 1999. Data Preparation for Data Mining. Morgan Kaufmann.
ware under a microscope. In: The {LASER} Workshop: Learning from Authorita- Quinlan, J.R., 1983. Learning efficient classification procedures and their application
tive Security Experiment Results ({LASER} 2016), pp. 1–12. to chess end games. Mach. Learn. 463–482.
Koc, L., Mazzuchi, T.A., Sarkani, S., 2012. A network intrusion detection system Quinlan, J.R., 1986. Induction of decision trees. Mach. Learn. 1 (1), 81–106.
based on a Hidden Naïve bayes multiclass classifier. Expert Syst. Appl. 39 (18), Quinlan, J.R., 2014. C4. 5: Programs for Machine Learning. Elsevier.
13492–13500. Raskutti, B., Kowalczyk, A., 2004. Extreme re-balancing for SVMs: a case study. ACM
Kolias, C., Kambourakis, G., Stavrou, A., Gritzalis, S., 2015. Intrusion detection in Sigkdd Explor. Newsl. 6 (1), 60–69.
802.11 networks: empirical evaluation of threats and a public dataset. IEEE Ring, M., Wunderlich, S., Grüdl, D., Landes, D., Hotho, A., 2017. Flow-based bench-
Commun. Surv. Tutor. 18 (1), 184–208. mark data sets for intrusion detection. In: Proceedings of the 16th European
Koroniotis, N., Moustafa, N., Sitnikova, E., Turnbull, B., 2019. Towards the develop- Conference on Cyber Warfare and Security. ACPI, pp. 361–369.
ment of realistic botnet dataset in the internet of things for network forensic Ring, M., Wunderlich, S., Scheuring, D., Landes, D., Hotho, A., 2019. A survey of net-
analytics: bot-IoT dataset. Future Gener. Comput. Syst. 100, 779–796. work-based intrusion detection data sets. Comput. Secur. 86, 147–167.
Krizhevsky, A., Sutskever, I., Hinton, G.E., 2012. ImageNet classification with deep Roy, S.S., Mallik, A., Gulati, R., Obaidat, M.S., Krishna, P.V., 2017. A deep learning
convolutional neural networks. Adv. Neural Inf. Process. Syst. 25, 1097–1105. based artificial neural network approach for intrusion detection. In: Interna-
Kyoto-20 06+, 20 06. http://www.takakura.com/Kyoto_data/. tional Conference on Mathematics and Computing, pp. 44–53.
LeCun, Y., Bottou, L., Bengio, Y., Haffner, P., 1998. Gradient-based learning applied to Ruan, Z., Miao, Y., Pan, L., Patterson, N., Zhang, J., 2017. Visualization of big data
document recognition. Proc. IEEE 86 (11), 2278–2324. security: a case study on the KDD99 cup data set. Digit. Commun. Netw. 3 (4),
Lee, W., Stolfo, S.J., 20 0 0. A framework for constructing features and models for 250–259.
intrusion detection systems. ACM Trans. Inf. Syst. Secur.(TiSSEC) 3 (4), 227– Safavian, S.R., Landgrebe, D., 1991. A survey of decision tree classifier methodology.
261. IEEE Trans. Syst. Man Cybern. 21 (3), 660–674.
Li, J., Zhao, Z., Li, R., Zhang, H., 2018. Ai-based two-stage intrusion detection for Sarangi, S., Sahidullah, M., Saha, G., 2020. Optimization of data-driven filterbank for
software defined IoT networks. IEEE Internet Things J. 6 (2), 2093–2102. automatic speaker verification. Digit. Signal Process. 104, 102795.
Li, Z., Qin, Z., Huang, K., Yang, X., Ye, S., 2017. Intrusion detection using convolu- Sharafaldin, I., Lashkari, A.H., Ghorbani, A.A., 2018. Toward generating a new
tional neural networks for representation learning. In: International Conference intrusion detection dataset and intrusion traffic characterization. In: ICISSp,
on Neural Information Processing, pp. 858–866. pp. 108–116.
Liu, X.-Y., Wu, J., Zhou, Z.-H., 2008. Exploratory undersampling for class-imbalance Sharafaldin, I., Lashkari, A.H., Hakak, S., Ghorbani, A.A., 2019. Developing realistic
learning. IEEE Trans. Syst. Man Cybern. Part B 39 (2), 539–550. distributed denial of service (DDoS) attack dataset and taxonomy. In: 2019 In-
Loh, W.-Y., 2011. Classification and regression trees. Wiley Interdiscip. Rev. Data ternational Carnahan Conference on Security Technology (ICCST), pp. 1–8.
Min.Knowl. Discov. 1 (1), 14–23. Shiravi, A., Shiravi, H., Tavallaee, M., Ghorbani, A.A., 2012. Toward developing a sys-
Ma, J., Saul, L.K., Savage, S., Voelker, G.M., 2009. Beyond blacklists: learning to de- tematic approach to generate benchmark datasets for intrusion detection. Com-
tect malicious web sites from suspicious URLs. In: Proceedings of the 15th ACM put. Secur. 31 (3), 357–374.
SIGKDD International Conference on Knowledge Discovery and Data Mining, Singh, K., Guntuku, S.C., Thakur, A., Hota, C., 2014. Big data analytics framework for
pp. 1245–1254. peer-to-peer botnet detection using random forests. Inf. Sci. 278, 488–497.
18
Z. Yang, X. Liu, T. Li et al. Computers & Security 116 (2022) 102675
19
Z. Yang, X. Liu, T. Li et al. Computers & Security 116 (2022) 102675
Yunwei Zhao received her PhD from Tsinghua Univer- Han Han is an engineer of CNCERT/CC. He specializes in
sity in 2015 and worked as a postdoctoral researcher in software engineering, AI, and cybersecurity. His research
Nanyang Technological University afterwards. She joined has bridged the gap between the theory and practical us-
CNCERT/CC in 2017. Her research interest is data analytics, age of AI-assisted software systems for better quality as-
network security, data interdependence, behavior model- surance and security.
ing, and social media analytics. Her publications appear
in top-tier venues including IJCAI, IJCNN, WI-IAT, etc.
20