A Systematic Literature Review of Methods and Datasets For Anomaly Based Network Intrusion Detection

Computers & Security 116 (2022) 102675
Contents lists available at ScienceDirect
Computers & Security

journal homepage: www.elsevier.com/locate/cose
A systematic literature review of methods and datasets for

anomaly-based network intrusion detection
Zhen Yang a, Xiaodong Liu a, Tong Li a,∗, Di Wu a, Jinjiang Wang a, Yunwei Zhao b, Han Han b
a
Department of Faculty of Information Technology, Beijing University of Technology, Beijing, China
b
CNCERT/CC, Beijing, China
a r t i c l e i n f o a b s t r a c t
Article history: As network techniques rapidly evolve, attacks are becoming increasingly sophisticated and threatening.
Received 27 September 2021 Network intrusion detection has been widely accepted as an effective method to deal with network
Revised 28 November 2021
threats. Many approaches have been proposed, exploring different techniques and targeting different
Accepted 27 February 2022
types of traffic. Anomaly-based network intrusion detection is an important research and development di-
Available online 1 March 2022
rection of intrusion detection. Despite the extensive investigation of anomaly-based network intrusion de-
Keywords: tection techniques, there lacks a systematic literature review of recent techniques and datasets. We follow
Intrusion detection the methodology of systematic literature review to survey and study 119 top-cited papers on anomaly-
Systematic literature review based intrusion detection. Our study rigorously and comprehensively investigates the technical landscape
Machine learning of the field in order to facilitate subsequent research within this field. Specifically, our investigation is
Datasets conducted from the following perspectives: application domains, data preprocessing and attack-detection
techniques, evaluation metrics, coauthor relationships, and datasets. Based on the research results, we
identify unsolved research challenges and unstudied research topics from each perspective, respectively.
Finally, we present several promising high-impact future research directions.
© 2022 The Authors. Published by Elsevier Ltd.
This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/)
1. Introduction the development of machine learning. Traditional machine learn-

ing techniques have been widely used in intrusion detection,
Computer and network technologies play an increasingly impor- such as decision tree (DT) (Safavian and Landgrebe, 1991), ran-
tant role in our daily life, however their benefits have been bal- dom forest (RF) (Zhang et al., 2008), and support vector ma-
anced somewhat by serious network attacks in recent years. The chine (SVM) (Hsu et al., 2003). And, with the development of deep
2020 NTSC (National Technology Security Coalition) security report learning, convolutional neural network (CNN) (Vinayakumar et al.,
points out that significant network security issues have been ag- 2017), recurrent neural network (RNN) (Yin et al., 2017), and long
gravated every month.1 In 2019, about 620 million account details short-term memory (LSTM) (Roy et al., 2017) are becoming popu-
were leaked by hackers and sold on the dark web. This threaten- lar in intrusion detection. These techniques are based on different
ing situation has been exacerbated by the COVID-19 pandemic, be- principles, and how to effectively exploit their advantages to ad-
cause many people have to work from home, resulting in a signifi- dress intrusion detection tasks in particular domains remains an
cant increase in network traffic. According to the 2020 CIRA (Cana- open research question. Moreover, due to the high dimensionality
dian Internet Registration Authority) Cybersecurity Survey, two- and complexity of the data, a common solution is to use data pre-
thirds of IT workers were required to work from home because of processing techniques which might help to reduce the dimension-
COVID-19.2 ality and thus enable the researchers to be able to deal with these
Intrusion detection system (IDS) is an effective security mech- high-dimensional spaces. Preprocessing methods can affect detec-
anism that monitors network traffic and prevents malicious re- tion performance, and should be carefully considered in the design
quests. Research of intrusion detection is evolving rapidly with of intrusion-detection methods.
Given the above research issues, a systematic and com-
prehensive literature review can contribute to the develop-
∗
Corresponding author. ment of the community. Existing IDSs can be divided into
E-mail address: [email protected] (T. Li).
1
two categories based on the detection method: anomaly-based
https://www.ntsc.org/assets/pdfs/cybersecurity- report- 2020.pdf
2
detection and misuse-based detection or signature detection
https://www.cira.ca/cybersecurity-report-2020
https://doi.org/10.1016/j.cose.2022.102675
0167-4048/© 2022 The Authors. Published by Elsevier Ltd. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/)
Z. Yang, X. Liu, T. Li et al. Computers & Security 116 (2022) 102675
(Axelsson, 20 0 0; Ghorbani et al., 20 09). Anomaly-based network Nisioti et al. (2018) provide a comprehensive overview of unsuper-
intrusion detection is an important research and development di- vised and hybrid intrusion detection methods and also present and
rection of intrusion detection. We follow the methodology of the emphasize the importance of feature engineering techniques in in-
systematic literature review (SLR) to survey 119 highly cited pa- trusion detection.
pers on anomaly-based network intrusion detection. We diversify These studies classified intrusion detection techniques based on
our analysis from multiple perspectives. First, we analyze research technical principles and detailed their advantages and disadvan-
progress and identify potential bottlenecks in specific application tages, but provided no research ideas or methods from the point
scenarios, such as the Internet of Things (IoT) and industrial control of view of reproducibility. This weakens their stringency and is not
networks. Second, we study preprocessing techniques, such as data conducive to further research. In addition, such papers lack com-
cleaning, feature selection, and feature transformation, which can prehensiveness in the discussion of intrusion detection methods.
provide suggestions for data preparation. Third, we discuss intru- We have more fully studied and discussed several aspects of intru-
sion detection techniques and analyze their principles and related sion detection such as pre-processing methods, analytical models
applications by technology category. Fourth, we investigate evalua- and evaluation methods.
tion methods, including metrics and datasets, which can help us to
standardize them. Fifth, to study the current state of the commu- 2.2. Survey of application of different fields in IDS
nity, we count the contributors and map the collaboration network.
Finally, we conduct a systematic survey of cybersecurity datasets, Zarpelão et al. (2017) surveyed IDS in the IoT. In an overview
so as to better understand them and evaluate their applicability. of IoT devices, they argued that the IoT paradigm has the phases
In summary, the contributions of this paper are as follows: of collection; transmission; and processing, management, and ex-
ploitation, and presented a range of technologies that can be used
• We are the first to use the SLR methodology to survey and for IoT devices, focusing on wireless technologies. Hande and Mud-
study the 119 most highly cited papers in the field of net- dana (2021) provide an overview of existing security solutions for
work security intrusion detection, which were systematically SDNs and a comparative study of various IDS approaches based on
screened from 14,942 candidates. deep learning models and machine learning methods.
• We establish a comprehensive technical overview of the intru- These studies discuss the current state of research on intru-
sion detection field from both coarse- and fine-grained perspec- sion detection under a certain target network. Compared to these
tives. We provide a comprehensive overview of 52 cybersecu- works, our research is broader, covering the Internet, the Internet
rity datasets and label them according to their attributes. of Things (IoT), Software Defined Networks (SDN), and Industrial
• The analysis of approaches sheds light on future research direc- Control Networks (ICN). And, we also investigate datasets from dif-
tions. ferent domains for researchers’ reference.
The remainder of this paper is organized as follows.
2.3. Survey of intrusion detection datasets
Section 2 summarizes existing literature review work related to
intrusion detection systems and datasets, and Section 3 presents a
Ring et al. (2019) identified 15 features of 34 intrusion de-
literature review of our intrusion-detection system methodology.
tection datasets, categorized in five groups: general information,
Our research results are presented in Section 4. Section 5 con-
evaluation, recording environment, data volume, and data nature.
cludes this paper and discusses future research directions.
Thakkar and Lohiya (2020) investigated different IDS datasets and
research advances used to evaluate IDS models, focusing on the
2. Related work
CIC-IDS-2017 and CSE-CIC-IDS-2018 datasets. The studies men-
tioned above focus on the characteristics of the dataset and the
A wealth of literature covers various aspects of intrusion detec-
progress of the study. Compared to these studies, our research also
tion. In this section, we present existing related works and com-
discusses intrusion detection principles and related methods.
pare them with our study.
Further, we categorized the relevant studies mentioned above.
We categorized these studies according to the following criteria
2.1. Survey of intrusion detection methods
and present the results in Table 1.
Most of the related research focuses on intrusion detection • Methodology: it indicates whether the study is based on SLR
methods. Bhuyan et al. (2013) briefly describe and compare a methodology.
large number of network anomaly detection methods and sys- • Intrusion detection technique: it indicates whether the study
tems. Ahmed et al. (2016) analyzed anomaly detection methods discusses intrusion detection techniques, and furthermore can
and the complexity of machine learning/data mining (ML/DM) be specific to preprocessing approaches, analysis models and
algorithms. Milenkoski et al. (2015) evaluated common prac- evaluation methods.
tices for intrusion detection systems by analyzing the existing • Multi-field: it indicates whether the study discusses the current
standard evaluation parameters, including workloads and met- state of research on intrusion detection in different network en-
rics. Buczak and Guven (2015) discussed machine learning and vironments.
data mining methods for network analysis to support intru- • Dataset: it indicates whether the study covers the relevant re-
sion detection. Hodo et al. (2017) presented a classification of search of the dataset.
shallow and deep network intrusion detection systems, investi- As shown in Table 1, in contrast to other studies, our study fol-
gated the performance of machine learning techniques in de- lows the SLR Methodology with comprehensive coverage of intru-
tecting anomalies, and discussed false and true positive alarm sion detection techniques (including preprocessing methods, ana-
rates. Wang and Jones (2017) reviewed the applications of data lytical models, and evaluation methods) and datasets, and explores
mining, machine learning, deep learning, and big data in intru- multi-target networks.
sion detection. Haq et al. (2015) conducted extensive research
on the application of machine learning techniques in intrusion 3. Research methodology
detection. Mishra et al. (2018) discuss the application of ma-
chine learning methods in intrusion detection and provide at- Various intrusion detection systems have been proposed. We
tack classification and attack feature mapping for each attack. have developed a research protocol according to the methodology
2
Table 1
Related studies on network intrusion detection survey.
Related work Year SLR-based Intrusion detection method Multi-field Dataset
Preprocessing Model Evaluation

√ √ √ √
Bhuyan et al. (2013) 2014 ✕ ✕
√
Milenkoski et al. (2015) 2015 ✕ ✕ ✕ ✕ ✕
√ √ √ √
Haq et al. (2015) 2015 ✕ ✕
√ √ √ √
Buczak and Guven (2015) 2015 ✕ ✕
√ √
Ahmed et al. (2016) 2016 ✕ ✕ ✕ ✕
√ √
Hodo et al. (2017) 2017 ✕ ✕ ✕ ✕
√ √
Wang and Jones (2017) 2017 ✕ ✕ ✕ ✕
√ √
Zarpelão et al. (2017) 2017 ✕ ✕ ✕ ✕
√ √ √
Nisioti et al. (2018) 2018 ✕ ✕ ✕
√ √
Mishra et al. (2018) 2019 ✕ ✕ ✕ ✕
√
Ring et al. (2019) 2019 ✕ ✕ ✕ ✕ ✕
√
Thakkar and Lohiya (2020) 2020 ✕ ✕ ✕ ✕ ✕
√
Hande and Muddana (2021) 2021 ✕ ✕ ✕ ✕ ✕
√ √ √ √ √ √
Our study
Fig. 1. SLR process.
of the systematic literature review (SLR) (Keele et al., 2007), as 3.2. Research questions
shown in Fig. 1. This includes identification of research, research
questions, study selection, data extraction, and data synthesis. The We developed ideas for the analysis of a paper and articulated
method is approached with mixed methods (qualitative and quan- specific research questions (RQs), as shown in Table 2, including
titative research methods) to more visually represent the above detailed sub-questions, to guide our research. First, we summarize
needs. the network environment in which intrusion detection techniques
are applied (RQ1), which helps us to analyze the characteristics of
the development and application of intrusion detection techniques.
Second, we investigate the data preprocessing techniques (RQ2)
3.1. Identification of research content
and intrusion detection datasets (RQ6) commonly used in intru-
sion detection and make recommendations for the data prepara-
To obtain a comprehensive set of papers required an unbiased
tion phase based on the findings. Third, we focus on the intru-
search strategy to find original reviews related to intrusion detec-
sion detection techniques proposed in the paper (RQ3), including
tion systems. The search process must be as rigorous and sensible
framework (RQ3(a)), learning method (RQ3(b)), and types of su-
as possible, and search terms must be defined. We find that some
pervision (RQ3(c)). Also, we are very interested in the principles
anomaly-based intrusion detection articles are named with intru-
and applications of the model (RQ3(d)). Fourth, evaluation meth-
sion detection. Therefore, to fully cover anomaly-based intrusion
ods are important to measure the capability of intrusion detection
detection articles, we defined the search term as “network intru-
techniques, so we would like to learn about the general evaluation
sion detection”.
metrics in this area (RQ4). Finally, we are also interested in the
Before we started our literature search work, we evaluated
authors of the papers (RQ5).
three databases, Scopus, Google Scholar and Web of Science. Sco-
pus covers the major publishers of RE (ACM, Springer, IEEE) and is
more inclusive than Web of Science, but less inclusive than Google 3.3. Study selection
Scholar. However, Google Scholar may include many papers that
are not peer-reviewed, such as technical reports. For these reasons, The following research principles ensure consistent evaluation
we used Scopus to perform the publication search. and minimize subjectivity.
3
Table 2
Research questions .
RQ1 Application domains (a) What domains are covered by intrusion detection techniques?
(b) How are the studies distributed among the different areas?
(c) What are the reasons for this distribution?
(d) How does research in these domains differ from country to country?
RQ2 Data preprocessing methods (a) What are the common data preprocessing techniques used in network intrusion detection?
(b) How are the preprocessing technologies implemented and what are their technical features?
(c) What is the distribution of their applications in intrusion detection?
RQ3 Detection techniques (a) Which models are applied in intrusion detection techniques?
(b) How does machine learning and deep learning apply to intrusion detection?
(c) What is the distribution of supervision types?
(d) What are the principles and characteristics of different intrusion detection technologies?
RQ4 Evaluation metrics (a) How is the performance of intrusion detection technologies evaluated?
(b) In our research articles how are these evaluation methods applied?
RQ5 Authors (a) Who are the main contributors to the articles?
(b) What does the network of co-authors look like?
RQ6 Datasets (a) What are the available public datasets?
(b) Which datasets are often used in network intrusion detection?
(c) Why are these datasets widely used?
Table 3
Exclusion criteria and their interpretation.
Criteria EC/IC Criteria explanation
Inclusion IC1 Research efforts are explicitly and specifically dedicated to intrusion detection systems.
IC2 Research on intrusion detection methods based on machine learning
Exclusion EC1 Does not meet average of 10 citations per year.
EC2 A paper shorter than 6 pages contains insufficient research content.
EC3 Not an anomaly-based intrusion detection paper.
EC4 Only a review paper.
• Explicit inclusion and exclusion criteria. These should be ex-

plicitly outlined; see Table 3. We screened papers for quality,
length, and type to obtain a collection that could be used effec-
tively for research and analysis.
• Objective review strategy. Papers should be reviewed for inclu-
sion or exclusion by at least two reviewers with knowledge in
the field. Information determination for datasets should be re-
viewed for applicability by at least two reviewers. A third re-
viewer makes a final decision if there is disagreement.
3.4. Data extraction
Papers collected from the database are filtered using our de-
fined criteria, as shown in Fig. 2. After filtering, we had 119 papers
related to intrusion detection.
Relevant information on each paper was extracted and tagged
for analysis. There are two categories of tags. The first can be
obtained from the content of a paper: year of publication, au-
thor, number of citations, domain, model, evaluation metrics, and
dataset. The second is based on the learning method and supervi-
sion type.
It was not necessary to carefully read the full text of a paper.
We read the title, abstract, and introduction, which contained most
of the information, and examined the text if necessary.
Fig. 2. Paper selection process.
3.5. Data synthesis
For learning methods and types of supervision, we annotated Industrial networks implement communication protocols between
by analyzing the employed models. We classified domains through field devices, digital controllers, various software suites, and exter-
research of the traditional Internet (Web), IoT, industrial control nal systems. SDNs enhance network control through programming.
networks (ICNs), and software defined networks (SDNs). Lacking This combination of features can bring the benefits of enhanced
a specific application scenario, we classified papers as such. The configuration, improved performance, and new architecture.
IoT is a network of objects embedded with sensors, software, and It is also necessary to label datasets. Datasets for network
other technologies to connect and exchange data with other de- packet analysis in commercial products are not readily available
vices and systems via the Internet. IoT technologies are most of- for privacy reasons. Publicly available datasets such as DARPA,
ten associated with the “smart home” concept. ICNs are networks KDD, and UNSW-NB15 are widely used as benchmarks. We de-
of digital control systems connected using the Ethernet standard. fined labels, including “Year”, “Authenticity”, “Count”, “Labeled”,
4
Table 4
Number of papers by domain in intrusion detection methods.
Domain Number
Internet 88
Internet of Things (IoT) 20
Industry Control Network (ICN) 9
Software Defined Network (SDN) 2
Fig. 4. Data preprocessing techniques commonly used in intrusion detection.
the field of network intrusion detection (RQ1(d)). This indicates

that the two major developing countries, China and India, currently
have a great devotion and importance to cyber security. With the
rapid development of IoT technology, IoT security has also been a
hot research topic in recent years, with half of the researched arti-
cles published in the past three years. As in the traditional network
environment, China is leading the way in the number of articles
published in the IoT field, accounting for 1/5 of the total. Based on
the rapid development of 5G technology and the empowerment of
IoT, IoT will be in a state of continuous development in the future,
and the security issues and related research will also become a re-
Fig. 3. Five countries with the most published papers. search hotspot in the field of security.
4.2. Data preprocessing methods

and “Number of labels” to reflect the availability, novelty, authen-
ticity, and data volume of a dataset.
The representation and quality of the data is of primary im-
portance in any data analysis process (Pyle, 1999). Raw data usu-
4. Analysis of intrusion detection literature review
ally contain noisy and unreliable data that can affect training and
analysis. Moreover, datasets used in intrusion detection are char-
In this section, we present and analyze our findings. We provide
acterized by high dimensionality, making it more difficult to dis-
an in-depth analysis and present our recommendations in the field
cover knowledge during training. To build high-performance detec-
of network intrusion detection from various perspectives, including
tors requires efficient preprocessing. Fig. 4 summarizes the com-
method application areas, data preprocessing methods, detection
mon data preprocessing methods used in intrusion detection, in-
techniques, evaluation metrics, datasets, and authors.
cluding data cleaning, imbalance learning, data transformation, fea-
ture selection, feature extraction, and Visualization (RQ2(a)).
4.1. Application domain
The above methods preprocess the data from different perspec-
tives to enable the analysis model to better learn useful informa-
From the perspective of application domain, 119 articles can be
tion from the data. In order to study the technical characteristics of
categorized as Internet, IoT, ICN and SDN (RQ1(a)). It is worth not-
different methods, we summarize their implementation principles
ing that we classify all articles that do not specify a particular ap-
and applications to answer RQ2(b).
plication domain as Internet. We summarize the number of 119 ar-
a ) Data cleaning
ticles classified based on different application domains in Table 4,
Data cleaning corrects corrupt or inaccurate records. Quality cri-
answering RQ1(b). More than 70% of the papers apply to the In-
teria may include the following.
ternet, which indicates that the security of traditional networks
is an important research topic. This distribution is on one hand • Validity. Data might have to be of a certain type, such as
due to the relative maturity of intrusion detection research in the Boolean or numeric.
traditional network domain, and many papers are devoted to the • Accuracy. Data must conform to the situation. For example,
improvement or refinement of existing methods (RQ1(c)). On the outliers may exist due to the recording process. Accuracy is
other hand, there are more public datasets used in this field, such difficult to guarantee through data cleaning because real data
as KDD99, UNSW-NB15, etc., which are easy for researchers to an- sources are needed for validation.
alyze and experiment. There are fewer research articles in the field • Completeness. Some data may have unknown or missing val-
of ICN and SDN, with only 9 and 2 articles respectively. Through ues. Completeness issues are generally resolved through default
our research, we found that articles in the ICN domain do not dis- values, setting zeros, or removal.
close their datasets due to the confidentiality of industrial network • Uniformity. Inconsistency occurs when there are conflicts in a
data, thus also limiting their development in security research. The dataset. For example, a source IP may differ between two re-
lack of datasets is also a key factor limiting research in the SDN ceivers. Fixing this type of problem requires the determination
field. Researchers often need to build SDN network environments of which datum is most reliable.
to simulate data before conducting security research.
Different countries or regions also show different trends in dif- Data analysis based on means, standard deviations, or clustering
ferent application domains. As can be seen in Fig. 3, China and algorithms can reveal errors, whose values can sometimes be set to
India are in a clear lead in the number of articles published in a mean or other statistical measure.
5
b ) Imbalanced learning the MSE of misclassifying a class marked as 0 as a class marked

A sample with different proportions of positive and negative as 9 will be 81 times that of a class misclassified as 1, which
cases will lead to a bias in the learning process toward the higher is unreasonable. One-hot encoding is a common way to han-
proportion. For example, in the extreme case of a dataset with 95% dle this. The algorithm uses n-bit status registers to encode n
positive and 5% negative cases, the model will have no practical states. Only one corresponding register bit is valid when a given
meaning. Since attacks tend to be sparse, datasets in intrusion de- state is in effect.
tection are often unbalanced. The following methods may improve • Numeric data. The range of values differs by feature. Deep
performance. learning frameworks avoid its impact on model accuracy by in-
troducing bias, but the time spent on model learning may still
• Sampling methods. A balanced dataset usually provides bet- be affected when the ranges of values of two features are too
ter overall classification performance (Estabrooks et al., 2004; different. For example, in ML optimization using gradient de-
Weiss and Provost, 2001), and to obtain the same propor- scent, the eigenvalues of {X T X } (i.e., the scales of features) de-
tion of positive and negative examples is the most common termine the speed of convergence to a global or local mini-
imbalanced learning method. The simplest method, to under- mum. Therefore, value ranges of features are often unified by
sample the majority class, obviously leads to information loss. data scaling before training, such as by min-max normalization.
Liu et al. (2008) introduced the EasyEnsemble and BalanceCas- Every value of a feature is mapped to between 0 and 1,
cade algorithms, combining a subset of majority classes with
x − min(x )
minority classes and performing ensemble learning on the clas- x = . (2)
max(x ) − min(x )
sifiers. NearMiss uses a KNN classifier to select the major-
ity class with the smallest average distance from the minority However, this method cannot handle outliers. For example, with
class (Mani and Zhang, 2003). Oversampling of minority exam- nine values between 0 and 1 and an outlier equal to 100, the
ples is usually accomplished by synthetic sampling. The syn- nine smaller values will be mapped to values between 0 and
thetic minority oversampling technique (SMOTE) synthesizes 0.01. This can be avoided by z-score standardization,
data based on feature space similarity between minority ex- x−μ
x = , (3)
amples (Chawla et al., 2002). Modified and extended sampling σ
algorithms (Fernández et al., 2018) have been proposed to ad-
where μ and σ are the mean and standard deviation, respec-
dress the problem of over-generalization in SMOTE (Wang and
tively, of the feature. This can scale the values to near 0 while
Japkowicz, 2004). The adaptive integrated sampling method
preserving the distribution of the features, but the features may
(ADASYN) (He et al., 2008) adaptively creates synthetic data
not be on exactly the same scale.
based on the distribution of minority examples and solves the
problem of overlapping between classes. d ) Feature selection
• Cost-sensitive approach. This approach considers the costs as- Feature selection is the selection of a subset of the original
sociated with misclassification and uses a different cost ma- dataset as model input. This can avoid dimensional disasters and
trix to describe them when misclassifying data (Ting, 2002). It enhance generalization (Bermingham et al., 2015). To perform fea-
has been shown to be a viable alternative to sampling meth- ture selection requires that data contain redundant or irrelevant
ods (McCarthy et al., 2005). features, so as to avoid excessive information loss. Feature selec-
• Additional methods. Many algorithms obtain good perfor- tion can be accomplished in several ways.
mance from other perspectives. Based on kernel methods as
• Manual selection. Whether to remove a feature is manually de-
well as active learning, Ertekin et al. proposed an SVM-based
termined. See, for example, Zhang et al. (2018).
active learning method (Ertekin et al., 2007) that restricts the
• Exhaustive search. To test every possible subset of features to
query process in each iteration of active learning to the data
find the subset with the lowest error rate can be computation-
pool rather than the entire dataset. The SVM is trained during
ally intensive (Guyon and Elisseeff, 2003).
this process, and the most informative instances are extracted
• Embedded method. Feature selection is performed during
from the hyperplane to form a new training set. One-class
model construction. The Bolasso algorithm (Bach, 2008) reduces
learning uses mainly or only single-class samples for recogni-
many regression coefficients to zero by constructing a linear
tion, distinguishing it from traditional distinction-based induc-
model and combining the L1 penalty with the L2 penalty of
tion, with good results with extremely unbalanced data (es-
ridge regression. FeaLect (Zare et al., 2013) scores and selects
pecially high feature space dimensions) (Raskutti and Kowal-
features based on the combinatorial analysis of regression coef-
czyk, 2004).
ficients.
• Wrapper approach. A prediction model is trained with each
c ) Data conversion
subset and tested on a holdout set. The score of a subset is ob-
Training data must often be transformed and mapped before
tained from the error rate of the model test. This is computa-
being fed to the model to accommodate requirements, and to im-
tionally intensive, and usually used only to find the best subset
prove detection speed and accuracy. This affects two types of data
of features.
in IDS datasets.
• Filtering method. Methods such as mutual informa-
• Non-numeric data. Taking the UNSW-NB15 dataset as an ex- tion (Guyon and Elisseeff, 2003), Pearson correlation coef-
ample, features in nominal form include the type of transport ficients, and significance scores, such as inter- or intra-class
protocol, state, service type, and attack type, stored as strings, distances (Yang and Pedersen, 1997), can be used to score
which most machine learning algorithms do not support. The a subset of features, which can rank features but does not
most straightforward way is to number the values under a fea- produce the best subset.
ture and map them, but this will cause errors. For example, in e ) Feature extraction
the calculation of mean square error, Unlike feature selection (Sarangi et al., 2020), feature extraction,
i.e., the creation of new features to facilitate learning, is considered
1
M
MSE = ym − yˆm 2 , (1) a key factor in building a model. This can be performed by the
M following algorithms.
m=1
6
• Principal Component Analysis (PCA). One of the most Table 5

Statistics of feature engineering methods.
used linear dimensionality reduction methods, PCA changes
the basis of data according to principal components, which Method Count
are essentially eigenvectors of the data covariance matrix. Swarm intelligence algorithms 12
De la Hoz et al. (2015), used PCA for feature extraction, while Manually defined rules 8
Xiao et al. (2019) combined PCA with autoencoders to com- PCA 6
press high-dimensional features for input to CNNs. Variants in- Deep learning 5
Clustering 4
clude probability PCA (PPCA), which utilizes probability dis-
SVM 2
tributions; kernel PCA, which uses kernel functions to map Decision tree 1
low-dimensional spaces to high-dimensional spaces before us-
ing PCA to reduce the dimensions; and independent compo-
nent analysis (ICA), which requires no hidden variables to obey
dimensional visualization techniques to visually determine the
Gaussian distributions.
effect of dimensionality reduction.
• Linear Discriminant Analysis (LDA). A classic dime nsionality-
• Principal Component Analysis (PCA). As we talked about in
reduction method, LDA finds linear combinations of features
the previous section, PCA is commonly used for dimensional-
to describe multiple classes of objects. As a supervised learn-
ity reduction of high-dimensional data and can be used to ex-
ing algorithm, it searches the low-dimensional space for the
tract the main features of the data. It is also often used to
vectors that best distinguish classes of data (Martinez and
visualize data. Fig. 6 shows the distribution of the data af-
Kak, 2001), projecting data in low dimensions to mini-
ter visualization based on PCA. The green points are normal
mize intra-class distances and maximize inter-class distances.
samples and the gray points are attack samples. In studies re-
Subba et al. (2015) built an intrusion-detection model using
lated to intrusion detection, Ruan et al. (2017) visualized the
LDA with logistic regression, with significant advantages in
KDD99 dataset based on PCA and proposed a new sampling
computational efficiency.
method that can visually identify normal classes because it has
• Autoencoder. This method uses hidden layers for unsupervised
the compactness and uniqueness of internal classes. In Bulavas
learning, mapping high-dimensional features by a nonlinear
et al.’s study Bulavas (2018), the authors proposed an intrusion
transformation (Goodfellow et al., 2016) to produce a repre-
detection method based on the PCA method for data visualiza-
sentation as close as possible to the original input. Regular-
tion combined with decision trees. Experiments show that the
ized autoencoders (sparse, denoised, and shrunken) are com-
method has shown good performance in the detection of a va-
monly used in learning representation (An and Cho, 2015).
riety of attacks.
Zhang et al. (2018) achieved 98.80% accuracy on the UNSW-
NB15 dataset with a denoising autoencoder (DAE). Compared with other preprocessing methods, feature selection
and feature extraction are the research points of many articles.
f ) Visualization Among all the papers we investigated, 38 focus on making im-
Data visualization is a graphical representation of data that is provements to feature engineering algorithms (including feature
used to better help researchers understand characteristics such as selection, feature extraction), while the others focus on improv-
data distribution. In intrusion detection, visualization helps us fur- ing classification algorithms. In general, it can be considered that
ther understand the characteristics of an attack by refining data current research on IDS is more focused on improving the perfor-
attributes and characteristics. And, due to the incomprehensibility mance of classification algorithms. We believe this trend is caused
of machine learning algorithms, we are often unable to analyze the by the fact that the format of each dataset is too different, a prob-
causes of classifier misclassification. Using visualization techniques, lem implying that the generalization of feature-related algorithms
we better identify attack behaviors for deeper analysis. Data di- is usually worse, further leading to inefficiency in the application
mensionality reduction works by extracting a subset of the original of feature engineering algorithms. We summarize the algorithms
features or transforming the original data to a lower dimensional used in these 38 articles, see Table 5. In the results shown in
space. In intrusion detection, t-SNE and PCA are often used to im- Table 5, swarm intelligence algorithms can be found to be rel-
plement network traffic visualization. atively more popular, which we believe is again related to the
dataset (RQ2(c)). Given that IDS datasets usually have too many
• t-distributed Stochastic Neighbor Embedding (t-SNE). The t- features, researchers usually focus more on performing feature se-
SNE (Van Der Maaten, 2014) technique is a dimensionality lection. With the difficulty of determining the importance of fea-
reduction technique used to visualize high-dimensional data tures, swarm intelligence algorithms with a certain degree of ran-
sets by representing them in a low-dimensional space in two domness become the preferred choice.
or three dimensions. It is based on the improvement of dis- In contrast to feature engineering, researchers have tended to
tributed Stochastic Neighbor Embedding (SNE) (Hinton and improve classification algorithms based on integration learning and
Roweis, 2002), which solves the drawback of crowded sample deep learning, which have higher capabilities, in order to obtain
distribution and inconspicuous boundary of SNE after visual- more accurate classification results. It is worth mentioning that
ization. Visualization can help researchers understand informa- deep learning itself can also perform feature engineering, which
tion about data distribution, sample overlap, etc. For example, is one of the reasons why deep learning is widely used in the re-
Fig. 5 shows the distribution of normal samples (green points) search of IDS. In future work, we also propose to combine feature
and attack samples (gray points) after visualization. Visualiza- engineering with visualization, which will help us understand fea-
tion methods have been practically applied to intrusion detec- tures such as data distribution to further understand the character-
tion. Hamid and Sugumaran (2020) performed data dimension- istics of attacks.
ality reduction and visualization based on t-SNE and combined
with support vector machine for classification in their study. 4.3. Detection technique
The results showed that the detection rate was improved for al-
most all attack groups. Yao et al. (2020) proposed a new unsu- We summarize the classification models used in the article in
pervised intrusion detection algorithm based on t-SNE and hi- Tables 6 and 7, answering RQ3(a). The most commonly used ma-
erarchical neural networks. In this study, the authors used two- chine learning algorithm for intrusion detection is SVM, a discrim-
7
Fig. 5. Visualization plots based on t-SNE.
Fig. 6. Visualization plots based on PCA.
Table 6 Table 7
Most used Machine Learning Algorithms in Proposed Methods. Most used Deep Learning Algorithms in Proposed Methods.
Type Method Count Type Method Count
Supervised learning Support vector machine 21 Supervised learning DNN 7

Decision tree 11 RNN 7
Naive Bayes 8 CNN 7
K-nearest neighbors 7 LSTM 4
Random forest 5 DBN 2
AdaBoost 2 DFFNN 2
Hidden Markov model 1 BPNN 1
Unsupervised learning K-means 4 ELM 1
DBSCAN 1 Unsupervised learning Autoencoder 5
Self-taught learning 3
RBM 1
inative classifier defined by a split hyperplane that uses a kernel eralization capability and performs well. Decision trees are widely
function to map training data to a high-dimensional space for lin- used due to their high efficiency and interpretability.
ear classification of intrusions. Data used in intrusion detection Deep learning is evolving rapidly, and is becoming the basis of
usually have high dimensionality, with which SVM has high gen- more intrusion detection methods. To answer RQ3(b), we plotted
8
Fig. 7. Changes in numbers of papers published over time.
annotation is time-consuming and labor-intensive. Therefore, unsu-

pervised and semi-supervised learning should also be of interest.
Among the four target networks, SVM and DT are the most
widely used machine learning algorithms in the “Internet” and
have achieved excellent performance. Unsupervised learning meth-
ods are more popular in the “IoT” and “ICN”. For “SDN”, only two
studies are included in our work, and they use RNN and RF algo-
rithms, respectively.
Although machine learning and deep learning are increasingly
being used for network intrusion detection, the effectiveness of
these methods can be significantly reduced in adversarial en-
vironments. An adversarial environment is one in which adver-
saries consciously limit or prevent accurate performance by some
means. For example, adversaries design adversarial examples and
add them to the training set to trick the model into producing in-
correct outputs. Alhajjar et al. (2021) generate adversarial exam-
ples using evolutionary computation (particle swarm optimization
and genetic algorithms) and deep learning (generative adversarial
Fig. 8. Number of published papers over time.
networks). Their adversarial example generation technique caused
high misclassification rates in 11 different machine learning mod-
els and voting classifiers. To improve the robustness of intrusion
the number of ML and DL based intrusion detection articles per detection algorithms in adversarial environments, researchers have
year, see Fig. 7. Based on the specificity of our research subject, also conducted related research. Caminero et al. (2019) proposed
the number of studies in 20192020 declined compared to previ- an adversarial environment reinforcement learning algorithm for
ous years. However, the search criteria may have filtered out some intrusion detection. They added an adversarial agent strategy to
quality papers with fewer citations. There is an upward trend in the training to increase the classifier’s false predictions and force
the number of annual papers because, with more frequent net- it to learn the most difficult cases, ultimately obtaining better re-
work attacks, people are paying more attention to network secu- sults. However, in general, more efforts are needed to study intru-
rity. As the Internet carries increasing amounts of information, it sion detection algorithms in adversarial environments.
has become a profitable target for attackers. In addition, hacking In order to study the principles and characteristics of different
tools and techniques are readily available. intrusion detection models, we present them in the following and
We can see that traditional machine learning methods are still analyze their principles and related applications to answer RQ3(d).
the mainstream technology. These are easier to deploy and imple- - DT supports decision making through a tree-like model
ment than deep learning methods, are not limited by computing consisting of decisions and their outcomes, is widely used in
power, and are more interpretable. However, the trend in the num- classification tasks (often called classification trees), and there-
ber of machine learning papers is similar to that of deep learn- fore is a common supervised learning classification method in
ing. The rapid development of deep learning has added to research IDS. A trained DT makes multiple selections of a packet’s fea-
of intrusion detection. Both SVM and DT are supervised learn- tures to determine its class. An optimal DT holds the most data
ing methods, and they require labels during training. Labeling is with a minimum number of levels (Quinlan, 1983). Several al-
time-consuming and tedious for large datasets, and clustering algorithms have been proposed to generate optimal trees, such as
gorithms may be a better choice. K-means and DBSCAN are com- ID3 (Quinlan, 1986), C4.5 (Quinlan, 2014), and classification and
monly used clustering algorithms, as shown in Table 6. We classify regression tree (CART) (Loh, 2011). There are different metrics to
methods by the supervision type, as shown in Fig. 8, from which it measure DT performance. ID3 uses information gain (entropy) and
can be seen that supervised learning is most widely used (RQ3(c)). makes decisions by selecting attributes with the highest informa-
This is because many publicly available datasets are already labeled tion gain, but does not support missing and continuous values
and researchers prefer supervised learning. As mentioned earlier, in features, which limits its applicability. C4.5 uses the informa-
9
tion gain ratio based on ID3, and prunes the tree by replacing in each cluster based on the distances between them, which im-
branches that do not help as leaf nodes. CART uses Gini impu- proves the clustering of complex data.
rity (an information-theoretic measure corresponding to Tsallis en- Clustering algorithms can categorized, such as connectivity-
tropy) as a metric, solving the problem that ID3 not handle the based (e.g., hierarchical), centroid-based (k-means, fuzzy c-means),
regression task. distribution-based (GMM), density-based (DBSCAN), or grid-based
DT has an intuitive classification strategy, is interpretable and (STING). Clustering is generally simple to implement and easy to
simple to implement, and often allows for better generalization interpret, but is sensitive to outliers, and initial values of parame-
through post-construction pruning, making it a common model ters have too much influence on the results.
in intrusion detection. Anthi et al. (2019) proposed a three-layer Peng et al. (2018) proposed a method for intrusion de-
intrusion detection system (IDS) that identifies IoT devices based tection systems using small-batch k-means for clustering and
on MAC addresses, classifies messages as bona fide or malicious, PCA to reduce data dimensionality. Experimental results and
and employs DTs to classify attacks. Abbes et al. (2010) classified time complexity analysis showed that the method is effective.
records as benign or anomalous by analyzing application proto- Casas et al. (2012) proposed UIDS, an unsupervised network in-
cols, using separate and distinct adaptive DTs for each. The sys- trusion detection system capable of detecting unknown network
tem achieved good results identifying DoS attacks, scanning at- attacks without the use of q signature, labeled traffic, or training.
tacks, and botnets. Muniyandi et al. (2012) proposed an anomaly- UIDS uses an unsupervised outlier detection method based on sub-
detection method that uses k-means to form k clusters of training space clustering, and multiple evidence accumulation techniques
instances based on Euclidean distance similarity, and C4.5 on each to identify types of attacks.
cluster to construct DTs of normal and abnormal instance density - Naive Bayes (NB) is a probabilistic classifier based on Bayes’
regions. theorem [44]. All naȯve Bayesian classifiers are based on the prin-
The disadvantage of DT is weak robustness; small changes in ciple that the value of a feature is independent of the value of any
training data may result in a completely different DT. Further- other feature, i.e.,
more, information gain is biased toward attributes with more lev-
els (Deng et al., 2011), so larger DTs may require manual pruning.
n
yˆ = argmax p(Ck ) p(xi | Ck ), (4)
- SVM (Özgür and Erdem, 2016) constructs an N-dimensional k∈{1,...,K } i=1
hyperplane to optimally classify data. SVM can be linear or non-
linear. Linear SVM is used for linearly separable data, i.e., datasets where yˆ is the conditional probability that the data belong to each
that can be divided into two categories by a straight line. Nonlin- class, k is the number of classes, Ck is the kth class, n is the num-
ear SVM is used for nonlinearly separable data. For this, we use a ber of features, p(Ck ) is the prior probability of Ck , and p(xi | Ck )
kernel trick that sets data points in a higher dimension where they is the conditional probability of feature xi given class Ck . A feature
can be separated using planes or other functions. distribution (i.e., an event model) or nonparametric model gener-
SVM can simplify the solution of high-dimensional problems. It ated from the training set must be assumed in order to compute
is based on small-sample statistical theory, has good generalization a class prior. The multimetric and Bernoulli distributions are usu-
ability, and is often used in intrusion detection. Jan et al. (2019) de- ally used for discrete features, and the Gaussian distribution for
veloped a lightweight attack-detection strategy using supervised continuous features. Bayesian classifiers can be trained on both la-
machine learning-based SVMs to detect attempts to inject un- beled and unlabeled datasets by certain semi-supervised training
wanted data into IoT networks. It obtains a feature pool from algorithms [15].
samples, and uses it with a label vector to train the SVM. The Koc et al. (2012) proposed an approach based on the hidden
method has good classification accuracy and detection times. naȯve Bayes (HNB) model, which can be applied to intrusion de-
Teng et al. (2017) proposed an intrusion detection method based tection problems affected by dimensionality, highly correlated fea-
on SVM, which constructs four two-stage SVMs based on the structures, and high network data stream capacity. HNB is a data mining
ture of DT. SVM1, SVM2, SVM3, and SVM4 detect normal data, model that relaxes the conditional independence assumptions of
DoS/DDoS attacks, probing attacks, and R2L or U2R attacks, re- the NB approach. Experimental results show that the HNB model
spectively. Experiments show that this method outperforms the outperforms the traditional NB model in terms of accuracy, error
method of a single SVM in terms of detection rate and recall. rate, and misclassification cost. To address the potential threat of
De la Hoz et al. (2015) proposed a hybrid statistical technique and DDoS attacks in the IoT, Mehmood et al. (2018) proposed an NB al-
Self Organizing Map (SOM) for network anomaly detection and gorithm with multi-agent-based IDS (NB-MAIDS) and implemented
classification. The method uses PCA and the Fisher discriminant multi-agents in the whole network.
ratio (FDR) for feature selection and noise removal, and probabilis- Although the independence assumption of NB is often violated
tic self-organizing mapping (PSOM)-based modeling of the feature in practice, it still has relatively high accuracy. In addition, as a
space to distinguish normal and malicious traffic. linear algorithm, NB has high training efficiency. These qualities
- Clustering groups objects that are more similar to each other have led to its widespread application as a baseline for classifica-
than to objects in other groups. It is generally understood as a task tion problems.
to be solved rather than an algorithm. Since the concept of a clus- - Ensemble learning combines multiple classifiers through an
ter (i.e., the similarity between objects) cannot be described pre- algorithm to find a (hopefully) better hypothesis in a mixed mul-
cisely, there are widely different clustering algorithms. Clustering tiple hypothesis space. It should be noted that the combination of
can be considered hard or soft, according to matching rules be- multiple classifiers does not guarantee better performance than the
tween objects and clusters. Hard clustering strictly assigns objects best individual classifier, but it reduces the risk of a particularly
to classes. The most representative algorithms are k-means cluster- poor selection.
ing and k-nearest neighbor (KNN), which calculate the Euclidean One of the earliest and most intuitive integration-based algo-
distance between objects to classify clusters. Soft (or fuzzy) clus- rithms, bagging (bootstrap aggregating) (Breiman, 1996) obtains
tering calculates the degree (e.g., probability) of each object’s be- the diversity of classifiers by randomly drawing a subset of the en-
longing to a cluster. Data often cannot be divided into clearly sepa- tire training to train classifiers of the same type, and allows each
rated clusters, and soft classification is used to obtain more flexible classifier in the set to vote with the same weight to combine in-
results. Fuzzy clustering means is a widely used soft clustering al- dividual classifiers. The random forest classifier (Breiman, 2001) is
gorithm that calculates the membership coefficient of each object a common machine learning method that combines bagging with
10
DTs. The boosting method recursively builds an ensemble by train- a genetic algorithm-based packing method as a search strategy and
ing a new classifier to emphasize the training data misclassified by logistic regression as a learning algorithm to select the best fea-
its previous classifier. Based on this algorithm, several well-known ture subset. The method effectively improves the intrusion detec-
machine learning algorithms have been proposed, such as adaptive tion performance. Hajisalem and Babaie (2018) proposed a hybrid
boosting (AdaBoost) (Freund et al., 1996), gradient boost decision classification method based on ABC and artificial fish swarm (AFS)
tree (GBDT), and extreme gradient boosting (XGBoost). algorithms, using fuzzy C-Means (FCM) clustering and relevance-
Ensemble learning improves the generalizability and accuracy based feature selection (CFS) to divide the training dataset and re-
of the final model by ensembling multiple classifiers, and is less move irrelevant features. Based on the selected features, if-then
likely to be overfitted. Its training and prediction speeds are rules are generated by a CART technique to distinguish normal
naturally lower than those of a single classifier, and the inter- and abnormal records. The generated rules are used to train the
pretability of the model is largely lost in some complex ensem- method to the detection model. In simulations on the NSL-KDD
bles (Madeh Piryonesi and El-Diraby, 2021). Singh et al. (2014) de- and UNSW-NB15 datasets, the method achieved a detection rate
veloped an RF-based DT model for the quasi-real-time peer-to- of 99% and a false-positive rate of 0.01%.
peer botnet detection problem. Li et al. (2018) proposed an arti- - The DNN is an artificial neural network (ANN) with multi-
ficial intelligence-based two-stage intrusion detection method that ple layers between the input and output layers (Bengio, 2009).
uses software-defined techniques. It uses the swarm partitioning In a narrow sense, it is a fully connected neural network with a
and binary difference variants of the bat algorithm to select typ- structure similar to a multilayer perceptron (MLP). The lower-layer
ical features, and RFs to classify streams by adaptively chang- neurons of a fully connected DNN can form connections with all
ing the weights of samples using a weighted voting mechanism. upper-layer neurons. A DNN uses backpropagation to perform a su-
Hu et al. (2013) proposed an online intrusion detection algorithm pervised learning task with nonlinear activation functions.
that constructs a local parameterized detection model at each node Vinayakumar et al. (2019) built a DNN-based distributed
using the online AdaBoost algorithm. A global detection model is deep learning model for an intrusion detection framework
constructed in each node using a small number of samples in the for real-time processing and analysis of very large-scale data.
nodes, combined with the local parametric model. Experimental Xu et al. (2018) proposed an IDS consisting of an RNN with gated
results show that the improved online AdaBoost has a higher de- recurrent units (GRUs), MLP, and softmax module. The DNN can
tection rate and lower false-alarm rate. theoretically approximate any function (Cybenko, 1989).
- Evolutionary algorithms are global optimization algorithms - The CNN is an artificial neural network with a shared-weight
inspired by biological evolution, usually the trial-and-error prob- structure based on convolutional kernels or filters. Inspired by bi-
lem of populations. Initial candidate solutions are repeatedly up- ological processes (Hubel and Wiesel, 1968), a CNN slides the con-
dated and iterated, with poorly performing solutions removed at volutional kernel along the input features to extract translation-
each generation and random variations introduced, consistent with equivariant responses called feature maps.
the concept of natural selection and variation. The CNN and its related architectures have received consider-
Most widely used are genetic algorithms, genetic programming, able attention due to their excellent performance at computer vi-
evolutionary algorithms, particle swarm optimization (PSO), and sion (He et al., 2016). Starting with LeNet-5 (LeCun et al., 1998),
artificial immune systems, which differ mainly in how the itera- numerous CNN architectures, including AlexNet (Krizhevsky et al.,
tions are performed. Genetic algorithms and genetic programming 2012) and ResNet (He et al., 2016), have been proposed. Although
calculate a fitness value for each individual in a population, and se- CNN architectures are usually applied to CV problems, they have
lect individuals with high fitness values for the mating pool with shown good results in IDS as well (Dong et al., 2019; Vinayakumar
high probability to produce the next generation through the ex- et al., 2017). Li et al. (2017) proposed an image conversion method
change of genetic material and mutations between individuals. The for NSL-KDD data, in which CNNs automatically learn the features
genetic algorithm considers the bit string as an individual, while of graphic NSL-KDD transformations.
genetic programming considers the program as the individual. Evo- Compared to the DNN, the CNN’s extraction of local features
lutionary algorithms generally simulate the biological learning pro- reduces the number of weights, as well as the computational com-
cess in nature. For example, the artificial bee colony (ABC) algo- plexity, thus improving the training and prediction speed. How-
rithm simulates the process of bees searching for food sources. The ever, this can lead to problems; some trained CNN models extract
artificial immune system simulates the immune system function by the features of wheels in an image and immediately judge the im-
cloning and mutating antibodies with high affinity to a “virus” (i.e., age as a truck.
the sample to be detected) in order to iterate. - The RNN is a class of artificial neural network that can tem-
Evolutionary computation is characterized by a variety of itera- porally exhibit memory behavior. This dynamic behavior is imple-
tive methods. The iterative approach typically requires the man- mented by connections between nodes to form a directed graph
ual definition of multiple parameters and evaluation functions along a time sequence (Dupond, 2019). The internal state of the
for the problem to be solved. Thus, the algorithm has problem- RNN allows it to process variable-length input sequences.
independent fast search capability and wide applicability, and the Depending on whether the constructed graph has a loop,
population-based principle brings parallelism, which increases the an RNN can be further classified as finite- or infinite-impulse
speed of the search for the optimal solution. However, the perfor- (Miljanovic, 2012). Finite-pulse networks can be unrolled and re-
mance of the evolutionary computation depends strongly on the placed with strict feedforward neural networks (FNNs), while
evaluation function and parameters (which are usually set empiri- infinite-pulse recurrent networks cannot. Moreover, there can be
cally), which affects the efficiency of the solution. Some algorithms additional stored states in an RNN, thus improving it to a network
converge too easily to a local optimum, or even an arbitrary point, that can be implemented with time delays or feedback loops (e.g.,
while others are poor at finding local optimum problems. Although long- and short-term memory networks).
this can be alleviated by replacing the evaluation function and pa- The RNN was proposed to solve the problem that a
rameters (Taherdangkoo et al., 2013), the “no free lunch” theo- DNN has difficulty fitting data that changes temporally. There-
rem (Wolpert and Macready, 1997) has proved that this problem fore, RNNs have played an important role in areas such
has no general solution. as natural language processing and action recognition (Tang
Khammassi and Krichen (2017) proposed a GA-LR packing et al., 2018). RNNs are increasingly applied to IDS, whose
method for feature selection in network intrusion detection, using data mostly consist of temporally continuous data streams
11
dicted samples, and it measures the overall recognition of the clas-

sifier. FAR is a critical metric to evaluate intrusion detection meth-
ods. False alarms are a manifestation of false positives, whose large
number will increase the load on the system and human resources.
After classification, the data can be divided into four categories:
true positive (TP), false positive (FP), true negative (TN), and false
negative (FN). The calculation formula is as follows:
TP + TN
Accuracy = (5)
TP + TP + FP + FN
TP
P recision = (6)
TP + FP
TP
Recall = (7)
TP + FN
FP
F AR = (8)
TN + FP
precision × recall
F −measure = 2 × (9)
precision + recall
Detection time is also a common evaluation metric in the field
Fig. 9. Evaluation metrics used in papers.
of intrusion detection. There are 74 articles in our research that
discuss time performance. Detection time means the time spent to
classify a sample with the trained model. Due to the complexity of
(Hochreiter and Schmidhuber, 1997; Yin et al., 2017). However, network traffic, even with methods such as feature selection for di-
since RNNs do not have a special treatment of the activation func- mensionality reduction, IDS research usually faces the problem of
tion, the continuous product of their partial derivatives can easily dimensional catastrophe, which eventually reflects as high detec-
lead to gradient disappearance or even gradient explosion when tion time. Some of the numerous existing algorithms for intrusion
the number of layers of the network is high. detection are almost unavailable in engineering implementations,
- LSTM solves the gradient vanishing problem of the classical and one important reason is their high detection time. From the
RNN by introducing additional storage states (Gers et al., 20 0 0). application point of view, the main goal of intrusion detection is to
LSTM effectively controls the degree of gradient vanishing by us- achieve an appropriate detection rate with minimal resource con-
ing a gate function as the activation function to selectively allow a sumption, which requires an ideal model structure for IDS as well
portion of the information to pass through. as parameter settings. A high detection time of a model usually
Based on the original LSTM architecture, Gers et al. (20 0 0) in- means that its algorithm complexity is too high. Reviewing previ-
troduced forgetting gates to enable the LSTM to reset its state, ous studies, a clear trade-off between the performance and com-
simulating the forgetting process of memory. Because of its ex- plexity of the model can be found.
cellent performance (Capes et al., 2017; Wu et al., 2016), it is Although deep learning based methods usually perform better
considered the most classical LSTM architecture. Based on this, in terms of detection capability compared to other methods, their
Cho et al. (2014) proposed a gate recurrent unit GRU consisting of detection times are too long, making these methods difficult to use
a reset gate and update gate, which maintains the performance of in scenarios such as big data. While computational complexity is
the LSTM as much as possible with fewer parameters. the most direct influence on detection time, considering that the
The LSTM has become one of the most used RNN variants be- computational complexity of some algorithms is difficult to cal-
cause it solves the gradient vanishing problem of traditional RNNs. culate or controversial under different assumptions, most papers
Many IDS studies use LSTM networks (Bontemps et al., 2016; Roy only provide the training and testing time of their algorithms on
et al., 2017) because they are well-suited for classification and pre- the specified dataset. Since the platforms used to obtain each re-
diction based on time-series data, and the forgetting mechanism is sult and the preprocessing methods for the datasets differ, it is
a better match for the detection of data streams. However, due to still difficult to judge the superiority of an algorithm in terms of
the inherent nature of RNNs, the classical LSTM architecture cannot time complexity just from the running time. In summary, we be-
be trained in parallel (Bai et al., 2018), making LSTM-based models lieve that there is still a need for a unified complexity evaluation
sometimes too costly to run. standard in the current IDS research, rather than just in terms of
detection time.
4.4. Evaluation metrics
4.5. Authors
We introduce commonly used evaluation criteria in intrusion
detection papers. To answer RQ4(a), Fig. 9 shows evaluation met- We assessed the main contributing authors in intrusion detec-
rics and the number of times they were used. tion by examining the total number of citations of the included
As shown in Fig. 9, accuracy, precision, recall, F1 value, and publications through Scopus, answering RQ5(a). As can be seen in
false-alarm rate (FAR) are most commonly used (RQ4(b)). Recall Fig. 10, The (Ambusaidi et al., 2016; Tan et al., 2014; Yin et al.,
and accuracy are used in most papers. Recall, also called detec- 2017) contributed most to the field. We found that citations were
tion rate or true positive rate (TPR), is the proportion of correctly not very high for all but the top few authors. This indicates that
classified attacks to all attacks. Recall can measure the accuracy few researchers are cited.
of the classifier in identifying attacks. Accuracy is the ratio of the The three most cited articles among the articles we researched
number of correctly predicted samples to the total number of pre- are shown in Table 8. The article “A Deep Learning Approach for
12
Fig. 10. Top authors by total number of Scopus citations.
Table 8
The three most cited articles.
Paper Year Citations Average citations
A Deep Learning Approach for Intrusion Detection Using Recurrent Neural Networks 2018 430 143
A Deep Learning Approach to Network Intrusion Detection 2017 313 78
Fuzziness based semi-supervised learning approach for intrusion detection system 2017 286 71
Table 9 To answer RQ5(b), we plotted the author network, as shown

Datasets used in papers.
in Fig. 11. As seen in Fig. 11, the distribution of author net-
Dataset Count works is dispersed. Note that the sizes of circles for these high-
KDD99 47 lighted authors are scaled according to their numbers of papers,
NSL-KDD 35 whereas the circle sizes for all other authors are fixed and small,
UNSW-NB15 8 for ease of reading. The figure includes 82 unconnected clusters,
ISCX 2012 6 with 451 authors. The two largest clusters, as shown in Fig. 12,
Kyoto 2006+ 2
both contain 32 authors, representing only 14% of all authors.
Botnet 2
This indicates a low level of collaboration among authors in the
community.
4.6. Datasets
Intrusion Detection Using Recurrent Neural Networks” was pub-
lished in 2018 and has been cited more than 430 times in total, To answer RQ6(a), we investigated existing network intrusion
with an average annual citation of 86. In this article, the authors detection datasets. As shown in Table 10, we collected a total of 52
propose a deep learning approach for intrusion detection using Re- datasets through the survey. And, based on the information pro-
current Neural Networks (RNN-IDS). Moreover, the authors also in- vided by the dataset publishers and additional searches, we ex-
vestigate the performance of the model in binary and multiclass tracted the year of creation, creation method, data volume, anno-
classification, and the effect of the number of neurons and differ- tation status, number of tags and links for each dataset. In terms
ent learning rates on the model performance. In the paper “A Deep of time, starting with the DARPA 1998 dataset, new datasets con-
Learning Approach to Network Intrusion Detection”, the authors tinue to appear in the community. With the year 2009 as the node,
also propose a deep learning-based intrusion detection model. The research related to network intrusion detection datasets started
model is built based on stacked NDAEs and achieves excellent re- to increase. From the fourth column of Table 10, it can be seen
sults. From these two highly cited articles, we can see the great that more than half of the datasets were obtained through sim-
impact and potential of deep learning in the field of intrusion de- ulation experiments. This reflects the sensitive nature of data in
tection. In another paper, “Fuzziness based semi-supervised learn- the field of network intrusion detection from the side. However,
ing approach for intrusion detection system”, the authors propose the accuracy and authenticity of such datasets have been ques-
a fuzzy-based semi-supervised learning approach that uses unla- tioned (Mahoney et al., 2003), and the validity of intrusion detec-
beled sample-assisted supervised learning algorithm to improve tion models constructed based on such datasets is poor.
the performance of the classifier. Unlike the previous two papers, Further, we summarize the frequency of use of the dataset in
this paper aims to reduce the labor consumption in the data la- Table 9 (RQ6(b)). As can be seen from the table, KDD99 and NSL-
beling process by taking the complexity of data labeling as a pain KDD are the two most commonly used datasets, although both
point. of them are simulation experimental data. This is mainly because
13
Fig. 11. Full coauthor network.
Fig. 12. Author network subgraph.
KDD99 and NSL-KDD are datasets that have been publicly available intrusion detection datasets, such as the CIRA-CIC-DoHBrw 2020
for a long time. Researchers have published many articles based on dataset, by referring to the information in our table. In addition,
these two datasets. When a new intrusion detection technique is researchers should try to experiment with some real datasets,
proposed, it often needs to be compared with previous techniques, such as the ISOT CID dataset, to ensure the validity of their
which leads to the constant use of KDD99 and NSL-KDD (RQ6(c)). approach.
However, the contents of the KDD99 and NSL-KDD datasets are Finally, to facilitate the work of researchers, we provide links to
obsolete. In future studies, we recommend that researchers eval- the datasets in the table and present some of the datasets in more
uate the performance of intrusion detection methods using newer detail.
14
Table 10
Existing network intrusion detection datasets.
No. Dataset Year Authenticity Count Labeled Number of labels Link
1 1998 DARPA 1998 emulated 7,000,000 yes 4 DARPA (1998,1999)

2 1999 DARPA 1999 emulated huge yes 4 DARPA (1998,1999)
3 KDD99 1999 emulated 5,000,000 yes 4 KDD99 (1999)
4 2000 DARPA 2000 emulated huge yes 4 DARPA (1998,1999)
5 DEFCON 2000 real unknown yes unknown DEFCON (2000)
6 Kyoto 2006+Song et al. (2006) 2006 real unknown yes unknown Kyoto-2006+ (2006)
7 NSL-KDD Tavallaee et al. (2009) 2009 emulated 148,517 yes 4 NSL-KDD (2009)
8 LDID 2009 emulated huge no unknown unknown
9 ICML-09 Ma et al. (2009) 2009 real 2,400,000 yes 1 ICML-09 (2009)
10 Twente Sperotto et al. (2009) 2009 emulated unknown yes unknown Twente (2009)
11 CDX 2009 real 5771 yes 2 CDX (2009)
12 ISOT Botnet 2010 real 1,675,424 yes unknown ISOT-Botnet (2010)
13 CSIC HTTP 2010 2010 emulated 223,585 yes 1 CSIC-HTTP-2010 (2010)
14 SSENet-2011 2011 real unknown yes 3 unknown
15 ISCX-IDS 2012 Shiravi et al. (2012) 2012 real 2,450,324 yes unknown ISCX-IDS-2012 (2012)
16 ADFA-LD Creech and Hu (2013) 2013 emulated 5266 yes 6 ADFA-LD (2013)
17 CTU-13 2014 real huge yes 7 CTU-13 (2014)
18 Botnet 2014 Beigi et al. (2014) 2014 real 283,770 yes 16 Botnet-2014 (2014)
19 SANTA 2014 real unknown yes 6 unknown
20 MAWILab 2014 emulated unknown yes 3 MAWILab (2014)
21 SSENet-2014 Bhattacharya and Selvakumar (2014) 2014 real 201,707 yes 3 unknown
22 SSHCure Hofstede et al. (2014) 2014 real unknown yes unknown SSHCure (2014)
23 UNSW-NB15 Moustafa and Slay (2015) 2015 emulated 2,540,044 yes 9 UNSW-NB15 (2015)
24 ISTS-12 2015 emulated huge no unknown ISTS-12 (2015)
25 AWID Kolias et al. (2015) 2015 emulated huge yes 16 AWID (2015)
26 UCSD Jonker et al. (2017) 2015 emulated unknown yes 1 UCSD (2015)
27 IRSC 2015 real unknown yes unknown unknown
28 NDSec-1 Beer et al. (2017) 2016 emulated huge yes 8 NDSec-1 (2016)
29 DDoS 2016 Alkasassbeh et al. (2016) 2016 emulated 734,627 yes 4 DDos-2016 (2016)
30 NGIDS-DS Haider et al. (2017) 2016 emulated unknown yes 8 NGIDS-DS (2016)
31 UGR’16 Cermak et al. (2018) 2016 real unknown yes 5 UGR’16 (2016)
32 Witty Worm 2016 real huge yes unknown unknown
33 Unified Host and Network 2016 real unknown yes unknown Host and Network (2016)
34 CDMC 2016 2016 real 61,730 yes 1 unknown
35 Kharon Kiss et al. (2016) 2016 real 55,733 yes 19 Kharon (2016)
36 CIDDS-001 Ring et al. (2017) 2017 emulated 31,959,267 yes 6 CIDDS (2017)
37 CIDDS-002 Ring et al. (2017) 2017 emulated 16,161,183 yes 5 CIDDS (2017)
38 CAIDA Jonker et al. (2017) 2017 real huge no unknown CAIDA (2017)
39 CICIDS 2017 Sharafaldin et al. (2018) 2017 emulated 2,830,743 yes 7 CICIDS-2017 (2017)
40 NCCDC 2017 real unknown yes unknown unknown
41 CICDoS 2017 Jazi et al. (2017) 2017 emulated 32,925 yes 8 CICDDoS-2019 (2019)
42 TRAbID 2017 emulated huge yes 2 TRAbID (2017)
43 SUEE 2017 2017 emulated 19,301,217 yes 3 unknown
44 ISOT HTTP Botnet 2017 emulated huge yes 9 ISOT HTTP Botnet (2017)
45 PUF 2018 emulated 6,000,000 yes 4 unknown
46 ISOT CID 2018 real 36,938,985 yes 18 ISOT-CID (2018)
47 CICDDoS 2019 Sharafaldin et al. (2019) 2019 emulated huge yes 11 CICDDoS-2019 (2019)
48 BoT-IoT Koroniotis et al. (2019) 2019 real 73,360,900 yes 2 BoT-IoT (2019)
49 IoT-23 2020 real unknown yes 20 IoT-23 (2020)
50 InSDN 2020 emulated 343,939 yes 7 InSDN (2020)
51 CIRA-CIC-DoHBrw 2020 MontazeriShatoori et al. (2020) 2020 emulated 1,185,286 yes 3 CIRA-CIC-DoHBrw-2020 (2020)
52 OPCUA 2020 emulated 107,634 yes 3 OPCUA, 2020
- DARPA datasets are most popular for intrusion detection, and characteristics. Content characteristics relate to suspicious behavior
were created at the MIT Lincoln Laboratory in an emulated net- of the data part. This is the most extensive dataset used to evaluate
work environment. The DARPA 1998 and DARPA 1999 datasets intrusion detection models.
contain seven and five weeks, respectively, of network traffic in - NSL-KDD is a dataset suggested to solve some of the inher-
packet-based format, including such attacks as DoS, buffer over- ent problems of the KDD99 dataset. Although, this new version of
flow, port scans, and rootkits. Despite (or because of) their wide the KDD dataset still suffers from some of the problems discussed
distribution, the datasets are often criticized for artificial attack in- by Tavallaee et al. (2009) and may not be a perfect representative
jections or large amounts of redundancy. of existing real networks, because of the lack of public data sets
- KDD99 dataset was created from DARPA network dataset files for network-based IDSs, we believe it still can be applied as an
by Lee and Stolfo (20 0 0). The dataset was constructed through data effective benchmark dataset to help researchers compare different
mining to analyze the features of the DARPA dataset and prepro- intrusion detection methods. Furthermore, the number of records
cess the data. The dataset contains seven weeks of network traffic, in the NSL-KDD train and test sets are reasonable. This advantage
with approximately 4.9 million vectors. Attacks are classified as: makes it affordable to run the experiments on the complete set
(1) user-to-root (U2R); (2) remote-to-local (R2L); (3) probing; and without the need to randomly select a small portion. Consequently,
(4) DoS. Each instance is represented by 41 features in three cate- evaluation results of different research work will be consistent and
gories: (1) basic; (2) traffic; and (3) content. Basic features are ex- comparable.
tracted from TCP/IP connections. Traffic characteristics are grouped - UNSW-NB15 was created by the Cyber Range Laboratory of
into those with the same host characteristics or the same service the Australian Cyber Security Center. It is widely used due to its
15
variety of novel attacks. Types of attacks consist of Fuzzers, Anal- forming different actions. The network traffic capture for benign
ysis, Backdoor, DoS, Exploits, Generic, Reconnaissance, Shellcode, scenarios was obtained from the network traffic of three real IoT
and Worms. It has a training set with 82,332 records, and a testing devices: a Philips HUE smart LED lamp, Amazon Echo home intel-
set with 175,341 records. ligent personal assistant, and Somfy smart door lock. Both mali-
- CICIDS2017 contains benign and common attacks, with both cious and benign scenarios were run in a controlled network en-
source data (PCAPs) and results of network traffic analysis (CSV vironment with unrestrained internet connection, like any real IoT
files) based on timestamps, source and destination IPs, source and device.
destination ports, protocols, and token flows of attacks. The re- - PUF was captured over three days from a campus network
searchers used the B-Profile system (Sharafaldin, et al. 2016) to an- and contains exclusively DNS connections, where 38,120 of 298,463
alyze the abstract behavior of human interactions and to generate unidirectional flows are malicious. All flows are labeled using logs
benign background traffic. The dataset includes abstracted behav- of an intrusion prevention system. IP addresses were removed for
iors of 25 users based on HTTP, HTTPS, FTP, SSH, and email proto- privacy reasons.
cols. Brute force cracking attacks include FTP, SSH, DoS, Heartbleed, - LBNL was created to analyze the network traffic charac-
web attack, infiltration, botnet, and DDoS. teristics in an enterprise network. The dataset can be used as
- CICDoS2017 is a publicly available intrusion detection dataset background traffic for security research, as it contains almost ex-
with application layer DoS attacks from the Canadian Institute for clusively normal user behavior. The dataset is not labeled, is
Cybersecurity. The authors executed eight DoS attacks on the ap- anonymized, and contains more than 100 h of network traffic in
plication layer. Normal user behavior was generated by combin- packet-based format. The dataset can be downloaded at the web-
ing the resulting traces with attack-free traffic from the ISCX 2012 site.4
dataset. The dataset is available in packet-based format and con- - The IEEE 300-bus power test system provides the topolog-
tains 24 h of network traffic. ical and electrical structure of a power grid, to be used to de-
- CICDDoS2019 contains the latest DDoS attacks, which are sim- tect false data injection attacks in the smart grid. The system has
ilar to real-world data. It includes the results of network traf- 411 branches, and an average degree of 2.74. For details about
fic analysis using CICFLOWMeter-V3, which contains a token flow this standard test system, we refer the reader to the work of
based on timestamp source, and destination IPS source and port Hines et al. (2010). The IEEE 300-bus power test system has been
protocols and attacks. used in much work related to cyber-attack classification.
- Kyoto 2006+ is a publicly available honeypot dataset of real - The ICS cyber attack datasets consist of: (1) power system
network traffic that includes only a small number and small range dataset; (2) gas pipeline dataset; (3) energy management system
of realistic, normal user behavior. The researchers transformed dataset; (4) new gas pipeline dataset; and (5) gas pipeline and wa-
packet-based traffic into a new format called sessions. Each ses- ter storage tank dataset. The power system dataset contains 37 sce-
sion has 24 attributes, 14 of which are statistical information fea- narios divided into eight natural events, one non-event, and 28 at-
tures inspired by the KDD CUP 99 dataset, and the remaining 10 tacks. Attacks are categorized as: (1) relay setting change; (2) re-
attributes are typical traffic-based attributes such as IP address mote tripping command injection; and (3) data injection. These
(anonymous), port, and duration. The data were collected over datasets can be used for cybersecurity intrusion detection in in-
three years and include approximately 93 million sessions. dustrial control systems.
- NDSec-1 contains trace and log files of network attacks syn-
thesized by researchers from network facilities. It is publicly avail-
5. Conclusion
able, and was captured in packet-based format in 2016. It contains
additional syslog and Windows eventlog information. Attack com-
We provided a comprehensive overview and analysis of re-
positions include botnet, brute force (against FTP, HTTP, and SSH),
search work on intrusion detection in network security. The survey
DoS (HTTP, SYN, and UDP flooding), exploits, port scans, spoofing,
covered 119 of the most highly cited papers in the field of network
and XSS/SQL injection.
security intrusion detection, including preprocessing and intrusion
- CTU-13 was captured in 2013 and is available in packet, unidi-
detection techniques, and analyzed the community from multiple
rectional flow, and bidirectional flow formats. Captured in a univer-
perspectives. We analyzed the research progress and bottlenecks
sity network, its 13 scenarios include different botnet attacks. Ad-
in different scenarios. We investigated preprocessing and intru-
ditional information about infected hosts is provided at the web-
sion detection techniques. We examined evaluation methods, in-
site.3 Traffic was labeled in three stages: 1) all traffic to and from
cluding metrics and datasets, so as to standardize performance as-
infected hosts was labeled as a botnet; 2) traffic matching spe-
sessment. We counted contributors in the community and mapped
cific filters was labeled as normal; 3) remaining traffic was labeled
their collaborative network. Our publication data and category de-
as background. Consequently, background traffic can be normal or
scriptions are publicly available to facilitate repeatability and fur-
malicious.
ther research.
- BoT-IoT contains more than 72 million records, including
Our results show that research on network anomaly detection is
DDoS, DoS, OS, service scan, keylogging, and data exfiltration at-
unbalanced under different target networks. In the ICN domain, re-
tacks. The Node-red tool was used to simulate the network be-
searchers often do not disclose their datasets due to the sensitivity
havior of IoT devices. MQTT, a lightweight communication protocol,
and confidentiality of industrial network data. The lack of available
links machine-to-machine (M2M) communications. The testbed IoT
datasets limits cybersecurity research in the ICN domain. The lack
scenarios are weather station, smart fridge, motion activated lights,
of datasets is also a key factor limiting research in the SDN do-
remotely activated garage door, and smart thermostat.
main. Before conducting security research, researchers often need
- IoT-23 consists of 23 network captures (called scenarios) of
to build SDN network environments to simulate the data. In terms
IoT traffic, including 20 (PCAP files) from infected IoT devices and
of the current means of intrusion detection techniques used, su-
three of real IoT network traffic. Raspberry Pi malware was exe-
pervised learning is still the mainstream direction. However, these
cuted in each malicious scenario using several protocols and per-
studies need to be built on top of the already labeled data. At the
time of practical application, the data we obtain is unlabeled. La-
3
http://mcfp.weebly.com/the- ctu- 13- dataset- a- labeled- dataset- with- botnet-
4
normal- and- background- traffic.html. http://icir.org/enterprise-tracing/download.html.
16
beling the data is a time-consuming and tedious task. We believe Bulavas, V., 2018. Investigation of network intrusion detection using data visualiza-
that unsupervised learning and semi-supervised learning are the tion methods, 1–6.
CAIDA, 2017. https://www.impactcybertrust.org/dataset_view?idDataset=834.
way forward for network anomaly detection. Similarly, we believe Caminero, G., Lopez-Martin, M., Carro, B., 2019. Adversarial environment rein-
that automated labeling of network data is also a direction wor- forcement learning algorithm for intrusion detection. Comput. Netw. 159, 96–
thy of in-depth study. In addition, the adversarial environment has 109.
Capes, T., Coles, P., Conkie, A., Golipour, L., Hadjitarkhani, A., Hu, Q., Huddleston, N.,
been shown to impact machine learning-based network anomaly Hunt, M., Li, J., Neeracher, M., et al., 2017. Siri on-device deep learning-guided
detection algorithms. Therefore, anti-perturbation anomaly detec- unit selection text-to-speech system. In: INTERSPEECH, pp. 4011–4015.
tion in adversarial environments also needs more research. Casas, P., Mazel, J., Owezarski, P., 2012. Unsupervised network intrusion detection
systems: detecting the unknown without knowledge. Comput. Commun. 35 (7),
772–783.
Declaration of Competing Interest CDX, 2009. https://www.usma.edu/centers- and- research/cyber-research-center/
data-sets.
Cermak, M., Jirsik, T., Velan, P., Komarkova, J., Spacek, S., Drasar, M., Plesnik, T., 2018.
The authors declare that they have no known competing finan- Towards provable network traffic measurement and analysis via semi-labeled
cial interests or personal relationships that could have appeared to trace datasets. In: 2018 Network Traffic Measurement and Analysis Conference
(TMA), pp. 1–8.
influence the work reported in this paper.
Chawla, N.V., Bowyer, K.W., Hall, L.O., Kegelmeyer, W.P., 2002. Smote: synthetic mi-
nority over-sampling technique. J. Artif. Intell. Res. 16, 321–357.
Cho, K., Van Merriënboer, B., Gulcehre, C., Bahdanau, D., Bougares, F., Schwenk, H.,
Acknowledgment
Bengio, Y., 2014. Learning phrase representations using RNN encoder-decoder
for statistical machine translation. arXiv preprint arXiv:1406.1078.
This work is partially supported by the National Natural Sci- CICDDoS-2019, 2019. https://www.unb.ca/cic/datasets/ddos-2019.html.
CICIDS-2017, 2017. https://www.unb.ca/cic/datasets/ids-2017.html.
ence Foundation of China (No. 61902010), the Major Research
CIDDS, 2017. http://www.hs-coburg.de/cidds.
Plan of National Natural Science Foundation of China (92167102), CIRA-CIC-DoHBrw-2020, 2020. https://www.unb.ca/cic/datasets/dohbrw-2020.html.
the Project of Beijing Municipal Education Commission (No. Creech, G., Hu, J., 2013. Generation of a new IDS test dataset: time to retire the KDD
KM202110 0 05025). collection. In: 2013 IEEE Wireless Communications and Networking Conference
(WCNC), pp. 4487–4492.
CSIC-HTTP-2010, 2010. https://petescully.co.uk/research/csic- 2010- http- dataset- in-
References csv- format- for- weka- analysis/.
CTU-13, 2014. http://mcfp.weebly.com/.
Abbes, T., Bouhoula, A., Rusinowitch, M., 2010. Efficient decision tree for protocol Cybenko, G., 1989. Approximation by superpositions of a sigmoidal function. Math.
analysis in intrusion detection. Int. J. Secur. Netw. 5 (4), 220–235. Control Signals Syst. 2 (4), 303–314.
ADFA-LD, 2013. https://www.unsw.adfa.edu.au/unsw-canberra-cyber/cybersecurity/ DARPA, 1998,1999. http://www.tp-ontrol.hu/index.php/TP_Toolbox.
ADFA- IDS- Datasets/. DDos-2016, 2016. www.researchgate.net/publication/292967044_Dataset- _
Ahmed, M., Mahmood, A.N., Hu, J., 2016. A survey of network anomaly detection Detecting_Distributed_Denial_of_Service_Attacks_Using_Data_Mining_
techniques. J. Netw. Comput. Appl. 60, 19–31. Techniques.
Alhajjar, E., Maxwell, P., Bastian, N., 2021. Adversarial machine learning in network DEFCON, 20 0 0. https://defcon.org/html/links/dc-ctf.html.
intrusion detection systems. Expert Syst Appl 186, 115782. Deng, H., Runger, G., Tuv, E., 2011. Bias of importance measures for multi-valued
Alkasassbeh, M., Al-Naymat, G., Hassanat, A., Almseidin, M., 2016. Detecting dis- attributes and solutions. In: International Conference on Artificial Neural Net-
tributed denial of service attacks using data mining techniques. Int. J. Adv. Com- works, pp. 293–300.
put. Sci. Appl. 7 (1), 436–445. Dong, Y., Wang, R., He, J., 2019. Real-time network intrusion detection system based
Ambusaidi, M.A., He, X., Nanda, P., Tan, Z., 2016. Building an intrusion detection on deep learning. In: 2019 IEEE 10th International Conference on Software En-
system using a filter-based feature selection algorithm. IEEE Trans. Comput. 65 gineering and Service Science (ICSESS), pp. 1–4.
(10), 2986–2998. Dupond, S., 2019. A thorough review on the current advance of neural network
An, J., Cho, S., 2015. Variational autoencoder based anomaly detection using recon- structures. Annu. Rev. Control 14, 200–230.
struction probability. Spec. Lect. IE 2 (1), 1–18. Ertekin, S., Huang, J., Bottou, L., Giles, L., 2007. Learning on the border: active
Anthi, E., Williams, L., Słowińska, M., Theodorakopoulos, G., Burnap, P., 2019. A su- learning in imbalanced data classification. In: Proceedings of the Sixteenth
pervised intrusion detection system for smart home IoT devices. IEEE Internet ACM Conference on Conference on Information and Knowledge Management,
Things J. 6 (5), 9042–9053. pp. 127–136.
AWID, 2015. http://icsdweb.aegean.gr/awid/download.html. Estabrooks, A., Jo, T., Japkowicz, N., 2004. A multiple resampling method for learn-
Axelsson, S., 20 0 0. Intrusion Detection Systems: A Survey and Taxonomy. Technical ing from imbalanced data sets. Comput. Intell. 20 (1), 18–36.
Report. Fernández, A., Garcia, S., Herrera, F., Chawla, N.V., 2018. Smote for learning from
Bach, F.R., 2008. Bolasso: model consistent Lasso estimation through the boot- imbalanced data: progress and challenges, marking the 15-year anniversary. J.
strap. In: Proceedings of the 25th international conference on Machine learning, Artif. Intell. Res. 61, 863–905.
pp. 33–40. Freund, Y., Schapire, R.E., et al., 1996. Experiments with a new boosting algorithm.
Bai, S., Kolter, J. Z., Koltun, V., 2018. An empirical evaluation of generic convolutional In: International Conference on Machine Learning, vol. 96, pp. 148–156.
and recurrent networks for sequence modeling. arXiv preprint arXiv:1803.01271. Gers, F.A., Schmidhuber, J., Cummins, F., 20 0 0. Learning to forget: continual predic-
Beer, F., Hofer, T., Karimi, D., Bühler, U., 2017. A new attack composition for network tion with LSTM. Neural Comput. 12 (10), 2451–2471.
security. 10. DFN-Forum Kommunikationstechnologien. Ghorbani, A.A., Lu, W., Tavallaee, M., 2009. Network Intrusion Detection and
Beigi, E.B., Jazi, H.H., Stakhanova, N., Ghorbani, A.A., 2014. Towards effective feature Prevention: Concepts and Techniques, vol. 47. Springer Science & Business
selection in machine learning-based botnet detection approaches. In: 2014 IEEE Media.
Conference on Communications and Network Security, pp. 247–255. Goodfellow, I., Bengio, Y., Courville, A., Bengio, Y., 2016. Deep Learning, vol. 1. MIT
Bengio, Y., 2009. Learning Deep Architectures for AI. Now Publishers Inc. Press Cambridge.
Bermingham, M.L., Pong-Wong, R., Spiliopoulou, A., Hayward, C., Rudan, I., Camp- Guyon, I., Elisseeff, A., 2003. An introduction to variable and feature selection. J.
bell, H., Wright, A.F., Wilson, J.F., Agakov, F., Navarro, P., et al., 2015. Applica- Mach. Learn. Res. 3 (Mar), 1157–1182.
tion of high-dimensional feature selection: evaluation for genomic prediction in Haider, W., Hu, J., Slay, J., Turnbull, B.P., Xie, Y., 2017. Generating realistic intrusion
man. Sci. Rep. 5 (1), 1–12. detection system dataset based on fuzzy qualitative modeling. J. Netw. Comput.
Bhattacharya, S., Selvakumar, S., 2014. SSENet-2014 dataset: a dataset for de- Appl. 87, 185–192.
tection of multiconnection attacks. In: 2014 3rd International Conference on Hajisalem, V., Babaie, S., 2018. A hybrid intrusion detection system based on
Eco-friendly Computing and Communication Systems, pp. 121–126. ABC-AFS algorithm for misuse and anomaly detection. Comput. Netw. 136,
Bhuyan, M.H., Bhattacharyya, D.K., Kalita, J.K., 2013. Network anomaly detection: 37–50.
methods, systems and tools. IEEE Commun. Surv. Tutor. 16 (1), 303–336. Hamid, Y., Sugumaran, M., 2020. A t-SNE based non linear dimension reduction for
Bontemps, L., McDermott, J., Le-Khac, N.-A., et al., 2016. Collective anomaly detec- network intrusion detection. Int. J. Inf. Technol. 12 (1), 125–134.
tion based on long short-term memory recurrent neural networks. In: Interna- Hande, Y., Muddana, A., 2021. A survey on intrusion detection system for software
tional Conference on Future Data and Security Engineering, pp. 141–152. defined networks (SDN). In: Research Anthology on Artificial Intelligence Appli-
BoT-IoT, 2019. https://www.unsw.adfa.edu.au/unsw-canberra-cyber/cybersecurity/ cations in Security. IGI Global, pp. 467–489.
ADFA- NB15- Datasets/bot_iot.php. Haq, N.F., Onik, A.R., Hridoy, M.A.K., Rafni, M., Shah, F.M., Farid, D.M., 2015. Appli-
Botnet-2014, 2014. https://www.unb.ca/cic/datasets/botnet.html. cation of machine learning approaches in intrusion detection system: a survey.
Breiman, L., 1996. Bagging predictors. Mach. Learn. 24 (2), 123–140. IJARAI-Int. J. Adv. Res. Artif. Intell. 4 (3), 9–18.
Breiman, L., 2001. Random forests. Mach. Learn. 45 (1), 5–32. He, H., Bai, Y., Garcia, E.A., Li, S., 2008. ADASYN: adaptive synthetic sampling
Buczak, A.L., Guven, E., 2015. A survey of data mining and machine learning meth- approach for imbalanced learning. In: 2008 IEEE International Joint Confer-
ods for cyber security intrusion detection. IEEE Commun. Surv. Tutor. 18 (2), ence on Neural Networks (IEEE World Congress on Computational Intelligence),
1153–1176. pp. 1322–1328.
17
He, K., Zhang, X., Ren, S., Sun, J., 2016. Deep residual learning for image recogni- Madeh Piryonesi, S., El-Diraby, T.E., 2021. Using machine learning to examine impact
tion. In: Proceedings of the IEEE Conference on Computer Vision and Pattern of type of performance indicator on flexible pavement deterioration modeling.
Recognition, pp. 770–778. J. Infrastruct. Syst. 27 (2), 04021005.
Hines, P., Blumsack, S., Sanchez, E.C., Barrows, C., 2010. The topological and electri- Mahoney, Matthew, V., Philip, K., Chan, 2003. An analysis of the 1999 DARPA/Lincoln
cal structure of power grids. In: 2010 43rd Hawaii International Conference on Laboratory evaluation data for network anomaly detection. In: International
System Sciences, pp. 1–10. Workshop on Recent Advances in Intrusion Detection. Springer, Berlin, Heidel-
Hinton, G., Roweis, S.T., 2002. Stochastic neighbor embedding. In: NIPS, vol. 15. Cite- berg, pp. 220–237.
seer, pp. 833–840. Mani, I., Zhang, I., 2003. kNN approach to unbalanced data distributions: a case
Hochreiter, S., Schmidhuber, J., 1997. Long short-term memory. Neural Comput. 9 study involving information extraction. In: Proceedings of Workshop on Learn-
(8), 1735–1780. ing from Imbalanced Datasets, vol. 126.
Hodo, E., Bellekens, X., Hamilton, A., Tachtatzis, C., Atkinson, R., 2017. Shal- Martinez, A.M., Kak, A.C., 2001. PCA versus LDA. IEEE Trans. Pattern Anal. Mach.
low and deep networks intrusion detection system: a taxonomy and survey. Intell. 23 (2), 228–233.
arXiv preprint arXiv:1701.02145. MAWILab, 2014. http://www.fukuda-lab.org/mawilab/documentation.html.
Hofstede, R., Hendriks, L., Sperotto, A., Pras, A., 2014. SSH compromise detection McCarthy, K., Zabar, B., Weiss, G., 2005. Does cost-sensitive learning beat sampling
using NetFlow/IPFIX. ACM SIGCOMM Comput. Commun. Rev. 44 (5), 20–26. for classifying rare classes? In: Proceedings of the 1st International Workshop
Host, U., Network, 2016. https://csr.lanl.gov/data/cyber1/. on Utility-Based Data Mining, pp. 69–77.
De la Hoz, E., De La Hoz, E., Ortiz, A., Ortega, J., Prieto, B., 2015. PCA filtering and Mehmood, A., Mukherjee, M., Ahmed, S.H., Song, H., Malik, K.M., 2018. NBC-MAIDS:
probabilistic SOM for network intrusion detection. Neurocomputing 164, 71–81. Naïve Bayesian classification technique in multi-agent system-enriched IDS for
Hsu, C.-W., Chang, C.-C., Lin, C.-J., et al., 2003. A practical guide to support vector securing iot against DDoS attacks. J. Supercomput. 74 (10), 5156–5170.
classification. Milenkoski, A., Vieira, M., Kounev, S., Avritzer, A., Payne, B.D., 2015. Evaluating com-
Hu, W., Gao, J., Wang, Y., Wu, O., Maybank, S., 2013. Online adaboost-based pa- puter intrusion detection systems: asurvey of common practices. ACM Comput.
rameterized methods for dynamic distributed network intrusion detection. IEEE Surv. (CSUR) 48 (1), 1–41.
Trans. Cybern. 44 (1), 66–82. Miljanovic, M., 2012. Comparative analysis of recurrent and finite impulse response
Hubel, D.H., Wiesel, T.N., 1968. Receptive fields and functional architecture of mon- neural networks in time series prediction. Indian J. Comput. Sci. Eng. 3 (1),
key striate cortex. J. Physiol. 195 (1), 215–243. 180–191.
ICML-09, 2009. http://www.sysnet.ucsd.edu/projects/url/. Mishra, P., Varadharajan, V., Tupakula, U., Pilli, E.S., 2018. A detailed investigation
InSDN, 2020. http://aseados.ucd.ie/?p=177. and analysis of using machine learning techniques for intrusion detection. IEEE
IoT-23, 2020. https://mcfp.felk.cvut.cz/publicDatasets/IoT- 23- Dataset/iot_23_datasets Commun. Surv. Tutor. 21 (1), 686–728.
_small.tar.gz. MontazeriShatoori, M., Davidson, L., Kaur, G., Lashkari, A.H., 2020. Detection of
ISCX-IDS-2012, 2012. https://www.unb.ca/cic/datasets/ids.html. DoH tunnels using time-series classification of encrypted traffic. In: 2020
ISOT-Botnet, 2010. https://www.uvic.ca/engineering/ece/isot/datasets/botnet- IEEE Intl. Conf. on Dependable, Autonomic and Secure Computing, Intl.
ransomware/index.php. Conf. on Pervasive Intelligence and Computing, Intl. Conf. on Cloud and
ISOT-CID, 2018. https://www.uvic.ca/engineering/ece/isot/datasets/cloud-security/ Big Data Computing, Intl. Conf. on Cyber Science and Technology Congress
index.php. (DASC/PiCom/CBDCom/CyberSciTech), pp. 63–70.
ISTS-12, 2015. http://ists.sparsa.org/. Moustafa, N., Slay, J., 2015. UNSW-NB15: a comprehensive data set for network
ISOT, 2017. https://www.uvic.ca/engineering/ece/isot/datasets/botnet-ransomware/ intrusion detection systems (UNSW-NB15 network data set). In: 2015 Military
index.php. Communications and Information Systems Conference (MilCIS), pp. 1–6.
Jan, S.U., Ahmed, S., Shakhov, V., Koo, I., 2019. Toward a lightweight intrusion detec- Muniyandi, A.P., Rajeswari, R., Rajaram, R., 2012. Network anomaly detection by cas-
tion system for the internet of things. IEEE Access 7, 42450–42471. cading k-means clustering and C4. 5 decision tree algorithm. Procedia Eng. 30,
Jazi, H.H., Gonzalez, H., Stakhanova, N., Ghorbani, A.A., 2017. Detecting http-based 174–182.
application layer dos attacks on web servers in the presence of sampling. Com- NDSec-1, 2016. https://www2.hs- fulda.de/NDSec/NDSec- 1/Files/.
put. Netw. 121, 25–36. NGIDS-DS, 2016. research.unsw.edu.au/people/professor- jiankun- hu.
Jonker, M., King, A., Krupp, J., Rossow, C., Sperotto, A., Dainotti, A., 2017. Mil- Nisioti, A., Mylonas, A., Yoo, P.D., Katos, V., 2018. From intrusion detection to at-
lions of targets under attack: a macroscopic characterization of the dos ecosys- tacker attribution: acomprehensive survey of unsupervised methods. IEEE Com-
tem. In: Proceedings of the 2017 Internet Measurement Conference, pp. 100– mun. Surv. Tutor. 20 (4), 3369–3388.
113. NSL-KDD, 2009. https://www.unb.ca/cic/datasets/nsl.html.
KDD99, 1999. http://kdd.ics.uci.edu/databases/kddcup99/kddcup99.html. OPCUA, 2020. https://digi2-feup.github.io/OPCUADataset/.
Keele, S., et al., 2007. Guidelines for Performing Systematic Literature Reviews in Özgür, A., Erdem, H., 2016. A review of KDD99 dataset usage in intrusion detection
Software Engineering. Technical Report. Citeseer. and machine learning between 2010 and 2015. PeerJ Preprints 4, e1954v1.
Khammassi, C., Krichen, S., 2017. A GA-LR wrapper approach for feature selection in Peng, K., Leung, V.C., Huang, Q., 2018. Clustering approach based on mini
network intrusion detection. Comput. Secur. 70, 255–277. batch Kmeans for intrusion detection system over big data. IEEE Access 6,
Kharon, 2016. http://kharon.gforge.inria.fr/dataset/index.html. 11897–11906.
Kiss, N., Lalande, J.-F., Leslous, M., Tong, V.V.T., 2016. Kharon dataset: android mal- Pyle, D., 1999. Data Preparation for Data Mining. Morgan Kaufmann.
ware under a microscope. In: The {LASER} Workshop: Learning from Authorita- Quinlan, J.R., 1983. Learning efficient classification procedures and their application
tive Security Experiment Results ({LASER} 2016), pp. 1–12. to chess end games. Mach. Learn. 463–482.
Koc, L., Mazzuchi, T.A., Sarkani, S., 2012. A network intrusion detection system Quinlan, J.R., 1986. Induction of decision trees. Mach. Learn. 1 (1), 81–106.
based on a Hidden Naïve bayes multiclass classifier. Expert Syst. Appl. 39 (18), Quinlan, J.R., 2014. C4. 5: Programs for Machine Learning. Elsevier.
13492–13500. Raskutti, B., Kowalczyk, A., 2004. Extreme re-balancing for SVMs: a case study. ACM
Kolias, C., Kambourakis, G., Stavrou, A., Gritzalis, S., 2015. Intrusion detection in Sigkdd Explor. Newsl. 6 (1), 60–69.
802.11 networks: empirical evaluation of threats and a public dataset. IEEE Ring, M., Wunderlich, S., Grüdl, D., Landes, D., Hotho, A., 2017. Flow-based bench-
Commun. Surv. Tutor. 18 (1), 184–208. mark data sets for intrusion detection. In: Proceedings of the 16th European
Koroniotis, N., Moustafa, N., Sitnikova, E., Turnbull, B., 2019. Towards the develop- Conference on Cyber Warfare and Security. ACPI, pp. 361–369.
ment of realistic botnet dataset in the internet of things for network forensic Ring, M., Wunderlich, S., Scheuring, D., Landes, D., Hotho, A., 2019. A survey of net-
analytics: bot-IoT dataset. Future Gener. Comput. Syst. 100, 779–796. work-based intrusion detection data sets. Comput. Secur. 86, 147–167.
Krizhevsky, A., Sutskever, I., Hinton, G.E., 2012. ImageNet classification with deep Roy, S.S., Mallik, A., Gulati, R., Obaidat, M.S., Krishna, P.V., 2017. A deep learning
convolutional neural networks. Adv. Neural Inf. Process. Syst. 25, 1097–1105. based artificial neural network approach for intrusion detection. In: Interna-
Kyoto-20 06+, 20 06. http://www.takakura.com/Kyoto_data/. tional Conference on Mathematics and Computing, pp. 44–53.
LeCun, Y., Bottou, L., Bengio, Y., Haffner, P., 1998. Gradient-based learning applied to Ruan, Z., Miao, Y., Pan, L., Patterson, N., Zhang, J., 2017. Visualization of big data
document recognition. Proc. IEEE 86 (11), 2278–2324. security: a case study on the KDD99 cup data set. Digit. Commun. Netw. 3 (4),
Lee, W., Stolfo, S.J., 20 0 0. A framework for constructing features and models for 250–259.
intrusion detection systems. ACM Trans. Inf. Syst. Secur.(TiSSEC) 3 (4), 227– Safavian, S.R., Landgrebe, D., 1991. A survey of decision tree classifier methodology.
261. IEEE Trans. Syst. Man Cybern. 21 (3), 660–674.
Li, J., Zhao, Z., Li, R., Zhang, H., 2018. Ai-based two-stage intrusion detection for Sarangi, S., Sahidullah, M., Saha, G., 2020. Optimization of data-driven filterbank for
software defined IoT networks. IEEE Internet Things J. 6 (2), 2093–2102. automatic speaker verification. Digit. Signal Process. 104, 102795.
Li, Z., Qin, Z., Huang, K., Yang, X., Ye, S., 2017. Intrusion detection using convolu- Sharafaldin, I., Lashkari, A.H., Ghorbani, A.A., 2018. Toward generating a new
tional neural networks for representation learning. In: International Conference intrusion detection dataset and intrusion traffic characterization. In: ICISSp,
on Neural Information Processing, pp. 858–866. pp. 108–116.
Liu, X.-Y., Wu, J., Zhou, Z.-H., 2008. Exploratory undersampling for class-imbalance Sharafaldin, I., Lashkari, A.H., Hakak, S., Ghorbani, A.A., 2019. Developing realistic
learning. IEEE Trans. Syst. Man Cybern. Part B 39 (2), 539–550. distributed denial of service (DDoS) attack dataset and taxonomy. In: 2019 In-
Loh, W.-Y., 2011. Classification and regression trees. Wiley Interdiscip. Rev. Data ternational Carnahan Conference on Security Technology (ICCST), pp. 1–8.
Min.Knowl. Discov. 1 (1), 14–23. Shiravi, A., Shiravi, H., Tavallaee, M., Ghorbani, A.A., 2012. Toward developing a sys-
Ma, J., Saul, L.K., Savage, S., Voelker, G.M., 2009. Beyond blacklists: learning to de- tematic approach to generate benchmark datasets for intrusion detection. Com-
tect malicious web sites from suspicious URLs. In: Proceedings of the 15th ACM put. Secur. 31 (3), 357–374.
SIGKDD International Conference on Knowledge Discovery and Data Mining, Singh, K., Guntuku, S.C., Thakur, A., Hota, C., 2014. Big data analytics framework for
pp. 1245–1254. peer-to-peer botnet detection using random forests. Inf. Sci. 278, 488–497.
18
Zhen Yang is currently a full professor of computer sci-

Song, J., Takakura, H., Okabe, Y., 2006. Description of kyoto university
ence and engineering at Beijing University of Technology.
benchmark data. Available at link: http://www.takakura.com/Kyoto_data/
He received the PhD degree in signal processing from the
BenchmarkData- Description- v5.pdf [Accessed on 15 March 2016].
Beijing University of Posts and Telecommunications. His
Sperotto, A., Sadre, R., Van Vliet, F., Pras, A., 2009. A labeled data set for flow-based
research interests include data mining, machine learn-
intrusion detection. In: International Workshop on IP Operations and Manage-
ing, trusted computing, and content security. He has pub-
ment, pp. 39–50.
lished more than 30 papers in highly ranked journals and
SSHCure, 2014. www.simpleweb.org/wiki/index.php.
top conference proceedings. He is a senior Member of the
Subba, B., Biswas, S., Karmakar, S., 2015. Intrusion detection systems using linear
Chinese Institute of Electronics and a member of the IEEE.
discriminant analysis and logistic regression. In: 2015 Annual IEEE India Confer-
ence (INDICON), pp. 1–6.
Taherdangkoo, M., Paziresh, M., Yazdi, M., Bagheri, M.H., 2013. An efficient algorithm
for function optimization: modified stem cells algorithm. Cent. Eur. J. Eng. 3 (1),
36–50.
Tan, Z., Jamdagni, A., He, X., Nanda, P., Liu, R.P., Hu, J., 2014. Detection of de-
nial-of-service attacks based on computer vision techniques. IEEE Trans. Com-
put. 64 (9), 2519–2533.
Xiaodong Liu is currently studying for a master’s degree
Tang, T.A., Mhamdi, L., McLernon, D., Zaidi, S.A.R., Ghogho, M., 2018. Deep re-
in the School of Computer Science and Technology at Bei-
current neural network for intrusion detection in SDN-based networks. In:
jing University of Technology. Research Fields: Network
2018 4th IEEE Conference on Network Softwarization and Workshops (NetSoft),
Security, Intrusion Detection, Machine Learning.
pp. 202–206.
Tavallaee, M., Bagheri, E., Lu, W., Ghorbani, A.A., 2009. A detailed analysis of the
KDD CUP 99 data set. In: 2009 IEEE Symposium on Computational Intelligence
for Security and Defense Applications, pp. 1–6.
Teng, S., Wu, N., Zhu, H., Teng, L., Zhang, W., 2017. SVM-DT-based adaptive and col-
laborative intrusion detection. IEEE/CAA J. Autom. Sin. 5 (1), 108–118.
Thakkar, A., Lohiya, R., 2020. A review of the advancement in intrusion detection
datasets. Procedia Comput. Sci. 167, 636–645.
Ting, K.M., 2002. An instance-weighting method to induce cost-sensitive trees. IEEE
Trans. Knowl. Data Eng. 14 (3), 659–665.
TRAbID, 2017. https://secplab.ppgia.pucpr.br/?q=trabid.
Twente, 2009. www.simpleweb.org/wiki/index.php.
UCSD, 2015. https://www.impactcybertrust.org/dataset_view?idDataset=915.
UGR’16, 2016. https://nesg.ugr.es/nesg-ugr16/index.php. Tong Li holds a lecturer position in the Faculty of In-
UNSW-NB15, 2015. https://cloudstor.aarnet.edu.au/plus/index.php/s/ formation Technology at the Beijing University of Tech-
2DhnLGDdEECo4ys?path=2FUNSW- NB1520- 20CSV20Files. nology, China. He received his PhD degree in Computer
Van Der Maaten, L., 2014. Accelerating t-SNE using tree-based algorithms. J. Mach. Science from the University of Trento in 2016. He has
Learn. Res. 15 (1), 3221–3245. been an author or co-author of more than 70 papers in
Vinayakumar, R., Alazab, M., Soman, K., Poornachandran, P., Al-Nemrat, A., Venkatra- peer-reviewed journals, conferences, or workshops in the
man, S., 2019. Deep learning approach for intelligent intrusion detection system. areas of requirements engineering, security engineering,
IEEE Access 7, 41525–41550. and conceptual modeling. He is currently focusing on an-
Vinayakumar, R., Soman, K., Poornachandran, P., 2017. Applying convolutional neu- alyzing security requirements for social engineering at-
ral network for network intrusion detection. In: 2017 International Confer- tacks. He is now hosting a National Natural Science Foun-
ence on Advances in Computing, Communications and Informatics (ICACCI), dation of China, a subtask of a National Key Research and
pp. 1222–1228. Development Program of China, and a Beijing Education
Wang, B.X., Japkowicz, N., 2004. Imbalanced data set learning with synthetic sam- Science Planning Funding. He is an expert of ISO/IEC JTC
ples. In: Proc. IRIS Machine Learning Workshop, vol. 19. 1/ SC 27/ WG 4 and works as a co-editor of ISO/IEC 24392.
Wang, L., Jones, R., 2017. Big data analytics for network intrusion detection: asurvey.
Int. J. Netw.Commun. 7 (1), 24–31.
Weiss, G.M., Provost, F., 2001. The effect of class distribution on classifier learning: Di Wu is currently pursuing the PhD degree in college
an empirical study. of computer science and technology at Beijing University
Wolpert, D.H., Macready, W.G., 1997. No free lunch theorems for optimization. IEEE of Technology, Beijing, China. Her research interests in-
Trans. Evol. Comput. 1 (1), 67–82. clude many-objective optimization algorithm and knowl-
Wu, Y., Schuster, M., Chen, Z., Le, Q. V., Norouzi, M., Macherey, W., Krikun, M., edge graph embedding.
Cao, Y., Gao, Q., Macherey, K., et al., 2016. Google’s neural machine trans-
lation system: bridging the gap between human and machine translation.
arXiv preprint arXiv:1609.08144.
Xiao, Y., Xing, C., Zhang, T., Zhao, Z., 2019. An intrusion detection model based
on feature reduction and convolutional neural networks. IEEE Access 7,
42210–42219.
Xu, C., Shen, J., Du, X., Zhang, F., 2018. An intrusion detection system using a deep
neural network with gated recurrent units. IEEE Access 6, 48697–48707.
Yang, Y., Pedersen, J.O., 1997. A comparative study on feature selection in text cate- Jinjiang Wang is a current undergraduate student major-
gorization. Icml 97 (412–420), 35. ing in information security at Beijing University of Tech-
Yao, H., Li, C., Sun, P., 2020. Using parametric t-distributed stochastic neighbor em- nology, Beijing, China. His research interests include ma-
bedding combined with hierarchical neural network for network intrusion de- chine learning-based network intrusion detection algo-
tection. Int. J. Netw. Secur. 22 (2), 265–274. rithm, and reinforcement learning.
Yin, C., Zhu, Y., Fei, J., He, X., 2017. A deep learning approach for intrusion detection
using recurrent neural networks. IEEE Access 5, 21954–21961.
Zare, H., Haffari, G., Gupta, A., Brinkman, R.R., 2013. Scoring relevancy of features
based on combinatorial analysis of Lasso with application to lymphoma diagno-
sis. BMC Genomics 14 (1), 1–9.
Zarpelão, B.B., Miani, R.S., Kawakani, C.T., de Alvarenga, S.C., 2017. A survey of intru-
sion detection in internet of things. J. Netw. Comput. Appl. 84, 25–37.
Zhang, H., Wu, C.Q., Gao, S., Wang, Z., Xu, Y., Liu, Y., 2018. An effective deep learn-
ing based scheme for network intrusion detection. In: 2018 24th International
Conference on Pattern Recognition (ICPR), pp. 682–687.
Zhang, J., Zulkernine, M., Haque, A., 2008. Random-forests-based network intrusion
detection systems. IEEE Trans. Syst. Man Cybern.Part C 38 (5), 649–659.
19
Yunwei Zhao received her PhD from Tsinghua Univer- Han Han is an engineer of CNCERT/CC. He specializes in
sity in 2015 and worked as a postdoctoral researcher in software engineering, AI, and cybersecurity. His research
Nanyang Technological University afterwards. She joined has bridged the gap between the theory and practical us-
CNCERT/CC in 2017. Her research interest is data analytics, age of AI-assisted software systems for better quality as-
network security, data interdependence, behavior model- surance and security.
ing, and social media analytics. Her publications appear
in top-tier venues including IJCAI, IJCNN, WI-IAT, etc.
20

A Systematic Literature Review of Methods and Datasets For Anomaly Based Network Intrusion Detection

Uploaded by

Copyright:

Available Formats

A Systematic Literature Review of Methods and Datasets For Anomaly Based Network Intrusion Detection

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

A Systematic Literature Review of Methods and Datasets For Anomaly Based Network Intrusion Detection

Uploaded by

Copyright:

Available Formats

Computers & Security 116 (2022) 102675

Contents lists available at ScienceDirect

Computers & Security

A systematic literature review of methods and datasets for

1. Introduction the development of machine learning. Traditional machine learn-

Related work Year SLR-based Intrusion detection method Multi-ﬁeld Dataset

Preprocessing Model Evaluation

Fig. 1. SLR process.

Criteria EC/IC Criteria explanation

• Explicit inclusion and exclusion criteria. These should be ex-

3.4. Data extraction

Fig. 4. Data preprocessing techniques commonly used in intrusion detection.

the ﬁeld of network intrusion detection (RQ1(d)). This indicates

4.2. Data preprocessing methods

b ) Imbalanced learning the MSE of misclassifying a class marked as 0 as a class marked

• Principal Component Analysis (PCA). One of the most Table 5

Fig. 5. Visualization plots based on t-SNE.

Fig. 6. Visualization plots based on PCA.

Type Method Count Type Method Count

Supervised learning Support vector machine 21 Supervised learning DNN 7

Fig. 7. Changes in numbers of papers published over time.

annotation is time-consuming and labor-intensive. Therefore, unsu-

dicted samples, and it measures the overall recognition of the clas-

Fig. 10. Top authors by total number of Scopus citations.

Paper Year Citations Average citations

Table 9 To answer RQ5(b), we plotted the author network, as shown

Fig. 11. Full coauthor network.

Fig. 12. Author network subgraph.

No. Dataset Year Authenticity Count Labeled Number of labels Link

1 1998 DARPA 1998 emulated 7,000,000 yes 4 DARPA (1998,1999)

Zhen Yang is currently a full professor of computer sci-

You might also like