Detection of Ddos Attacks and Flash Events Occuring Simultaneously in Network Traffic Using Deep Learning Techniques
Detection of Ddos Attacks and Flash Events Occuring Simultaneously in Network Traffic Using Deep Learning Techniques
By
CARL EGINALD MIHANJO
i
DECLARATION AND COPYRIGHT
I, Carl E. Mihanjo, declare that this dissertation is my original work and that it has
not has not been presented and will not be presented to any other University for a
similar or any other degree award.
Signature: ……………………
ii
CERTIFICATION
“The undersigned certifies that he has read and hereby recommend for acceptance by
the University of Dodoma dissertation entitled Detection of DDoS attacks and Flash
Events occurring simultaneously in Network Traffic using Deep Learning in partial
fulfilment of the requirements for the degree of Master of Science in
Telecommunication engineering of the University of Dodoma”.
DR. MONGI, A
iii
ABSTRACT
iv
ACKNOWLEDGEMENT
The space is not enough to mention all who assisted me in accomplishing this
dissertation. My heart is full of gratitude to all staff of the College of Informatics
and Virtual Education for their positive comments that made this piece of work
look as it is today. I pray that the Almighty God bless you all abundantly for your
good services.
v
TABLE OF CONTENTS
ABSTRACT ................................................................................................................ iv
ACKNOWLEDGEMENT ........................................................................................... v
2.1.2 FE ........................................................................................................... 5
2.1.9 Optimizer................................................................................................ 8
vi
2.2 Related Work ................................................................................................. 8
3.7.6 Optimizer.............................................................................................. 20
3.7.7 Output................................................................................................... 20
3.7.8 Design Summary in Table and Deep Neuron Network diagrams ........ 20
vii
3.7.10 Model Validation ................................................................................. 23
4.1 Description of the DDoS attacks and FE in network traffic with reference to
a normal traffic Results and Finding for FE: .......................................................... 26
4.2 Modelling of described pattern using deep learning technique for detection
of DDoS attacks and FE ......................................................................................... 29
4.3 Validate performance of the developed model for detection of DDoS attacks
and FE..................................................................................................................... 35
REFERENCES.......................................................................................................... 42
viii
LIST OF TABLES
ix
LIST OF FIGURES
Figure 4. 5: Accuracy, Loss of the Model with Learning Rate 0.01 .......................... 29
Figure 4. 6: Accuracy, Loss of the Model with Learning Rate 0.1 ............................ 30
Figure 4. 7: Accuracy, Loss of the model with learning rate 0.01 ............................ 31
Figure 4. 8: Accuracy, Loss of the Model with Learning Rate 0.1 ........................... 31
Figure 4. 9: Accuracy, Loss of the Model with Learning Rate 0.01 ......................... 32
Figure 4. 10 Accuracy, Loss of the model with learning rate 0.1 .............................. 33
Figure 4. 11: Accuracy, Loss of the Model with a Learning Rate of 0.1 ................... 33
Figure 4. 12: Accuracy, Loss of the Model with a Learning Rate of 0.01 ................. 34
Figure 4. 13: Three Hidden Layers, 0.01 Learning Rate on a Real Dataset Detection
.................................................................................................................................... 35
Figure 4. 14: Three Hidden Layers, Learning Rate 0.1 on a Real Dataset Detection36
Figure 4. 15: Two Hidden Layers, 0.01 Learning Rate on a Real Dataset Detection 36
Figure 4. 16: Two Hidden Layers, Learning Rate 0.1 on a Real Dataset Detection .. 37
x
Figure 4. 17: One Hidden layer, 0.01 Learning Rate on a rReal Dataset Detection .. 37
Figure 4. 18: One Hidden Layers, Learning Rate 0.1 on a Real Dataset Detection .. 38
xi
LIST OF ABBREVIATIONS AND ACRONYMS
DL Deep Learning
FE Flash Events
PC Personal Computer
xii
CHAPTER ONE
INTRODUCTION
In an attempt to distinguish the effect of DDoS and FE in the network traffics over
the internet, Daneshgadeh et al. (2019b) proposed a model that uses Shannon
entropy, Kernel Online Anomaly Detection KOAD algorithms and Mahalanobis
distance metric working with machine learning technique. The author reduced false
alarm and improved detection rate on High Rate and Low Rate DDoS (HR-DDoS
and LR-DDoS) and FE independently. Sun (2019) proposed a method that used K-
nearest neighbours (KNN) which is machine learning approach to detect DDoS and
FE based flow characteristics of the network traffic. The author considered some of
the features of flow characteristics such as protocol type and entropy of
source/destination IP. As a result, the proposed model reduced false alarm and
improved the detection rate. In the other study of Sahoo (2018), the authors
evaluated Shannon entropy and Kullberg-Leibler divergence metrics based on
information metrics to distinguish HR-DDoS and FE in SDN network traffic. The
study used two information theory based on general Entropy and general information
1
distance, both metrics showed can be used to detect and distinguish with great extent
FE from DDoS attack.
It is evident that several studies have been able to detect DDoS and FE
independently. But DDoS and FE may happen at the same time, and hence confuse
the current detection mechanism proposed. Furthermore, following studies
(Daneshgadeh et al., 2019b; Sun et al., 2019) that detect DDoS and FE were done
with focus of machine learning. Nevertheless, there are ongoing arguments in
research community about the efficiency of this strategy in detecting DDoS and FE.
For instance, Imamverdiyev and Abdullayeva (2018) say in machine learning feature
extractions is done by human, this mean for the big data it will be impossible for
human to extract the hidden features and pattern. For that reason, there is a need to
find out a different technique.
Therefore, this study has focused on solving issues that were not covered by the
previous scholars in an attempt to find a way to detect and isolate DDoS and FE
attacks in networks, especially when occurring simultaneously
The large traffic volume has been observed to attract attacks to dedicated application
systems from different corners of the globe (Zhang, Zhang, & Yu, 2018). DDoS
utilizes systems bandwidth and other computing resource like CPU and memory that
lead to the degradation of network services (Daneshgadeh, Kemmerich, & Ahmed,
2019a). FE also has similar effects as DDoS to the services (Sahoo et al., 2018).
Thus, several detection techniques were proposed to distinguish these DDoS attacks
from FE as the attacks need different countermeasures.
Daneshgadeh et al. (2019b) proposed a model that uses Shannon entropy and KOAD
algorithms based on machine learning to detect anomaly on network traffic. The
2
study reduced a false alarm and improved detection rate on HR-DDoS and LR-
DDoS. Sun (2019) proposed method that used KNN which is machine learning
approach to detect DDoS and FE based on flow characteristics of the network traffic.
Some of the features of flow characteristics were protocol type entropy of
source/destination IP. The proposed model reduced a false alarm and improved the
detection rate. However, these studies considered only several features found in flow
characteristics. Moreover, proposed model works with separate data that did not
come from same network. The proposed techniques used features like flow
characteristics, information metrics and others, but they all used machine learning as
their key method. The major disadvantage of machine learning in this scenario is that
features extraction is done by human beings which may be a source of error
(Imamverdiyev & Abdullayeva, 2018).
Therefore, this study aimed to develop a model to detect the DDoS attack and FE
occurring simultaneously in network traffic using deep learning techniques. The
selection of deep learning is due to the motivation of overcoming the weaknesses of
traditional machine learning as features extraction is done by human beings while in
deep learning is done by machine (Imamverdiyev & Abdullayeva, 2018).
i. To study the DDoS attacks and FE pattern in network traffic with reference to
a normal traffic
ii. To construct a model using deep learning technique for detection of DDoS
attacks and FE
3
1.4 Research Questions
i. How does DDoS attacks and FE pattern behave in the network traffic with
reference to a normal traffic?
ii. What model suits a describe pattern using DL techniques for DDoS attacks
and FE in network traffic?
iii. What is the performance of the developed model for detection of DDoS
attacks and FE?
4
CHAPTER TWO
LITERATURE REVIEW
This chapter focused on obtaining the information about the study by surveying the
previous written works from different sources of information. The sources include
searching from internet, reports, journals and books. It also consists of several
concepts that have been used in the study namely, DDoS Attacks, FE, Deep
Learning, feature extraction, classification algorithm, and CAPTCHA then followed
by related works. The research gap concludes the chapter by pointing out issues that
need to be addressed by future researchers.
2.1.2 FE
FE occurs when many legitimate users access shared resources lead to the
degradation of services (Bhatia, 2017). According to Sahoo (2018), FE are events
whereby several thousand legitimate users try to access shared services due to
important announcements or breaking news, which lead to high network traffic that
causes response delay.
5
2.1.4 Deep Learning Classification
Classification is one of the most common and frequently tackled problems in the
machine learning and deep learning domain. It uses the concept of classifying the
entities into categories. When the outcome is multiple then the problem can be
classified in several categories. Binary classification is the simplest form that user
tries to classify an entity into one of the two possible categories/outcomes. For
example, it can be between a cat and a dog, pass or fail and other examples. This
classification problem has been experienced in several researches as reported by
Alkhaleefah & Wu (2019), Rahman, Wang, Sun, & Zhou, (2006) and Shu(2019).
Deep learning has been applied by different prominent scholars namely Harms,
(2019), Li, (2017) and Rymarczyk, Kozłowski & Niderla, (2019). The study was
about classification problem in nature and detection between FE and DDoS attacks.
Classification is more suitable for the kind of a problem study aimed to solve.
6
2.1.7 Activation Function
In the scenario of trying to give an output layer of neuron network a value of 1 or 0
activation functions is applied. It’s mathematical functions that play with gradient
descent so that it cannot diminish or saturate towards zero (Leskovec et al., 2020). It
is used to solve nonlinear problems (Wang, Li, Song, & Rong, 2020). In general,
activation function performs mathematical operation on signal output. Moreover, its
choice depends on the type of a problem. Popular activation functions are divided
into linear activation function, Rectified Linear Unit (ReLu), Tangent Hyperbolic
Function and Sigmoid Function. Linear activation function produces a positive
number for all real numbers and Sigmoid function produces a value of 0 for all
values less than 0.5 and produces 1 for a value greater than 0.5 (Mhaskar &
Micchelli, 1994; Panchal & Panchal, 2014). The most used activation function in
neuron network in non saturated is ReLU (Feng & Lu, 2019). Moreover, it is popular
in ANNs mostly in Convolutional Neural Networks (CNN) and Deep Learning.
Output
1
0.5
Activation
7
2.1.8 Loss Function
The mechanism of checking how far or close a model output to the label or a true
value is achieved by using loss function or cost function. The output of the model is
compared to the label and gets the difference which helps a model to adjust its
weight towards true values (Poirot, 2019). Consider the equation below that shows
cost function with y1 model output with y label
Loss (y1, y) = How much y1 differs from the true y …………………………….. (2)
In logistic regression and other classification, problems uses cross entropy loss
function to calculate loss. Cross entropy (LCE) use negative log likelihood loss
(Keren, Sabato, & Schuller, 2020). Mathematical equivalent for cross entropy loss is
given in equations:
2.1.9 Optimizer
After knowing the loss of the model, the next step is to update the gradient (weight)
of the model to a new one that is closer in getting a true label value. This is where
gradient descent is done. Moreover, learning rate parameter is decided by looking at
the size of data and if it is small, it will take a long time however, it can produce
results that are more reliable. The determination on which parameter is suitable for
the model especially neuron network depends on the problem it has.
8
machine learning technique to distinguish DDoS from FE. The study reduces a false
alarm and improves detection rate on high rate and Low rate DDoS (HR-DDoS and
LR-DDoS). However, the proposed model works with separate data that did not
come from the same network.
Also, Daneshgadesh et al. (2019a) proposed another model in a different study that
used Shannon entropy and KOAD to detect the anomaly in network traffic similar to
other studies but used Support Vector Machine (SVM) which is machine leaning
methods to distinguish FE from DDoS attacks. In the study, authors detected DDoS
attacks and FE when they occur separately. Nevertheless, this study did not
considered when DDoS attack and FE occur simultaneous.
Dhingra (2018) revealed several parameters that distinguish DDoS from FE. He
asserts that FE needs different countermeasures while those legitimate users should
be allowed and not blocked or treated as attacks. This can only be successful if FE is
separated from DDoS. Moreover, the author continues to distinguish FE from DDoS
by pointing out that DDoS are bots that are pre-programmed. Bots were pre-
programmed for rate of request, payload, and time interval between requests; also,
system under control of botmaster is pre-defined. This concludes that the traffic
generated by bots have the same kind of similarity. While in FE it is totally different
scenarios as genuine requests are difficult to determine. The normal user sends a
request depending on the information he/she seeks. Furthermore, the study gives out
the parameter that can be differentiated as follows: the rate of incoming request, the
number of requests from new IPs, geographical distribution of request sources,
request files and patterns among sources of IPs. Through these parameters the two
traffics can be distinguished.
Sun (2019) proposed a method that used KNN which is machine learning approach
to detect DDoS and FE based flow characteristics of the network traffic. Among the
features of flow characteristics was protocol type entropy of source/destination of IP.
The proposed model reduced a false alarm and improves the detection rate. However,
this study considers only several features found in flow characteristics.
In the study of Sahoo (2018), the authors evaluate Shannon entropy and Kullberg-
Leibler divergence metrics based on information metrics to distinguish HR-DDoS
9
and FE in SDN network traffic. The proposed metrics reduce false alarm. However
study focus only on information metrics to detect HR-DDoS and FE.
With regard to the technique proposed in this study which is DL, majority of
researchers used it for detection of DDoS attacks only in the network traffics and did
not consider FE as it was not their focus. Imamverdiyev (2018) proposes an
application on Deep learning based on Gaussian-Bernoulli type restricted Boltz-
mann Machine (RBM) using NSL-KDD dataset to detect DDoS attack. The
application outperforms a traditional machine learning namely; SVM, Decision Tree
and others. However, this study focused on only DDoS and shows the strength of
deep learning technique.
According to Li (2018), using deep learning technique not only improved accuracy to
detect DDoS attacks but also the dependence of the physical hardware and software
were reduced while the updating mechanism of real-time detection became easy to
do. The author cited achieved high accuracy between 98-99% at training phase using
ISCX dataset on Software Defined Network (SDN). McDermott, Majdani, and
Petrovski (2018) used deep learning technique based on Bidirectional Long Short
Term Memory based on Recurrent Neural Network (BLSTM-RNN) to detect mirai
botnet DDoS attacks and obtained a validation accuracy of 98-99% while reducing
the loss.
On the other hand, Priyadarshini and Barik (2019) proposed a defence mechanism
design using deep learning based on LSTM to detect DDoS attacks on fog
environment and obtain 98.88% accuracy on testing data. According to them, the
model uses 128 input nodes, 3 hidden layers and one dense layer to achieve high
growing accuracy and reduced error.
Currently, the DDoS attack mitigation technique widely used is CAPTCHA (Al-Ali,
Al-Duwairi, & Al-Hammouri, 2016). CAPTCHA stands for Completely Automated
Public Turing test to tell Computers and Humans Apart. It’s used challenge test to
tell apart between humans and computers (Saikirthiga & Vaithyasubramanian, 2016).
The CAPTCHA given out whenever there is anomaly in network traffic. In the case
of both DDoS attacks and FE a CAPTCHA will be given out. However the FE are
legitimate user that should not be tested by CAPTCHA instead they should be
continue to the service requested.
10
Several studies focused on detecting anomalies in network as its important path when
trying to detect DDoS and FE. Chen et al. (2019) used hybrid techniques by
combining unsupervised/supervised machine learning to anomaly in the network
traffic. The study obtained high accuracy on detect and 1% false positive and
negative rates.
Garg et al. (2019) proposed a model using deep learning techniques to detect
anomaly traffic in social media in SDN. The proposed model achieves over 99%
using TIET, KDD99 and CMU dataset. This shows the strength of deep learning
techniques on features extractions.
The previous study obtained high accuracy using computational intelligence and
processing that involve machine learning methods meanwhile deep learning gaining
success for major industrial applications due to ability of learning feature from big
data (Yu & Zhou, 2020). The promise and current well known programming
language used to implement such kind of method is Python, as it has built-in libraries
for real scientific research (Kumar & Panda, 2019). While the concept of combining
activation functions improve the performance of the model (Manessi & Rozza,
2018).in the data manipulation and preparation part, data were divided into three
groups namely train, test and validation data. Testing data help to control the model
and not generalize data or start to remember as it was trained for so long on a single
source of data (training) (Allmer, 2014).
Unfortunately, most of the proposed solutions to detect the FE and DDoD were
proposed to deal with each effect mutually exclusively. However, in real network
environment DDoS and FE may happen at the same and should be identified and
dealt with appropriately.
11
This study therefore, created a model which can detect and differentiate between the
DDoS attacks and FE happening simultaneously in computer networks.
12
CHAPTER THREE
METHODOLOGY
3.1 Introduction
This part presents research setting, design, strategy, simulation setup, manipulation
of DDoS and FE, experimental tool, data collection, data analysis and model
development. The next chapter focuses on the result that were analysed in this
chapter.
13
3.4.1 Simulation Setup
The simulation setup did have computer for DDoS attacker, normal user, FE, wire
shark, and target server. The network was simulated as a normal network using GN3,
having switches, routers and ISP. This was done to reflect the real network situation.
3.5.1 Studying of the DDoS attacks and FE in network traffic with reference to
a normal traffic
In the simulation Scapy and Wireshark tools were used, the pattern was observed and
collected for DDoS attack, FE. The procedures were as follows:
3.5.1.1 FE Generation
FE was generated based on FIFA world cup 98 using python scapy tool, FIFA world
cup of 1998 is only available dataset represent the predictable FE. The researcher’s
focus was to generate the similar pattern as it happens on FIFA world cup 98 dataset.
According to (Daneshgadeh et al., 2019b) the highest FE occur in a 66th day around
23:30 and 23:46 covering 16 minutes towards a game match between Argentina and
14
England. The focus was to replicate these 16 minutes in simulator. In that dataset, IP
addresses were replaced by code id for the purpose of retaining privacy. The
following table shows important details about a dataset as number of requests, range
code id, maximum code id, and destination id.
1. 2,712,425
15
Table 3. 4: Server and Network device specifications
Moreover, in network layer attacks (volume based attack) the most effective attack is
flood (Sahi, Lai, Li, & Diykh, 2017). Among this flood DDoS are UDP, ICMP, TCP-
SYN and HTTP. In this study the focus was in TCP SYN flood (SYN flood). The
DDoS generator was volume based attack focus on TCP SYN flood. This TCP SYN
is when attacker exploits the normal TCP three-way handshake by sending request
for connection and when the server reply the request with acknowledge attacker does
16
not send the acknowledge that left server waiting for it for some period of time. The
server will deny other clients as many connections are open waiting for
acknowledgement. This scenario exhausts network bandwidth, CPU and other server
resources.
1 4 4 100 2110
17
3.5.1.10 Data Preparation for FE and DDoS Attacks Generation at the Same
Time with Python Language
In this stage, the data were obtained by simulating both generators at the same time
and capturing the traffic using wireshark. The simulated data were the graphed based
on a number of requests over the time (minutes).
18
activation function, and output were selected and used on a model. As input depends
on dataset generated and output is categorical, then different hidden layers and
activation functions were tested and analysed. As before, performance metric was
used to analyse the model created based on several hidden layers and activation
functions. For the third and last specific objective which validation of the developed
model was accomplished and analysed by using test and validation data. The
performance metrics of DL were used to analyse.
3.7.1 Introduction
In the previous stage, the data obtained were FE and DDoS happened at the same
time. The aim here was to design a model that could detect and distinguish the two.
The nature of this problem was classification as the researcher indicated; therefore,
the researcher selected deep learning techniques as the method for detecting and
distinguishing. In classification problem, it depends on how many categories to be
classified and researcher have two categories: FE or DDoS attack. Therefore, its
logistic (binary) classification the output can be 0 or 1. In implementation of the
model, the researcher used Python Language relying on Pytorch library
19
number of epoch to train the model depends too on the size of the data. The
researcher concludes as follows in this stage: 1 to 3 hidden layers with 1 epoch of
train then increase epoch until desirable accuracy with validation accuracy achieved.
Moreover, in a selected number of nodes/neuron there is no clear and systematic
way. The researcher used text different and came out with one that have most
desirable performance
3.7.6 Optimizer
The researcher picked two optimizers namely: Adam and SGD that can help the
learning rate of the model making training and test phase to speed up. The researcher
decided to test both optimizer with learning rate of 0.1 and 0.01 for better
performance in training phase.
3.7.7 Output
In the output part, only one output was given out either FE or DDoS attack. The
researcher used binary to represent the output as follows: 0 for FE and 1 for DDoS
attacks.
20
Table 3. 6: Summary of Model Design Parameter
21
Figure 3. 2: Deep Neuron Network Diagram with 2 Hidden Layers
22
3.7.10 Model Validation
3.7.11 Introduction
In previous stage, several design models obtained that can detect and distinguish
between FE and DDoS attack when they happen at the same time. The focus here
was to test the performance of the model to a real data from the network of the
University of Dodoma.
23
Figure 3. 4: Training, Testing and Validation phase flow chart
The second phase is collection of data for testing a model in a live network. In this
study, the UDOM Local Area Network (LAN) was used. Then, there were some
ethical concerns such as privacy, confidentiality and integrity of data generated by
network users and applications.
24
Therefore, the researcher preserved the said attributes by securing the IP addresses,
service port numbers and server credentials that will be availed for the purpose of
this study.
25
CHAPTER FOUR
RESULTS AND DISCUSSIONS
4.1 Description of the DDoS attacks and FE in network traffic with reference
to a normal traffic Results and Finding for FE
4.1.1 FE pattern
The process of execution of script took 16 minutes. The results were plotted to the
graph. The purpose on this stage was to replicate the FIFA dataset as it happened.
The following graph shows the number of requests in each minute of 16 minutes for
both Traffic generator and FIFA world cup 98.
The researcher was able to simulate and generate almost similar traffic as FIFA
dataset is shown in fig. 3 with tolerance of about +/- 5%. The challenge faced was
hardware issue as when using small specification below what the researcher
indicated the outcome may diverge from the target. Moreover, the researcher faced
the Scapy tool limitation of packet rate when sending packet so there was limitation
of a certain packet rate. This could be improved by deep understanding of the Scapy
tool library but for researcher time was not in his favour. For future work, the
researcher plans to go deep in Scapy tool library to improve the FE traffic generator.
26
4.1.2 DDoS attacks pattern
The process of execution of script took 8 minutes for both scenarios. The results
were plotted to the graph. The following graph shows the number of requests in each
minute of 8 minutes for both scenarios with 4 domains and 1 domain network.
The results were similar to other researchers who reported on how the flood DDoS
attack behaves when it happens. The DDoS attacks produce a Square shape as it
maintains a number of packets for a certain period of time.
27
Figure 4. 3: Network Traffic of FE with DDoS Attack in 16 Minutes
28
The researcher was able to simulate and generate network traffic that contained both
FE and DDoS attacks happen simultaneously. This provides new direction as it gives
out alternative ways to researchers who wish to try their model in scenarios like this
where FE and DDoS attacks happen at same time. The challenge faced were
hardware issue as when using small specification below what the researcher
indicated the outcome could diverge from the target. Moreover, researcher faced the
Scapy limitation of packet rate when sending packet so there was limitation of a
certain packet rate. This could be improved by deep understanding of the Scapy
library but for the researcher time was not in his favour. For future works, the
researcher plans to go deep in Scapy library so as to improve the FE traffic generator.
In the first trial where learning rate was 0.01, the model took less than 50 seconds to
obtain the highest accuracy of about 99% while loss drop dramatically to almost
0.01, the model were taken short and small steps in learning gradient descent that
produce more desirable gradient which lead to the high accuracy . This is shown in
Figure 4.5:
29
In the second trial where learning rate was 0.1 the model took above 50 seconds to
obtain the highest accuracy of about 99% while loss drop dramatically to 0.01 as
shown in Figure 4.6. Now the model took much high steps which can lead quick
learning to a model meanwhile it overshot the step and miss the true and desirable
gradient(s). That’s why it takes too long compare to the learning rate of 0.01.
In the first trial for two hidden layers with learning rate of 0.01, the model took again
less than 50 seconds to obtain the highest accuracy of about 99% while loss dropped
dramatically to almost 0.01. The reason here it’s the same as previous model with the
same learning rate of 0.01 that the model took small steps toward desirable
gradient(s).This is shown in Figure 4.7:
30
Figure 4. 7: Accuracy, Loss of the model with learning rate 0.01
In the second trial where learning rate was 0.1, the model took between 10 to 15
seconds to obtain the highest accuracy of about 99% while loss dropped dramatically
to almost 0.01. In this scenario with large learning rate the model learn more quickly
which did not happen to the previous model with the same learning rate. This is
because number of hidden layer and nodes reduce meaning the model did not go
deep enough to capture all features instead it generalize the result. On previous Fig
4.7 did not affect that much because the learning step was 0.01 small enough
compare to 0.1. This is shown in Figure 4.8:
31
In the first trial for one hidden layer with learning rate of 0.01, the model took more
than 50 seconds to obtain the highest accuracy of about 99% while loss dropped
dramatically to almost 0.01. The reason here it’s the same as previous model with the
same learning rate of 0.01 that the model took small steps toward desirable
gradient(s) the model took few time as layer did not go deep enough to capture all
features as it have only one hidden layer. This is shown in Figure 4.9:
In the second trial for one hidden layer with a learning rate of 0.1 the model took
again between 10 to 15seconds to obtain the highest accuracy of about 99% while
loss dropped dramatically to almost 0.01. In this scenario same as Fig 4.11 with large
learning rate the model learn more quickly which did not happen to the previous
model with the same learning rate. This is because number of hidden layer and nodes
reduce meaning the model did not go deep enough to capture all features instead it
generalize the result .This is shown in Figure 4.10:
32
Figure 4. 10 Accuracy, Loss of the model with learning rate 0.1
In the first trial for three hidden layers with a learning rate of 0.1 with SGD optimizer
the model took almost 130 seconds to obtain the highest accuracy of about 99%
while loss dropped dramatically to almost 0.01. this show Optimizer is not suitable as
its took long time to capture the desirable gradient(s) and this is validated in Figure
4.11.
Figure 4. 11: Accuracy, Loss of the Model with a Learning Rate of 0.1
33
In the second trial for the three hidden layers with a learning rate of 0.01 with SGD
optimizer the model trained for 400 seconds to and obtained the highest accuracy of
about 70-72% for test data and 75-82% for train data while loss dramatically to
almost 0.55 for test and 0.5 for trained data. The reason here is model took small
steps but optimizer failed to learn the features from data hence it train for long time
without capture a true gradient(s). This is evidenced in Figure 4.12.
Figure 4. 12: Accuracy, Loss of the Model with a Learning Rate of 0.01
As shown in the Figure 1 and 2 when using SGD as optimizer model took too long to
learn the pattern. For the same learning rate between Adam and SGD, Adam
optimizer showed that it was more capable for learning quickly a pattern in a dataset.
This concludes that Adam optimizer is the best optimizer for training model and
there is no need for finding out performance for a small number of hidden layers.
Moreover, using Adam optimizer 0.1, model performance took a few seconds
compared to the 0.01 configuration. Also, when considering the number of hidden
layers, two hidden layers showed the highest and quickest learning compared to
others.
34
However, a number of hidden layers depend on the size of dataset and a number of
inputs, so that the model could be changed. With respect to this study, selection of
the appropriate model is discussed in the next step.
Figure 4. 13: Three Hidden Layers, 0.01 Learning Rate on a Real Dataset Detection
35
Figure 4. 14: Three Hidden Layers, Learning Rate 0.1 on a Real Dataset Detection
Figure 4. 15: Two Hidden Layers, 0.01 Learning Rate on a Real Dataset Detection
36
Figure 4. 16: Two Hidden Layers, Learning Rate 0.1 on a Real Dataset Detection
Figure 4. 17: One Hidden layer, 0.01 Learning Rate on a Real Dataset Detection
37
Figure 4. 18: One Hidden Layers, Learning Rate 0.1 on a Real Dataset Detection
38
As summarized in the Table 4.1 both models detect the FE and DDoS attacks.
However, the numbers of FE and DDoS attacks detected were different. This implies
that the different model design affected the outcome. This was due to the effect of
different numbers of hidden layers and a learning rate. However, as the hidden layers
increased, the difference started to become small. This implies that the more layer of
the model (Deeper layer), improve accuracy. This is significant as FE and DDoS
attacks both have properties of happen with a large number of traffic or requests in
short period of time.
Therefore, the model with three hidden layer with either learning rate can classify
and detect FE or DDoS attacks. However, the learning rate of 0.01 gives out a
smooth curve of learning without jumping to far or too small. This implies that it is
better to take few steps in training the model because with high learning rate it can
overshoot or miss out the desirable weight (gradient). The model can be used to
detect accuracy of 99% with a false alarm of less than 1%. This is evident in several
researches using DL for detection has high performance and accuracy. Those studies
were (Garg et al., 2019; C. Li et al., 2018; Priyadarshini & Barik, 2019) with their
proposed models obtained an accuracy of between 98-99%.
39
CHAPTER FIVE
CONCLUSION, RECOMMENDATION AND FUTURE RESEARCH
5.1 Conclusion
In this dissertation the detection of DDoS attacks and FE when occurring at
simultaneously in network traffic using deep learning techniques were developed
and tested with data from real network traffic as shown in Chapter Four. The
performance parameters such as accuracy, loss and false alarm were used to compare
in designing the model as shown in several figures with respect to how long time
took on training phase and be able to detect and distinguish attacks and FE. Not only
that but also how the hidden layer affected the performance as shown in several
figures and Table 4.1 in Chapter Four. The model score the accuracy of 99% as same
as other studies but this study were solve the different scenario were this DDoS
attacks and FE occurring simultaneously while others studies did not.
5.2 Recommendations
The shown developed model can detect and distinguish with accuracy of 99%, this
shows it can be applied in live network traffic to detect the two scenarios the defence
mechanism can be applied after this proposed techniques as FE needs more server
processing power and network resources while DDoS attacks should be blocked. The
uses of different hidden layers and nodes help to improve the developed model with
two-test optimizer. Adam optimizer showed higher performance compared to normal
SGD as shown in Chapter Four. The study recommended this (Adam) optimizer in
deep learning problem similar to this study. Moreover, in this study, the researcher
observed the use of one public IP for all internet users in the organisation. However,
this is not a good network configuration because when all users access a server like
Google at once, DDoS detect techniques group all as DDoS attacks. The
organisations have more than 1400 staff with more than 25000 students with access
points and internet places for both of them. Regarding these scenarios, it is definitely
that IP will be grouped in suspicious or in black list. The quick fix for these scenarios
is to use IP version 6.
40
The study used FIFA world cup dataset 98 that took place more than 21 years ago.
This can provide invalid results. However, the study was able to simulate the events
of FIFA. This FE generator needs more improvement as it needs to mimic more
current situation of FE happening now.
Moreover, the input parameter used in this study to distinguish DDoS attacks and FE
were time interval and a number of requests/packets but to improve the detection
more parameters are needed for example source, destination IP distribution(entropy).
The future researcher will be required to include those parameters in input vector of
the model. Also, the organisation network configuration especially in Tanzania needs
to be investigated as this study pointed out the bad configuration of the Public IPs
which may be the source of blacklist organisation IPs in several sites of the world. In
the DDoS generation, only one type was used in this study. Therefore, the future
study will have to focus on testing other DDoS attacks and improving the techniques.
41
REFERENCES
Alkhaleefah, M., & Wu, C. C. (2019). A Hybrid CNN and RBF-Based SVM
Approach for Breast Cancer Classification in Mammograms. Proceedings -
2018 IEEE International Conference on Systems, Man, and Cybernetics, SMC
2018, 894–899. https://doi.org/10.1109/SMC.2018.00159
Alsirhani, A., Sampalli, S., & Bodorik, P. (2018). DDoS Detection System: Utilizing
Gradient Boosting Algorithm and Apache Spark. Canadian Conference on
Electrical and Computer Engineering, 2018-May, 1–6.
https://doi.org/10.1109/CCECE.2018.8447671
Bhatia, S. (2017). Ensemble-based model for DDoS attack detection and flash event
separation. FTC 2016 - Proceedings of Future Technologies Conference,
(December), 958–967. https://doi.org/10.1109/FTC.2016.7821720
Chen, X., Li, B., Proietti, R., Zhu, Z., & Yoo, S. J. B. (2019). Self-taught anomaly
detection with hybrid unsupervised/supervised machine learning in optical
networks. Journal of Lightwave Technology, 37(7), 1742–1749.
https://doi.org/10.1109/JLT.2019.2902487
Daneshgadeh, S., Kemmerich, T., & Ahmed, T. (2019b). Detection of DDoS Attacks
and Flash Events Using Shannon Entropy , KOAD and Mahalanobis Distance.
2019 22nd Conference on Innovation in Clouds, Internet and Networks and
42
Workshops (ICIN), 222–229.
Dhingra, A., & Sachdeva, M. (2018). DDoS detection and discrimination from flash
events: A compendious review. ICSCCC 2018 - 1st International Conference on
Secure Cyber Computing and Communications, 518–524.
https://doi.org/10.1109/ICSCCC.2018.8703335
Feng, J., & Lu, S. (2019). Performance Analysis of Various Activation Functions in
Artificial Neural Networks. Journal of Physics: Conference Series, 1237(2).
https://doi.org/10.1088/1742-6596/1237/2/022030
Garg, S., Kumar, N., Rodrigues, J. J. P. C., & Rodrigues, J. J. P. C. (2019). Hybrid
deep-learning-based anomaly detection scheme for suspicious flow detection in
SDN: A social multimedia perspective. IEEE Transactions on Multimedia,
21(3), 566–578. https://doi.org/10.1109/TMM.2019.2893549
Gupta, T. K., & Raza, K. (2020). Optimizing Deep Feedforward Neural Network
Architecture: A Tabu Search Based Approach. Neural Processing Letters,
51(3), 2855–2870. https://doi.org/10.1007/s11063-020-10234-7
Imamverdiyev, Y., & Abdullayeva, F. (2018). Deep Learning Method for Denial of
Service Attack Detection Based on Restricted Boltzmann Machine. Big Data,
6(2), 159–169. https://doi.org/10.1089/big.2018.0023
Keren, G., Sabato, S., & Schuller, B. (2020). Analysis of loss functions for fast
single-class classification. Knowledge and Information Systems, 62(1), 337–
43
358. https://doi.org/10.1007/s10115-019-01395-6
Kumar, A., & Panda, S. P. (2019). A Survey: How Python Pitches in IT-World.
Proceedings of the International Conference on Machine Learning, Big Data,
Cloud and Parallel Computing: Trends, Prespectives and Prospects,
COMITCon 2019, 248–251. https://doi.org/10.1109/COMITCon.2019.8862251
Leskovec, J., Rajaraman, A., & Ullman, J. D. (2020). Neural Nets and Deep
Learning. Mining of Massive Datasets, 498–543.
https://doi.org/10.1017/9781108684163.014
Li, C., Wu, Y., Yuan, X., Sun, Z., Wang, W., Li, X., & Gong, L. (2018). Detection
and defense of DDoS attack–based on deep learning in OpenFlow-based SDN.
International Journal of Communication Systems, 31(5), 1–15.
https://doi.org/10.1002/dac.3497
McDermott, C. D., Majdani, F., & Petrovski, A. V. (2018). Botnet Detection in the
Internet of Things using Deep Learning Approaches. Proceedings of the
International Joint Conference on Neural Networks, 2018-July, 1–8.
https://doi.org/10.1109/IJCNN.2018.8489489
44
Poirot, H. (2019). Logistic Regression.
Rahman, S., Wang, L., Sun, C., & Zhou, L. (2006). Review.
Rymarczyk, T., Kozłowski, E., Kłosowski, G., & Niderla, K. (2019). Logistic
regression for machine learning in process tomography. Sensors (Switzerland),
19(15), 1–19. https://doi.org/10.3390/s19153400
Sahi, A., Lai, D., Li, Y., & Diykh, M. (2017). An Efficient DDoS TCP Flood Attack
Detection and Prevention System in a Cloud Environment. IEEE Access, 5(c),
6036–6048. https://doi.org/10.1109/ACCESS.2017.2688460
Sahoo, K. S., Tiwary, M., & Sahoo, B. (2018). Detection of High Rate DDoS Attack
From Flash Events Using Information Metrics in Software Defined Networks.
421–424.
Shu, M. (2019). Deep learning for image classification on very small datasets using
transfer learning.
Sun, G., Jiang, W., Gu, Y., Ren, D., & Li, H. (2019). DDoS Attacks and Flash Event
Detection Based on Flow Characteristics in SDN. Proceedings of AVSS 2018 -
2018 15th IEEE International Conference on Advanced Video and Signal-Based
45
Surveillance. https://doi.org/10.1109/AVSS.2018.8639103
Wang, Y., Li, Y., Song, Y., & Rong, X. (2020). The influence of the activation
function in a convolution neural network model of facial expression recognition.
Applied Sciences (Switzerland), 10(5). https://doi.org/10.3390/app10051897
Yu, J., & Zhou, X. (2020). One - Dimension Residual Convolutional Auto - Encoder
- Based Feature Learning for Gearbox Fault Diagnosis. IEEE Transactions on
Industrial Informatics, PP(c), 1. https://doi.org/10.1109/TII.2020.2966326
Zhang, B., Zhang, T., & Yu, Z. (2018). DDoS detection and prevention based on
artificial intelligence techniques. 2017 3rd IEEE International Conference on
Computer and Communications, ICCC 2017, 2018-Janua, 1276–1280.
https://doi.org/10.1109/CompComm.2017.8322748
Zou, W., Li, Y., & Tang, A. (2009). Effects of the number of hidden nodes used in a
structured-based neural network on the reliability of image classification.
Neural Computing and Applications, 18(3), 249–260.
https://doi.org/10.1007/s00521-008-0177-3
46
Appendix
FE generator code
47
Deep Neuron Network diagram with 1 Hidden Layer
48