The Evaluation of Network Anomaly Detection Systems: Statistical Analysis of

Information Security Journal: A Global Perspective
ISSN: 1939-3555 (Print) 1939-3547 (Online) Journal homepage: http://www.tandfonline.com/loi/uiss20
The evaluation of Network Anomaly Detection

Systems: Statistical analysis of the UNSW-NB15
data set and the comparison with the KDD99 data
set
Nour Moustafa & Jill Slay
To cite this article: Nour Moustafa & Jill Slay (2016): The evaluation of Network Anomaly
Detection Systems: Statistical analysis of the UNSW-NB15 data set and the comparison
with the KDD99 data set, Information Security Journal: A Global Perspective, DOI:
10.1080/19393555.2015.1125974
To link to this article: http://dx.doi.org/10.1080/19393555.2015.1125974
Published online: 11 Jan 2016.
Submit your article to this journal
Article views: 13
View related articles
View Crossmark data
Full Terms & Conditions of access and use can be found at

http://www.tandfonline.com/action/journalInformation?journalCode=uiss20
Download by: [Mount Allison University 0Libraries] Date: 13 January 2016, At: 17:28
INFORMATION SECURITY JOURNAL: A GLOBAL PERSPECTIVE
http://dx.doi.org/10.1080/19393555.2015.1125974
The evaluation of Network Anomaly Detection Systems: Statistical analysis of

the UNSW-NB15 data set and the comparison with the KDD99 data set
Nour Moustafa and Jill Slay
School of Engineering and Information Technology, University of New South Wales at the Australian Defence Force Academy, Canberra,
Australia
ABSTRACT KEYWORDS
Over the last three decades, Network Intrusion Detection Systems (NIDSs), particularly, Anomaly Feature correlations;
Detection Systems (ADSs), have become more significant in detecting novel attacks than multivariate analysis; NIDSs;
Signature Detection Systems (SDSs). Evaluating NIDSs using the existing benchmark data sets of UNSW-NB15 data set
Downloaded by [Mount Allison University 0Libraries] at 17:28 13 January 2016
KDD99 and NSLKDD does not reflect satisfactory results, due to three major issues: (1) their lack of
modern low footprint attack styles, (2) their lack of modern normal traffic scenarios, and (3) a
different distribution of training and testing sets. To address these issues, the UNSW-NB15 data set
has recently been generated. This data set has nine types of the modern attacks fashions and new
patterns of normal traffic, and it contains 49 attributes that comprise the flow based between
hosts and the network packets inspection to discriminate between the observations, either
normal or abnormal. In this paper, we demonstrate the complexity of the UNSW-NB15 data set
in three aspects. First, the statistical analysis of the observations and the attributes are explained.
Second, the examination of feature correlations is provided. Third, five existing classifiers are used
to evaluate the complexity in terms of accuracy and false alarm rates (FARs) and then, the results
are compared with the KDD99 data set. The experimental results show that UNSW-NB15 is more
complex than KDD99 and is considered as a new benchmark data set for evaluating NIDSs.
1. Introduction FAR than the ANIDS (Lee et al., 1999), but the
ANIDS has the ability of detecting novel attacks
Because of the ubiquitous usage of computer
(Lazarevic et al., 2003). Therefore, the ANIDSs are
networks and the plurality of applications running
becoming a necessity rather than the MNIDS (Aziz
on them, cyber attackers attempt to exploit weak
et al., 2014; Bhuyan, Bhattacharyya, & Kalita, 2014;
points of network architectures to steal, corrupt, or
García-Teodoro, Díaz-Verdejo, Maciá-Fernández, &
destroy valuable information (DeWeese, 2009; Eom
Vázquez, 2009).
et al., 2012; Vatis, 2001). Consequently, the function
Evaluating the efficiency of any NIDS requires a
of a NIDS is to detect and identify anomalies in
modern comprehensive data set that contains
network systems (Denning, 1987). NIDSs are classi-
contemporary normal and attack activities.
fied into Misuse based (MNIDS) and Anomaly based
McHugh (2000), Tavallaee et al. (2009), and
(ANIDS) (Lee, Stolfo, & Mok, 1999; Moustafa &
Moustafa and Slay (2015a) stated that the existing
Slay, 2015a; Valdes & Anderson, 1995). In MNIDS,
benchmark data sets, especially, KDD99 and
the known attacks are detected by matching the
NSLKDD, negatively affect the NIDS results
stored signatures of those attacks (Lee et al., 1999;
because of three major problems. Firstly, lack of
Vigna & Kemmerer, 1999). While ANIDS creates a
modern low footprint attack fashions, for instance,
profile of normal activities, any deviation from this
stealthy or spy attacks that change their styles over
profile is considered as an anomaly (Ghosh,
time to become similar to normal behaviors
Wanken, & Charron, 1998; Valdes & Anderson,
(Cunningham & Lippmann, 2000; Tavallaee
1995). Several studies stated that the MNIDS
et al., 2009). Second, the existing data sets were
can often accomplish higher accuracy and lower
created two decades ago, indicating that the
CONTACT Nour Moustafa [email protected] School of Engineering and Information Technology, University of New
South Wales at the Australian Defence Force Academy, Northcott Drive, Campbell, ACT 2600, Canberra, Australia.
Color versions of one or more of the figures in the article can be found online at www.tandfonline.com/uiss.
© 2016 Taylor & Francis
2 N. MOUSTAFA AND J. SLAY
normal traffic of the existing benchmark data sets Clustering (Sharif, Prugel-Benett, & Wills, 2012)
is different from the current normal traffic because are executed on the training and testing sets to
of the revolution in networks speed and applica- assess the complexity in terms of accuracy and
tions (McHugh, 2000). Third, the testing set of the FARs. Further, the results of this data set are com-
existing benchmark data sets has some attack types pared with the KDD99 data set (KDDCUP1999,
which are not in the training set; this means that 2007) to identify the capability of the UNSW-
the training and testing set have different distribu- NB15 data set in appraising existing and novel
tion (Tavallaee et al., 2009). The difference in the classifiers.
distribution persuades classifier systems to skew The objective of the paper is to analyse the
toward some observations causing the FAR UNSW-NB15 data set statistically and practically.
(Cieslak & Chawla, 2009; Tavallaee et al., 2009). First, in the statistical aspect, the distribution of
In the light of the above discussion, to address data points specifies the suitable algorithms of
these challenges the UNSW-NB15 data set has classification. To be clear, if a data set follows
recently been released (Moustafa & Slay, 2014, Gaussian distribution, many statistical algorithms,
2015b). This data set includes nine categories of for instance, HMM and Kalman filter are used.
the modern attack types and involves realistic However, if a data set does not fit Gaussian
activities of normal traffic that were captured distribution, other algorithms, for example,
with the change over time. In addition, it contains particle filter and mixture models are applied.
49 features that comprised the flow based between Second, in the practical aspect, the adoption of
hosts (i.e., client-to-server or server-to-client) the best attributes decrease false alarm rates and
and the packet header which covers in-depth reduce the execution costs. For that purpose, fea-
characteristics of the network traffic. ture correlations with label and without label are
A part of UNSW-NB15 data set was decomposed demonstrated.
into two partitions of the training and testing sets to The rest of this paper is organised as follows:
determine the analysis aspects. The goal of the three Section 2 describes the UNSW-NB15 data set.
aspects is to evaluate the complexity of the training Section 3 discusses the training and the testing
and testing sets. First, the Kolmogorov-Smirnov sets extracted from this data set. Section 4 discusses
Test (Justel, Peña, & Zamar, 1997; Massey, 1951) the statistical mechanisms used on the two sets.
defines and compares the distribution of the Section 5 presents the feature correlation methods.
training and testing sets; skewness (Mardia, 1970) Section 6 identifies the classification techniques
measures the asymmetry of the features; and which are involved to evaluate the complexity of
kurtosis (Mardia, 1970) estimates the flatness of the KDD99 and UNSW-NB15 data sets. Section 7
the features. The reliability of results can be presents the experimental results of the statistical
achieved when these statistics are approximately techniques, the feature correlations, and the
similar to the features of the training and testing complexity evaluation. Finally, section 8 provides a
sets. Second, the feature correlations are measured conclusion to the paper and examines the future
in two perspectives: (1) the feature correlations research area.
without the class label, (2) and the feature correla-
tions with the class label. To achieve the first per-
2. Description of the UNSW-NB15 data set
spective, Pearson’s Correlation Coefficient (PCC)
(Bland & Altman, 1995) is used. Gain Ratio (GR) The UNSW-NB 15 data set (Moustafa & Slay, 2014,
method (Hall & Smith, 1998) is utilised to achieve 2015b) was created using an IXIA PerfectStorm tool
the second perspective. Third, five existing techni- (IXIA PerfectStormOne Tool, 2014) in the Cyber
ques, namely, Naïve Bayes (NB)(Panda & Patra, Range Lab of the Australian Centre for Cyber
2007), Decision Tree (DT) (Bouzida & Cuppens, Security (ACCS) (Australian Center for Cyber
2006), Artificial Neural Network (ANN) (Bouzida Security (ACCS), 2014) to generate a hybrid of the
& Cuppens, 2006; Mukkamala, Sung, & Abraham, realistic modern normal activities and the synthetic
2005), Logistic Regression (LR) (Mukkamala et al., contemporary attack behaviors from network traffic.
2005), and Expectation-Maximisation (EM) A tcpdump tool (tcpdump tool, 2014) was used to
INFORMATION SECURITY JOURNAL: A GLOBAL PERSPECTIVE 3
capture 100 GB of a raw network traffic. Argus Table 5. Additional generated features.
(Argus tool, 2014), Bro-IDS (Bro-IDS Tool, 2014) 36 is_sm_ips_ports If srcip (1) equals to dstip (3) and sport (2)
equals to dsport (4), this variable assigns to 1
tools were used and 12 models were developed for otherwise 0
extracting the features of Tables 1, 2, 3, 4 and 5, 37 ct_state_ttl No. for each state (6) according to specific
respectively. These techniques were configured in a range of values of sttl (10) and dttl (11)
38 ct_flw_http_mthd No. of flows that has methods such as Get
parallel processing to extract 49 features with the and Post in http service
class label. After finishing the implementation of the 39 is_ftp_login If the ftp session is accessed by user and
password then 1 else 0
40 ct_ftp_cmd No of flows that has a command in ftp
session
Table 1. Flow features. 41 ct_srv_src No. of records that contain the same service
(14) and srcip (1) in 100 records according to
No. Name Description
the ltime (26)
1 Srcip Source IP address 42 ct_srv_dst No. of records that contain the same service
2 Sport Source port number (14) and dstip (3) in 100 records according to
3 Dstip Destination IP address the ltime (26)
4 Dsport Destination port number 43 ct_dst_ltm No. of records of the same dstip (3) in 100
5 Proto Protocol type (such as TC, UDP) records according to the ltime (26)
44 ct_src_ ltm No. of records of the srcip (1) in 100 records
according to the ltime (26)
Table 2. Basic features. 45 ct_src_dport_ltm No of records of the same srcip (1) and the
6 stateIndicates to the state and its dependent protocol (such dsport (4) in 100 records according to the
as ACC, CLO and CON). ltime (26)
7 dur Record total duration 46 ct_dst_sport_ltm No of records of the same dstip (3) and the
8 sbytes Source to destination bytes sport (2) in 100 records according to the ltime
9 dbytes Destination to source bytes (26)
10 sttl Source to destination time to live 47 ct_dst_src_ltm No of records of the same srcip (1) and the
11 dttl Destination to source time to live dstip (3) in in 100 records according to the
12 sloss Source packets retransmitted or dropped ltime (26)
13 dloss Destination packets retransmitted or dropped
14 service Such as http, ftp, smtp, ssh, dns and ftp-data.
15 sload Source bits per second configured techniques, the total number of records,
16 dload Destination bits per second 2,540,044, were stored in four CSV files. The records
17 spkts Source to destination packet count
18 dpkts Destination to source packet count and the features of the UNSW-NB15 data set are
described in-depth as follows.
Table 3. Content features.
19 swin Source TCP window advertisement value 2.1. Attack types
20 dwin Destination TCP window advertisement value
21 stcpb Source TCP base sequence number Attack types can be classified into nine groups:
22 dtcpb Destination TCP base sequence number
23 smeansz Mean of the flow packet size transmitted by the src
(1) Fuzzers: an attack in which the attacker
24 dmeansz Mean of the flow packet size transmitted by the attempts to discover security loopholes in a
dst program, operating system, or network by
25 trans_depth Represents the pipelined depth into the
connection of http request/response transaction feeding it with the massive inputting of
26 res_bdy_len Actual uncompressed content size of the data random data to make it crash.
transferred from the server’s http service
(2) Analysis: a type of variety intrusions that
penetrate the web applications via ports
Table 4. Time features. (e.g., port scans), emails (e.g., spam), and
27 sjitSource jitter (mSec) web scripts (e.g., HTML files).
28 djitDestination jitter (mSec)
(3) Backdoor: a technique of bypassing a
29 stime
record start time
30 ltime
record last time stealthy normal authentication, securing
31 sintpkt
Source interpacket arrival time (mSec) unauthorized remote access to a device,
32 dintpkt
Destination interpacket arrival time (mSec)
33 tcprtt
TCP connection setup round-trip time, the sum of and locating the entrance to plain text as it
’synack’ and ’ackdat’ is struggling to continue unobserved.
34 synack TCP connection setup time, the time between the SYN (4) DoS: an intrusion which disrupts the com-
and the SYN_ACK packets
35 ackdat TCP connection setup time, the time between the puter resources via memory, to be extremely
SYN_ACK and the ACK packets
busy in order to prevent the authorized To label this data set, two attributes were
requests from accessing a device. provided: attack_cat represents the nine categories
(5) Exploit: a sequence of instructions that of the attack and the normal, and label is 0 for
takes advantage of a glitch, bug, or vulner- normal and otherwise 1.
ability to be caused by an unintentional or
unsuspected behavior on a host or network.
(6) Generic: a technique that establishes against 3. Training and testing set distribution
every block-cipher using a hash function to
collision without respect to the configura- A NIDS data set can be conceptualized as a
tion of the block-cipher. relational table (T) (Witten & Mining, 2005). The
(7) Reconnaissance: can be defined as a probe; an input to any NIDS is a set of instances (I) (e.g.,
attack that gathers information about a com- normal and attack records). Each instance consists
puter network to evade its security controls. of features (F) that have different data types
(8) Shellcode: an attack in which the attacker (i.e.,"f fR [ Sg, where "f means each feature
penetrates a slight piece of code starting from in T; R is real numbers and S denotes characters).
a shell to control the compromised machine. It is observed that NIDS techniques face challenges
(9) Worm: an attack whereby the attacker repli- for using these features because no standard format
cates itself in order to spread on other com- for feature values (e.g., number or nominal) is
puters. Often, it uses a computer network to offered(Shyu et al., 2005). In statistical perspective,
spread itself, depending on the security fail- T is a multivariate data representation which is codi-
ures on the target computer to access it. fied in Definition 1.

Definition 1: Let I1:N 2 T; I1:N ¼ fij 2 F ;
Y1:N ¼ fci 2 Cg; where i, j = 1, 2. . .., N. Suppose
2.2. Features
F is iid (independently and identically distributed).
Features are categorized into five groups: Defining I1:N and Y1:N as a column-vector, as given
(1) Flow features: includes the identifier in Eq. (1).
attributes between hosts (e.g., client-to-serve
f11 f12 :: c
or server-to-client), as reflected in Table 1. I1:N ¼ ; Y1:N ¼ 1 (1)
f21 f22 fij ci
(2) Basic features: involves the attributes that
represent protocols connections, as shown
in Table 2. such that I represents the observations of T,Y is
(3) Content features: encapsulates the attributes the class label ðCÞ for each I, N is the number of
of TCP/IP; also they contain some attributes instances, F denotes the features of I.
of http services, as reflected in Table 3.
(4) Time features: contains the attributes time, Proposition 1: A standard format for features (F)
for example, arrival time between packets, is prepared to have a same type (i.e., (number
start/end packet time, and round trip time onlyÞ"F fRg) to make the analysis of the data
of TCP protocol, as shown in Table 4. points easier. it assigns each nominal feature (S) to
(5) Additional generated features: in Table 5, a sequence of numbers (i.e.,"S ! R0:R ;
this category can be further divided into two wheref0 : Rg denotes as equence of numbers
groups: general purpose features (i.e., (Salem & Buehler, 2012). For instance, the UNSW-
36–40), whereby each feature has its own NB15 data set has three major nominal features
purpose, according to protect the service of (e.g., protocol types (e.g., TCP, UPD), States (e.g.,
protocols, and (2) connection features (i.e., CON, ACC) and services (e.g., HTTP, FTP)).This
41–47) that are built from the flow of 100 issue can be tackled by converting each value in
record connections based on the sequential these features into ordered numbers such as
order of the last time feature. TCP = 1, UDP = 2 and so on.
Table 6. A part of UNSW-NB15 data set distribution.

max fij min fij <1\ max fij min fij
Category Training set Testing set
Normal 56,000 37,000 (2)
Analysis 2,000 677
Backdoor 1,746 583 In Eq. (2), each feature values are not
DoS 12,264 4089 specified into a confidence interval, for instance

Exploits 33,393 11,132
Fuzzers 18,184 6,062 [−1, 1], and the maximum value (i.e., Max fij Þof
Generic 40,000 18,871 the feature (F) is much larger than the minimum
Reconnaissance 10,491 3,496
Shellcode 1,133 378 value (i:e:; Min fij ). This causes a noise distribu-
Worms 130 44 tion, because the smallest and the largest values
Total Records 175,341 82,332
highly deviate from their mean (M) (Cherkassky &
Mulier, 2007).
Table 6 demonstrates the creation of the train-
ing and the testing sets from the UNSW-NB15 Proposition 2: To tackle Dilemma 1, the z-score
data set; a part of the data set records has been function is utilised as formulated in Eq. (3). It is a
divided with an approximate 60%:40% ratio of the linear transformation to standardise the format of
training and testing sets, respectively. To achieve the fij values, this makes it easier to compare
the authenticity of NIDS evaluations, no redun- values in diverse distributions without changing
dant records among the training and testing set. the shape of the original distribution.
4. Statistical descriptive observations fij M

zij ¼ (3)
In this section, the statistical analysis of the δ
training and the testing sets attributes are elabo-
rated to measure the relationship of the two set. In Eq. (3), the z-score of each value in I1: N is
The training and testing sets are prepared to in calculated by subtracting "f from its M, and then
the form of Eq. (1), namely, TRIN and TSIN : the value is divided by the standard deviationðδÞ of
Kolmogorov-Smirnov (K-S) test (Justel et al., "f to measure how far away "f from its M. The
1997; Massey, 1951), Multivariate skewness and sign of the z-score demonstrates that the value of
kurtosis functions (Mardia, 1970) are customized "f is either below or above M. After the transfor-
to examine the relationship and the distribution mation into z-score, all the distributions of the
nature of the TRIN and TSIN . The values of the fij values are standardised with M = 0 and δ ¼ 1.
features are not in a confidence interval, the
z-score transformation (Jain, Nandakumar, &
Ross, 2005) is used to make a standard format 4.2. Kolmogorov-Smirnov (K-S) test
for these values of the attributes.
The K-S test is used to decide the proper distribu-
tion of the features (Massey, 1951; Shyu et al.,
4.1. Z-score transformation of the feature values 2005). Let "f has x1 ; x2 ; . . . :; xn be an ascending
The attributes of the TRIN and TSIN have a large order sample, an empirical distribution function
scale between the minimum and the maximum Fn ðxÞis the fraction of sample observations less
value. Therefore, it is extremely difficult to esti- than or equal to the value x, then the empirical
mate their variance from the central tendency of distribution function is defined as:
the distribution. This issue is formulated as a mul- 0; x < x1
tivariate problem of the varied distributions in Fn ðxÞ ¼ f k=n; x x < x ; k ¼ 1; 2; . . . ; n 1
k kþ1
Dilemma 1. 1; x xn
(4)
Dilemma 1: "f involves in IN has a wide range of
values, such that The Kolmogorov distribution function is denoted as:
pffiffiffiffiffi 1
2π X ð2n1Þ2 π2 =ð8x2 Þ It is acknowledged that if skewness and kurtosis
f ðxÞ ¼ e (5) values tend to be 0, then the distribution
x n¼1
approximates a normal distribution.
From Eqs. (4) and (5), K-S test achieves Fn ðxÞ fits
f ðxÞ by maximizing the absolute difference as follows:
4.5. The statistical functions utilisation on the
Dn ¼ maxx jf ðxÞ Fn ðxÞj (6) TRIN and TSIN
The K-S test, Multivariate skewness and kurtosis
In the case of the critical value (Dn;/ ) functions are customised on the TRIN and TSIN to
(i.e., / denotes significance level) falls into the estimate the compatibility of them, as declared in
Kolmogorov-Smirnov table (KDDCUP1999, Eqs. (9)–(11).
2007), do PðDn Dn;/ Þ ¼ 1 / :Dn that can be
ρ
used to test "f within f ðxÞ. Hence, the suitable "f TRIN Dn;TRIN , "f TSIN Dn;TSIN Dn;/;"f (9)
fitting is accomplished in the case of

maxx jf ðxÞ Fn ðxÞj Dn;/ .Fn ðxÞ Dn;/ affords a
confidence interval to f ðxÞ to present the proper ρ
distribution of "f . "f TRIN skeTRIN , "f TSIN skeTSIN (10)
ρ
4.3. Multivariate skewness "f TRIN kurTRIN , "f TSIN kurTSIN (11)
The skewness method (Mardia, 1970) is an asym-
metry measure of the probability distribution of Eq. (9) estimates the best fitting of the distribution
"f that has x1 ; x2 ; . . . :; xn from its M; skewness to TRIN and TSIN of "f , achieving the two sides are
function can be defined as: less than or equal K-S test (i.e., Dn;/;"f ), while Eqs.
Pn (10) and (11) calculate the skewness and the kurtosis
ðxi MÞ3
ske ¼ i¼1 3 (7) to the TRIN and TSIN of"f , respectively. It is observed
nδ
that ρ assigns to a threshold operator (e.g., = , < or >)
which compares the results between the two sides of
In Eq. (7), if result is positive, the distribution
the equations.
with an asymmetric tail spreads toward major
Based on the above explanation, the TRIN
positive values. On the other hand, a negative
value indicates that the distribution with an asym- and TSIN of "f is analysed to evaluate the statisti-
metric tail extends toward more negative values. cal relationship of them as in the following
algorithm:
4.4. Multivariate kurtosis Algorithm 1: The statistical relationship

between TRIN and TSIN of "f
The kurtosis technique (Mardia, 1970) is a peaki-
ness measure of the probability distribution of "f
Input ← TRIN and TSIN of "f
that has x1 ; x2 ; . . . :; xn ; kurtosis is denoted as:
1: for "f do // for each feature contains in the
Pn training and testing sets do
ðxi MÞ4 2: convert the values of the nominal features to
Kur ¼ i¼1
(8)
nδ4 numerical as described in proposition 1.
3: apply z-score from Eq. (3), subject to Eq. (2)
In Eq. (8), if the outcome is positive, the distribu- as discussed in section 4.1.
tion is more peaked than the normal distribution. 4: apply Kolmogorov-Smirnov and multivariate
Nevertheless, a negative value indicates a flatter skewness and kurtosis measures as declared in
distribution. sections 4.2, 4.3 and 4.4 respectively).
5: compare the results of step 4 to the where covðÞ is the covariance and σ is the stan-
TRIN and TSIN as in formulated equations 9, 10 PN Pn
and 11 respectively. dard deviation, Mf1 ¼ 1=N xi and Mf2 ¼ N yi

i i
6: end for indicate the means of f1 and f2 respectively.
Output ! the relationship of the TRIN and TSIN Based on Eq. (12), the output of the PCC is in a
of "f via the results of step 5 range [−1, 1]. If the value is near to −1 or 1, it
indicates a strong correlation between the two fea-
tures. However, if the value is close to 0, it shows
5. Feature correlations of the TRIN and TSIN that there is no correlation between the features. A
Correlation analysis is another aspect to identify positive sign means that the two features are in the
the relationship of the TRIN and TSIN features. Two same direction, while a negative sign indicates that
correlation analysis mechanisms are used. First, a the two features are in the opposite trend.
Pearson’s correlation coefficient technique (PCC) To rank the strongest features, the mean for
each PCC feature (i.e., Mpccfi Þis calculated, such
(Bland & Altman, 1995) measures the relevance

between features without a label. Second, a Gain PN
that Mpccfi ¼ 1=N PCCfi , then the means are
Ratio method (GR) (Hall & Smith, 1998) is applied i¼1
to rank the correlation between features and the ordered descendingly to define the closest
label. The major goal of these techniques is to dependency features.
recognise the correlation scores between the fea-
tures either with or without the class label on
theTRIN and TSIN , to estimate the efficiency of the 5.2. Feature correlations with labeled
features for discriminating the normal and attack
observations. The Gain Ratio technique (Hall & Smith, 1998)
estimates the ratio between an information Gain
Definition 1.1 (the extension of Definition 1): method (IG) (Jain et al., 2005) and feature values.
Let data set (T) has multiple features f1 ; f2 ; The GA is prepared to solve the problem of the IG
. . . :; fd 2 I1:N and class (C), each feature and the when a feature has a large number of values com-
class have multiple values. For instance, pared with other features values. The IG is a feature
f1 ¼ fx1 ; x2 ; . . . :; xN g, f2 ¼ fy1 ; y2 ; . . . :; yN g and selection method depends on an entropy function
C ¼ fc1 ; c2 ; . . . :; cw g; where d is the number of which is a measure of uncertainty of any feature.
features, N denotes the number of instances and Let I be set of samples with w distinct classes. The
w is the number of classes. entropy or the expected IG that is required to
classify the observations is given by:
X
m
GðI Þ ¼ pi log2 ðpi Þ (13)
5.1. Feature correlations without labeled i¼1
where pi is a likely of the instance I that belongs to

Pearson’s correlation coefficient (PCC) (Bland &
Altman, 1995) is one of the simplest linear correla- class Ci and is calculated by Ii =IN :
tion methods to measure the dependency between For partitioning the Ii into subsets for each
features. From Definition1.1, the PCC of the features feature f, the expected IG is denoted as:
f1 andf2 is formulated as: X
m
I1i þ I2i þ . . . . . . þ Imi
Eð f Þ ¼ GðI Þ
covðf1 ; f2 Þ IN
PCCðf1 ; f2 Þ ¼ i¼1
σ :σ (14)
PN f 1 f 2
i¼1 xi Mf1 yi Mf2
¼ qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
PN 2ffi qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
PN 2ffi
From Eqs. (13) and (14), the encoding information
i¼1 x i M f1 : i¼1 y i M f2 that is gained to F is:
(12) Gainð f Þ ¼ GðI Þ Eð f Þ (15)
The splitting value between the subsets Iir is the DT classifier is a structure similar to a flowchart
indicated as: which consists of root, nodes and branches to
represent the classification rules. Each node denotes
X
r
Ii Ii
Split ðI Þ ¼ log2 (16) rules or procedures on a feature, each branch con-
i¼1
IN IN tains the results of the rules; and each leaf node
expresses a class label. Third, the ANN learning is
In Eq. (16), the split value of I expresses the used to approximate an activation function that
information generated by dividing I into r parts depends on a large number of input observations
conforming to r on the features. From Eqs. (14) and I. The basic ANN function can be defined as:
!
(15), the GA can be defined as: GRð f Þ ¼ Gainð f Þ= X
SplitðI Þ, where the feature with the highest Gain ratio f ðI Þ ¼ τ Wj :Ij (18)
is selected as the splitting feature. Thus, the strongest j
features with the class label are evaluated and ranked where f ðI Þ represents a predicted output of the class
by utilising the GR, where the scores of the features label, τ is an activation function (i.e., Sigmoid), Wj
are in the descending order. is a weight of each input instance Ij . Fourth, Logistic
Regression algorithm establishes the correlation
between a dependent variable (C) and independent
6. Techniques for evaluating the complexity
variables (F). It uses the maximum likelihood func-
This section discusses the techniques that are used tion to estimate the regression parameters. Fifth,
to evaluate the complexity in terms of accuracy Expectation-Maximization (EM) clustering techni-
and false alarm rates (FAR) on the UNSW-NB15 que depends on maximizing the probability density
data set. The five techniques used are Naïve Bayes function of a Gaussian distribution to calculate the
(NB) (Panda & Patra, 2007), Decision Tree (DT) mean and the covariance of each instance I in T.
(Bouzida & Cuppens, 2006), Artificial Neural The EM clustering algorithm encompasses into two
Network (ANN) (Bouzida & Cuppens, 2006; steps (i.e., Expectation (E-step) and Maximization
Mukkamala et al., 2005), Logistic Regression (LR) (M)). In the E-step, it estimates the likelihood for
(Mukkamala et al., 2005), and Expectation- each instance I in T. whilst, the M-step is utilised to
Maximization (EM) Clustering (Sharif et al., re-estimate the parameter values from the E-step to
2012). Each technique has its own characteristics achieve the best expected output.
to learn and evaluate data points of the Two parameters (accuracy and false alarm rate)
TRIN and TSIN which are described respectively in are calculated from the outcomes of these techniques
the following section. First, the NB classifier is a to measure the complexity of the UNSW-NB15
conditional probability model which constructs data set. Let the factors of the classification are FC ¼
the classification of the two classes (i.e., normal fTP; TN; FP; FN g where TP (i.e., true
(0) or anomaly (1)). It is applied by the maximum positive) denotes a number of the correctly attack
a posterior (MAP) function which is denoted as: classified, (i.e., true negative) expresses a number of
the correctly normal classified, FP (i.e., false positive)
Y
N
PðCjI Þ ¼ argmax PðCw Þ PðIj jCw Þ (17) is a number of the misclassified attacks and FN (i.e.,
w2f1;2;::;N g j¼1 false negative) is a number of the misclassified
normal records (Sokolova, Japkowicz, &
where C is the class label, I is the observation of each Szpakowicz, 2006). The accuracy (So-In et al., 2014;
class, w is the class number, P(C|I) denotes the Sokolova et al., 2006) is the rate of the correctly
probability of the class given a specified observation classified records to all the records, whether correctly
Q
N or incorrectly classified, which is denoted as
and PðIj jCw Þ indicates to multiply all the prob-
j¼1 TP þ TN
accuracy ¼ (19)
abilities of the instances conditionally to their TP þ TN þ FP þ FN
classes to achieve the maximum outcome. Second,
The false alarm rate (FAR) is the average ratio of Table 7. The features of the analysis.
the misclassified to classified records either normal or Id Names Id Names
1 dur 22 synack
abnormal as denoted in Eq. (22). It is designed from
2 spkts 23 ackdat
Eqs. (20) and (21) to calculate the false positive rate 3 dpkts 24 smean
(FPR) and the false negative rate (FNR), respectively. 4 sbytes 25 dmean
5 dbytes 26 trans_depth
FP 6 rate 27 response_body_len
FPR ¼ (20) 7 sttl 28 ct_srv_src
FP þ TN 8 dttl 29 ct_state_ttl
FN 9 sload 30 ct_dst_ltm
FNR ¼ (21) 10 dload 31 ct_src_dport_ltm
FN þ TP 11 sloss 32 ct_dst_sport_ltm
12 dloss 33 ct_dst_src_ltm
FPR þ FNR
FAR ¼ (22) 13
14
sinpkt
dinpkt
34
35
is_ftp_logn
ct_ftp_cmd
2
15 sjit 36 ct_flw_http_mthd
16 djit 37 ct_src_ltm
17 swin 38 ct_srv_dst
7. Results and discussion 18 stcpb 39 is_sm_ips_ports
19 dtcpb 40 proto
This paper examines analytical approaches to mea- 20 dwin 41 service
sure the complexity of the UNSW-NB15 data set 21 tcprtt 42 state
which was developed to evaluate NIDSs. The study
uses three approaches: (7.1) the statistical explana-
tion (i.e., K-T test, multivariate skewness and kur-
tosis measures), (7.2) the features correlations (i.e.,
PCC and GR), and (7.3) and the complexity evalua-
tion using the five classifiers. To measure the com-
plexity of the UNSW-NB15 data set within the
adopted part of the training and the testing sets, as
presented in Table 6. The features that are selected
to execute these aspects are reflected in Table 7.
7.1. The statistical explanation

The SPSS tool (SPSS tool, 2014) is utilised to analyse
the statistical explanation and to determine the dis-
tribution nature of the training and the testing sets.
Figure 1 shows the probability of the features on the
training (TR) and the testing (TS) sets. The results
demonstrate that the features distribution is a non-
Figure 1. The probability distribution of the features on the
linearity and nonnormality representation. The fit- training and testing sets.
ting percentage of these features in the two sets is
78%, when the two lines of the TR and the TS are
identical. On the other hand, the nonfitting percen- The TS is approximately similar to the TR, almost
tage is 22%, which is caused of the TR records are majority of the features are positive, except the
excessively more than the TS records. features 7, 17 and 20. Figure 2 demonstrates the
To measure the asymmetry of the TR and the TS skewness of the TR and TS features, where the
sets features (F); a multivariate skewness (skw) func- relationship percentage is 82% when the two lines
tion is executed. In the TR, the empirical outcomes are similar. The most skewed features are 25, 26, 27
show that all the features, except 7, are positive. and 28, conversely, the lowest skewness are 6, 7, 8,
Therefore, the majority of the features are on the 17, 18, 19, 29, 30, 31, 32, 37, 38 and 42.
right side of the probability density function distri- To estimate the Preakness of the TR and TS
bution which is longer or fatter than the left side. features, a multivariate kurtosis function is
Based on the above observation, the features of

the training and testing sets are a highly statistical
correlation. Consequently, this part of the UNSW-
NB15 data set is reliable to evaluate classifier tech-
niques because the training and the testing sets
have the same characteristics of the non-linearity
and non-normality, and the compatibility of the
skewness and kurtosis values is acceptable. Overall,
this shows that the data set can be used to evaluate
the existing and novel methods of NIDSs.
7.2. Feature correlations

The correlation of features is evaluated based on
two aspects; without the label using the PCC and

with the label using the GR using the MTLAB tool
Figure 2. The skewness of the features on the training and (Matlab Tool, 2014) to identify the percentage of
testing sets. the closeness or the remoteness of the features.
The PCC estimates the score of each feature in
implemented. In the TR, the experimental results the TR and the TS. As can be seen in Figure 4, in
indicate that the 7 features, 7, 8, 17, 18, 19, 20 and the TR and the TS, the features have the same
40, have a negative value which is a flatter distribu- correlated score. These features are ranked into a
tion. However, the rest of the features are positive; specified range [−0.01, 0.11]. The highest related
this leads to the distribution are higher than a nor- features are 5, 3, 6, 12, 13, 29, 31, 32, 33, 34, 38
mal distribution. In Figure 3, the kurtosis compara- and 39. On the contrary, the lowest correlated fea-
tive of the TR and TS features (F) illustrates that the tures are 7, 8, 10, 11, 14, 40 and 41. Otherwise, the
two lines fit 76% of the features. The kurtosis of the features almost fall into the middle of the range.
TS is higher than the TR in the features 2−6, 10−12, The correlated features are classified into high,
15−16, 25−28 and 35−37. Conversely, the other middle and low, which get probability values 12 23
42 , 42
features are almost in the close proximity. and 427 . This means that 87.5% are acceptable corre-
lations of the high and middle correlated features.
Figure 3. The kurtosis of the features on the training and

testing sets. Figure 4. The PCC of the features on the training and testing sets.
Figure 6. The comparison of the fifth techniques on the UNSW-

NB15 data set.
Figure 5. The Gain Ratio of the features on the training and the
testing set.
five techniques (e.g., NB, DT, ANN, LR, and EM

In the TR and TS, the GR measures the correlation clustering) are executed. These techniques are built
of each feature with the class label. Figure 5 shows in Visual Studio Business Intelligence 2008 (Visual
the features are extremely similar in the correlation. Studio 2008, 2014) and are implemented with the
The specified range of the score is [0.01,0.56] which default input parameters. Figure 6 represents the
is classified into low, middle and high according to comparison of the fifth techniques in the terms of
the ranges [0.01, 0.2], [0.21,0.3], [0.31,0.56], respec- the accuracy and FAR in the x-axis and in the y-axis
tively. The lowest related features are 1, 2, 3, 7, 12, 13, shows the percentage. The DT technique accom-
14, 16, 17, 27, 28, 29, 34, 35, 36, 37, 38 and 39. The plishes the highest accuracy (i.e., 85.56%) and the
middle ranked features are 4, 5, 6, 10, 11, 15, 18, 19, lowest FAR (15.78%). On the other hand, the EM-
20, 21, 22, 23, 24, 25, 26, 31, 32 and 40. The highest clustering achieves the lowest efficiency where the
correlated features are 8, 9, 30, 33, 41, and 42. The accuracy is 78.47% and the FAR is 23.79%.
probability of this group is 18 42 , 42 ; and 42 respec-
18 6
In Table 8, the comparative results of the KDD99
tively, thus the acceptable correlation rate is 57.2% of and the UNSW-NB15 data sets are elaborated.
the high and the middle correlated features. Overall, the results of the accuracy and the FAR
of these techniques using the KDD99 data set are
better than the UNSW-NB15 data set. There are
7.3. Complexity evaluation of the classifier
two perspectives demonstrate the complexity of the
techniques
UNSW-NB15 data set compared to the KDD99
To evaluate the complexity of the UNSW-NB15 data set. First, from the perspective of the network
data set (i.e., the training and the testing sets) in traffic behavior, the UNSW-NB15 data set contains
terms of accuracy and false alarm rate (FAR), the a variety of the contemporary attack and normal
Table 8. Comparison between the results of the KDD99 and UNSW-NB15 data set.
KDD99 data set UNSW-NB15 data set
Techniques Reference Accuracy (%) FAR (%) Accuracy (%) FAR (%)
DT (Bro-IDS Tool, 2014) 92.30 11.71 85.56 15.78
LR (Witten & Mining, 2005) 92.75 - 83.15 18.48
NB (Shyu et al., 2005) 95 5 82.07 18.56
ANN (Witten & Mining, 2005) 97.04 1.48 81.34 21.13
EM clustering (Salem & Buehler, 2012) 78.06 10.37 78.47 23.79
behaviors. On the contrary, the attack and normal the results of the two data sets, the efficiency
behaviors of the KDD99 data set are outdated. techniques using the KDD99 data set are better
Additionally, the similarities of the normal and than the UNSW-NB15 data set. As a consequence,
the attack observations in majority of the features the UNSW-NB15 data set is considered complex
add another factor to the complexity of the UNSW- due to the similar behaviours of the modern attack
NB15 data set. and normal network traffic. This means that this
Second, from the perspective of the statistical data set can be used to evaluate the existing and
based test, as shown in Figures 1, 2, and 3, the the novel methods of NIDSs in a reliable way.
features of the training and the testing sets are a In the future, we plan to develop a new classifica-
highly correlation because the features are almost tion technique to identify the anomalies from the
similar in the skewness and the kurtosis indicators. nonlinearity and non-normality data representation.
Further, the training and the testing sets have the
same distribution which is non-linear and non-nor-
mal. As a result, the two perspectives demonstrate ORCID
the major reasons of the complexity of the UNSW- Nour Moustafa http://orcid.org/0000-0001-
NB15 data set compared to the KDD99 data set. 6127-9349
8. Conclusion and future work References
In this paper, the analysis and the evaluation of the Argus tool. (2014). Retrieved from http://qosient.com/argus/
UNSW-NB15 data set are discussed. A part from flowtools.shtml.
Australian Center for Cyber Security (ACCS). (2014).
this data set is divided into a training set and
Retrieved from http://www.accs.unsw.adfa.edu.au/
testing set to examine this data set. The training Aziz, A. S. A., Azar, A. T., Hassanien, A. E., & Hanafy, S. E.
and testing sets are analysed in three aspects of the (2014). Continuous features discretization for anomaly
statistical analysis phase, the feature correlation intrusion detectors generation. In Proceedings of the 17th
phase and the complexity evaluation phase. First, Online World Conference on Soft Computing in Industrial
the features of the two sets are converted into Applications (pp. 209–221). Switzerland: Springer.
Bhuyan, M. H., Bhattacharyya, D. K., & Kalita, J. K. (2014).
numerical values to be statistically processed and
Network anomaly detection: Methods, systems and
normalized using the z-score transformation to tools. IEEE Communications Surveys & Tutorials, 16 (1),
prevent the change in the original distribution. 303–336. doi:10.1109/SURV.2013.052213.00046
The statistical results show that the two sets are Bland, J. M., & Altman, D. G. (1995). Statistics notes:
of the same distribution, nonnormal and non- Calculating correlation coefficients with repeated observa-
linear, using the Kolmogorov-Smirnov test. tions: Part 2—correlation between subjects. Bmj, 310
(6980), 633. doi:10.1136/bmj.310.6980.633
Further, the skewness and kurtosis indicators of
Bouzida, Y., & Cuppens, F. (2006). Neural networks vs. deci-
the training and the testing set are statistically sion trees for intrusion detection. IEEE/IST Workshop on
similar. Second, the feature correlations of the Monitoring, Attack Detection and Mitigation (MonAM),
training and the testing sets are measured either Tuebingen, Germany.
with the class label (i.e., the Pearson’s correlation Bro-IDS Tool. (2014). Retrieved from https://www.bro.org/.
coefficient method) or without the label (i.e., the Cherkassky, V., & Mulier, F. M. (2007). Learning from data:
Concepts, theory, and methods. Hoboken, NJ: John Wiley &
Gain Ratio technique). The feature correlations
Sons.
results demonstrate that these features are highly Cieslak, D. A., & Chawla, N. V. (2009). A framework for
relevant observations. Third, the five techniques of monitoring classifiers’ performance: When and why
the DT, LR, NB, ANN, and EM clustering are used failure occurs? Knowledge and Information Systems, 18
to measure the complexity in terms of accuracy (1), 83–108. doi:10.1007/s10115-008-0139-1
and False Alarm Rate (FAR) of this data set, and Cunningham, R., & Lippmann, R. (2000). Detecting computer
attackers: Recognizing patterns of malicious stealthy behavior.
then the results are compared using the KDD99
MIT Lincoln Laboratory–Presentation to CERIAS, 11, 29.
data set. The evaluation results of the five techni- Denning, D. E. (1987). An intrusion-detection model. IEEE
ques show that the DT technique accomplishes the Transactions on Software Engineering, SE-13 (2), 222–232.
best efficiency compared to others. For comparing doi:10.1109/TSE.1987.232894
DeWeese, S. (2009). Capability of the People’s Republic of Moustafa, N., & Slay, J. (2015a). Creating novel features to
China (PRC) to conduct cyber warfare and computer net- anomaly network detection using DARPA-2009 data set.
work exploitation. Darby, PA: DIANE Publishing. 14th European Conference on Cyber Warfare and Security
Eom, J.-H., Kim, S.-H., & Chung, T.-M. (2012). Cyber mili- ECCWS-2015. The University of Hertfordshire, Hatfield,
tary strategy for cyberspace superiority in cyber warfare. in UK.
2012 International Conference on Cyber Security, Cyber Moustafa, N., & Slay, J. (2015b). UNSW-NB15: A compre-
Warfare and Digital Forensic (CyberSec), IEEE. hensive data set for network intrusion detection. 2015
García-Teodoro, P., Díaz-Verdejo, J., Maciá-Fernández, G., & Military Communications and Information Systems
Vázquez, E. (2009). Anomaly-based network intrusion Conference. Canberra, Australia: MilCIS 2015-IEEE
detection: Techniques, systems and challenges. Computers Stream.
& Security, 28 (1–2), 18–28. doi:10.1016/j.cose.2008.08.003 Mukkamala, S., Sung, A. H., & Abraham, A. (2005). Intrusion
Ghosh, A. K., Wanken, J., & Charron, F. (1998). Detecting detection using an ensemble of intelligent paradigms.
anomalous and unknown intrusions against programs. Journal of Network and Computer Applications, 28 (2),
Computer Security Applications Conference, 1998. 167–182. doi:10.1016/j.jnca.2004.01.003
Proceedings. 14th Annual. IEEE. Panda, M., & Patra, M. R. (2007). Network intrusion detec-
Hall, M. A., & Smith, L. A. (1998). Practical feature subset tion using naive bayes. International Journal of Computer
selection for machine learning. In McDonald, C. (Ed.), Science and Network Security, 7 (12), 258–263.
Computer Science ’98 Proceedings of the 21st Australasian Salem, M., & Buehler, U. (2012). Mining techniques in net-
Computer Science Conference ACSC’98 (pp. 181–191). work security to enhance intrusion detection systems.
Berlin, Germany: Springer. International Journal of Network Security & Its
IXIA PerfectStormOne Tool. (2014). Retrieved from http:// Applications (IJNSA), 4 (6). doi:10.5121/ijnsa
www.ixiacom.com/products/perfectstorm Sharif, I., Prugel-Benett, A., & Wills, G. (2012).
Jain, A., Nandakumar, K., & Ross, A. (2005). Score normal- Unsupervised clustering approach for network anomaly
ization in multimodal biometric systems. Pattern detection, networked digital technologies. In Benlamri,
Recognition, 38 (12), 2270–2285. doi:10.1016/j.patcog.2005. R., (Ed), Networked Digital Technologies (Vol. 293, pp.
01.012 135–145). Communications in Computer and
Justel, A., Peña, D., & Zamar, R. (1997). A multivariate Information Science. Berlin, Germany: Springer Berlin
Kolmogorov-Smirnov test of goodness of fit. Statistics & Heidelberg.
Probability Letters, 35 (3), 251–259. doi:10.1016/S0167- Shyu, M.-L., Sarinnapakorn, K., Kuruppu-Appuhamilage, I.,
7152(97)00020-5 Chen, S.-C., Chang, L., & Goldring, T. (2005). Handling
KDDCUP1999. (2007). Retrieved from http://kdd.ics.uci.edu/ nominal features in anomaly intrusion detection problems.
databases/kddcup99/KDDCUP99.html 15th International Workshop on Research Issues in Data
Lazarevic, A., Ertoz, L., Kumar, V., Ozgur, A., & Srivastava, J. Engineering: Stream Data Mining and Applications, 2005.
(2003). A comparative study of anomaly detection schemes in RIDE-SDMA 2005. IEEE.
network intrusion detection. SDM. SIAM. So-In, C., Mongkonchai, N., Aimtongkham, P., Wijitsopon,
Lee, W., Stolfo, S. J., & Mok, K. W. (1999). A data mining K., & Rujirakul, K. (2014). An evaluation of data mining
framework for building intrusion detection models. classification models for network intrusion detection. 2014
Proceedings of the 1999 IEEE Symposium on Security Fourth International Conference on Digital Information
and Privacy, 1999. IEEE. and Communication Technology and its Applications
Mardia, K. V. (1970). Measures of multivariate skewness and (DICTAP), IEEE.
kurtosis with applications. Biometrika, 57 (3), 519–530. Sokolova, M., Japkowicz, N., & Szpakowicz, S. (2006). Beyond
doi:10.1093/biomet/57.3.519 accuracy, F-score and ROC: A family of discriminant
Massey, F. J., Jr (1951). The Kolmogorov-Smirnov test for good- measures for performance evaluation. In AI 2006:
ness of fit. Journal of the American Statistical Association, 46 Advances in artificial intelligence (Vol. 4304, pp.
(253), 68–78. doi:10.1080/01621459.1951.10500769 1015–1021). Lecture Notes in Computer Science. Berlin,
Matlab Tool. (2014). Retrieved from http://au.mathworks. Germany: Springer.
com/products/matlab/?refresh=true SPSS tool. (2014). Retrieved from http://www-01.ibm.com/
McHugh, J. (2000). Testing intrusion detection systems: A software/analytics/spss/
critique of the 1998 and 1999 DARPA intrusion detection Tavallaee, M., (2009). A detailed analysis of the KDD CUP 99
system evaluations as performed by Lincoln Laboratory. data set. In Proceedings of the Second IEEE Symposium on
ACM Transactions on Information and System Security, 3 Computational Intelligence for Security and Defence
(4), 262–294. doi:10.1145/382912.382923 Applications 2009 (pp. 53–58). Piscataway, NJ: IEEE.
Moustafa, N., & Slay, J. (2014, May) UNSW-NB15 DataSet for tcpdump tool. (2014). Retrieved from http://www.tcpdump.
Network Intrusion Detection Systems. Retrieved from org/
http://www.cybersecurity.unsw.adfa.edu.au/ADFA% Valdes, A., & Anderson, D. (1995). Statistical methods for
20NB15%20Datasets computer usage anomaly detection using NIDES (Next-
Generation Intrusion Detection Expert System). In Biographies

Proceedings of the Third International Workshop on
Rough Sets and Soft Computing (RSSC94), (pp. 306–311). Nour Moustafa is a PhD candidate at the School of Engineering
San Jose, CA: USW. and Information Technology (SEIT) in the University of New
Vatis, M. A. (2001). Cyber attacks during the war on terror- South Wales, Canberra, Australia. He is an IEEE student mem-
ism: A predictive analysis. DTIC Document. Hanover, NH: ber. He received his bachelor degree in 2009 and his master’s
Institute for Security Technology Studies at Dartmouth degree in 2014, at the faculty of computer and Information,
College. Helwan University, Egypt. His areas of interests include cyber
Vigna, G., & Kemmerer, R. A. (1999). NetSTAT: A network- security, in particular, network intrusion detection systems, data
based intrusion detection system. Journal of Computer mining, and machine learning mechanisms
Security, 7, 37–71.
Visual Studio 2008. (2014). Retrieved from http://msdn.micro Professor Jill Slay is director of the Australian Centre for Cyber
soft.com/en-us/library/ms175595(v=sql.100).aspx Security at UNSW Canberra at ADFA. She has established an
Witten, I. H., & Mining, E. F. D. (2005). Practical machine international research reputation in cyber security and has
learning tools and techniques (The Morgan Kaufmann worked in collaboration with many industrial partners. She
Series in Data Management Systems). San Francisco, CA: has published more than 92 research outputs in information
Elsevier. assurance and supervised 16 PhDs.

The Evaluation of Network Anomaly Detection Systems: Statistical Analysis of

Uploaded by

Copyright:

Available Formats

The Evaluation of Network Anomaly Detection Systems: Statistical Analysis of

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

The Evaluation of Network Anomaly Detection Systems: Statistical Analysis of

Uploaded by

Copyright:

Available Formats

Information Security Journal: A Global Perspective

ISSN: 1939-3555 (Print) 1939-3547 (Online) Journal homepage: http://www.tandfonline.com/loi/uiss20

The evaluation of Network Anomaly Detection

Nour Moustafa & Jill Slay

To link to this article: http://dx.doi.org/10.1080/19393555.2015.1125974

Published online: 11 Jan 2016.

Submit your article to this journal

View related articles

View Crossmark data

Full Terms & Conditions of access and use can be found at

The evaluation of Network Anomaly Detection Systems: Statistical analysis of

Table 6. A part of UNSW-NB15 data set distribution.

4. Statistical descriptive observations fij M

fitting is accomplished in the case of

4.4. Multivariate kurtosis Algorithm 1: The statistical relationship

and 11 respectively. dard deviation, Mf1 ¼ 1=N xi and Mf2 ¼ N yi

(Bland & Altman, 1995) measures the relevance

where pi is a likely of the instance I that belongs to

7.1. The statistical explanation

Based on the above observation, the features of

7.2. Feature correlations

two aspects; without the label using the PCC and

Figure 3. The kurtosis of the features on the training and

Figure 6. The comparison of the fifth techniques on the UNSW-

five techniques (e.g., NB, DT, ANN, LR, and EM

8. Conclusion and future work References

Generation Intrusion Detection Expert System). In Biographies

Elsevier. assurance and supervised 16 PhDs.

You might also like