An Agent-Based Bayesian Forecasting Model For Enhanced Network Security
An Agent-Based Bayesian Forecasting Model For Enhanced Network Security
An Agent-Based Bayesian Forecasting Model For Enhanced Network Security
Security
J. PIKOULAS, W.J. BUCHANAN, Napier University, Edinburgh, UK.
M. MANN ION, Glasgow Caledonian University, Glasgow, UK.
K. TRIANTAFYLLOPOULOS, University of Warwick, UK.
Security monitoring software monitor. This monitors different operations of a computer network and
outputs the results to system administrators (for example, Abacus Sentry, COPS, Tripwire and Tiger).
Network monitoring software. This monitors user's
behaviour or monitors incoming or outgoing traffic
(for example. Argus, Arpwatch and ISS).
Firewall software and hardware. This runs on the
Internethtranet entrance to a computer network, and
checks all incoming network traffic for its contents at
the network and transport layers of the OS1 model. At
the network layer. typically the Intemet Protocol (IP)
addresses are filtered for their source and/or destination. and at the transport layer. the TCP ports and
monitored (thus FTP and TELNET traffic could be
blocked for incoming data traffic, but SMTP (electronic mail) could be allowed).
These methods are generally centralised applications with
no real time response and have no mechanism to foresee
future user events. These methods also have a central focal
point for security (typically a main server), which could
itself become the focus of an attack (such as a denial-ofservice attack. where the server is bombarded with hoax
requests, which eventually reduces its quality of service to
its clients).
The method involved in this research is distributed.
and does not depend on a central point of failure. It also
gathers user behavioural information and it makes a prediction on what the user might do in the future. This paper
presents a distributed approach to network security using
agents. and presents a novel application of the Bayesian
forecasting technique to predict user actions. The Bayesian
method has. in the past. been used for weather forecasting
and has been expanded so that it can be used to provide
enhanced network security by trying to predict user actions. For this a system can determine if a user is acting
unpredictably or has changed their normal working pattern. Results are also given which show that the new
model can predict user actions. and a set of experiments
are proposed for hrther exploitation of the method.
In choosing a computer network security solutions. the
dominant issues are: cost; the desired level of security: and
the characteristics of the existing operating system envi-
Abstract
Security has become a major issue in many organisations,
but most systems still rely on operating systems. and a
user ID and password system to provide user authentication and validation. Thqv also tend to be centralized in
their approach which makes them open to an attack. This
paper presents a distributed approach to nehvork securip
using ugents. and presents a novel application of the
Bavesian fosecasting technique to predict user actions.
The Bayesian method has been used in the past on
weather forecasting and has been expanded so that it can
be used to provide enhanced network security b-v t y i n g to
predict user acrions. For this a system can determine i f a
user is acting unpredictab(v or has changed their normal
working pattern. Results are also given which show that
the new model can predict user actions, and a set of experiments are proposed f o r jiirther exploitation of the
method.
1. Introduction
Computer security is a major concern for organizations.
Whilst security violations can be caused by external users
hacker.^). Carter and Catz [ 1 J have shown that the primary
threat comes from individuals inside an organisation.
I-lencc much morc emphasis has to be placed on internal
security mechanisms.
External network attacks can be categorised [4] into IP
spoofing attacks [SI. Packet-sniffjng [ 6 ] . sequence number
prediction attacks and trust-access attacks. Categories of
internal attack include Passwords attacks [7]. session hijacking attacks. shared library attacks. social engineering
attacks. and technological vulnerability attack.
Computer network security programs can be categorised
as follo\vs [ 3 ] :
Security enhancement software. This enhances or
replaces an operating system's built-in security software (for example. Mangle It. Pass\+d+ and Shadow).
Authentication and encryption software. This encrypts and decrj.pts computer files (for example. Kerberos. MDS. RIPEM. and TIS Firewall Toolkit).
247
0-7695-1086-8/01$10.00 02001 IEEE
ronment. Three mechanisms for illegal behaviour detection are commonly used in computer network security
programs [8]. and can be applied to all five categories of
computer security program.
Hybrid Detection
Hybrid detection systems are a combination of statistical
anomaly detection and rule-based detection systems.
These. typically, use rules to detect known methods of
intrusion and statistical based methods to detect new
methods of intrusion.
CMDS (Computer Misuse Detection System) [I41 is a
security-monitoring package that provides a method to
watch for intrusions. such as bad logins or file modifications. It also monitors for the dif-ficult detection problems
such as socially engineered passwords, trusted user file
browsing and data theft that might indicate industrial espionage. CMDS supports a wide variety of operating systems and application programs. The drawback of this system is that it uses statistical analysis to make additional
rules for the system. This is a drawback, as it can only
detect attack patterns that have been used in the past and
identified as attack patterns. or predefined by the system
operators. It also generates long reports and graphs of the
system performance that require to be interpreted by a
security expert.
Figure I:
Agent Environment Topology
248
4 Faster p r c e w n g
C"WE7ll
kmca,ted model
5 Prediction Model
user.
2. Kill the specific application that has caused invalid
behaviour.
3. Prevent the end user from running any further applications.
Cases 2 and 3 can be achieved locally at the client workstation. and in Case 1. the user agent informs the core
agent and the core agent informs the systems administrator. The user agent terminates when a user logs off.
Figure 2 shows the complete model for the forecasting
system. where a core agent reads the user profile. which is
then received by the user agent. The user agent then predicts the usage against the forecast. Eventually when the
user logs off the user profile is updated and sent back to
the core agent.
In the traditional method of forecasting. a user event
would be averaged over long time intervals (in Figure 3).
........
.
N m uier
profile
, ,
Current
" 5 5 proflk
.......................................................................................
When our intrusion detection system is installed. the prediction part monitors the user behaviour for 15 times. After that. it evaluates itself for five times. After this it is
ready to make an accurate prediction. Our model has three
stages of operation. The stages are:
I.
2.
3.
Core agent
sendl lorerarung
inlormarion
......
Observation stage. In this stage the model is monitoring the user and records its behaviour.
Evaluation stage. In this stage the model makes a
prediction and also monitors the user actual movements and calculates the result. This stage is critical.
because the model modifies itself according to the
environment that it operates in.
One-step prediction. In this stage the model makes
a single step prediction. For example. assume that the
user is logged in for 15 times and the model is configured. and it is ready to start predicting user moves.
Instead of making a five or ten step prediction. likc
other mathematical models. our model makes a prediction for the next step. When the user logs in and
out of our model. it takes the actual behaviour ofthe
user. compares it with the one step prediction that it
has performed before and calculates the error. So the
next time a prediction is made for this user it will include also the data of the last user behaviour. With
this procedure we maximise the accuracy of the prediction system.
user age"*
r e c u m i the
updared model
lor the user
The proposed forecasting method improves this by requiring much less memory storage. Figure 4 shows a generic
model for the predicting using parameters for a given window size (n)?time units and prediction number ( 5 ) .
249
some kind of knowledge, that is a subjective belief, expressed in terms of a distribution. This is the prior distribution of (@,I Q-,) at time t. In other words, it is what we
know before Y, becomes available. Once this happens, we
revise this prior belief, using the likelihood function, to
find the posterior distribution (e, ID,) or revised. which is
better and more accurate. Then according to simple calculations, we find the prior of time t-l and we calculate the
posterior at f+ I . only when information of the data
Y,,, comes in to the system (e.g. in our case is the real behaviour of the user). The model used becomes:
S."
p8nmt.n:
"=I5
z=5
TmS ""k = I hour
Figure 4: Forecastingcalculation
We use multivariate models because we want to incorporate and forecast several variables simultaneously. Again
note that the fact that the parameters 8, change both deterministically (through f ) and stochastically (through the
variance W, ). and thus make the model dynamic. Also
standard ARIMA (Auto-Regressive Integrated Moving
Average) models are a special and restrictive case of the
above model. when you set F, = F . G, =G and W, =
(all these three components are constant over time). This is
restrictive since all these components are likely to change
over time because e.g. ( I ) changes over time and there are
other external sources of variation (such as extra subjective information about a variable). Moreover, equation ( 2 )
is not observable. This means that we never are going to
see any evolution or trend in a diagram or a graph. This is
a hidden model that cannot assume ,'K to be constant over
time. There is another large problem that we cannot ignore
in multivariate models. The variance matrix
will not be
known. Often. in standard time series. it is assumed known
and they easily jump to another problem. However. in
practice. this is extremely difficult to set it as a known
matrix. It is very difficult to propose what variance to use
to a system where 20 applications are considered and only
20 or 30 vectors are collected as data.
So for all these reasons we need to consider the dynamic
models. Also. the system could provide forecasting as
much ahead as we like. proving very accurate according to
the results. For this purpose we used a Bayesian framework. which virtually means that at time f we will have
250
In addition to the standard autoregressive and moving average parameters. ARlMA models may also include a constant. as described above. The interpretation of a statistically significant constant depends on the model that is fit.
Specifically:
if there are no autoregressive parameters in the
model, then the expected value of the constant is
p the mean of the series;
if there are autoregressive parameters in the series. then the constant represents the intercept.
If the series is differenced. the constant represents the
mean or intercept of the differenced series; For example. if
the series is differenced once. and there are no autoregressive parameters in the model. the constant represents the
mean of the differenced series. and therefore the linear
trcnd slope ofthe un-differenced series.
ARlMA models are similar to our model. They use the
existing data to calculate the parameters of the model. But
it: for example. some external information is available. For
example. we may know that it is thex user and although he
does not have an illegal user profile. it is very probable
that at a specific point of time he will perform a huge invasion to an important application. ARIMA will try to
change the parameters to ad.just the model. but even in this
case. it is doubtful how well the model will do in all the
applications. With our DLM it is not a problem. Simply
we add to the prior information we have. the external information. This is named expert intervention, and the revised posterior takes into account the new knowledge. Our
system is not assumed perfect when the model is fitted.
and we let information. no matter what its sort. to make us
learn and improve the system.
Now our model is slightly different than the one we use
for illustration purposes. We find recurrence relationships.
251
SI
6:
"I
F, :
1-5
Palrmcten:
"-15
Time unic = I how
0.5.
0 45
04
0 35
03
0 25
02
0 15
01
0 05
I
10
II
12 I 3 14 15 16 17 18 19 20
Time (hours)
Ttme unit
(4
Sample pulmetcn:
"=I5
215
Intervention
Intervention is a mechanism for improving the prediction
accuracy. It is used when there is additional information
about the future behaviour of the system, and can be added
to the model prior the prediction. For example if there is
some users that are keen on using illegal software or there
are new users that there is not enough information about
their behaviour, by applying the intervention mechanism,
252
less precise with the actual readings and they fail to predict the action of the user in application 1 at the time interval nineteen, in comparison with our model that predicted it with a very close figure.
Panmeten:
Prediction for Application I (using ARIMA)
,115
t=5
T m e unit
I hour
0.5
Using
0.45
ARlHA model
\;
0.4
0.35
I t
. .
.I .I
I
. 1.
0.3
0.25
0.1
Vviation of
prediction window
(I t o 100)
0.15
0. I
0.05
Time (hours)
Evaluation
8 References
Carter and Catz, Computer Crime: an emerging challenge for law enforcement, FBI Law Enforcement Bulleting, pp 1-8, December 1996.
Roger Blake, Hackers in The M i s r . Northwestem University, December 2, 1994.
National Institutes of Health. Center for Information
Technologv.
http.//\wv.alw.nihgov/Security/securityprog.html#com
mercial, October 1998.
W.J. Buchanan. Handbook of Data Communications and
Networks, Kluwer, 1998.
SamsNet, A Hacker's Guide to Protecting Yoiir Internet
Site and Network.
URL:http://mx.nsu.ru/Mas-Security/ch28/ch28
htm
NetworklCE Corporation, Packet .%fling. http:
//www.networkice.com/ advice/ Underground tlacking/ Methods/ Technical/ Packct_sniffing/default.htni
Alan Ramsbottom, FAO: N T Crvplographic Passirord
Attacks 82 Defences, 1997,
http: //wvw.omikron de/ -ecr/ nthack/ samfaq.htm.
Chris Herringshaw, Defecting Attacks on Networks,
IEEE Computer Magazine. pp 1&17. Dec. 1997.
Debra Anderson, Detecting Uniisiral Program Behavior
Using the MDES Statistical Component, IDS Report
SRI Project 2596, Contract Number 910097C (Trusted
Information Systems) under F30602-9 1 -C-0067 (Rome
Labs), 1995.
T. Lunt, H . Javitz, A. Valdes, et al. A Real-Time lntrusion Detection Expert Svstem (IDES). SRI Project 6784,
Future Work
The experiments that are conducted up to now were setup
253
254