ELET442 - Artificial Neural Networks (ANNs)

Download as pdf or txt
Download as pdf or txt
You are on page 1of 56

Overview of Artificial Neural Networks

‘Artificial Neural Network’ or ANN is the term used to describe a


machine or a computer constructed and implemented to model and
emulate the human brain.

It consists of a set of interconnected simple processing units called


(artificial neurons or nodes) which combined together to produce
an output signal that is suitable to solve a certain problem based on
the applied input signals.
The interconnected simple processing units have adjustable weights
that are gradually adjusted through iterations influenced by the
input-output patterns given to the ANN. (Training & Learning
Process)
Overview of Artificial Neural Networks
Roughly speaking , a neural network is a collection of artificial neurons,
which is described by a mathematical model of a biological neuron in its
simplest form. The mathematical model of an artificial neuron is based on
the following facts:
1) Neurons are the elementary units in a nervous system at which
information processing occurs.
2) Incoming information is in the form of signals that are passed between
neurons through connection links.
3) Each connection link has an adjustable weight that is multiplied by the
signal transmitted through it.
4) Each neuron has an internal action, depending on a bias or firing
threshold, resulting in an activation function being applied to the
weighted sum of all input signals entering it to produce an activated
output signal.
Biological and Artificial Neurons

Biological findings inspire the development of


Artificial Neural Net
Input → weights → Logic function → output
Terminology Relation Between Biological and Artificial Neuron
Some Criteria's of Comparison Between Biological Brain and Artificial
Neural Networks
General Architecture of Artificial Neural Networks
BASIC OPERATION OF ARTIFICIAL NEURAL NETWORK

o X1 , X2 , … , Xn (Inputs to neuron).
o Y (Output of neuron)
o wi (Adjustable Weights)
o Bi (Given Bias Value)

o Net input calculation is : Yin = X1 . W1 + X2 . W2 + …. + Xn . Wn + Bi


o Output is : Y = f ( Yin )
Where , f is the activation function

 Bi

Bi
Most Used types of Activation functions

1) The Logistic Sigmoid Function : It has a curve looks like an S-shape.


Therefore, it is especially used for models where we have to predict the
probability as an output. Since probability of anything exists only between the
range of 0 and 1.
2) Tanh or hyperbolic tangent Activation Function : It is also like logistic
sigmoid but better. The range of the tanh function is from (-1 to 1) , it has the
advantage of including negative values and wider range. As it is differentiable.
Most Used types of Activation functions
For this threshold functions: If the weighted sum is valued as less than 0, the
TF will pass on the value 0. If the value is equal to or more than 0, the TF
passes on 1. It is a yes or no, black or white, (Also known as Binary Function).
Example 1: Find the summation of all input signals (Yin ) before it processed
to the activation function for the following given data:
(No bias value, B1=0)
o inputs : ( 3, 2 , 0, -2) .
o Weights : ( 0.15 , -0.1 , 0.8 , -0.75) . X1

X2
∑ Yin
X3

X4
Solution 1
Processing ➔ Yin = { 30.15 + 2  (-0.1) + 0  0.8 + (-2)  (-0.75) }

Yin = { 0.45 - 0.2 + 0 + 1.50 } = 1.75


Example 2: What is the exact output (Y) of the previous example if the
activation function given as :
1. Unit step function ?
2. Logistic Sigmoid function ?
3. Tanh function ?
Solution 2:
1. For unit step function :
Y = f(1.75) = 1 ➔ because, value of (Yin = 1.75) > 0

2. For Logistic Sigmoid Function :


1
Y = f(1.75) ➔ Y = f (1.75) = = 0.8520
1+ e (
−1.75
)

( )
3. For Hyperbolic Tangent Function :
e 1.75
− e −1.75
➔ Y = f (1.75) = = 0.9414
(e )
Y = f(1.75)
−1.75
1.75
+e
Example 3: Find the output of the given ANN . If the hidden layer neuron has
a logistic Sigmoid as its activation function and the output neuron has a
binary activation function?

0.45

-0.25
0.6 Yout

0.85
0.015

Yin_H1 =

Yo_H1 = f ( Yin ) =
 Yout_in = { 0.3775  0.45 + 0.6  (-0.25) + 0.015  0.85 } = 0.033

For Output neuron having binary activation function :

Yout = f(0.033) = 1 ➔ Since , value of ( Yout_in = 0.033 > 0 )


TYPES OF ARTIFICIAL NEURAL NETWORKS

Generally, ANNs are classified by:


1. Its pattern of connections between the neurons (also called its
architecture or model).
Where , The arrangement of neurons to form layers and the
connection patterns between layers is called the network
architecture.
2. Its learning algorithm (which is the method of adjusting and
determining of its weights).
Two types of learning available in ANNs:
• Supervised learning:- learning with teacher signals or targets are given
• Unsupervised learning:- learning without the use of teacher signals
TYPES OF ARTIFICIAL NEURAL NETWORKS
1. Based on Architectures
Usually ANNs can be categorized into 2 models:
1) Feedforward Network:
All signals flow in one direction only, ( i.e. from lower layers (input) to upper
layers (output).
2) Feedback or Recurrent Network:
Signals from neurons in upper layers are fed back to either its own or to
neurons in lower layers.
TYPES OF ARTIFICIAL NEURAL NETWORKS
Feedforward Networks

i) Single-Layer Perceptron (SLP) Network:


✓ A single layer perceptron (SLP) is a feed-
forward network, it is the simplest type of
artificial neural networks and usually used in
classifying linearly separable cases with a
binary target (1 , 0).
✓ Inputs are directly linked to output layer.
✓ Inputs are connected to the output processing
nodes with various weights, resulting in series
of outputs one per each node.
ii) Multi-Layer Perceptron (MLP) Network:
✓ A multi-layer perceptron (MLP) has the same structure of a single layer
perceptron with additional one or more hidden layers.
✓ The function of hidden neurons is to interact between the external input
and network output in some useful manner to extract higher mapping
relationship between inputs and outputs.
✓ The input signals are propagated in a forward direction on a layer-by-layer
basis.
✓ More of the hidden layer, more is the complexity of network, but efficient
output is produced.
TYPES OF ARTIFICIAL NEURAL NETWORKS
Feedback and Recurrent Networks
i) Feedback Network:
✓ A feed forward neural network with at least one feedback loop is known as
feedback network

✓ When outputs are directed back as inputs to same or preceding layer nodes
it results in the formation of feedback networks
TYPES OF ARTIFICIAL NEURAL NETWORKS
Feedback and Recurrent Networks

ii) Recurrent Network:


✓ It is a feedback network with a self closed feedback loop, i.e., where output
of a neuron is fed back to its own input.
✓ Sometimes, feedback loops involve the use of unit delay elements, which
results in a useful nonlinear dynamic behavior.
Exercise 1
For the ANN given below answer the following:

• How many input and output neurons? • Ans: 4 input and 2 output neurons

• How many hidden layers does this network have? • Ans: 3 Layers

• How many adjustable weights in total ?


• Ans: First hidden layer has 4x4, second layer
has 4x3, third hidden layer has 3x3, fourth
hidden layer to output layer has 2x3 weights.
Total = 16 + 12 + 9 + 6 = 43

Inputs
neurons
LEARNING & TRAINING OF ANNs
Once a network has been structured for a particular application, it is ready for
training. At the beginning, the initial weights are chosen randomly and then
the training or learning begins. There are two approaches in training;
Supervised and Unsupervised.

In all of the neural paradigms, the application of an ANN involves two phases:
(1) Learning & Training phase
(2) Recall phase
▪ In the learning & training phase (usually offline) the ANN are trained
until it has learned its task (through the adaptation of its weights)
▪ In the recall phase (usually online), ANN are used to solve the similar
task they have trained on.
LEARNING & TRAINING OF ANNs

✓ An ANN solves a task when its weights are adapted through a learning
phase.

✓ All neural networks have to be trained before they can be used.

✓ They are given training patterns and their weights are adjusted
iteratively until the output error function is minimized.

✓ Once the ANN has been trained no more training is needed.

Two types of learning prevailed in ANNs:


❑ Supervised learning:- learning with teacher signals or targets are given,
❑ Unsupervised learning:- learning done without the use of teacher signals
Supervised Learning
Example: ANN to work as XOR gate:
Actual Target
Output Output
?
Compare
?
?
?
Un-supervised Learning
LEARNING ALGORITHMS
The Delta Rule:
This rule is based on the simple idea of continuously modifying the
strengths of the connections weights to reduce the difference (the delta)
between the desired output value and the actual output of a processing
element.

The Back-propagation Rule:


This is the most popular method used for training MLP, it is similar to Delta
Rule in that, it used to minimize the Least Mean Square Error (LMSE) to
modify and adjusting the connection weights.

BB is used to
update the
weights to
minimize the
error function
Back-propagation Training Algorithm (BP)

In a back-propagation neural network, the learning algorithm


has two phases.
1. A training input pattern is presented to the network input
layer. The network propagates the input pattern from layer
to layer until the output pattern is generated by the output
layer.
2. If this pattern is different from the desired output, an error
is calculated and then propagated backwards through the
network from the output layer to the input layer.
The weights are modified as the error is propagated.
Back-propagation Training Algorithm (BP)

Input signals propagated forward

Error signals propagated backward


Back-propagation Training Algorithm (BP)

1. Calculate the outputs of all neurons in the hidden layer:


n
x =  xi  wi + bias
i =1
1
O j = f ( x) =
1 + e−x

2. Calculate the outputs of (all) neuron(s) in the output layer:


n
x =  O j  w j + bias
j =1

1
Ok = f ( x) =
1 + e−x
Back-propagation Training Algorithm (BP)

3. Calculate output error :

 k = Ok  (1 − Ok )  (t − Ok )

4. Update weight between hidden-output layer :

w( jk ) =   O j   k
w( jk ) (t +1) = w( jk ) t + w( jk )

5. Calculate hidden error :


n
 j = O j  (1 − O j )  (  k  w( jk ) )
k =1
Back-propagation Training Algorithm (BP)

6. Update weight between input-hidden layer:

w(ij ) =   xi   j

w(ij ) (t +1) = w(ij ) t + w(ij )


Where,
i = Input
j = Hidden
k = Output
LEARNING LAWS (ALGORITHMS)
LEARNING RATE Constant :

✓ Most learning functions have some provision for a learning rate


(learning constant).

✓ Usually this term is positive and between 0 and 1.

✓ If the learning rate is greater than 1, it is easy for the learning


algorithm to overshoot in correcting the weights, and the network will
oscillate.

✓ Small values of the learning rate will not correct the current error as
quickly, but if small steps are taken in correcting errors, there is a
good chance of arriving at the best minimum convergence.
Flow diagram of the ANN with back propagation algorithm
Summary of Back-propagation Training Algorithm

Error

https://www.youtube.com/watch?v=WZDMNM36PsM
Example 4:
Learning example for AND gate using Delta Rule:
Input
Step function

Pattern 1: ( 0 x 0.3 ) +(0 x -0.1) = 0, Less than 0.2, output = 0 (target 0)


Pattern 2: ( 0 x 0.3 ) + (1 x -0.1) = -0.1, Less than 0.2, output = 0 (target 0)
Pattern 3: (1 x 0.3 ) + ( 0 x -0.1 ) = 0.3, Greater than 0.2, output = 1 (target 0)
Weight update: wi  wi + (a  xi  (t − o ))
w1 = 0.3 + (0.1 x 1 x (0-1) = 0.2
w2 = -0.1 + (0.1 x 0 x (0-1) = -0.1
Pattern 4: (1x0.2)+(1x-0.1) = 0.1, Less than 0.2, output = 0 (target 1)
Weight update:…..
Completed learning cycle to updating the weights to achieve the desired output

Inputs Desired Initial Actual Error Final


Epoch output weights output weights
x1 x2 Yd w1 w2 Y e w1 w2
1 0 0 0 0.3 − 0.1 0 0 0.3 − 0.1
0 1 0 0.3 − 0.1 0 0 0.3 − 0.1
1 0 0 0.3 − 0.1 1 −1 0.2 − 0.1
1 1 1 0.2 − 0.1 0 1 0.3 0.0
2 0 0 0 0.3 0.0 0 0 0.3 0.0
0 1 0 0.3 0.0 0 0 0.3 0.0
1 0 0 0.3 0.0 1 −1 0.2 0.0
1 1 1 0.2 0.0 1 0 0.2 0.0
3 0 0 0 0.2 0.0 0 0 0.2 0.0
0 1 0 0.2 0.0 0 0 0.2 0.0
1 0 0 0.2 0.0 1 −1 0.1 0.0
1 1 1 0.1 0.0 0 1 0.2 0.1
4 0 0 0 0.2 0.1 0 0 0.2 0.1
0 1 0 0.2 0.1 0 0 0.2 0.1
1 0 0 0.2 0.1 1 −1 0.1 0.1
1 1 1 0.1 0.1 1 0 0.1 0.1
5 0 0 0 0.1 0.1 0 0 0.1 0.1
0 1 0 0.1 0.1 0 0 0.1 0.1
1 0 0 0.1 0.1 0 0 0.1 0.1
1 1 1 0.1 0.1 1 0 0.1 0.1
Threshold:  = 0.2; learning rate: = 0.1
Example: 5

Input (X) Weight (Input to Hidden) Weight (Hidden to Output)

X1 = 0.8 X1 to H1 = 0.3 H1 to O1 = 0.6

X2 = 0.5 X1 to H2 = 0.4 H2 to O1 = 0.9


X2 to H1 = 0.7

X2 to H2 = 0.9
Learning rate α = 0.6
Target Output = 1.0

0.3
0.8 X1 H1
0.6
0.7
O1 Target = 1.0
0.4
0.9
0.5 X2 H2
0.9
1. Calculate the outputs of all neurons in the hidden layer:

H1 = (0.8 x 0.3) + (0.5 x 0.7) = 0.59, Sigmoid = 1/1+2.71828^-0.59 = 0.6434


H2 = (0.8 x 0.4) + (0.5 x 0.9) = 0.77, Sigmoid = 0.6835

2. Calculate the outputs of (all) neuron(s) in the output layer:

O1 = (0.6434 x 0.6) + (0.6835 x 0.9) = 1.0012, Sigmoid = 0.7313

3. Calculate output error

Error = 0.7313 x (1-0.7313) x (1-0.7313) = 0.0528


4. Update weight between hidden-output layer :
WH1-O1 = 0.6 + (0.6 x 0.6434 x 0.0528) = 0.6204
WH2-O1 = 0.9 + (0.6 x 0.6835 x 0.0528) = 0.9217

5. Calculate hidden error


6. Update weight between input-hidden layer:
Error gradient H1 = (0.0528 x 0.6) x ((1-0.6434) x 0.6434) = 0.0073
WX1-H1 = 0.3 + (0.6 x 0.8 x 0.0073) = 0.3035
WX2-H1 = 0.7 + (0.6 x 0.5 x 0.0073) = 0.7021

Error gradient H2 = (0.0528 x 0.9) x ((1-0.6835) x 0.6835) = 0.0103


WX1-H2 = 0.4 + (0.6 x 0.8 x 0.0103) = 0.4049
WX2-H2 = 0.9 + (0.6 x 0.5 x 0.0103) = 0.9031

• Then start another epoch until reach error = 0


Artificial Neural Networks in Control
Neural networks have been applied very successfully in the identification
and control of many dynamic systems.
The universal approximation capabilities of the multilayer perceptron
make it a popular choice for modeling nonlinear systems and for
implementing general-purpose nonlinear controllers .

There are three popular architectures for prediction and control that
have been implemented using Neural Networks :
1) Inverse Modelling and Control .
2) Model Predictive Control
3) Model Reference Adaptive Control .
Artificial Neural Networks in Control
There are typically two steps involved when using neural networks for
control:
✓ System Identification Stage
✓ Control Design Implementation
▪ In the system identification stage, we train and develop a neural
network to act similar to the model of the plant that we want to control.
▪ In the control design stage, we use the trained neural network model to
design (or fine-tune) the controller of the system.
▪ In each of the three control architectures stated earlier , the system
identification stage is identical.
▪ While, The control design stage, is different for each architecture.
1. Inverse Modelling and Control
There are two basic design approaches for inverse control:
1) Generalized Training (off-line) –
• In this architecture, the input signal (u) is applied to the system input ,
• Output signal (y) is obtained at the system output and forwarded to the
proposed neural network model which produce a signal (uN) .
• The different between the incoming signal (u) and the neural model
output (uN) is the error (eN = u - uN) which can be used and utilized for
the neural network learning to identify the inverse model of the system.
1. Inverse Modelling and Control
2) Specialized Training (on-line) –
• Now, the already trained Inverse neural model is implemented in the
system and used as a controller for the process.
• The neural controller is now fine-tuned and adjusted online by utilizing
the error (ec) .
• The error (ec) is here obtained as the difference between the desired
signal (yr ) and the signal ( y ) that represents the actual system output
(ec = yr - y) .
2. The Neural Network Predictive Controller
i. System Identification (off-line) –
• The first stage of model predictive control is to train a neural network to
represent the forward dynamics of the plant.
• The prediction error between the plant output and the neural network output
(e = yp - ym ) is used as the neural network training signal.
• The neural network plant model uses previous plant outputs to predict future values
of the plant output.

This network can be trained


offline in batch mode,
using data collected from
the operation of the plant.
2. The Neural Network Predictive Controller
ii. Predictive Controller (on-line) –
The following block diagram illustrates the model predictive control process.
The controller consists of the already trained neural network plant model
and the optimization block.
The optimization block determines the values of ( u’ ) that minimize ( ym ) ,
and then produce the optimal ( u ) as the needed input to the plant.
3. The Model Reference Adaptive Control

Model
Reference
+
Learning

Ref e Neural
. System
+ Controller

Here, the neural controller is trained and fine tuned by the learning algorithm
based on minimizing the error between actual output of the system and the
output of the well-known reference model of the controlled process. So the
closed loop system behaves like the model (desired rise time, overshoot, etc.)
Example of Using the NN Predictive Controller
This example shows the application of the NN predictive controller in a Catalytic
Continuous Stirred Tank Reactor Process (file name: predcstr )
Graph to show the plant output and the reference signal after the
neural network trained and learned its task

You might also like