Chapter 7 - Neural-Networks
Chapter 7 - Neural-Networks
Chapter 7 - Neural-Networks
Neural Networks
Compiled By: Bal Krishna Nyaupane
[email protected]
Basic Components of Biological Neurons
The brain is a collection of about 10
billion interconnected neurons.
Each neuron is a cell that uses
biochemical reactions to receive,
process and transmit information.
The majority of neurons encode
their activation or outputs as a series
of brief electrical pulses.
A neuron's dendritic tree is
connected to a thousand
neighbouring neurons. When one of
those neurons fire, a positive or
negative charge is received by one
of the dendrites. The strengths of all
the received charges are added
together through the processes of
spatial and temporal summation.
2
Basic Components of Biological Neurons
The neuron’s cell body (soma) processes the incoming activations and converts them into output
activations.
The neuron’s nucleus contains the genetic material (DNA)
Dendrites are fibers which emanate from the cell body and provide the receptive zone that receive
activation from other neurons.
Axons are fibers acting as transmission lines that send action potentials to other neurons.
Each terminal button is connected to other neurons across a small gap called a synapse.The synapses
allow signal transmission between the axons and the dendrites.
Biological NN Artificial NN
Soma Neuron
Dendrite Input
Axon Output
Synapse weight 3
Introduction to Neural Networks
McCulloch & Pitts (1943) are generally recognised as the designers of the first neural network.
The inventor of the first neurocomputer, Dr. Robert Hecht-Nielsen, defines a neural network as:
a computing system made up of a number of simple, highly interconnected processing elements,
which process information by their dynamic state response to external inputs.
An Artificial Neural Network (ANN) is an information processing paradigm that is inspired by
the biological nervous systems, such as the human brain’s information processing mechanism.
An artificial network consists of a pool of simple processing units which communicate by
sending signals to each other over a large number of weighted connections.
Artificial Neural Networks (ANNs) are networks of Artificial Neurons and hence constitute
crude approximations to parts of real brains.
Computers point of view, an ANN is just a parallel computational system consisting of many
simple processing elements connected together in a specific way in order to perform a particular
task.
An Artificial Neural Network is composed of a large number of highly interconnected
processing elements (neurons) working in unison to solve specific problems. NNs, like people,
learn by example.
4
Introduction to Neural Networks
Neural networks are a powerful technique to solve many real world problems. They have the
ability to learn from experience in order to improve their performance and to adapt themselves
to changes in the environment. In addition to that they are able to deal with incomplete
information or noisy data and can be very effective especially in situations where it is not
possible to define the rules or steps that lead to the solution of a problem.
Why are Artificial Neural Networks?
• They are extremely powerful computational devices.
• Massive parallelism makes them very efficient
• They can learn and generalize from training data, so there is no need for enormous feats of
programming
• They are particularly fault tolerant.
• They are very noise tolerant, so they can cope with situations where normal symbolic
systems would have difficulty
• In principle, they can do anything a symbolic/logic system can do, and more.
• They can perform tasks that a linear program cannot perform.
• Widely applied in data classification, clustering, pattern recognition.
5
Neural Network Applications
Brain modelling
• Aid our understanding of how the brain works, how behavior emerges from the
interaction of networks of neurons, what needs to “get fixed” in brain damaged
patients.
Real world applications
• Financial modelling – predicting the stock market
• Time series prediction – climate, weather
• Computer games – intelligent agents, chess, backgammon (A board game for two
players; pieces move according to throws of the dice)
• Robotics – autonomous adaptable robots
• Pattern recognition – speech recognition, seismic activity, sonar signals (acoustic
pulse in water and measures distances in terms of the time for the echo of the pulse to
return)
• Data analysis – data compression, data mining
• Bioinformatics – DNA sequencing, alignment
6
Learning Processes in Neural Networks
Neural network has the ability to learn from its environment, and to improve its performance
through learning. The improvement in performance takes place over time in accordance with
some prescribed measure.
A neural network learns about its environment through an iterative process of adjustments
applied to its synaptic weights and thresholds. The network becomes more knowledgeable
about its environment after each iteration of the learning process.
There are three broad types of learning:
1. Supervised learning (i.e. learning with an external teacher)
2. Unsupervised learning (i.e. learning with no help)
3. Reinforcement learning (i.e. learning with limited feedback)
7
Learning Processes in Neural Networks
Supervised learning
• Supervised learning is the machine learning task of inferring a function from training data and
the training data consist of a set of training examples i.e. a supervised learning algorithm
analyzes the training data and produces an inferred function, which can be used for mapping
new examples.
• In supervised the variables under investigation can be split into two groups: explanatory
variables and one (or more) dependent variables. The target of the analysis is to specify
a relationship between the explanatory variables and the dependent variable as it is done in
regression analysis.
• In supervised training, both the inputs and the outputs are provided. The network then processes
the inputs and compares its resulting outputs against the desired outputs.
• Errors are then propagated back through the system, causing the system to adjust the weights
which control the network. This process occurs over and over as the weights are continually
tweaked.
• The set of data which enables the training is called the training set. During the training of a
network the same set of data is processed many times as the connection weights are ever
refined.
8
• Used for: classification, regression
Learning Processes in Neural Networks
Unsupervised learning
• Unsupervised learning is the task of finding hidden structure in unlabeled data. Since the
examples given to the learner are unlabeled, there is no error or reward signal to evaluate a
potential solution.
• In unsupervised learning situations all variables are treated in the same way, there is no
distinction between explanatory and dependent variables.
• In unsupervised training, the network is provided with inputs but not with desired outputs.
The system itself must then decide what features it will use to group the input data. This is
often referred to as self-organization or adaption.
• The most common unsupervised learning method is cluster analysis, which is used for
exploratory data analysis to find hidden patterns or grouping in data.
• Used for: clustering
9
Learning Processes in Neural Networks
Reinforcement learning
• Reinforcement learning: in the case of the agent acts on its environment, it receives some
evaluation of its action (reinforcement), but is not told of which action is the correct one to
achieve its goal
• It allows machines and software agents to automatically determine the ideal behaviour
within a specific context, in order to maximize its performance. Simple reward feedback is
required for the agent to learn its behaviour; this is known as the reinforcement signal.
• In the reinforcement learning, the learner receives feedback about the appropriateness of its
response. For correct responses, reinforcement learning resembles supervised learning.
• However, the two forms of learning differ significantly for errors, situations in which the
learner's behavior is in some way inappropriate. In these situations, supervised learning lets
the learner know exactly what it should have done, whereas reinforcement learning only
says that the behavior was inappropriate and (usually) how inappropriate it was.
• Consider an animal that has to learn some aspects of how to walk. It tries out various
movements. Some work -- it moves forward -- and it is rewarded. Others fail -- it stumbles
or falls down -- and it is punished with pain.
10
McCulloch-Pitts (M-P) Neurons Equation
11
Artificial Neuron- Basic Elements
12
Artificial Neuron- Basic Elements
13
Activation function
14
Activation function
15
Example
16
Types of Layers in ANN
The input layer
• Introduces input values into the network.
• No activation function or other processing.
The hidden layer(s)
• Perform classification of features
• Two hidden layers are sufficient to solve any problem
• Features imply more layers may be better
The output layer
• Functionally just like the hidden layers
• Outputs are passed on to the world outside the neural network. 17
Types/Architectures/Structures/Topologies of Neural Network
Single layer feed forward Network
• The single layer feed forward Network consists of a single layer of weights, where the inputs are
directly connected to the outputs ,via a series of weights. The synaptic links carrying weights
connect every input to every output, but not other way. This way it is considered a network of feed-
forward type.
• The sum of the products of the weights and the inputs is calculated in each neuron node, and if the
value is above some threshold (typically 0) the neuron fires and takes the activated value (typically
1); otherwise it takes the deactivated value( typically -1).
• For example, a simple Perceptron.
18
Types/Architectures/Structures/Topologies of Neural Network
Multi-layer feed forward Network
• One input layer, one output layer, and one or more hidden layers of processing units. The hidden layers sit in
between the input and output layers, and are thus hidden from the outside world. The computational unit of
hidden layers are known as hidden neurons.
• The hidden layer does intermediate computation before directing the input to output layer. The input layer
neurons are linked to the hidden layer neurons.
• A multi-layer feedforward network with l input neurons, m1 neurons in first hidden layer, m2 neurons in second
hidden layer, and n output layer is written as (l - m1 - m2 – n)
• For example, a Multi-Layer Perceptron.
19
Types/Architectures/Structures/Topologies of Neural Network
Recurrent Network
• A recurrent network has at least one feedback loop.
• There could be neurons with self-feedback loop; that is the output of a neuron is feedback
into itself as input.
20
The Perceptron
First studied in the late 1950s.
Also known as Layered Feed-Forward Networks.
The operation of Rosenblatt’s perceptron is based on the McCulloch and Pitts neuron model.
The model consists of a linear combiner followed by a hard limiter. The weighted sum of the
inputs is applied to the hard limiter, which produces an output equal to +1 if its input is positive
and -1 if it is negative.
Single-layer two-input perceptron Inputs
x1 Linear Hard
w1 Combiner Limiter
Output
Y
w2
x2
Threshold 21
The perceptron learning rule
wi ( p 1) wi ( p) a . xi ( p) . e( p)
where p = 1, 2, 3, . . .
a is the learning rate, a positive constant less than unity.
The perceptron learning rule was first proposed by Rosenblatt in 1960. Using this rule we can
derive the perceptron training algorithm for classification tasks.
Step 2: Activation
• Activate the perceptron by applying inputs x1(p), x2(p),…, xn(p) and desired output Yd (p).
• Calculate the actual output at iteration p = 1
n
Y ( p ) step x i ( p ) w i ( p )
i 1
• where n is the number of the perceptron inputs, and step is a step activation function.
• If at iteration p, the actual output is Y(p) and the desired output is Yd (p), then the error
is given by:
e( p) Yd ( p) Y( p) where p = 1, 2, 3, . . .
23
Perceptron’s training algorithm
wi ( p 1) wi ( p) Dwi ( p)
Step 4: Iteration
Increase iteration p by one, go back to Step 2 and repeat the process until
convergence. 24
Example of perceptron learning: the logical operation AND
Inputs Desired Initial Actual Error Final
Epoch output weights output weights
x1 x2 Yd w1 w2 Y e w1 w2
1 0 0 0 0.3 0.1 0 0 0.3 0.1
0 1 0 0.3 0.1 0 0 0.3 0.1
1 0 0 0.3 0.1 1 1 0.2 0.1
1 1 1 0.2 0.1 0 1 0.3 0.0
2 0 0 0 0.3 0.0 0 0 0.3 0.0
0 1 0 0.3 0.0 0 0 0.3 0.0
1 0 0 0.3 0.0 1 1 0.2 0.0
1 1 1 0.2 0.0 1 0 0.2 0.0
3 0 0 0 0.2 0.0 0 0 0.2 0.0
0 1 0 0.2 0.0 0 0 0.2 0.0
1 0 0 0.2 0.0 1 1 0.1 0.0
1 1 1 0.1 0.0 0 1 0.2 0.1
4 0 0 0 0.2 0.1 0 0 0.2 0.1
0 1 0 0.2 0.1 0 0 0.2 0.1
1 0 0 0.2 0.1 1 1 0.1 0.1
1 1 1 0.1 0.1 1 0 0.1 0.1
5 0 0 0 0.1 0.1 0 0 0.1 0.1
0 1 0 0.1 0.1 0 0 0.1 0.1
1 0 0 0.1 0.1 0 0 0.1 0.1
1 1 1 0.1 0.1 1 0 0.1 0.1
Threshold: = 0.2; learning rate: = 0.1
25
Perceptron: Linear separability
The single layer perceptron algorithm converges if examples are linearly separable.
A single layer perceptron can only learn linearly separable concepts.
A single layer perceptron can learn the operations AND, OR, and NOT , but not Exclusive-OR.
AND OR
26
Example: Linear separability
27
Multilayer Perceptron Neural Networks
A multilayer perceptron is a feedforward neural network with one or more hidden layers.
The network consists of an input layer of source neurons, at least one middle or hidden layer
of computational neurons, and an output layer of computational neurons.
The input signals are propagated in a forward direction on a layer-by-layer basis.
Figure: Multilayer perceptron with two hidden layers
Input Signal
Output Signa
s
ls
First Second
Input hidden hidden Output
layer layer layer layer 28
Backpropagation Algorithm
In a back-propagation neural Input signals
network, the learning algorithm has 1
two phases. x1
1 y1
1
First, a training input pattern is 2
x2
presented to the network input layer. 2
2 y2
The network propagates the input
pattern from layer to layer until the i wij j wjk
xi k yk
output pattern is generated by the
output layer. m
n
If this pattern is different from the xn
l yl
desired output, an error is calculated Input Hidden Output
and then propagated backwards layer layer layer
through the network from the output Error signals
layer to the input layer. The weights
are modified as the error is
propagated. Figure: Three-layer back-propagation neural network
29
The Back-Propagation Algorithm
Step 1: Initialization
Set all the weights and threshold levels of the network to random
numbers uniformly distributed inside a small range:
2.4 2.4
,
Fi Fi
k ( p) yk ( p) 1 yk ( p) ek ( p)
where ek ( p) yd ,k ( p) yk ( p)
Calculate the weight corrections:
Dw jk ( p) y j ( p) k ( p)
Update the weights at the output neurons:
w jk ( p 1) w jk ( p) Dw jk ( p)
32
(b) Calculate the error gradient for the neurons in the hidden layer:
l
j ( p) y j ( p) [1 y j ( p)] k ( p) w jk ( p)
k 1
Calculate the weight corrections:
Dwij ( p) xi ( p) j ( p)
Now the actual output of neuron 5 in the output layer is determined as:
The next step is weight training. To update the weights and threshold levels in
our network, we propagate the error, e, from the output layer backward to the
input layer. 35
First, we calculate the error gradient for neuron 5 in the output layer:
Then we determine the weight corrections assuming that the learning rate
parameter, a, is equal to 0.1:
Dw35 y3 5 0.1 0.5250 (0.1274) 0.0067
Dw45 y4 5 0.1 0.8808 (0.1274) 0.0112
D5 ( 1) 5 0.1 (1) (0.1274) 0.0127
36
calculate
Next3we 5w
) error
y3(1 y3the 35 0.5250
gradients for neurons ( hidden
and 4 in the
(1 30.5250) 0.1274) ( 1
layer:
3 4y 3) y
3(1y4y(1 54 w w45 (10.8808
) 35 5 0.5250 (1 (0.8808)
0.5250) 0.1274) ((1.20.127 4) 1.
) 0.0381
Step 3: use the delta rule to update the bias and weights
Delta rule
Step 4: stop if the largest weight change across all the training samples is less
than a specified tolerance, otherwise cycle through the training set again
46
The Learning Rate, a
The performance of an ADALINE neuron depends heavily on the choice of the
learning rate
• if it is too large the system will not converge
• if it is too small the convergence will take to long
Typically, a is selected by trial and error
• typical range: 0.01 < a < 10.0
• often start at 0.1
• sometimes it is suggested that:
0.1 < n a < 1.0
where n is the number of inputs
Example: Construct an AND function for a ADALINE neuron with a0.1.
Activation Function
Neuron input
y_in = b + S xiwi y= {-1 if y_in < 0
1 if y_in >= 0
47
48
49
Continue to cycle
through the four
training inputs until the
largest change in the
weights over a complete
cycle is less than some
small number (say 0.01)
50
Hopfield Neural Network
A Hopfield network is a form of recurrent artificial neural network popularized by John
Hopfield in 1982, but described earlier by Little in 1974.
A recurrent neural network has feedback loops from its outputs to its inputs. The presence of
such loops has a profound impact on the learning capability of the network.
x1 1 y1
x2 2 y2
xi i yi
xn n yn
1 0 0
The 3 3 identity matrix I is I 0 1 0
0 0 1
0 0 1
Hopfield Neural Network
Thus, we can now determine the weight matrix as follows:
(1, 1, 1) (1, 1, 1)
y1
(1,1, 1) (1,1, 1)
y3
The Kohonen network
The Kohonen model provides a topological mapping. It places a fixed number of input
patterns from the input layer into a higher-dimensional output or Kohonen layer.
Training in the Kohonen network begins with the winner’s neighbourhood of a fairly large size.
Then, as training proceeds, the neighbourhood size gradually decreases.
y1
Output Signals
Input Signals
x1
y2
x2
y3
Input Output
layer layer
where xi and wij are the ith elements of the vectors X and Wj, respectively.
To identify the winning neuron, jX, that best matches the input vector X, we may apply the
following condition:
jX min X W j , j = 1, 2, . . ., m
j
The weight vector W3 of the wining neuron 3 becomes closer to the input vector X with
each iteration.