NN 1

Neural Networks
Neural Networks NN 1 1
Course Outline
Theory covers basic topics in neural networks
theory and application to supervised and
unsupervised learning.
What are Neural Networks?
• Simple computational elements forming a
large network
– Emphasis on learning (pattern recognition)
– Local computation (neurons)
• Definition of NNs is vague

– Often | but not always | inspired by biological brain
History
• Roots of work on NN are in:
• Neurobiological studies (more than one century ago):
• How do nerves behave when stimulated by different magnitudes
of electric current? Is there a minimal threshold needed for
nerves to be activated? Given that no single nerve cel is long
enough, how do different nerve cells communicate among each
other?
• Psychological studies:
• How do animals learn, forget, recognize and perform other types
of tasks?
• Psycho-physical experiments helped to understand how individual
neurons and groups of neurons work.
• McCulloch and Pitts introduced the first mathematical model of
single neuron, widely applied in subsequent work.
History
Prehistory:
• Golgi and Ramon y Cajal study the nervous system and discover
neurons (end of 19th century)
History (brief):
• McCulloch and Pitts (1943): the first artificial neural network with
binary neurons
• Hebb (1949): learning = neurons that are together wire together
• Minsky (1954): neural networks for reinforcement learning
• Taylor (1956): associative memory
• Rosenblatt (1958): perceptron, a single neuron for supervised
learning
History
• Widrow and Hoff (1960): Adaline
• Minsky and Papert (1969): limitations of single-layer perceptrons (and
they erroneously claimed that the limitations hold for multi-layer
perceptrons)
Stagnation in the 70's:
• Individual researchers continue laying foundations
• von der Marlsburg (1973): competitive learning and self-organization
Big neural-nets boom in the 80's
• Grossberg: adaptive resonance theory (ART)
• Hopfield: Hopfield network
• Kohonen: self-organising map (SOM)
History
• Oja: neural principal component analysis (PCA)

• Ackley, Hinton and Sejnowski: Boltzmann machine
• Rumelhart, Hinton and Williams: backpropagation
Diversification during the 90's:
• Machine learning: mathematical rigor, Bayesian methods,
infomation theory, support vector machines (now state of the
art!), ...
• Computational neurosciences: workings of most subsystems of the
brain are understood at some level; research ranges from low-level
compartmental models of individual neurons to large-scale brain
models
Course Topics
Learning Tasks
Supervised Unsupervised
Data: Data:
Labeled examples Unlabeled examples
(input , desired output) (different realizations of the
input)
Tasks:
classification Tasks:
pattern recognition clustering
regression content addressable memory
NN models:
perceptron NN models:
adaline self-organizing maps (SOM)
feed-forward NN Hopfield networks
radial basis function
support vector machines
NNs: goal and design
– Knowledge about the learning task is given in the
form of a set of examples (dataset) called training
examples.
– A NN is specified by:
• an architecture: a set of neurons and links connecting
neurons. Each link has a weight,
• a neuron model: the information processing unit of the
NN,
• a learning algorithm: used for training the NN by
modifying the weights in order to solve the particular
learning task correctly on the training examples.
The aim is to obtain a NN that generalizes well, that
is, that behaves correctly on new examples of the
learning task.
Example: Alvinn
Autonomous
driving at 70 mph
Camera on a public
image highway
30 outputs
for steering 30x32 weights
4 hidden into one out of
units four hidden
unit
30x32 pixels
as inputs
Dimensions of a Neural
Network
• network architectures
• types of neurons
• learning algorithms
• applications
Network architectures
• Three different classes of network architectures
– single-layer feed-forward neurons are organized

– multi-layer feed-forward in acyclic layers
– recurrent
• The architecture of a neural network is linked with the

learning algorithm used to train
Single Layer Feed-forward
Input layer Output layer

of of
source nodes neurons
Multi layer feed-forward
3-4-2 Network
Input Output
layer layer
Hidden Layer
Recurrent network
Recurrent Network with hidden neuron: unit delay
operator z-1 is used to model a dynamic system
z-1
input
z-1 hidden
output
z-1
The Neuron
Bias
b
x1 w1
Activation
Local function
Field
  (−)
Output
x2 w2 v y
Input
values
…………. Summing
function
xm wm
weights
Input Signal and Weights
Input signals Weights
An input may be either a Weights are connected
raw / preprocessed signal or between an input and a
image. Alternatively, some summing node. These affect to
specific features can also be the summing operation.
used. The quality of network can be
seen from weights
If specific features are used
as input, their number and Bias is a constant input with
selection is crucial and certain weight.
application dependent Usually the weights are
randomized in the beginning
The Neuron
• The neuron is the basic information processing unit of
a NN. It consists of:
1 A set of links, describing the neuron inputs, with
weights W1, W2, …, Wm
2 An adder function (linear combiner) for computing
the weighted sum of m
the inputs (real numbers): u= w x j j
j =1
3 Activation function (squashing function)  for

limiting the amplitude of the neuron output.
y =  (u + b)
Bias of a Neuron
• The bias b has the effect of applying an affine
transformation to the weighted sum u
v=u+b
• v is called induced field of the neuron
u = x1 − x2
x1-x2= -1
x2 x1-x2=0
x1-x2= 1
x1
Bias as extra input
• The bias is an external parameter of the neuron. It can be
modeled by adding an extra input. m
v =  wj xj
w0
x0 = +1 j =0
w0 = b
x1 w1 Activation
Local function
Field
  (−)
Input Output
signal v
x2 w2 y
Summing
………….. function
xm wm Synaptic
weights
Activation Function 
There are different activation functions used in different applications. The
most common ones are:
Hard-limiter Piecewise linear Sigmoid Hyperbolic tangent
1 if v  1 2
1 if v  0
 (v ) =    (v ) = tanh (v )
 (v ) = v if 1 2  v  − 1 2  (v ) =
1
0 if v  0 0 1 + exp( −av)
 if v  −1 2
Neuron Models
• The choice of  determines the neuron model. Examples:
• step function: a if v  c
 (v ) = 
b if v  c
• ramp function:
a if v  c

 (v ) = b if v  d
a + ((v − c)(b − a ) /(d − c)) otherwise

• sigmoid function:
with z,x,y parameters 1
 (v ) = z +
1 + exp( − xv + y )
1  1  v −  2 
• Gaussian function:  (v ) = exp −   
2   2   
 
Learning Algorithms
Depend on the network architecture:

• Error correcting learning (perceptron)
• Delta rule (AdaLine, Backprop)
• Competitive Learning (Self Organizing Maps)
Applications
• Classification:
– Image recognition
– Speech recognition
– Diagnostic
– Fraud detection
– …
• Regression:
– Forecasting (prediction on base of past history)
– …
• Pattern association:
– Retrieve an image from corrupted one
– …
• Clustering:
– clients profiles
– disease subtypes
– …
Radial basis function networks Feed-forward networks
Non-linear classifiers
Supervised learning Support vector machines
Linear classifiers
Perceptron Adaline
Self-organizing maps K-means

Clustering
Unsupervised learning
Content addressable memories

Optimization Hopfield networks

NN 1

Uploaded by

Copyright:

Available Formats

NN 1

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

NN 1

Uploaded by

Copyright:

Available Formats

Neural Networks

• Definition of NNs is vague

• Oja: neural principal component analysis (PCA)

• Three different classes of network architectures

– single-layer feed-forward neurons are organized

• The architecture of a neural network is linked with the

Input layer Output layer

3 Activation function (squashing function)  for

Hard-limiter Piecewise linear Sigmoid Hyperbolic tangent

Depend on the network architecture:

Supervised learning Support vector machines

Self-organizing maps K-means

Content addressable memories

You might also like