Notes Lect 17autoassociated - Hopfield

• Auto-associative Memory Model - Hopfield model (single layer)
Auto-associative memory means patterns rather than associated

pattern pairs, are stored in memory. Hopfield model is one-layer
unidirectional auto-associative memory.
Hopfield network alternate view
W14 W24
1
W34
W13 W23 W43
W12 W32 W42
W21 W31
W41
4 2
connection 1 2 3 4 neurons
weights wij I1 I2 I3 I4
inputs I 3
V1 V2 V3 V4
outputs V
Fig. Hopfield model with four units
 the model consists, a single layer of processing elements where each
unit is connected to every other unit in the network but not to itself.
 connection weight between or from neuron j to i is given by a
number wij. The collection of all such numbers are represented
by the weight matrix W which is square and symmetric, ie , w i j = w j i
for i, j = 1, 2,............, m.
 each unit has an external input I which leads to a modification
in the computation of the net input to the units as

m
input j =  xi w i j + I j for j = 1, 2, . . ., m.
i=1
th
and xi is the i component of pattern Xk
 each unit acts as both input and output unit. Like linear associator,
a single associated pattern pair is stored by computing the weight

T
matrix as Wk = X k Yk where XK = YK
 Weight Matrix : Construction of weight matrix W is accomplished by
p
summing those individual correlation matrices, ie, W =   Wk where
k=1
 is the constant of proportionality, for normalizing, usually set to 1/p
to store p different associated pattern pairs. Since the Hopfield
model is an auto-associative memory model, it is the patterns
rather than associated pattern pairs, are stored in memory.
 Decoding : After memorization, the network can be used for retrieval;
the process of retrieving a stored pattern, is called decoding; given an

input pattern X, the decoding or retrieving is accomplished by
m
computing, first the net Input as input j =  xi w i j where input j
j=1
stands for the weighted sum of the input or activation value of node j ,
th
for j = 1, 2, ..., n. and xi is the i component of pattern Xk , and
then determine the units Output using a bipolar output function:
+1 if input j  θ j
Y j=
-1 other wise
where θ j is the threshold value of output neuron j .
Note: The output units behave like linear threshold units; that compute a
weighted sum of the input and produces a -1 or +1 depending whether the
weighted sum is below or above a certain threshold value.
Decoding in the Hopfield model is achieved by a collective and

recursive relaxation search for a stored pattern given an initial
stimulus pattern. Given an input pattern X, decoding is accomplished
by computing the net input to the units and determining the output of
those units using the output function to produce the pattern X'. The
pattern X' is then fed back to the units as an input pattern to produce
the pattern X''. The pattern X'' is again fed back to the units to
produce the pattern X'''. The process is repeated until the network
stabilizes on a stored pattern where further computations do not
change the output of the units.
In the next section, the working of an auto-correlator : how to store
patterns, recall a pattern from the stored patterns and how to
recognize a noisy pattern are explained.
• Bidirectional Associative Memory (two-layer)
Kosko (1988) extended the Hopfield model, which is single layer,

by incorporating an additional layer to perform recurrent auto-
associations as well as hetero-associations on the stored memories. The
network structure of the bidirectional associative memory model is
similar to that of the linear associator but the connections are
bidirectional; i.e.,
wij = wji , for i = 1, 2, . . . , n and j = 1, 2, . . . , m.
neurons
neurons
weights wij
x1 w11
y1
w12 w21
x2
w22 y2
inputs
w2m outputs
w1m
wn1
wn2
Xn wnm
Ym
Fig. Bidirectional Associative Memory model
 In the bidirectional associative memory, a single associated pattern

T
pair is stored by computing the weight matrix as Wk = X k Yk .
 the construction of the connection weight matrix W, to store p

different associated pattern pairs simultaneously, is accomplished
by summing up the individual correlation matrices Wk ,
p
i.e., W =  Wk
 k=1
where  is the proportionality or normalizing constant.
20
1. Auto-associative Memory (auto-correlators)
In the previous section, the structure of the Hopfield model has been
explained. It is an auto-associative memory model which means patterns,
rather than associated pattern pairs, are stored in memory. In this
section, the working of an auto-associative memory (auto-correlator) is
illustrated using some examples.
Working of an auto-correlator :
 how to store the patterns,
 how to retrieve / recall a pattern from the stored patterns, and

 how to recognize a noisy pattern
21
• How to Store Patterns : Example
Consider the three bipolar patterns A1 , A2 , A3 to be stored as

an auto-correlator.
A1 = (-1, 1 , -1 , 1 )
A2 = ( 1, 1 , 1 , -1 )
A3 = (-1, -1 , -1 , 1 )
Note that the outer product of two vectors U and V is
U1 U1V1 U1V2 U1V3

T U2 U2V1 U2V2 U2V3
U V U V = V1 V2 V3 =
U3 U3V1 U3V2 U3V3
= U4 U4V1 U4V2 U4V3
Thus, the outer products of each of these three A1 , A2 , A3 bipolar

patterns are
j
T 1 -1 1 -1
-1 1 -1 1
[A1 ] 4x1 [A1 ] 1x4 =
1 -1 1 -1
-1 1 -1 1
i
T
1 1 1 -1
1 1 1 -1
[A2 ] 4x1 [A2 ] 1x4 = 1 1 1 -1
-1 -1 -1 1
i
j
1 1 1 -1
T
1 1 1 -1
[A3 ] 4x1 [A3 ] 1x4 = 1 1 1 -1
-1 -1 -1 1
i
Therefore the connection matrix is

3 T
T = [t i j ] =  [Ai ] 4x1 [Ai ] 1x4
i=1
=
i
This is how the patterns are stored .
22
• Retrieve a Pattern from the Stored Patterns (ref. previous slide)
The previous slide shows the connection matrix T of the three

bipolar patterns A1 , A2, A3 stored as
i
3 T 3 1 3 -3
1 3 1 -1
T = [t i j ] =  [Ai ] 4x1 [Ai ] 1x4 =
3 1 3 -3
i=1
j -3 -1 -3 3
and one of the three stored pattern is A2 = ( 1, 1 , 1 , -1 )

ai
 Retrieve or recall of this pattern A2 from the three stored patterns.
 The recall equation is

new old
aj = ƒ ( ai t ij , aj ) for j = 1 , 2 , . . . , p
Computation for the recall equation A2 yields  =  ai t ij and

then find 
i= 1 2 3 4  
=  ai t i , j=1 1x3 + 1x1 + 1x3 + -1x-3 = 10 1
=  ai t i , j=2 1x1 + 1x3 + 1x1 + -1x-1 = 6 1
=  ai t i , j=3 1x3 + 1x1 + 1x3 + -1x-3 = 10 1
=  ai t i , j=4 1x-3 + 1x-1 + 1x-3 + -1x3 = -1 -1
new old
Therefore aj = ƒ ( ai t ij , aj ) for j = 1 , 2 , . . . , p is ƒ ( ,  )
new
a1
= ƒ (10 , 1)
new
a2 = ƒ (6 , 1)
new
a3 = ƒ (10 , 1)
new
a4 = ƒ (-1 , -1)
The values of  is the vector pattern ( 1, 1 , 1 , -1 ) which is A2 .
This is how to retrieve or recall a pattern from the stored patterns.
Similarly, retrieval of vector pattern A3 as

new
new new new
(a1 , a , a3 , a 4 , ) = ( -1, -1 , -1 , 1 ) = A3
2
23
• Recognition of Noisy Patterns (ref. previous slide)
Consider a vector A' = ( 1, 1 , 1 , 1 ) which is a noisy presentation

of one among the stored patterns.
 find the proximity of the noisy vector to the stored patterns
using Hamming distance measure.
 note that the Hamming distance (HD) of a vector X from Y, where
X = (x1 , x2 , . . . , xn) and Y = ( y1, y2 , . . . , yn) is given by

m
HD (x , y)  | (xi - yi ) |
= i=1
The HDs of A' from each of the stored patterns A1 , A2, A3 are
HD (A' , A1) =  |(x1 - y1 )|, |(x2 - y2)|, |(x3 - y3 )|, |(x4 - y4 )|

=  |(1 - (-1))|, |(1 - 1)|, |(1 - (-1) )|, |(1 - 1)|
= 4
HD (A' , A2) = 2
HD (A' , A3) = 6
Therefore the vector A' is closest to A2 and so resembles it.

In other words the vector A' is a noisy version of vector A2.
Computation of recall equation using vector A' yields :
i= 1 2 3 4  
=  ai t i , j=1 1x3 + 1x1 + 1x3 + 1x-3 = 4 1
=  ai t i , j=2 1x1 + 1x3 + 1x1 + 1x-1 = 4 1
=  ai t i , j=3 1x3 + 1x1 + 1x3 + 1x-3 = 4 1
=  ai t i , j=4 1x-3 + 1x-1 + 1x-3 + 1x3 = -4 -1
new old
Therefore aj = ƒ ( ai t i , aj ) for j = 1 , 2 , . . . , p is ƒ ( ,  )
new j
a1
new = ƒ (4 , 1)
a2
new
= ƒ (4 , 1)
a3
new
= ƒ (4 , 1)
a4
= ƒ (-4 , -1)
The values of  is the vector pattern ( 1, 1 , 1 , -1 ) which is A2 .

Note : In presence of noise or in case of partial representation of vectors,
an autocorrelator results in the refinement of the pattern or removal of
noise to retrieve the closest matching stored pattern.
24
Bidirectional Hetero-associative Memory
The Hopfield one-layer unidirectional auto-associators have been discussed

in previous section. Kosko (1987) extended this network to two-layer
bidirectional structure called Bidirectional Associative Memory (BAM) which
can achieve hetero-association. The important performance attributes of the
BAM is its ability to recall stored pairs particularly in the presence of noise.
Definition : If the associated pattern pairs (X, Y) are different and if the
model recalls a pattern Y given a pattern X or vice-versa, then it is
termed as hetero-associative memory.
This section illustrates the bidirectional associative memory :

 Operations (retrieval, addition and deletion) ,
 Energy Function (Kosko's correlation matrix, incorrect recall of pattern),
 Multiple training encoding strategy (Wang's generalized correlation matrix).
25
Bidirectional Associative Memory (BAM) Operations
BAM is a two-layer nonlinear neural network.

Denote one layer as field A with elements Ai and the other layer
as field B with elements Bi.
The basic coding procedure of the discrete BAM is as follows.

Consider N training pairs { (A1 , B1) , (A2 , B2), . . , (Ai , Bi), . . (AN , BN) }
where Ai = (ai1 , ai2 , . . . , ain) and Bi = (bi1 , bi2 , . . . , bip) and
aij , bij are either in ON or OFF state.
 in binary mode , ON = 1 and OFF = 0 and
in bipolar mode, ON = 1 and OFF = -1
N
 the original correlation matrix of the BAM is M0 = T
 [ Xi ] [ Yi ]
i=1
where Xi = (xi1 , xi2 , . . . , xin) and Yi = (yi1 , yi2 , . . . , yip)
and xij(yij) is the bipolar form of aij(bij)
The energy function E for the pair ( ,  ) and correlation matrix M is

T
E=-M 
With this background, the decoding processes, means the operations

to retrieve nearest pattern pairs, and the addition and deletion of
the pattern pairs are illustrated in the next few slides.
26
• Retrieve the Nearest of a Pattern Pair, given any pair
(ref : previous slide)
Example
Retrieve the nearest of (Ai , Bi) pattern pair, given any pair ( ,  ) .
The methods and the equations for retrieve are :

 start with an initial condition which is any given pattern pair ( ,  ),
 determine a finite sequence of pattern pairs (' , ' ) , (" , " ) . . . .
until an equilibrium point (f , f ) is reached, where

T
' =  ( M ) and ' =  ( ' M )
T
" =  (' M ) and " = ( '' M )
 (F) = G = g1 , g2 , . . . . , gr ,
F = ( f1 , f2 ,........., fr )
M is correlation matrix
1 if fi>0
0 (binary)
gi = , fi<0
-1 (bipolar)
previous g i , fi= 0
Kosko has proved that this process will converge for any
correlation matrix M.
27
• Addition and Deletion of Pattern Pairs
Given a set of pattern pairs (Xi , Yi) , for i = 1 , 2, . . . , n and a set

of correlation matrix M :
 a new pair (X' , Y') can be added or
 an existing pair (Xj , Yj) can be deleted from the memory model.
Addition : add a new pair (X' , Y') , to existing correlation matrix M ,

them the new correlation matrix Mnew is given by
T T T T
Mnew = X1 Y1 + X1 Y1 + . . . . + Xn Yn + X' Y'
Deletion : subtract the matrix corresponding to an existing pair (Xj , Yj)
from the correlation matrix M , them the new correlation matrix Mnew
is given by
T
Mnew = M - ( Xj Yj )
Note : The addition and deletion of information is similar to the

functioning of the system as a human memory exhibiting learning
and forgetfulness.
28
Energy Function for BAM
Note : A system that changes with time is a dynamic system. There are two types
of dynamics in a neural network. During training phase it iteratively update
weights and during production phase it asymptotically converges to the solution
patterns. State is a collection of qualitative and qualitative items that characterize
the system e.g., weights, data flows. The Energy function (or Lyapunov function)
is a bounded function of the system state that decreases with time and the
system solution is the minimum energy.
Let a pair (A , B) defines the state of a BAM.
 to store a pattern, the value of the energy function for that pattern
has to occupy a minimum point in the energy landscape.

 also adding a new patterns must not destroy the previously
stored patterns.
The stability of a BAM can be proved by identifying the energy function E
with each state (A , B) .
 For auto-associative memory : the energy function is
T
E(A) = - AM A
 For bidirecional hetero associative memory : the energy function is
E(A, B) = - AM T
B ; for a particular case A = B , it corresponds
to Hopfield auto-associative function.
We wish to retrieve the nearest of (Ai , Bi) pair, when any ( ,  ) pair
is presented as initial condition to BAM. The neurons change
their states until a bidirectional stable state (Af , Bf) is reached. Kosko
has shown that such stable state is reached for any matrix M when it
corresponds to local minimum of the energy function. Each cycle of
decoding lowers the energy E if the energy function for any point
( ,  ) is given by T
E= M
If the energy T
E = Ai M B evaluated using coordinates of the pair
i
(Ai , Bi) does not constitute a local minimum, then the point cannot
be recalled, even though one starts with  = Ai. Thus Kosko's encoding
method does not ensure that the stored pairs are at a local minimum.
29
• Example : Kosko's BAM for Retrieval of Associated Pair
The working of Kosko's BAM for retrieval of associated pair.
Start with X3, and hope to retrieve the associated pair Y3 .
Consider N = 3 pattern pairs (A1 , B1 ) , (A2 , B2 ) , (A3 , B3 ) given by
A1 = ( 1 0 0 0 0 1 ) B1 = ( 1 1 0 0 0 )
A2 = ( 0 1 1 0 0 0 ) B2 = ( 1 0 1 0 0 )
A3 = ( 0 0 1 0 1 1 ) B3 = ( 0 1 1 1 0 )
Convert these three binary pattern to bipolar form replacing 0s by -1s.

X1 = ( 1 -1 -1 -1 -1 1 ) Y1 = ( 1 1 -1 -1 -1 )
X2 = ( -1 1 1 -1 -1 -1 ) Y2 = ( 1 -1 1 -1 -1 )
X3 = ( -1 -1 1 -1 1 1 ) Y3 = ( -1 1 1 1 -1 )
The correlation matrix M is calculated as 6x5 matrix
1 1 -3 -1 1
1 -3 1 -1 1
T T T -1 -1 3 1 -1
M = X1 Y1 + X2 Y2 + X3 Y3 =
-1 -1 -1 1 3
-3 1 1 3 1
-1 3 -1 1 -1
Suppose we start with  = X3, and we hope to retrieve the associated pair
Y3 . The calculations for the retrieval of Y3 yield :
M = ( -1 -1 1 -1 1 1 ) ( M ) = ( -6 6 6 6 -6 )
 ( M) = ' = ( -1 1 1 1 -1 )
T
' M = ( -5 -5 5 -3 7 5 )
T
 (' M ) = ( -1 -1 1 -1 1 1 ) = '
' M = ( -1 -1 1 -1 1 1 )M = ( -6 6 6 6 -6 )
 (' M) = " = ( -1 1 1 1 1 -1 )
= '
This retrieved patern ' is same as Y3 .

Hence, (f , f) = (X3 , Y3 ) is correctly recalled, a desired result .
30
• Example : Incorrect Recall by Kosko's BAM
The Working of incorrect recall by Kosko's BAM.

Start with X2, and hope to retrieve the associated pair Y2 .
A1 = ( 1 0 0 1 1 1 0 0 0 ) B1 = ( 1 1 1 0 0 0 0 1 0 )
A2 = ( 0 1 1 1 0 0 1 1 1 ) B2 = ( 1 0 0 0 0 0 0 0 1 )
A3 = ( 1 0 1 0 1 1 0 1 1 ) B3 = ( 0 1 0 1 0 0 1 0 1 )
X1 = ( 1 -1 -1 1 1 1 -1 -1 -1 ) Y1 = ( 1 1 1 -1 -1 -1 -1 1 -1 )
X2 = ( -1 1 1 1 -1 -1 1 11 ) Y2 = ( 1 -1 -1 -1 -1 -1 -1 -1 1 )
X3 = ( 1 -1 1 -1 1 1 -1 1 1 ) Y3 = ( -1 1 -1 1 -1 -1 1 0 1 )
The correlation matrix M is calculated as 9 x 9 matrix

T T T
M = X1 Y1 + X2 Y2 + X3 Y3
-1 3 1 1 -1 -1 1 1 -1
1 -3 -1 -1 1 1 -1 -1 1
-1 -1 -3 1 -1 -1 1 -3 3
3 -1 1 -3 -1 -1 -3 1 -1
= -1 3 1 1 -1 -1 1 1 -1
-1 3 1 1 -1 -1 1 1 -1
1 -3 -1 -1 1 1 -1 -1 1
-1 -1 -3 1 -1 -1 1 -3 3
-1 -1 -3 1 -1 -1 1 -3 3
SC - Bidirectional hetero AM
Let the pair (X2 , Y2 ) be recalled.

X2 = ( -1 1 1 1 -1 -1 1 11 ) Y2 = ( 1 -1 -1 -1 -1 -1 -1 -1 1 )
Start with  = X2, and hope to retrieve the associated pair Y2 .
The calculations for the retrieval of Y2 yield :
M = ( 5 -19 -13 -5 1 1 -5 -13 13 )
 ( M) = ( 1 -1 -1 -1 1 1 -1 -1 1 ) = '
T
' M = ( -11 11 5 5 -11 -11 11 5 5 )
T =
 (' M ) = ( -1 1 1 1 -1 -1 1 1 1 ) '
' M = ( 5 -19 -13 -5 1 1 -5 -13 13 )
 (' M) = ( 1 -1 -1 -1 1 1 -1 -1 1 ) = "
= '
Here " = ' . Hence the cycle terminates with
F = ' = ( -1 1 1 1 -1 -1 1 1 1 ) = X2
F = ' = ( 1 -1 -1 -1 1 1 -1 -1 1 ) ≠ Y2
But ' is not Y2 . Thus the vector pair (X2 , Y2) is not recalled correctly
by Kosko's decoding process.
32
Check with Energy Function : Compute the energy functions

T
for the coordinates of pair (X2 , Y2) , the energy E2 = - X2 M Y2 = -71
for the coordinates of pair (F , F) , the energy EF = - F M F = -75

T
However, the coordinates of pair (X2 , Y 2) is not at its local

minimum can be shown by evaluating the energy E at a point which
is "one Hamming distance" way from Y2 . To do this consider a point
Y2 ' = ( 1 -1 -1 -1 1 -1 -1 -1 1 )
where the fifth component -1 of Y2 has been changed to 1. Now

T
E = - X2 M Y′2 = - 73
which is lower than E2 confirming the hypothesis that (X2 , Y2) is not
at its local minimum of E.
Note : The correlation matrix M used by Kosko does not guarantee

that the energy of a training pair is at its local minimum. Therefore , a
pair Pi can be recalled if and only if this pair is at a local minimum
of the energy surface.
33
Multiple Training Encoding Strategy
Note : (Ref. example in previous section). Kosko extended the unidirectional

auto-associative to bidirectional associative processes, using correlation matrix
T
M=  Xi Yi computed from the pattern pairs. The system proceeds to
retrieve the nearest pair given any pair ( ,  ), with the help of recall
equations. However, Kosko's encoding method does not ensure that the stored
pairs are at local minimum and hence, results in incorrect recall.
Wang and other's, introduced multiple training encoding strategy which

ensures the correct recall of pattern pairs. This encoding strategy is an
enhancement / generalization of Kosko's encoding strategy. The Wang's
T
generalized correlation matrix is M =  qi Xi Yi where qi is viewed
T
as pair weight for Xi Yi as positive real numbers. It denotes the
minimum number of times for using a pattern pair (Xi , Yi) for training to
guarantee recall of that pair.
To recover a pair (Ai , Bi) using multiple training of order q, let us

augment or supplement matrix M with a matrix P defined as
P = (q – T
1) Xi Yi where (Xi , Yi) are the bipolar form of (Ai , Bi).
The augmentation implies adding (q - 1) more pairs located at (Ai , Bi) to

the existing correlation matrix. As a result the energy E' can reduced to
an arbitrarily low value by a suitable choice of q. This also ensures that
the energy at (Ai , Bi) does not exceed at points which are one Hamming
distance away from this location.
The new value of the energy function E evaluated at (Ai , Bi) then becomes
T T T
E' (Ai , Bi) = – Ai M B
i – (q – 1) Ai Xi i Bi
The next few slides explains the step-by-step implementation of

Multiple training encoding strategy for the recall of three pattern pairs
(X1 , Y1 ) , (X1 , Y1 ) , (X1 , Y1 ) using one and same augmentation matrix
M . Also an algorithm to summarize the complete process of multiple
training encoding is given.
34
• Example : Multiple Training Encoding Strategy
The working of multiple training encoding strategy which ensures the

correct recall of pattern pairs.

A1 = ( 1 0 0 1 1 1 0 0 0 ) B1 = ( 1 1 1 0 0 0 0 1 0 )
A2 = ( 0 1 1 1 0 0 1 1 1 ) B2 = ( 1 0 0 0 0 0 0 0 1 )
A3 = ( 1 0 1 0 1 1 0 1 1 ) B3 = ( 0 1 0 1 0 0 1 0 1 )

X1 = ( 1 -1 -1 1 1 1 -1 -1 -1 ) Y1 = ( 1 1 1 -1 -1 -1 -1 1 -1 )
X2 = ( -1 1 1 1 -1 -1 1 11 ) Y2 = ( 1 -1 -1 -1 -1 -1 -1 -1 1 )
X3 = ( 1 -1 1 -1 1 1 -1 1 1 ) Y3 = ( -1 1 -1 1 -1 -1 1 0 1 )
Let the pair (X2 , Y2) be recalled.
X2 = ( -1 1 1 1 -1 -1 1 11 ) Y2 = ( 1 -1 -1 -1 -1 -1 -1 -1 1 )
T
Choose q=2, so that P = X2 Y2 ,the augmented correlation matrix M
T T T
becomes M = X1 Y1 + 2 X2 Y2 + X3 Y3
4 2 2 0 0 2 2 -2
2 -4 -2 -2 0 0 -2 -2 2
0 -2 -4 0 -2 -2 0 -4 4
= 4 -2 0 -4 -2 -2 -4 0 0
-2 4 2 2 0 0 2 2 -2
-2 4 2 2 0 0 2 2 -2
2 -4 -2 -2 0 0 -2 -2 2
0 -2 -4 0 -2 -2 0 -4 4
0 -2 -4 0 -2 -2 0 -4 4
( Continued in next slide )

35
Now give  = X2, and see that the corresponding pattern pair  = Y2
is correctly recalled as shown below.
 M = ( 14 -28 -22 -14 -8 -8 -14 -22 22 )
 ( M) = ( 1 -1 -1 -1 -1 -1 -1 -1 1 ) = '
T
' M = ( -16 16 18 18 -16 -16 16 18 18 )
T =
 (' M ) = ( -1 1 1 1 -1 -1 1 1 1 ) '
' M = ( 14 -28 -22 -14 -8 -8 -14 -22 23 )
 (' M) = ( 1 -1 -1 -1 1 1 -1 -1 1 ) = "
F = ' = ( -1 1 1 1 -1 -1 1 1 1 ) = X2
F = ' = ( 1 -1 -1 -1 1 1 -1 -1 1 ) = Y2
Thus, (X2 , Y2 ) is correctly recalled, using augmented correlation

matrix M . But, it is not possible to recall (X1 , Y1) using the same
matrix M as shown in the next slide.
36
Note : The previous slide showed that the pattern pair (X2 , Y2 ) is correctly
recalled, using augmented correlation matrix
T
T T
M = X1 Y1 X3 Y3
+ 2 X2 Y2 +
but then the same matrix M can not recall correctly the other
pattern pair (X1 , Y1 ) as shown below.
X1 = ( 1 -1 -1 1 1 1 -1 -1 -1 ) Y1 = ( 1 1 1 -1 -1 -1 -1 1 -1 )
Let  = X1 and to retrieve the associated pair Y1 the calculation shows
M = ( -6 24 22 6 4 4 6 22 -22 )
 ( M) = ( -1 1 1 1 1 1 1 1 -1 ) = '
T
' M = ( 16 -16 -18 -18 16 16 -16 -18 -18 )
T =
 (' M ) = ( 1 -1 -1 -1 1 1 -1 -1 -1 ) '
' M = ( -14 28 22 14 8 8 14 22 -22 )
 (' M) = ( -1 1 1 1 1 1 1 1 -1 ) = "
F = ' = ( 1 -1 -1 -1 1 1 -1 -1 -1 ) = X1
F = ' = ( -1 1 1 1 1 1 1 1 -1 ) ≠ Y1
Thus, the pattern pair (X1 , Y1 ) is not correctly recalled, using augmented
correlation matrix M.
To tackle this problem, the correlation matrix M needs to be further

augmented by multiple training of (X1 , Y1 ) as shown in the next slide.
37
The previous slide shows that pattern pair (X1 , Y1) cannot be recalled
under the same augmentation matrix M that is able to recall (X2 , Y2).
However, this problem can be solved by multiple training of (X1 , Y1)

which yields a further change in M to values by defining
T T T
M = 2 X1 Y1 + 2 X2 Y2 + X3 Y3
-1 5 3 1 -1 -1 1 3 -3
1 -5 -3 -1 1 1 -1 -3 3
-1 -3 -5 1 -1 -1 1 -5 5
5 -1 1 -5 -3 -3 -5 1 -1
= -1 5 3 1 -1 -1 1 3 -3
-1 5 3 1 -1 -1 1 3 -3
1 -5 -3 -1 1 1 -1 -3 3
-1 -3 -5 1 -1 -1 1 -5 5
-1 -3 -5 1 -1 -1 1 -5 5
Now observe in the next slide that all three pairs can be correctly recalled.
38
Recall of pattern pair(X1 , Y1 )
X1 =(1 -1 -1 1 1 1 -1 -1 -1 )
Y1 =(1 1 1 -1 -1 -1 -1 1 -1 )

M = ( 3 33 31 -3 -5 -5 -3 31 -31 )
 ( M) = ( 1 1 1 -1 -1 -1 -1 1 -1 ) = '
(' M T ) = ( 13 -13 -19 23 13 13 -13 -19 -19 )
 (' M T ) =( 1-1-1111-1-1-1) ='
' M =( 3 33 31-3-5-5-3 31 -31)
 (' M) =( 111-1-1-1-11-1) = "

F = ' = ( 1 -1 -1 1 1 1 -1 -1 -1 ) = X1
F = ' = ( 1 1 1 -1 -1 -1 -1 1 -1 ) = Y1
Thus, the pattern pair (X1 , Y1 ) is correctly recalled

X2 =( -1 1 1 1 -1 -1 1 1 1 )
Y2 = (1 -1 -1 -1 -1 -1 -1 -1 1 )

M = ( 7 -35 -29 -7 -1 -1 -7 -29 29 )
 ( M) = (' M ) =
T
( 1 -1 -1 -1 -1 -1 -1 -1 1 ) = '
 (' M T ) =
( -15 15 17 19 -15 -15 15 17 17 )
' M = ( -1 1 1 1 -1 -1 1 1 1 ) = '
 (' M) = ( 7 -35 -29 -7 -1 -1 -7 -29 29 )
Here " = ' . Hence
( the1 cycle
-1 terminates
-1 -1 -1with -1 -1 -1 1 ) = "
F = ' = ( -1 1 1 1 -1 -1 1 1 1 ) = X2
F = ' = ( 1 -1 -1 -1 -1 -1 -1 -1 1 ) = Y2

X3 = (1 -1 1 -1 1 1 -1 1 1 )
Y3 = (-1 1 -1 1 -1 -1 1 0 1 )

M =
( -13 17 -1 13 -5 -5 13 -1 1 )
 ( M) = (' M ) =
T
( -1 1 -1 1 -1 -1 1 -1 1 ) = '
 (' M T ) =
( 11 -11 27 -63 11 11 -11 27 27 )
' M =
( 1 -1 1 -1 1 1 -1 1 1 ) = '
 (' M) =
( -13 17 -1 13 -5 -5 13 -1 1 )
( -1 1 -1 1 -1 -1 1 -1 1 ) = "
F = ' = ( 1 -1 1 -1 1 1 -1 1 1 ) = X3
F = ' = ( -1 1 -1 1 -1 -1 1 0 1 ) = Y3
( Continued in next slide )

39
Thus, the multiple training encoding strategy ensures the correct recall
of a pair for a suitable augmentation of M . The generalization of the
correlation matrix, for the correct recall of all training pairs, is written as
N T
M =  qi X i Yi where qi 's are +ve real numbers.
i=1
This modified correlation matrix is called generalized correlation matrix.

Using one and same augmentation matrix M, it is possible to recall all
the training pattern pairs .
40
• Algorithm (for the Multiple training encoding strategy)
To summarize the complete process of multiple training encoding an

algorithm is given below.
Algorithm Mul_Tr_Encode ( N , Xi
, Yi , qi ) where
N : Number of stored patterns set

Xi Yi the bipolar pattern pairs
, :
X= ( X1, X2 . . . . , XN where Xi = ( x i 1
,x i 2 , . . .x i n )
, )
Y= ( where Yj = ( x j 1 , x j 2 , . . . x j n )
Y 1 , Y2 . . . . , YN
, )
q : is the weight vector (q1 , q2 , . . . . , qN )
Step 1 Initialize correlation matrix M to null matrix M  [0]
Step 2 Compute the correlation matrix M as
For i  1 to N
M  M  [ qi  Transpose ( Xi )  ( Xi ) end
(symbols  matrix addition,  matrix multiplication, and
 scalar multiplication)
Step 3 Read input bipolar pattern A
Step 4 Compute A_M where A_M A M
Step 5 Apply threshold function  to A_M to get B'

ie B'   ( A_M )
where  is defined as  (F) = G = g1 , g2,........, gn
Step 6 Output B' is the associated pattern pair

end
41
SC – AM References
2. References : Textbooks
1. "Neural Network, Fuzzy Logic, and Genetic Algorithms - Synthesis and

Applications", by S. Rajasekaran and G.A. Vijayalaksmi Pai, (2005), Prentice Hall,
Chapter 4, page 87-116.
2. "Elements of Artificial Neural Networks", by Kishan Mehrotra, Chilukuri K.

Mohan and Sanjay Ranka, (1996), MIT Press, Chapter 6, page 217-263.
3. "Fundamentals of Neural Networks: Architecture, Algorithms and Applications",

by Laurene V. Fausett, (1993), Prentice Hall, Chapter 3, page 101-152.
4. "Neural Network Design", by Martin T. Hagan, Howard B. Demuth and Mark

Hudson Beale, ( 1996) , PWS Publ. Company, Chapter 13, page 13-1 to 13-37.
5. "An Introduction to Neural Networks", by James A. Anderson, (1997), MIT

Press, Chapter 6-7, page 143-208.
6. Related documents from open source, mainly internet. An exhaustive list is

being prepared for inclusion at a later date.
42

Notes Lect 17autoassociated - Hopfield

Uploaded by

Copyright:

Available Formats

Notes Lect 17autoassociated - Hopfield

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Notes Lect 17autoassociated - Hopfield

Uploaded by

Copyright:

Available Formats

• Auto-associative Memory Model - Hopfield model (single layer)

Auto-associative memory means patterns rather than associated

Fig. Hopfield model with four units

 the model consists, a single layer of processing elements where each

in the computation of the net input to the units as

a single associated pattern pair is stored by computing the weight

the process of retrieving a stored pattern, is called decoding; given an

Decoding in the Hopfield model is achieved by a collective and

Kosko (1988) extended the Hopfield model, which is single layer,

wij = wji , for i = 1, 2, . . . , n and j = 1, 2, . . . , m.

Fig. Bidirectional Associative Memory model

 In the bidirectional associative memory, a single associated pattern

 the construction of the connection weight matrix W, to store p

where  is the proportionality or normalizing constant.

 how to store the patterns,

 how to retrieve / recall a pattern from the stored patterns, and

Consider the three bipolar patterns A1 , A2 , A3 to be stored as

Note that the outer product of two vectors U and V is

U1 U1V1 U1V2 U1V3

Thus, the outer products of each of these three A1 , A2 , A3 bipolar

Therefore the connection matrix is

The previous slide shows the connection matrix T of the three

and one of the three stored pattern is A2 = ( 1, 1 , 1 , -1 )

 The recall equation is

Computation for the recall equation A2 yields  =  ai t ij and

The values of  is the vector pattern ( 1, 1 , 1 , -1 ) which is A2 .

This is how to retrieve or recall a pattern from the stored patterns.

Similarly, retrieval of vector pattern A3 as

Consider a vector A' = ( 1, 1 , 1 , 1 ) which is a noisy presentation

X = (x1 , x2 , . . . , xn) and Y = ( y1, y2 , . . . , yn) is given by

HD (A' , A1) =  |(x1 - y1 )|, |(x2 - y2)|, |(x3 - y3 )|, |(x4 - y4 )|

Therefore the vector A' is closest to A2 and so resembles it.

The values of  is the vector pattern ( 1, 1 , 1 , -1 ) which is A2 .

The Hopfield one-layer unidirectional auto-associators have been discussed

This section illustrates the bidirectional associative memory :

BAM is a two-layer nonlinear neural network.

The basic coding procedure of the discrete BAM is as follows.

and xij(yij) is the bipolar form of aij(bij)

The energy function E for the pair ( ,  ) and correlation matrix M is

With this background, the decoding processes, means the operations

The methods and the equations for retrieve are :

 determine a finite sequence of pattern pairs (' , ' ) , (" , " ) . . . .

until an equilibrium point (f , f ) is reached, where

Given a set of pattern pairs (Xi , Yi) , for i = 1 , 2, . . . , n and a set

Addition : add a new pair (X' , Y') , to existing correlation matrix M ,

Deletion : subtract the matrix corresponding to an existing pair (Xj , Yj)

Note : The addition and deletion of information is similar to the

has to occupy a minimum point in the energy landscape.

Consider N = 3 pattern pairs (A1 , B1 ) , (A2 , B2 ) , (A3 , B3 ) given by

Convert these three binary pattern to bipolar form replacing 0s by -1s.

The correlation matrix M is calculated as 6x5 matrix

This retrieved patern ' is same as Y3 .

The Working of incorrect recall by Kosko's BAM.

Convert these three binary pattern to bipolar form replacing 0s by -1s.

The correlation matrix M is calculated as 9 x 9 matrix

Let the pair (X2 , Y2 ) be recalled.

Start with  = X2, and hope to retrieve the associated pair Y2 .

The calculations for the retrieval of Y2 yield :