The Neural-Network Analysis & Its Applications Data Filters: Saint-Petersburg State University JASS 2006

THE NEURAL-NETWORK
ANALYSIS
&
its applications
DATA FILTERS
Saint-Petersburg State University

JASS 2006
About me
Name: Alexey Minin
Place of studying: Saint-Petersburg State University
Current semester: 7th semester
Field of interests: Neural Nets, Data filters for Optics
(Holography), Computational Physics,EconoPhisics.
Content:
What is Neural Net & its applications
Neural Net analysis
Self organizing Kohonen maps
Data filters
Obtained results
What is NeuroNet & its applications
Recognition of images
Processing of noised signals
Addition of images
Associative search
Classification
Drawing up of schedules
Optimization
The forecast
Diagnostics
Prediction of risks
Recognition of images
M-X2
Neural Net analysis
PARADIGMS of neurocomputing
Connection
Localness and parallelism of
calculations
The training based on data
(programming)
Universality of training algorithms
Neural Net analysis

What is Neuron?
Typical formal neuron makes the elementary operation
weighs values of the inputs with the locally stored weights
and makes above their sum nonlinear transformation:
y f u , u w0 iwi x i
y
y
u
x1
u w0 wi xi
xn
neuron makes nonlinear operation above a linear combination of inputs
Neural Net analysis

Connectionism
Global communications
Formal neurons
Layers
Neural Net analysis

Localness and parallelism of calculations
Localness of processing of the information
Any neuron reacts only to the information

from connected with it neurons without the appeal
to a general plan of calculations
Neurons are capable to function in parallel
Parallelism of calculations
Comparison of ANN&BNN
BRAIN
100hz
PC IBM
Vprop=100m/s
Vprop=3*108 m/s
100hz
109hz
N 109 hz
N=10 -10 neurons

10
11
The parallelism degree ~1014

like 1014processors with 100
Hz frequency. 104 connected at
the same time.
N=109
Neural Net analysis

The training based on data (programming)
Absence of the global plan
Mode of distribution of the
information on a network with
corresponding adaptation
neurons
The algorithm is not set in
advance, and generated by data
Training of a network occurs on a
small share of all possible situations
then the trained network is capable
to function in much wider range
of patterns
Local change by any neuron

the selected parameters
Synaptic weights
Training of a network
Patterns, on which Network is

training
An ability for generalization
Neural Net analysis

Universality of training algorithms
The only principle of studying - is to find minimum of empirical error
W set of synaptic weights
E (W) error function
The task is to find

Global minimum
The stochastic optimization as
a way not to stick at local minimum
Neural Net analysis

BASIS NEURAL NETS
Perceptron
Hopfield network
Kohonen maps
Probabilistic NNets
NN with general regression
Polynomial nets
Neural Net analysis

The architecture of NN
PROTOTYPES OF ANY NEURAL ARCHITECTURE
RECURRENT
with FEEDBACK (Elman-Jordan)
LEVEL-BY-LEVEL
WITHOUT FEEDBACK
Neural Net analysis

Classification of NN
By type of training
with tutor
E ( w) E{x , y , y ( x , w)}
without tutor
E ( w) E{x , y ( x , w)}
In this case the network is offered most to find the latent laws
in data file. So, redundancy of data supposes compression of
the information, and a network it is possible to learn to find the
most compact representation of such data, i.e. to make optimum
coding the given kind of the entrance information.
Methodology of self-organizing cards

Self-organizing Kohonen cards represent the type of the neural networks
trained without the teacher. The network independently forms the outputs,
adapting to signals acting on its input. As "teacher" of a network only data,
that is an information available in them, the laws distinguishing entrance data
from casual noise can serve.
Cards unite in themselves two types of compression of the information:
Downturn of dimension of data with the

minimal loss of the information
Reduction of a variety of data due to allocation
of a final set of prototypes, and references of
data to one of such types

Schematic representation of self-organizing network
Neurons in the target layer are ordered and

correspond to cells of a bi-dimensional card
which can be painted by a principle of
affinity of attributes
Hebb training rule

Changeof
ofweight
weightat
atpresentation
presentationof
ofith
ithexample
example
Change
isproportionally
proportionallyits
itsinputs
inputsand
andoutputs:
outputs:: :
is
Hebb, 1949
y
x
w y x
Vector representation
If to formulate training as a problem of optimization trained on Hebb

neuron aspires to increase amplitude of the output:
E
w
,
w
E w, x 12
w x
12 y 2 ,
Where averaging is spent on training sample x

NB: in this case there is no minimum error
Training on Hebb in that kind in what it is described above,
In practice not useful since leads
to unlimited increase of amplitude of weights.
Oya training rule

wj y xj y w j
The member interfering is added

To unlimited growth of weights
w y x y w
Vector representation
Rule Oya maximizes sensitivity of an output neuron at the limited amplitude of

weights. It is easy to be convinced of it, having equated average change of
weights to zero.
Having increased then the right part of equality on w. We are convinced, that in
2
balance
2
1 w 0
Thus, weights trained neuron are located on hyper sphere:
w 1.
At training on Oya, a vector of

weights neuron settles down on
hyper sphere, In a direction
maximizing Projection of entrance
vectors.
Competition of neurons: the winner takes away all

yi j 1 wij x j
d
Basis algorithm
Training of a competitive layer remains constant
wi yi x k yk w k
Winner:
x1
xd
# of neuron winner
i # i : w i x w i x
Training of the winner:
wi x w i
if w i 1 w i x w i x i i
I.e. the winner will appear neuron,
giving the greatest response to the
given entrance stimulus
yi 1, yi 0, i i
The winner takes away all

One of variants of updating of a base rule of training of a competitive layer
Consists in training not only the neuron-winner, but also its "neighbors", though and with
In the smaller speed. Such approach - "pulling up" of the nearest to the winner neuronIt is applied in topographical Kohonen cards
i # i : w i x min w i x
i
wi (t 1) wi (t 1) wi (t ) i i , t x (t ) wi (t )
Modified by Kohonen training rule
i i ,t Function of the neighborhood is equal to unit for the neuron
-winner with an index i And gradually falls down at removal

from the neuron-winner
Training on Kohonen reminds stretching an elastic grid of prototypes on

Data file from training sample
Bidimentional topographical card of a set Threedimensional data
Each point in three-dimensional space

gets in the cell of a grid having
coordinate of the nearest to its neuron from
bidimentional card.
Visualization a topographical card, Induced by i-th

component of entrance data
xi
The convenient tool of visualization

Data is coloring topographical
Cards, it is similar to how it do on
Usual geographical cards. All
attribute of data generates the coloring
Cells of a card - on size of average value
This attribute at the data who have got in given
Cell.
Having collected together cards of all interesting

Us of attributes, we shall receive topographical
The atlas, giving integrated representation
About structure of multivariate data.

Classified SOM for NASDAQ100 index for the period
from 10-Nov-1997 till 27-Aug-2001
4,0
3,5
Ln Y(t)
3,0
2,5
2,0
1,5
1,0
1
51
101
151
Change in time of the logprice of actions of

companies JP Morgan
Chase (The top schedule)
and American Express (the
bottom schedule) for the
period With 10-Jan-1994 on
27-Oct-1997
4,5
4,0
Ln Y(t)
3,5
3,0
2,5
2,0
1,5
1
51
101
151
Change in time of the logprice of actions of

companies JP Morgan
Chase (The top schedule)
and Citigroup (the bottom
schedule) for the period
c 10-Nov-1997 on 27-Aug2001
How to choose a variant?

Annual prediction
1988
1993
1998
2003
2008
2013
2018
2023
-24
TEST
-25
A
nnual C
S
L
-26
-27
-28
-29
PREDICTION
2028
This is the forecast of the

Sea level (Caspian)
2033
2038
DATA FILTERS
Custom filters (e.g. Fourier filter)
Adaptive filters (e.g. Kalman filter)
Empirical mode decomposition
Holder exponent
y (n) b(1) x (n) b(2) x(n 1) ... b(nb 1) x (n nb) a(2) y (n 1) ... a(na 1) y (n na)
Adaptive filters
Further we will keep in mind, that we are going to make
forecasts, thats why we need filters, which wont
change phase of the signal.
y (n) b(1) x(n) b(2) x(n 1) ... b(nb 1) x(n nb)

a (2) y (n 1) ... a (na 1) y (n na )
X(n)
X(n-1)
X(n-2)
Z-1
Z-1
b(2)
b(3)
X(n-nb)
Z-1
b(nb+1)
y(n)
-a(2)
-a(3)
-a(na+1)
Z-1
y(n-1)
Z-1
y(n-2)
Z-1
y(n-nb)
Adaptive filters
Siemens value, ad close (scaled)
We saved all maxima, there is no phase distortion
Adaptive filters
Lets try to predict next value using zero-phase filter, having
information about historical price:
I used Perceptron with 3 hidden layers, logistic act function, rotation alg, 20 min
Adaptive filters
Kalman filter
x (n) a x(n 1) k (n)[ y (n) ac x(n 1)], where

x(n) ax(n 1) w(n 1) model of generating signal,
w(n) white noise and

y (n) cx(n) (n) signal after neural net , (n) white noise
y (n)
K(n)
x(n)
Z-1
ac x(n 1)
x(n 1)
Adaptive filters
Lets use Kalman filter, like the error estimator for the
forecast of the zero-phase filtered data.
Empirical Mode Decomposition

What is it?
We can heuristically define a (local) high-frequency part
{d(t), t t t+}, or local detail, which corresponds
to the oscillation terminating at the two minima
and passing through the maximum which necessarily
exists in between them. For the picture to
be complete, one still has to identify the corresponding
(local) low-frequency part m(t), or local trend,
so that we have x(t) = m(t) + d(t) for t t t+.

What is it?
Eventually, the original signal x(t) is first decomposed

through the main loop as
x(t) d1 (t ) m1 (t ),
and the first residual m1 (t ) is itself decomposed as
m1 (t ) d 2 (t ) m2 (t ),
so that
x(t ) d1 (t ) m1 (t ) d1 (t ) d 2 (t ) m2 (t ) ...
k 1 d k (t ) mK (t )
K

Algorithm
Given a signal x(t), the effective algorithm of

EMD
can be summarized as follows:
1. identify all extrema of x(t)
2. interpolate between minima (resp. maxima),
ending up with some envelope emin(t) (resp.
emax(t))
3. compute the mean m(t) = (emin(t)+emax(t))/2
4. extract the detail d(t) = x(t) m(t)
5. iterate on the residual m(t)
tone
chirp
tone + chirp
IMF 1; iteration 0
2
1
0
-1
-2
10
20
30
40
50
60
70
80
90
100
110
120
IMF 1; iteration 0
2
1
0
-1
-2
10
20
30
40
50
60
70
80
90
100
110
120
IMF 1; iteration 0
2
1
0
-1
-2
10
20
30
40
50
60
70
80
90
100
110
120
IMF 1; iteration 0
2
1
0
-1
-2
10
20
30
40
50
60
70
80
90
100
110
120
IMF 1; iteration 0
2
1
0
-1
-2
10
20
30
40
50
60
70
80
90
100
110
120
IMF 1; iteration 0
2
1
0
-1
-2
10
20
30
40
50
60
70
80
90
100
110
120
IMF 1; iteration 0
2
1
0
-1
-2
10
20
30
40
50
60
70
80
90
100
110
120
70
80
90
100
110
120
residue
1.5
1
0.5
0
-0.5
-1
-1.5
10
20
30
40
50
60
IMF 1; iteration 1
1.5
1
0.5
0
-0.5
-1
-1.5
10
20
30
40
50
60
70
80
90
100
110
120
70
80
90
100
110
120
residue
1.5
1
0.5
0
-0.5
-1
-1.5
10
20
30
40
50
60
IMF 1; iteration 1
1.5
1
0.5
0
-0.5
-1
-1.5
10
20
30
40
50
60
70
80
90
100
110
120
70
80
90
100
110
120
residue
1.5
1
0.5
0
-0.5
-1
-1.5
10
20
30
40
50
60
IMF 1; iteration 1
1.5
1
0.5
0
-0.5
-1
-1.5
10
20
30
40
50
60
70
80
90
100
110
120
70
80
90
100
110
120
residue
1.5
1
0.5
0
-0.5
-1
-1.5
10
20
30
40
50
60
IMF 1; iteration 1
1.5
1
0.5
0
-0.5
-1
-1.5
10
20
30
40
50
60
70
80
90
100
110
120
70
80
90
100
110
120
residue
1.5
1
0.5
0
-0.5
-1
-1.5
10
20
30
40
50
60
IMF 1; iteration 1
1.5
1
0.5
0
-0.5
-1
-1.5
10
20
30
40
50
60
70
80
90
100
110
120
70
80
90
100
110
120
residue
1.5
1
0.5
0
-0.5
-1
-1.5
10
20
30
40
50
60
IMF 1; iteration 1
1.5
1
0.5
0
-0.5
-1
-1.5
10
20
30
40
50
60
70
80
90
100
110
120
70
80
90
100
110
120
residue
1.5
1
0.5
0
-0.5
-1
-1.5
10
20
30
40
50
60
IMF 1; iteration 1
1.5
1
0.5
0
-0.5
-1
-1.5
10
20
30
40
50
60
70
80
90
100
110
120
70
80
90
100
110
120
residue
1.5
1
0.5
0
-0.5
-1
-1.5
10
20
30
40
50
60
IMF 1; iteration 2
1.5
1
0.5
0
-0.5
-1
-1.5
10
20
30
40
50
60
70
80
90
100
110
120
70
80
90
100
110
120
residue
1.5
1
0.5
0
-0.5
-1
-1.5
10
20
30
40
50
60
IMF 1; iteration 2
1.5
1
0.5
0
-0.5
-1
-1.5
10
20
30
40
50
60
70
80
90
100
110
120
70
80
90
100
110
120
residue
1.5
1
0.5
0
-0.5
-1
-1.5
10
20
30
40
50
60
IMF 1; iteration 2
1.5
1
0.5
0
-0.5
-1
-1.5
10
20
30
40
50
60
70
80
90
100
110
120
70
80
90
100
110
120
residue
1.5
1
0.5
0
-0.5
-1
-1.5
10
20
30
40
50
60
IMF 1; iteration 2
1.5
1
0.5
0
-0.5
-1
-1.5
10
20
30
40
50
60
70
80
90
100
110
120
70
80
90
100
110
120
residue
1.5
1
0.5
0
-0.5
-1
-1.5
10
20
30
40
50
60
IMF 1; iteration 2
1.5
1
0.5
0
-0.5
-1
-1.5
10
20
30
40
50
60
70
80
90
100
110
120
70
80
90
100
110
120
residue
1.5
1
0.5
0
-0.5
-1
-1.5
10
20
30
40
50
60
IMF 1; iteration 2
1.5
1
0.5
0
-0.5
-1
-1.5
10
20
30
40
50
60
70
80
90
100
110
120
70
80
90
100
110
120
residue
1.5
1
0.5
0
-0.5
-1
-1.5
10
20
30
40
50
60
IMF 1; iteration 2
1.5
1
0.5
0
-0.5
-1
-1.5
10
20
30
40
50
60
70
80
90
100
110
120
70
80
90
100
110
120
residue
1
0.5
0
-0.5
-1
10
20
30
40
50
60
IMF 1; iteration 3
1
0.5
0
-0.5
-1
10
20
30
40
50
60
70
80
90
100
110
120
70
80
90
100
110
120
residue
1
0.5
0
-0.5
-1
10
20
30
40
50
60
IMF 1; iteration 4
1
0.5
0
-0.5
-1
10
20
30
40
50
60
70
80
90
100
110
120
70
80
90
100
110
120
residue
1
0.5
0
-0.5
-1
10
20
30
40
50
60
IMF 1; iteration 5
1
0.5
0
-0.5
-1
10
20
30
40
50
60
70
80
90
100
110
120
70
80
90
100
110
120
residue
1
0.5
0
-0.5
-1
10
20
30
40
50
60
IMF 1; iteration 6
1
0.5
0
-0.5
-1
10
20
30
40
50
60
70
80
90
100
110
120
70
80
90
100
110
120
residue
1
0.5
0
-0.5
-1
10
20
30
40
50
60
IMF 1; iteration 7
1
0.5
0
-0.5
-1
10
20
30
40
50
60
70
80
90
100
110
120
70
80
90
100
110
120
residue
1
0.5
0
-0.5
-1
10
20
30
40
50
60
IMF 1; iteration 8
1
0.5
0
-0.5
-1
10
20
30
40
50
60
70
80
90
100
110
120
70
80
90
100
110
120
residue
1
0.5
0
-0.5
-1
10
20
30
40
50
60
IMF 2; iteration 0
1
0.5
0
-0.5
-1
10
20
30
40
50
60
70
80
90
100
110
120
70
80
90
100
110
120
residue
1
0.5
0
-0.5
-1
10
20
30
40
50
60
IMF 2; iteration 1
1
0.5
0
-0.5
-1
10
20
30
40
50
60
70
80
90
100
110
120
70
80
90
100
110
120
residue
1
0.5
0
-0.5
-1
10
20
30
40
50
60
IMF 2; iteration 2
1
0.5
0
-0.5
-1
10
20
30
40
50
60
70
80
90
100
110
120
70
80
90
100
110
120
residue
1
0.5
0
-0.5
-1
10
20
30
40
50
60
IMF 2; iteration 3
1
0.5
0
-0.5
-1
10
20
30
40
50
60
70
80
90
100
110
120
70
80
90
100
110
120
residue
1
0.5
0
-0.5
-1
10
20
30
40
50
60
IMF 2; iteration 4
1
0.5
0
-0.5
-1
10
20
30
40
50
60
70
80
90
100
110
120
70
80
90
100
110
120
residue
1
0.5
0
-0.5
-1
10
20
30
40
50
60
IMF 2; iteration 5
1
0.5
0
-0.5
-1
10
20
30
40
50
60
70
80
90
100
110
120
70
80
90
100
110
120
residue
1
0.5
0
-0.5
-1
10
20
30
40
50
60
res.
imf6
imf5
imf4
imf3
imf2
imf1
10
20
30
40
50
60
70
80
90
100
110
120

Lets do it for Siemens index

Lets do it for Siemens index
We saved all strong maxima and there is no phase distortion

Lets make a forecast for Siemens index
THERE WAS NO DELAY IN THE FORECAST AT ALL!!!
Holder exponent
The main idea is next. Consider
Holder derived, that
f (t ) D f
| f (t t ) f (t ) | const (t ) ( t ) , (t ) [0,1]
0 means that we have break of second order

1means that we have O(t )
So this formula is a somewhat connection between bad functions and
good functions. If we will look on this formula with more precise we will
notice, that we can catch moments in time, when our function knows,
that its going to change its behavior from one to another. It means that
today we can make a forecast on tomorrow behavior. But one should
mention that we dont know the sigh on what behavior is going to
change.
Results
Thank You!
Any QUESTIONS?
SUGGESTIONS?
IDESAS?
Soft Im using:
1)MatLab
2)NeuroShell
3)FracLab
4)Statistika
5)Builder C++

The Neural-Network Analysis & Its Applications Data Filters: Saint-Petersburg State University JASS 2006

Uploaded by

Copyright:

Available Formats

The Neural-Network Analysis & Its Applications Data Filters: Saint-Petersburg State University JASS 2006

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

The Neural-Network Analysis & Its Applications Data Filters: Saint-Petersburg State University JASS 2006

Uploaded by

Copyright:

Available Formats

THE NEURAL-NETWORK

Saint-Petersburg State University

What is NeuroNet & its applications

What is Neural Net & its applications

What is Neural Net & its applications

Neural Net analysis

Neural Net analysis

neuron makes nonlinear operation above a linear combination of inputs

Neural Net analysis

Neural Net analysis

Any neuron reacts only to the information

Neurons are capable to function in parallel

N=10 -10 neurons

The parallelism degree ~1014

Neural Net analysis

Local change by any neuron

Patterns, on which Network is

An ability for generalization

Neural Net analysis

The task is to find

Neural Net analysis

Neural Net analysis

Neural Net analysis

Methodology of self-organizing cards

Downturn of dimension of data with the

Methodology of self-organizing cards

Neurons in the target layer are ordered and

Hebb training rule

If to formulate training as a problem of optimization trained on Hebb

Where averaging is spent on training sample x

Oya training rule

The member interfering is added

Rule Oya maximizes sensitivity of an output neuron at the limited amplitude of

Thus, weights trained neuron are located on hyper sphere:

At training on Oya, a vector of

Competition of neurons: the winner takes away all

Training of a competitive layer remains constant

The winner takes away all

i i ,t Function of the neighborhood is equal to unit for the neuron

-winner with an index i And gradually falls down at removal

Training on Kohonen reminds stretching an elastic grid of prototypes on

Bidimentional topographical card of a set Threedimensional data

Each point in three-dimensional space

Visualization a topographical card, Induced by i-th

The convenient tool of visualization

Having collected together cards of all interesting

Methodology of self-organizing cards

Change in time of the logprice of actions of

Change in time of the logprice of actions of

How to choose a variant?

This is the forecast of the

y (n) b(1) x(n) b(2) x(n 1) ... b(nb 1) x(n nb)

Siemens value, ad close (scaled)

We saved all maxima, there is no phase distortion

x (n) a x(n 1) k (n)[ y (n) ac x(n 1)], where

w(n) white noise and

Empirical Mode Decomposition

Empirical Mode Decomposition