Oil PHD

Download as pdf or txt
Download as pdf or txt
You are on page 1of 129

University of Kaiserslautern

Department of Mathematics
Research Group Algebra, Geometry and Computeralgebra
Dissertation
Topological Methods for the
Representation and Analysis of
Exploration Data in Oil Industry
by
Oleg Artamonov
Supervisor: Prof. Dr. Gerhard Pster
Kaiserslautern, 2010
1. Reviewer: Prof. Dr. Gerhard Pster
2. Reviewer: Prof. Dr. Bernd Martin
Day of the defense: 13 August 2010
To my dear Grandfather.
Abstract
The purpose of Exploration in Oil Industry is to discover an oil-containing ge-
ological formation from exploration data. In the context of this PhD project this
oil-containing geological formation plays the role of a geometrical object, which
may have any shape. The exploration data may be viewed as a cloud of points,
that is a nite set of points, related to the geological formation surveyed in the ex-
ploration experiment. Extensions of topological methodologies, such as homology,
to point clouds are helpful in studying them qualitatively and capable of resolving
the underlying structure of a data set. Estimation of topological invariants of the
data space is a good basis for asserting the global features of the simplicial model
of the data. For instance the basic statistical idea, clustering, are correspond to
dimension of the zero homology group of the data. A statistics of Betti numbers
can provide us with another connectivity information. In this work represented
a method for topological feature analysis of exploration data on the base of so
called persistent homology. Loosely, this is the homology of a growing space that
captures the lifetimes of topological attributes in a multiset of intervals called a
barcode. Constructions from algebraic topology empowers to transform the data,
to distillate it into some persistent features, and to understand then how it is or-
ganized on a large scale or at least to obtain a low-dimensional information which
can point to areas of interest. The algorithm for computing of the persistent Betti
numbers via barcode is realized in the computer algebra system Singular in the
scope of the work.
v
Contents
Contents vii
1 Introduction 1
1.1 Motivations of the research. . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Input exploration data. . . . . . . . . . . . . . . . . . . . . . . . . 2
2 Structures for Point Sets 7
2.1 Homological approximation of real objects. . . . . . . . . . . . . . 7
2.2 Preliminary constructions. . . . . . . . . . . . . . . . . . . . . . . . 10
2.3 Nerves of coverings, a geometric realization of the point cloud data
and the similarity theorem. . . . . . . . . . . . . . . . . . . . . . . 11
2.4

Cech and Rips complexes. . . . . . . . . . . . . . . . . . . . . . . . 14
2.5 Witness complexes. . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.6 Voronoi diagrams and Delaunay triangulations. . . . . . . . . . . . 18
2.6.1 The dual complex. . . . . . . . . . . . . . . . . . . . . . . . 22
2.6.2 The -shape complex. . . . . . . . . . . . . . . . . . . . . . 23
3 Multiresolution and Persistence 25
3.1 Levels of resolution. . . . . . . . . . . . . . . . . . . . . . . . . . . 25
3.2 Persistence homology. . . . . . . . . . . . . . . . . . . . . . . . . . 26
4 Persistence Structures 33
4.1 Persistence Betti numbers of dierent dimensions and barcode. . . 33
4.2 The persistence module. . . . . . . . . . . . . . . . . . . . . . . . . 38
4.2.1 The Artin-Rees correspondence. . . . . . . . . . . . . . . . 40
4.2.2 A structure theorem for graded modules over a graded PID. 41
5 The Realized Singular Software 47
5.1 The program structure. . . . . . . . . . . . . . . . . . . . . . . . . 47
5.2 The persistence algorithm. . . . . . . . . . . . . . . . . . . . . . . . 49
5.2.1 Matrix representations. . . . . . . . . . . . . . . . . . . . . 49
vii
viii CONTENTS
5.2.2 A pseudo-code of the revised version
of the persistence algorithm. . . . . . . . . . . . . . . . . . 53
5.3 Examples of data processing. . . . . . . . . . . . . . . . . . . . . . 56
5.4 Summary and concluding remarks. . . . . . . . . . . . . . . . . . . 59
A Basic Notions and Concepts 63
B The Singular Code 71
B.1 Computation of persistence Betti numbers of noisy point cloud data. 71
B.2 Computation of the Gr obner basis by a realization of the Buchberger-
Moller algorithm. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
B.3 Computation of the Grobner basis by a realization of the approx-
imative version of the Buchberger-Moller algorithm. . . . . . . . . 83
C A Reference Mapping Way and a Representative Graph 91
C.1 Filtering. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
C.2 Clustering. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
C.3 The cluster complex. . . . . . . . . . . . . . . . . . . . . . . . . . . 95
C.4 The Mayer-Vietoris blowup. . . . . . . . . . . . . . . . . . . . . . . 97
C.5 The similarity graph. . . . . . . . . . . . . . . . . . . . . . . . . . . 99
D Brief Description of the Project 103
References 105
List of Symbols and Abbreviations 109
List of Figures 111
List of Tables 113
Index 115
Acknowledgements
This work was dome with the nancial support of Deutscher Akademischer Aus-
tausch Dienst. The nancial support of the University of Kaiserslautern is also
gratefully acknowledged.
First of all, I would like to thank my supervisor Prof. Dr. Gerhard Pster for
his encouragement, support, and valuable help with a Sngular programming. He
is one who gave me a great opportunity to do my Ph.D. research in a supporting
and comforting atmosphere.
Next, I would like to say Thank you so much! to Dr. Hennie Poulisse, Prin-
cipal Research Mathematician in Shell. It may be said with absolute certainty
that the results of this thesis would not appear without my cooperation with
Hennie. He become the thesis advisor during my one-year internship at the De-
partment of Exploratory Research of Shell Research, Rijswijk, The Netherlands.
My special thanks to Prof. Dr. Dirk Siersma from Utrecht University. It
was nice of him to help with a solution of unexpectedly arisen problems in The
Netherlands.
I want to express my sincere appreciation of the fruitful concomitant discus-
sions to Matthew Heller.
Finally, I want to acknowledge large majority of people from collective of the
Research Group Algebra, Geometry und Computer-algebra of the Department of
Mathematics of the University of Kaiserslautern, between whom elapsed my life
all these years.
ix
Chapter 1
Introduction
1.1 Motivations of the research.
In oil industry, in order to attain a more eective extraction, it is necessary to
obtain information about oil elds location and also about shapes of underground
capacities of petroleum gathering. The reason is clear: since to drill one oil well is
extremely expensive, it is crucial to understand how the underground geological
formation roughly look like. This kind of knowledge demands to process huge
amounts of experimental exploration data which always contains a lot of noise
and also has missing information. The obtained after a row of explosions data is
corresponds to times of arrival of post-explosion reected waves to a network of
special sensor detectors.
Fig. 1.1: Seismic acquisition on land using a dynamite source and a cable of geophones.
1
2 CHAPTER 1. INTRODUCTION
The data coming from real applications is massive and it is not possible to
discern and visualize structure even in low resolution. The purpose of this work
is to modify and to apply of recently developed techniques of topological data
analysis for ad hoc applied objectives. The main message is based on the idea of
partial clustering of the data guided by a construction of simplicial complexes in
order of topological approximation.
Algebraic topology can be loosely described as the study of spaces through
their algebraic images [18]. Since most of the information about topological spaces
can be obtained through diagrams of discrete sets, the gist of the method is to
reduce high dimensional data sets and to nd such a simplicial representation
for the reduced data with much fewer points which still encodes some essential
topological and geometric information at a specied resolution from the original
data.
So the method is based on ideas of algebraic topology. We map our cloud
of noisy data to a combinatorial object some simplicial complex, whose inter-
connections reect important aspects of the topological features of the geological
object under investigation.
1.2 Input exploration data.
The initial input information about the geological formation is multilevel na-
ture data which may be viewed as a cloud of points, that is a nite set of
points, related to the geological formation surveyed in the exploration experi-
ment. Therefore it can be obtained only with the real geometrical object noisy
sampling. By sampling with noise we mean points sampling from a probability
distribution concentrated near the underground geological formation surface. As
was mentioned, the data is obtained after surface or underground explosions, and
corresponds to times of arrival of post-explosion reected waves to a network of
geophones or hydrophones.
In the simplest case, the detectors are located on a straight line after the
explosion point with equal distances between each other. In this case the data
can be coarsely represented as a plot of points: on the horizontal axis of the plot
we have distances of the detectors from the explosion point, and on the vertical
one time of arrival of reected sound waves to the detectors. Points of the
diagram are distributed around series of hyperbolas which, after straightening
and rectication of the curve, gives a approximate image of underground rock
strata.
The curvature carries messages about constantly unknown density of the in-
homogeneous rock medium. Since the underground geological formation can
have an extremely complicated shape, the signals arriving after the reections
1.2. INPUT EXPLORATION DATA. 3
Fig. 1.2: 3D marine seismic acquisition, with multiple streamers towed behind a vessel.
from such a surface can arrive to the detectors in a complicated succession. For
example, a syncline reector yields bow-tie shape in zero oset section.
Fig. 1.3: A syncline reector (left) yields bow-tie shape in zero oset section (right).
Actually, it happens very often on practice when on one detector comes several
signals from dierent directions. A minimal bit of information for us is every such
a signal, and therefore we will be treat post-explosion reected waves arrived on
a sensor detector as our initial input points. In the considered case, the signal
can be parametrized by (
i
, t
i
, A
i
), where
i
is a distance of the i-th sensor
detector to the explosion point, t
i
is a interval between the explosion time and
the time when the signal came to the i-th detector, and A
i
is an amplitude of
the signal.
4 CHAPTER 1. INTRODUCTION
Fig. 1.4: Reections in time (a) and in depth (b).
In the case of a network of the sensor detectors with the explosion point in the
middle, we just take a polar system of coordinates and increase a dimension
of the parametrization. Let us rst choose a system of coordinates with the
horizontal axis from West to East and with the vertical one from South to North.
Also let
i
be a module of the radius-vector from the origin of coordinates to
the i-th detector, and
i
be a angle between the positive horizontal semi-axis
and the radius-vector. Then we have here (
i
,
i
, t
i
, A
i
) as the parametrization
of the t
i
-time signal at the i-th sensor detector.
Fig. 1.5: (a) Transmission response of the noise sources in the subsurface observed at the
surface. (b) Synthesized reection response, obtained by seismic interferometry. (c) Synthesized
reection depth image from reection responses as in (b).
1.2. INPUT EXPLORATION DATA. 5
So, in the simplest case, we have on input the experimentally obtained point
cloud X
def
= x
i
[ x
i
=(
i
, t
i
, A
i
), which can be treated as a nite set of N points
equipped with the Euclidean distance function d(x
i
, x
j
) between each x
i
, x
j
X.
Of course, any another metric which is a reasonable proxy for an intuitive notion
of similarity can also be used.
It is an entirely separate science about how to derive a proper sampling of
underground objects from the exploration experiments. This very challenging
task requires a sucient number of explosions and sensors, as well as proper
locations for these. So the problem of extracting of topological and geometrical
information about an underground geological formation has two separate aspects:
geophysical and mathematical. The rst one is beyond our control, and we can
only rely on a professionalism of geophysicists here. As this work is devoted to the
mathematical side of the problem, we further assume that we obtained a suitable
nice sampling of the object under investigation, which is uniformly distributed
with a sucient density.
For everybody who is interested in the geophysical aspect of the problem,
there is a lot of easily available literature. For example, confer the website listed
below:
ve Jon Claerbouts books: http://sepwww.stanford.edu/sep/prof;
Biondo Biondis publications:
http://sepwww.stanford.edu/data/media/public/sep//biondo/biblio_frame.html;
books from Samizdat Press: http://samizdat.mines.edu;
Guy G. Drijkoningens lecture notes and pictures:
http://geodus1.ta.tudelft.nl/PrivatePages/G.G.Drijkoningen.
See also [2] and [33].
Chapter 2
Structures for Point Sets
2.1 Homological approximation of real objects.
A principle problem within computational topology is recovering the topology
of a nite point set. The assumption is that the acquired point set is sampled
from some underlying topological space, whose connectivity is lost during the
sampling process. Strictly speaking, we are not able to make any computation
from the input point cloud directly. Therefore, we need techniques for computing
structures that topologically approximate the underlying space of a given point
set. In other words, we should come up ourselves with additional intermediate
input information by usage of techniques based on a special kind of mathematical
formalism. For the sake of clarity, we begin with a topological construction,
and proceed then to develop the analogous construction for the experimentally
obtained sampling.
In order to be able to do any calculations, we have to encode rst our space
to a special approximation complex which may be considered as a combinato-
rial version of the topological space whose properties may now be studied from
combinatorial, topological or algebraic aspects. In most general terms, algebraic
topology oers two methods for gauging the global properties of a particular topo-
logical space, X, by associating with it a collection of algebraic objects. The rst
set of invariants are the homotopy groups,


i
(X). A much less computationally
expensive approach and, therefore, more practical is the second set of invariants,

Homotopy groups contain information on the number and kind of ways one can map a k-
dimensional sphere S
k
into X, with two spheres in X considered equivalent if they are homotopic
belonging to a same path equivalence class relative to some xed basepoint. The main object
here is a so called fundamental group the group of homotopy classes of loops in space. Here a
path in X is a continuous map : [0, 1] X, and a loop is a path with (0)=(1), i.e. starts
and ends at the same basepoint.
7
8 CHAPTER 2. STRUCTURES FOR POINT SETS
which will be the main object of study in this work. The k-dimensional homology
groups, H
k
(X), provide information about properties of chains which was formed
from simple oriented units known as simplexes. As opposite to homotopy groups,
homology groups can be computed using the methods of linear algebra.
We want to recover accessible information about a solid shape of a geometric
object in R
3
from the nite point cloud of approximately noisy exploration data
empirically sampled from the object. What attributes of the original space could
be recovered from this data? Briey, the idea behind this is what we discuss next.
Imagine a volume of oil and gas in some reservoir. This volume can be considered
as a manifold , i.e. as an algebraic surface in 3-space. The surface is embedded
in the reservoir rock and also captures faults, as well as impermeable layers in the
reservoir rock, in which, as a result, there can be no oil or gas. In this context,
these anomalies can be interpreted as holes of the algebraic surface. Information
about the number and type of these holes which are contained in a topological
space go beyond standard homological approaches. Features of a geometric model
which can be obtained by mathematical techniques at our disposal are the three
types of holes characterizing its connectivity:
The gaps that separate components.
The tunnels that pass through the shape.
The voids that are components of the complement space inaccessible from
the outside.
The decomposition into pass connected components is level zero connectivity
information, loops the level one connectivity information, and so forth. Homology
groups oers a formal algebraic framework for studying and counting holes in a
topological space. Due to an Alexander duality property [18], the ranks of the
rst three homology groups or Betti numbers,
0
,
1
,
2
, count the number of
above-mentioned gaps, tunnels, and voids (see [42]).

In this manner, homology


gives a nite compact description of the connectivity of the objects shape. For
instance in the below picture and table are represented simple subspaces of the
three-dimensional sphere, S
3
, and their Betti numbers, correspondently. E.g. we
can see that the torus is one connected component, has two tunnels, and encloses
one void, correspondently,
0
=1,
1
=2, and
2
=1. Skeleton of tetrahedron has
no voids and forms three tunnels

, so
1
=3, and
2
=0 in this case.

In an informal sence, the k-th Betti number


k
(X) measures the number of k-dimensional
holes in the space X. Tunnels and voids belongs to the complement of a considered complex.

The edges of the skeleton of tetrahedron form four triangle-shaped cycles but, since one triangle
may be represented as the linear some of the other ones, the four together are a vector space of
1-cycles with rank three.
2.1. HOMOLOGICAL APPROXIMATION OF REAL OBJECTS. 9
Fig. 2.1: Four simple subspaces of S
3
.

0

1

2
Sphere S
2
1 0 1
Torus T
2
1 2 1
Skeleton of a Tetrahedron 1 3 0
Crystals Grid 1 3 0
Tab. 2.1: Betti numbers
0
,
1
and
2
of the geometrical objects from Figure 2.1.
Moreover, since homology is an invariant, we may represent the shape combi-
natorially with a simplicial complex which has the same connectivity, and there-
fore lead to the same result. It is interesting for us when subcomplexes of a
triangulation of S
3
encloses a void, and the void is the empty space enclosed
by the complex. One can be interested to nd out which holes are long-lasting,
that is, persist over a certain parameter range with the course of time, and which
is can be easily ignored as topological noise. It is for establishing and counting
of these holes, which are represented by so called persistent Betti numbers, the
algorithms from computational homology are adopted and implemented in the
scope of this PhD research.
Observe that our data which we treat as initial input information are not spa-
tial coordinates, but parametrized post-explosion reected-back signals which are
arrived on sensor devices. Therefore, Betti numbers which can be caught from
the data has complex physical meaning and can not be directly interpreted as
the above-mentioned multidimensional connectivity information about the under-
ground geological formation under investigation. Nevertheless, since each signal
implicates information about the spatial point of the object surface from which
it was reected back, the available Betti numbers still implicates desirable in-
formation and thus are valuable for further renement from noise and persistent
features distillation. How to infer Betti numbers of the desired underground
shape from the ones of the input noisy cloud X is partially a geophysical ques-
tion and, as we mentioned above, a geophysical interpretation of the topological
information regarding the data is beyond the scope of this paper. So after the
assumptions about the sampling, we concentrate our opinion on preliminary ap-
10 CHAPTER 2. STRUCTURES FOR POINT SETS
proximation constructions that also are nite combinatorial representations which
t for machine computations.
2.2 Preliminary constructions.
A homology is a topological invariant that is frequently used in practice, since it
is computable by linear algebraic methods in all dimensions. The homological
method of data investigation characterizes the connectivity of a space X through
the structure of its holes (see e.g. [18]) by studying them via equivalence classes
of cycles called homology classes. In order to be able to calculate homology by
a computer, we need to deal with structures amenable to nite computation. It
is not possible to carry out direct computations of homology groups from the
denition, but, since the homologies of simplicial complexes are algorithmically
computable, it is necessary to use special techniques for spaces which are equipped
with a homeomorphism to a some structure which captures the topology of the
data. In our case, the capturing of the topology means an approximation of X
in terms of homology. So any of our calculations stipulates rst for a solving this
theoretically challenging problem, i.e. construction of such a simplicial complex.
Huge amount of literature is devoted to the problem of constructing simpli-
cial complexes that represent or approximate a geometric object in some nite-
dimensional Euclidean space. As mentioned in [14], this is a special case of the
grid generation problem and can be divided into two parts:
1) choose the points or vertices of the grid;
2) connect the vertices using edges, triangles, and higher-dimensional simplices.
Unfortunately, most existing simplicial approximation algorithms are pro-
hibitively expensive since they give too many cells in the approximating complex
or are valid only in low dimensional cases. So, in order to estimate topological
invariants of X, it is necessary to nd a simplicial complex construction which
uses relatively few cells and can be eciently computed in an arbitrary metric
space.
One natural operation for the approximation of the original surface is the
construction of a triangulated surface using the data points as vertices. Since the
topological invariants, which we want to measure, are pretty coarse features of
the data, therefore it is much more ecient and absolutely sucient to construct
less detailed approximations, e.g. a triangle is topologically equivalent to a circle.
2.3. NERVES OF COVERINGS, A GEOMETRIC REALIZATION OF THE POINT
CLOUD DATA AND THE SIMILARITY THEOREM. 11
Fig. 2.2: Data sampled from a circle.
In this stream, we can just approximate the intrinsic metric structure of the
data by computing shortest paths in a local connectivity graph and dene the so
called Delaunay complex, a complex which is dened in terms of this intrinsic path-
length geometry instead of extrinsic Euclidean geometry. On the other hand, this
construction may be interpreted as their dual graph Voronoi diagram

(see [16]),
whereby, by denition, Voronoi cells are required to overlap in order to use of their
intersection structure. We are going to use restricted versions of these complexes
for estimation of Betti numbers from the point-cloud data. Furthermore, we refer
for the good survey of dierent construction algorithms to [30], and, in order to
give some context and some comparison, we will start, in due course, with several
famous simplicial complex constructions.
2.3 Nerves of coverings, a geometric realization of the
point cloud data and the similarity theorem.
All denitions relating to algebraic topology are represented in the Appendix A
of the work. As it is infeasible to include an entire course of algebraic topology
here (for this see e.g. [7], [18], [20], [28], [29], [32]), we give the most important
meanings in order to make an exposition smooth.
An anely independent point set T X R
d
denes the k-simplex
T
=
conv T with dimension k = dim
T
= card T 1 and which vertexes are points
of T. The standard k-simplex can be taken to be the convex hull of the basis
vectors in R
d
. An abstract simplicial complex, /, is a nite collection of simplices
such that satises the following two properties:
1) if
T
/ and ST then
S
/;

Let us call a subset LX as the set of landmark points. For some L the Voronoi cell
V

def
= {x X| d(x, ) (x,

) for all

L} form a covering of X with the Delaunay complex


attached to L as the nerve. The Voronoi diagram of X is the decomposition of X into Voronoi
cells. The Delaunay triangulation is the abstract simplicial complex whose vertex set is X , and
where a family {x
0
, x
1
, . . . , x
k
} spans a k-simplex if and only if V
x
0
V
x
1
. . . V
x
k
= for all
k0; it is geometrically realised as a triangulation of the convex hull of X.
12 CHAPTER 2. STRUCTURES FOR POINT SETS
2) if
T
,
L
/ then
T

L
=
TV
.
Each k-simplex has k+1 faces which are (k1)-simplices, each face is obtained
by deleting one of the vertices,
T
/ if all its faces belongs to the complex.
/ has the dimension dim /
def
= max
K
dim , the vertex set vert /
def
=

T
K
T,
and the underlying space [/[
def
=

/=

K
. A subcomplex of / is a simplicial
complex L/. A triangulation of X is a simplicial complex / together with a
homomorphism between X and [/[; X is triangulable if there exists a simplicial
complex / such that [/[ is homeomorphic to X (see [29], [31]).
Several points in d-dimensional space are in non-degenerate position if there
are no d+2 cospherical points; in other words, if there are no four points on
one circle, ve points on one sphere and so forth. The non-degenerate position
can be easily simulated computationally by a slight symbolic perturbation. So,
since a joggle input is at our service, further we assume that points in X are in
the non-degenerate position. Without this assumption we get cells that are not
simplices.
We assume we are given a point set X embedded in R
d
. As was mentioned,
the point set does not have any interesting topology by itself, and so we begin
by approximating the underlying space by pieces of the embedding space. An
approach than lets decouple geometry from topology is a covering of X a
collection
|
def
= U
i

iI
[ U
i
R
d
, X
_
i
U
i
, where I is an indexing set .
It is an open or closed covering if each U
i
| is open or closed, and it is a
nite covering if each U
i
is nite. Topological attributes can be localized by an
axment to elements of the covering.
The nerve of a nite covering of | =U
i

iI
is the set of cover elements with
non-empty common intersections:
^ = ^(|)
def
= nerve | = U
j

jJ
[ U
j
,=, J I.
A geometric realization of ^(|) is a simplicial complex, (, together with a
bijection r between | and vert (, so that U
j

jJ,JI
^ if and only if
the simplex spanned by r(U
j

jJ
) is in (.
Since 1 | ^ implies 1 ^, we need to construct an embedding in order
to obtain an abstract simplicial complex from the nerve. Let us assume that
each element ^
i
def
= U
i
[ U
i
^(|) in ^(|) is represented by a point i R
d
and that any subset of ^ is then represented by the convex hull, conv, of the
corresponding points. Now nd an injection
r : ^ R
d
so that
2.3. NERVES OF COVERINGS, A GEOMETRIC REALIZATION OF THE POINT
CLOUD DATA AND THE SIMILARITY THEOREM. 13
conv r(|) conv r(1) = conv r(| 1) for all |, 1 ^.
The simplicial complex ( = conv r( ^(|) ) is the geometric realization of
^, and the underlying space of ( is the part of R
d
covered by its simplicies,
[([ =

G
.
Now we are ready to formulate the crucial point in the complex approxima-
tion. This is the famous result of algebraic topology, the so-called nerve theorem
of combinatorial topology, also known as the similarity theorem or Lerays theorem
(see e.g. [25]).
Theorem 2.1. Let | =U
i

iI
be a nite closed covering of a triangulable space
XR
d
such that

U
j

jJ
[ J I is either empty or contractible. Let ( be a
geometric realization of ^(|). Then X is homotopy equivalent to the underlying
space [([ , and therefore has homology isomorphic to that of [([ .
Loosely, the nerve of a good cover is homotopy equivalent to the cover.
This theorem is the basis of most methods for point set representations. In
each case we search for a good covering whose nerve will be our representation.
So it is natural now to dene the approximation/similarity simplicial complex
of X, associated with the covering |, as the underlying space of the geometric
realization of the nerve of the covering: [conv r( ^(|) )[. Let us give more
extensive denition.
An approximation simplicial complex of XR
d
associated with the cover-
ing | is the abstract simplicial complex /(X) =[((^(|))[, whose vertex
set is the indexing set R = r( ^(|) ), where r is the injection ^
i
r
i
which dene the geometric realization, and where a family r
0
, r
1
, . . . , r
k

spans a k-simplex if and only if correspondent elements of conv r( ^(|) )


have non-empty common intersections:
(( ^
i
0
) (( ^
i
1
) . . . (( ^
i
k
) ,= .
This work is devoted to the strictly applicable problem of estimating the topo-
logical structure of the underground geological formation via homology groups
or Betti numbers of the similarity complex. These coarse topological structures
are invariants under homotopy equivalence, and therefore it is appropriate and
sucient notion of equivalence for our purposes. The simplicial complex approx-
imation assumes the following aspects.
1. A construction of a simplicial complex, /X, which depending on X and
possibly on additional parameters, but not depending on . The similarity
theorem asserts that such an approximation complex captures topological
features of the set.
14 CHAPTER 2. STRUCTURES FOR POINT SETS
2. A similarity simplicial complex reects the homology of if there exists a
homotopy equivalence, /(X), or homeomorphism, /(X), between
and /. These relations stipulate for reasonable conditions on X as a
sample of , and for some choice of values for the additional parameters.
The important point here is that when we construct some covering of X, we
thereby construct a covering of the initial space , and now, in order to be
able to switch from one to another, we need the sampling to be good enough.
In our case, obviously, we could provide such a goodness by a sucient amount
of sensor devises which receives signals, a lucky location of these geophones or
hydrophones and also a sucient amount of explosion located in proper places.
Access of the extent of such a suciency is however a geophysical problem. Since
our research is devoted to the mathematical side of the problem, it is not within
the scope of this work.
So from now on, we assume that we have received a suciently ne sam-
pling from professional geophysicists and concentrate our attention instead on
a construction of the similarity complex. This complex /(X) ensures that the
relations nally imply an approximation of the topological structure of . It is
possible to construct several such simplicial complexes with their own advantages
and disadvantages. We must analyze these complexes to compute topological in-
variants attached to the geometric object around which our data is concentrated.
2.4

Cech and Rips complexes.
Simplicial complex approximations are well understood if they can be interpreted
as the nerve of a covering of a space (see [36]). Below we will use non-empty
intersections of elements of the covering | =U
i

iI
of X R
d
in building an
another complex from X.
The

Cech complex of the covering | is the abstract simplicial complex,

(,
whose vertex set is the indexing set I, and where a family i
0
, i
1
, . . . , i
k

spans a k-simplex if and only if U


i
0
U
i
1
. . . U
i
k
,= .
That is,

( is precisely the nerve ^(|) . We treat a corresponding elements of
U
i

iI
as a vertex in our complex whenever U
i

U
i
m
,=. And then, whenever
U
i
0
, U
i
1
, . . . , U
i
k
are overlapping, we add a k-simplex =[i
0
, i
1
, . . . , i
k
] to the

Cech complex.
We need some additional denitions to estate a correspondence between X
and ^(|) by kind of partial coordinatization of X with values in ^(|).
A partition of unity subordinate to the nite open covering | is a family of real
valued functions
i

iI
with the following properties:
2.4.

CECH AND RIPS COMPLEXES. 15
1) 0
i
(x)1 for all i I and xX;
2)

iI

i
(x) =1 for all xX;
3) the closure of the set x X[
i
(x) > 0 is contained in the open
set U
i
.
The barycentric coordinatization is a bijection between points p
i
of a k-simplex

T
, T = [p
0
, p
1
, . . . , p
k
], and the set of ordered k-tuples of real numbers
(
0
,
1
, . . . ,
k
) so that:
1) 0
j
1 , j [0, 1, . . . , k];
2)

k
j=0

j
= 1;
3)

k
j=0

j
p
j
= p.
The numbers (
0
,
1
, . . . ,
k
) are barycentric coordinates of the point p
with respect to the simplex
T
, which are unique and non-negative for all
p
T
.

The barycenter of
T
is b
T
=

k
i=0
p
i
k+1
.
Now for any point xX let (x)
def
= i I [ xX
i
=(i
0
, i
1
, . . . , i
l
) I. Then
we dene the desired correspondence map : X^(X) as the map x(x) ,
where (x) ^(X) is the point in the simplex with the vertex set (x), whose
barycentric coordinates are
(
0
,
1
, . . . ,
l
) =

i
0
,

i
1
, . . . ,

i
l
=

(x) [ (x) .
The map (x) is continuous, provides a partial coordinatization of X and is
with values in the simplicial complex ^(|).
For this case, the nerve theorem can be formulated as following: if, for all
non-empty J I we have

jJ

1
(U
j
) is either contractible or empty, then

((X) is homotopically (and therefore homologically) equivalent to X .


It is convenient to represent each element U
i
of | as a closed Euclidean ball
B

= B(x, ) = y R
d
[ d(x, y), xX, R.
Since balls are convex, the nerve theorem implies that

((X, ) is homotopy equiv-
alent to the union of these balls, and therefore has a straightforward geometrical

Let K and L be two simplicial complexes with a map : vert K vert L which take vertices
of any simplexes in K to the vertexes of a simplex in L. The simplicial map implied by is
: K L, which maps a point p
T
, T =[p
0
, p
1
, . . . , p
k
], to (p) =
P
k
i=0

i
(p
i
). If is a
bijection then is a homeomorphism.
There is a standard realization for an k-simplex as follows. The standard k-simplex
k
is
the convex hull of {e
i
}
i{0,1,...,k}
, where e
i
= {(0, . . . , 1, . . . , 0) | 1 in the ith position, i I =
{0, 1, . . . , k}} is the i-th standard basis vector for R
k
. For any indexing set J I,
J
is the face
of
k
=
I
spanned by {e
j
}
jJ
. The standard simplex may be subdivided using the barycenters of
its faces to produce the simplicial complex K
k
with |K
k
| =
k
. Each non-empty face
J
of
k
has an associated vertexes in K
k
.
J
is triangulated by subcomplex K
J
K
k
with |K
J
| =
J
.
16 CHAPTER 2. STRUCTURES FOR POINT SETS
interpretation. Here the radius is a typical additional parameter which can
serve as a feature scale that directly denes geometrical features of which the
scale should be captured by

((X, ). This is a nested parameter in the sense that

((X,
1
)

((X,
2
) whenewer
1

2
. Later, this property will be propagated to
inclusions of homology groups of corresponding complexes.
However, we can not be tempted to use

((X, ) as the approximation complex
because it requires storage of inappropriately large amount of simplices of various
dimensions even when the underlying topological information is simple. The
complex may have dimension much higher than the original space, and such a
cumbersome construction is prohibitively expensive computationally. Much less
computationally awkward is a construction of a simplicial complex which can be
recovered solely from the edge information. So we will relax the

Chech condition
for simplex inclusion by allowing a simplex of which the vertexes are pairwise
within some distance .
The Rips complex for the set X R
d
, attached to the xed parameter , is
the abstract simplicial complex

(X) whose vertex set is X and whose


k-simplexes are spanned by (k +1)-tuples x
0
, x
1
, . . . , x
k
if and only if
d(x
i
, x
j
) for all 0 i , j k.

(X) is the variant of



(, which is easier to calculate. It is also the largest
simplicial complex having the same 1-skeleton as the correspondent

Cech com-
plex ^

2
(X). The denition makes sense for an arbitrary metric structure on
X, and it avoids the calculations needed to determine whether a set of Eu-
clidean balls has nonempty common intersection. There are obvious inclusions

(X)

(X)

(
2
(X).
The disadvantage of the Rips complex is that it is not the nerve of any cov-
ering, and it is therefore not amenable to the nerve theorem. This means that

(X) is actually not always homotopy equivalent to X. Despite this, the Rips
complex is widely uses for approximations in cases where it is homology isomor-
phic to X, and after some optimal factorisation where this is not the case (at
greater length see [9]).
On the below picture are represented

Cech and Rips complexes constructed
from point cloud data for a particular .
Another drawback of the Vietoris-Rips complex is wastefulness from a compu-
tational point of view. One way to circumvent this problem is to use the Voronoi
diagram to dene nested families of some special complexes which makes frugal
use of simplices, but is however easily computed.
2.5. WITNESS COMPLEXES. 17
Fig. 2.3: A xed set of points can be completed to

C

(X) or to R

(X) . The

Cech complex
has the homotopy type of the /2 cover, S
1
S
1
S
1
, while the Rips complex has homotopy
type S
1
S
2
.
2.5 Witness complexes.
The strong witness complex J(X, L, ) for X is the abstract simplicial
complex whose vertex set is a nite L X, and where a family =

0
,
1
,. . .,
k
X spans a k-simplex if and only if there is a point xX
such that d(x,
i
)min(x, L)+ for all i.
A point xX is a -weak witness for if d(x, l)+ d(x,
i
) for all i and
all l / .
The weak witness complex

J(X, L, ) for X with vertex set is L, is the
abstract simplicial complex where spans a k-simplex if and only if L
and all its faces are amenable to -weak witnesses.
Here the landmark points L X are chosen to be treated as the vertex set
and assumed to be well-distributed over the data.
Withess complexes are based on the idea that the non-landmark data points
XL can be used to determine the edges and higher-dimensional cells of the
complex: the edge [
i
,
i
] between two landmark points is included in the complex
if there exists a data point whose two nearest neighbors in the landmark set are

i
and
j
.
It is very convenient to consider the versions of J(X, L, ) and

J(X, L, ),
in which spans a k-simplices if and only if all the pairs (
i
,
j
) are 1-simplices
(see [4]).
18 CHAPTER 2. STRUCTURES FOR POINT SETS
There is the important result which implies that, instead of looking for a single
strong witness, it is possible to consider the entire aggregate of weak witnesses
(at greater length see [8]).
Theorem 2.2. Lets consider points
0
,
1
, . . . ,
k
from a nite L X. Then
=[
0
,
1
, . . . ,
k
] has a strong witness with respect to L if ond only if and
all its cells have weak witnesses with respect to L.
Below we consider a construction in the framework of witness complexes,
which have the dening characteristics:
1) landmarking: the complex / uses a vertex set LX considerably smaller
than the sample X itself;
2) homotopy approximation: under favourable circumstances, there is a homo-
topy equivalence /X, and may be a homeomorphism /X, however it
need not be a close geometric approximation;
3) intrinsic geometry: the construction estimates and works with the intrinsic
geometry of X, what let us to avoid problems relate to the embedding
dimension d;
4) non-redundancy: as opposed to the

Cech complex, / is not predisposed to
an accumulation of redundant high-dimensional cells.
2.6 Voronoi diagrams and Delaunay triangulations.
The Voronoi cell of pXR
d
is the set of points in the ambient space whose
Euclidean distance from p is less than or equal to the distance from any
other point in X. That is
1
p
def
= xR
d
[ d(x, p)d(x, q), q X.
Each Voronoi cell is a closed and, in case that p lies on the boundary of
conv X, unbounded convex polyhedron and distinct cells have disjoint interiors.
The Voronoi cells meet at most along common boundary faces, the collection of
Voronoi cells form the Voronoi diagram
1
X
def
= 1
p
[ pX,
which decomposes R
d
into Voronoi cells and forms a good covering of entire R
d
,
since all the cells are convex.
2.6. VORONOI DIAGRAMS AND DELAUNAY TRIANGULATIONS. 19
The Voronoi cell restricted to is 1
p,
def
= 1
p
[ p X , and the col-
lection of restricted Voronoi cells is the restricted Voronoi diagram which is the
decomposition of as a union of cells
1

= 1
X,
def
= 1
p,
[ pX.
If is a union of -balls then the nerve, ^(1
B

), of such a restricted Voronoi


diagram is the so called alpha complex.
The collection of Voronoi cells restricted to is a nite closed covering of .
For a subset T X, the corresponding subsets are
1
T
= 1
p
[ pT and 1
T,
= 1
p,
[ pT 1

.
Since we assume non-degenerate position, the common intersection of any k
Fig. 2.4: Decomposition of the plane by Voronoi cells of a nite set.
Voronoi cells is either empty or a convex polyhedron of dimension d+1k and,
for any 1 ^(1
X
), card 1 d+1.
The Delaunay complex of X, T=T
X
, is the geometric realization of ^(1
X
)
dened by the injection
r : 1
X
R
d
[ r(1
p
)=p mapping every 1
X
to its generator.
That is,
T
X
= conv r(1) [ 1 ^(1
X
) .
20 CHAPTER 2. STRUCTURES FOR POINT SETS
In other words, if two Voronoi cells share a common (d1)-face then their
generating points are connected by an edge, if three cells share a common (d2)-
face then their generators are connected by a triangle, etc.
Therefore, we have
T

=
T
[ T X, 1
T,
,= ,
i.e the convex hull of k points is a cell in the Delaunay complex i the corre-
sponding k Voronoi cells have a non-empty common intersection not contained
in any other Voronoi cell.

For given X, we have that T is unique,


dim T = min d, card X1 and T=conv X.
The Delaunay complex decomposes the convex hull of X by connecting the
points with simplices of all possible dimensions. In computational geometry,
T is referred to as Delaunay triangulation (see [11],[31]), which is dual to the
Voronoi diagram of the points. Advantages of these complexes are that they
are small, geometrically realizable, and their highest-dimensional simplexes have
the same dimension as the ambient space. In other words, the Delaunay trian-
gulation is a simplicial complex whose vertex set is X and it contains the cell
=[x
0
, x
1
, . . . , x
k
] whenever 1
x
0
1
x
1
. . . 1
x
k
,= . Equivalently, T if
there is a point p which is equidistant from x
0
, x
1
, . . . , x
k
and which has no
nearer neighbour in X. Then p is a witness to the cell and the assumed non-
degenerate points position means that each witness is equidistant from no more
then d+1 nearest neighbors in X.
By itself, the Delaunay triangulation T is a contractible simplicial complex,
and so its topological invariants carry no information. However, we can dene
restricted complexes whose structure does reect the topology of X.
The restricted Delaunay triangulation (or complex) is the geometric realiza-
tion in R
d
of the nerve of the restricted Voronoi diagram, that is
T

= T
X,
def
= conv r(1) [ 1 ^(1

), r (1
p,
)=p, pX.
So the restricted Delaunay samplicial complex T

is the dual of 1
X,
, and
it contains the cell =[x
0
, x
1
, . . . , x
k
] whenever 1
x
0
1
x
1
. . . 1
x
k
,= .
In other words, we demand a witness for which lies on itself.
Let us repeat once again, we can talk about an approximation of the sur-
face R
3
by the simplicial complex T
X,
only if the sampling X R
d
is a
suciently ne. In our applied case, the quality of the approximation depends
on suciency of density and by evenness of distribution of the sensor devices by
which we obtain X.
2.6. VORONOI DIAGRAMS AND DELAUNAY TRIANGULATIONS. 21
Fig. 2.5: Delaunay complex corresponding to the shown in Figure 2.3 decomposition by Voronoi
cells.
After the partitioning of X to subsets that corresponds to elements of the
covering, we can use the interaction of the subsets formed in this way between
each other for an approximate representation of the exploration data.
Note that T

is a subcomplex of the Delaunay simplicial complex, T=T


R
d,
of X and the relationship between , 1

and T

is elucidated by the nerve the-


orem. The nerve theorem of Leray [25] implies that, if all restricted Voronoi cells
are contractible, then the underlying space of the restricted Delaunay complex,
[T

[ =

, is homotopy equivalent to . This means that the two topological


spaces can be geometrically dierent but, in the meanwhile, have the same kind
of arrangement of holes, i.e. to be topologically equivalent.
Here is one of crucial points of the work, so let us make a brief resume. As men-
tioned before, we refer to our underground geometric object under investigation
as a topological space and subspace of R
3
. We have only a representation of
by a nite set XR
d
, where d is a number of parameters in the reected signals
representation. Nevertheless, since each signal implicates information about the
spatial point of the object surface from which it was reected back, the available
Betti numbers still implicate desirable information, and therefore are valuable
for further renement from noise and persistent features distillation. In order to
represent the geological formation and to be able to do any calculations, we need
to construct on the base of X a simplicial complex which captures topology of
. The Voronoi cells of X decompose into closed convex regions, and the De-
launay simplicial complex restricted by is dened as the geometric realization
of the nerve of these regions by the map 1
r
p, for all 1 ^(1

). The com-
plex T

is dual to the restricted Voronoi diagram, its simplices are spanned by

This construction even does not stipulates non-degenerate position for points of X.
22 CHAPTER 2. STRUCTURES FOR POINT SETS
Fig. 2.6: Delaunay based triangulation of a complicated shape.
subsets T X, and, if the common intersection of any subset of these restricted
Voronoi regions,

1
T,
, is empty or is convex and, therefore, contractible for
every T X, then, by the nerve theorem, and [T

[ are homotopy equivalent


and so have the same topological type.

So topological properties of a simplicial complex representing the underground


space, such as whether its domain is homotopy equivalent or homomorphic to ,
is under consideration are based on local interactions between and the Voronoi
neighborhoods of the sampled points X. This leads us to an idea that additional
points can be chosen so that improve the local interaction patterns, can be done
by using the landmark points, and are related to our special case of the grid
generation problem.
2.6.1 The dual complex.
As a good example here can serve us the so called dual complex of a union of balls
in R
d
, as is demonstrated in [12]. Let X R
d
and dene
A
def
= xR
d
[ min
pX
d(x, p), R, 0 .

Moreover, |D

| and are homomorphic if the sets can be further subdivided in a certain way
so that they form a so called regular CW complex. A closed ball is called a cell, or a k-cell if its
dimension is k. A nite collection of non-empty cells, R, is a regular CW complex if the cells have
pairwise disjoint interiors, and the boundary of each cell is the union of other cells in R (see e.g.
[26]).
2.6. VORONOI DIAGRAMS AND DELAUNAY TRIANGULATIONS. 23
The Voronoi cells 1
X
decompose A into closed convex regions, and the dual
complex is dened as the nerve of these regions, ^(1
X
), geometrically realized
by the map 1
p,X
r
p for all p X. It is the same as the restricted Delaunay
simplicial complex T
X
. The common intersection of any subset of these regions
is convex and therefore contractible, and so the nerve theorem implies that the
underlying space of the dual complex, [T
X
[, is homotopy equivalent to A.
Note that, in terms of open balls, the Delaunay complex, T = T
X
, can be
dened as a simplicial complex dened by X R
d
and consists of all simplices

T
[ T X, for which there exist an open ball
B = B(x, ) = y R
d
[ d(x, y)<, xR
d
, R,
with X cl B=T and XB=.
There is an interesting result from [14], stating that, if is a k-manifold with
boundary, then and [T
X,
[ are homeomorphic if 1
X,
satisfy the following
closed ball property:
1) the common intersection of and any k+1l Voronoi cells is either empty
or a closed l-ball;
2) the common intersection of the boundary of and any k+1l Voronoi
cells is either empty or a closed (l1)-ball.
The closed ball property generalizes to a sucient condition that implies
homeomorphic reconstruction for general triangulable space .
2.6.2 The -shape complex.
The -shape complex for the set X R
d
, attached to the xed parameter ,
is the abstract simplicial complex

(X) whose vertex set is X , and where


a family x
0
, x
1
, . . . , x
k
spans a k-simplex if and only if the convex sets
(x
i
, )
def
= B(x
i
, ) 1
x
i
[ x
i
X, i = 0, 1, . . . , k
have non-empty common intersections.
This complex is homotopy equivalent to the

Cech complex. The -shapes are
also nerves of dierent coverings of a union of the balls. However, -complexes
include much fewer elements and has the same dimension as the ambient space.
So, by the nerve theorem, there is a homotopy equivalence (X, ) B(X, )
and advantage of the -complex is that it uses considerably fewer cells. Observe
that, by construction, the alpha complex is always a subcomplex of the Delaunay
24 CHAPTER 2. STRUCTURES FOR POINT SETS
complex,

(X) T
X
, and therefore use of the -shape complexes can signif-
icantly reduce calculations, i.e. we may compute the former by computing the
latter.
The paradigm of -shape complexes is dened for [0, ]. The smallest
complex
0
(X) is a discrete collection of points, and the largest complex

(X)
is a complete simplex on the vertex set X. The appearance, survival and disap-
pearance of homology classes, as varies through intermediate values, provide
detailed topological information that is statistically more robust than the Betti
numbers of the complex for any single value of . Thus we seek similar nested
families based on restricted Delaunay complexes.
Chapter 3
Multiresolution and Persistence
3.1 Levels of resolution.
We want to explore sets of point cloud at a various level of resolution, and so to get
a possibility to consider outcomes at dierent levels for comparison. Whenever
intervals at two dierent resolutions have a non empty intersection, there exist a
natural map from one set of intervals to the other. This construction produces a
multiresolution or multiscale image of the data set.
One can actually construct a family of simplicial complexes which are viewed
as images at varying levels of coarseness, and maps between them moving from
a complex at one resolution to one of less or more coarser resolution.
Let us assume that we got two coverings | =U
i

iI
and

| =

U
j

jJ
of X.
A map of coverings from | to

| is a set map : I J such that, for all i I,
we have U
i


U
(i)
. This map allow to discern a multiresolution structure of the
clustered point cloud, and implies a so called functorial

clustering algorithm.
Then |

| implies an inclusion /(U
i
)/(

U
i
) for all i I. So, if we apply a
functorial clustering scheme to the both coverings, it is clear that each connected
component of /(|) is included in exactly one connected component of /(

|). So
we got a map between sets of clusters of | and

|, and therefore a map from the
vertex set of /(|) to the vertex set of /(

|) . Finally, we obtained an associated


induced simplicial map : /(|) /(

|) of complexes given on correspondent


clusters.
Similar, if we have a family of dierent coverings
X
i
= |
i
[ X |
i
, i =0, 1, . . . , n with inclusions |
i
|
i+1
.

A clustering algorithm is functorial if any inclusion XY of point clouds maps each single
cluster in X in one of the unique clusters in Y .
25
26 CHAPTER 3. MULTIRESOLUTION AND PERSISTENCE
If we represent X with a simplicial complex, then we may also represent its
grows with a ltered complex, i.e. we can obtain the correspondent diagrams of
complexes and simplicial maps
/(X
0
)

0
/(X
1
)

1
. . .

n1
/(X
n
).
With the data encoded into simplicial complexes, we are interested in topo-
logical features which persist over a sequence of simplicial complexes of dierent
sizes. This sequence reects changes when new attributes introduced or removed.
So the above structure clearly demonstrates that the exploration of the behavior
of intrinsic geometric features of X under such maps allows us to distinguish
actual features which holds out at multiple scales from artifacts which appears
not so often or even at a single level.
The ltration presupposes inclusions by increasing of a certain parameter
for all involved complexes, i.e. there is an inclusion of the upper complex into the
lower one. If is too small, the complex is a discrete set X themselves, and, for
is too large, /

(X) is a single simplex. In this context, the golden mean may


not exist. Algebraic topology suggest a functional approach based on idea, that
is the topology of a given space is framed in the mappings to or from that space.
3.2 Persistence homology.
Suppose that we have two topological spaces X and Y and two maps g, h: X Y
between them. Observe rst, that the homology groups H
k
(X) are a family of
Abelian groups for positive integers k with the following properties (see [18]).
Functoriality: each H
k
(X) is a functor, that is, for any continuous g , there is
the induced homomorphism
H
k
(g): H
k
(X) H
k
(Y ), such that
_
H
k
(gh) = H
k
(g)H
k
(h)
and H
k
(1
X
)=1
H
k
(X)
.
Homotopy invariance: if g and h are homotopic, then H
k
(g) =H
k
(h). If g
is a homotopy equivalence, then H
k
(g) is an isomorphism.
The elements of homology groups are cycles, i.e. chains with vanishing bound-
ary, and two k-cycles are considered homologous if their dierence is the bound-
ary of the (k +1)-chain. In more general terms, H
k
(X) determines the number
of k-dimensional subspaces of X, which have no boundary in X and themselves
are not a boundary of any (k +1)-dimensional subspace. Homology groups are
computable and provides an insight into topological spaces with maps between
3.2. PERSISTENCE HOMOLOGY. 27
them. Our interest is in a discerning which topological features are essential
and which can be safely ignored. This is pretty similar to the signal processing
procedure, when a signal is removed from noise.
For any eld F, there is a version of homology with coecients in F, that
takes values in the F-vector spaces. Throughout the work, we always compute
over eld coecients. The dimension of the vector space is the kth Betti number

k
(X) of the space.
Ideally, the complex /(X) has the same homotopy tipe, and therefore it has
the same homology as X. Unfortunately, this is rather an exception and, despite
the above theory, in practice it is rather unusual for the complex to capture the
homology of the underlying space. The root of the problem is in the noise which
implies the not adequate sampling from X, and also there are other faults in the
data recovery. We can not distinguish between features of the original space and
the noise spanned by the representation.
In order to make an exposition more visual, let us consider the next graphic
example. We have the point cloud sampled from an annulus that has the ho-
motopy class of a circle. We assume that the similarity complex, /

=/(X, ),
constructed on such X is the

Cech complex, and, as it shown in the below
picture,

C

has two smaller additional holes which creates new generators in ho-
mology. Therefore, for this value of , a computation of the homology of the
complex yield a rst Betti number is equal to three instead of the desired
1
=1.
Fig. 3.1: A

Cech complex

C

constructed on a nite collection on points in the Euclidean


plane.
It seems, that an increasing of the parameter value will give rise to a complex
with the correct topological characteristics. Indeed, some voids will enclosed and
28 CHAPTER 3. MULTIRESOLUTION AND PERSISTENCE
lled in, but at the same time new components will appear and connect to the
old ones. Therefore, in reality the thickened complex which corresponds to some

> will lead to an another error. As illustrated in the next picture, whereas
the two holes have been closed, a new hole has been arisen. Finally, if we compute
the homology of this complex, we will get the another incorrect result:
1
=2.
Fig. 3.2: The

Cech complex after increasing of the parameter value to

>.
Therefore in the similarity complex ltration clusters with the related k-
dimensional topological attributes appear and vanish with the lapse of time,
and we need some measure of signicance which would enable us to dierentiate
meaningful information. In other words, we need some segregation technique for
the captured attributes which have relatively long lifetime within the ltration
grows history.
The ltration =X
0
X
1
. . . X presupposes inclusions by increasing of
the ltration parameter for the all involved complexes, i.e. there is an inclusion
of the upper complex into the lower one, since the upper complex is corresponds
to a bigger ltration index then the lower one. This yields a directed space
= /(X
0
)

. . .

/(X
j
)

. . .

/(X),
where the maps are the respective inclusions. Applying the kth dimensional
homology functor H
k
to both the spaces and the maps, we get an another di-
rected space
= H
k
(X
0
)

k
. . .

k
H
k
(X
j
)

k
. . .

k
H
k
(X),
where
k
are the respective induced homology maps. Since in this context the
golden mean may not exist, algebraic topology suggest a functional approach
3.2. PERSISTENCE HOMOLOGY. 29
based on the idea, that is the topology of a given space is framed in the mappings
to or from that spaces. This leads to extremely powerful tools for studying real
data, where a single homology group by itself is likely to be highly unstable with
respect to parameter settings and noise.
The trick that will let us to obtain the desired topological information is
that, instead of computing homology for the concrete approximation complex,
jth element of the ltration of /, /
j
=/(X
j
), we will compute homology for a
sucient amount of the ltration parameters, since the suciency denes by
a wishful level of a resolution coarseness. Now the multiscale approach lets us to
consider the so called
persistence complex a ltered simplicial complex, along with its associated
chain and boundary maps, so one considers the ordered sequence of spaces
/
j

jJ
, stitched together in a nested family of injections
/
j

j,p
/
j+p
.
By turns the persistence complex leads to the homomorphism
H
k
(/
j
)

j,p
k
H
k
(/
j+p
),
that maps a homology class into the one that contains it. Now we are able to
dene the crucial concept of this research.
The persistent homology of X is an image of the above homomorphism,
Im
j,p
k
.
This this the quite famous

algebraic invariant which derive their popularity


from their computability. Persistent homology enable to capture the connectivity
of the space, and that peaks out those already existing in /
j
homology classes
which persist when we map the complex to the lower one. Here the homological
history is modeled by the complex ltration, where simplexes are always added
but never removed, implying a partial order on the simplexes. Persistent homol-
ogy is an algebraic invariant that identies the birth and death of each topological
attribute in this evolution. Another names for persistence are space-time anal-
ysis and historical analysis with the ltration as the history of topological and
geometrical changes.
Algebraically, the p-persistent kth homology group of the jth complex /
j
in
a ltration can be dened as the factor of its kth cycle group, Z
j
k
, by the kth
boundary group, B
j+p
k
, of /
j+p
, p complexes later in the ltration:
H
j,p
k
def
= Z
j
k
/(B
j+p
k
Z
j
k
),

It was introduced rst by Edelsbrinner, Letscher, Zomorodian ([13]), and then studied in detail
by Carlsson and Zomorodian ([44]).
30 CHAPTER 3. MULTIRESOLUTION AND PERSISTENCE
which is well-dened since the denominator is the intersection of two subgroups
of C
j+p
k
and thus is a group itself, a subgroup of the numerator.

Here we derive
the cycles which are not turned in the boundaries for p steps in time since the
moment j. Here we have that, if two cycles are homologous in /
j
, then they
are also exist and homologous in /
j+p
, and therefore we have the isomorphism
Im
j,p
k

= H
j,p
k
.
In each dimension the homology of the complex becomes a vector space over
a eld, and it fully described by its Betti numbers. Each topological attribute in
the similarity complex ltration has a lifetime during which it contributes to some
Betti number. We mostly interest in those attributes with longer lifetimes, as they
persist in being features of the objects shape. We may represent these life-spans
as intervals, and therefore persistent homology can describe the connectivity of
the object under investigation via a multiset of intervals in each dimension. For
an interval
[a
i
, b
i
) [ a
i
Z
+
, b
i
Z
+
.
Let F[a
i
, b
i
) = F
i

i0
be a directed vector space over the eld, F, which is
equivalent to F within the interval and is empty elsewhere. Under suitable
niteness hypotheses that are satised for all our spaces, the homology directed
space may be written as a direct sum
l

i=0
F
i
,
where the description is unique up to a reordering of the summands. Therefore we
can track topological attributes and measure their lifetimes as the nite multiset
of these so called T-intervals.
Consider some non-bounding k-cycle z, which arise at time i as a conse-
quence of the appearance of the simplex
+
in the complex /, such that the
correspondent homology class [z] H
i
k
. We mark the simplex
+
as a creator.
At some moment of time after, p steps later in the ltration, the another just
arrived simplex

turn in [z] the homologous to z k-cycle, z

, into a boundary.
The simplex

labeled as a destroyer, just because it is eliminated both z

and
the created before element of H
i
k
, and thereby decreased the rank of the kth
homology group.
The persistence of the k-cycle z and the correspondent homology class [z] in
/ is the dierence between endpoints of its life-span interval [i, i+p) in the
complex. A non-bounded interval [i, ) implies an innite persistence.
By a varying of the parameter p it is possible to regulate what amount of
topological noise we wish to get rid of.

Here the superscripts indicates the ltration index, so the associated with K
j
groups are
C
j
k
, Z
j
k
, B
j
k
, H
j
k
, and the boundary operators are
j
k
for all j, k 0.
3.2. PERSISTENCE HOMOLOGY. 31
So the persistent (or persistence) homology is a correct and an eective tool to
capture topological invariants from the real data. The main idea is to assess the
extent to which features are genuine and represented by large holes, as opposed
to artifacts that may be regarded as an inadequate sampling or a noise, that
relate to little holes which collapse almost as soon as they are formed. Lifetime
intervals serve us with such an extent, since the features which persist over a
range of values of the coarseness, and retains in the complex more then certain
threshold time, would be viewed as being less likely to be artifacts.
Chapter 4
Persistence Structures
4.1 Persistence Betti numbers of dierent dimensions and
barcode.
Individual Betti numbers by themselves are highly unstable, and it is natural to
dene the so called
p-persistent kth Betti numbers as dimensions of the correspondent p-persistent
kth homology groups:

i,p
k
def
= rank H
i,p
k
.
The kth Betti numbers have the meaning that somewhere in the complex
k-dimensional subcomplex is missing through all stages of the complex growth or
reduction. In other words, an object formed by simplices of dimension at most k
is absent from the complex, and this k-dimensional holes remains open when we
thicken the complex until the certain ltration index. A sucient increase of the
ltration index p is the mode to clean the topological signal of the complex
from the topological noise which related to the features whose lifespan lasts
beyond the chosen threshold.
The persistence compute the compatible homology bases across this growth
history, i.e. it represent an algebraic invariant which detect the birth and death
of each topological feature as the complex evolve in time. Therefore, it is ad-
vantageous to encode the persistent homology in the form of a parameterized
version of the rank of homology groups, its Betti numbers, by a representing
of the persistence of kth homology groups generators in a multiset of intervals
called a persistence barcode [6]. Time is a parameter for this purposes, since
it encompasses both a birth and, perhaps, a death of each single simplex in the
complex. During its temporal existence, each topological attribute plays a part in
33
34 CHAPTER 4. PERSISTENCE STRUCTURES
Fig. 4.1: Filtration of a simple simplicial complex with topological characteristics. Just added
at a current step vertex or face are represented in red.
the formation of some Betti number, and our interest lies in those properties with
long life-spans. The parameter intervals represent lifetimes of various stages of
the ltration, and they may be represented on the horizontal axis while arbitrary
ordered homology generators H
k
may be represented on the vertical axis. E.g.
a barcode for the ltration of a tetrahedron of the above picture is presented in
the below picture.
The rank of the persistent homology group H
i,p
k
is equal to the number of
T-intervals in the barcode of the homology group H
k
, i.e., in detail, is equal
to the number of the lifetimes [i, i+p) which corresponds to H
k
, and where p
is within the limits of the chosen threshold of a resolution. Barcode reects
the persistent properties of Betti numbers and serve us as a lter that enable a
clear distinction between a topological noise and a topological signal. So the
persistent homology can be dened as the homology of the growing space that
4.1. PERSISTENCE BETTI NUMBERS OF DIFFERENT DIMENSIONS AND
BARCODE. 35
Fig. 4.2: Barcode for the ltration of the simplicial complex presented in Figure 4.1.
captures lifetimes of topological attributes in a barcode.
Fig. 4.3: A ltered simplicial complex and its barcode persistence interval multiset in each
dimension. Each persistent interval shown is the lifetime of a topological attribute, created and
destroyed by the simplices at the low and high endpoints, respectively.
So here our interest lies in a detection of long-lived homology groups of a
simplicial complex during the course of its history which includes both addition
and removal of simplices. The method relies on the visual approach of a recog-
nizing of persistent features in the form of a barcode which may be regarded as
the persistence analog of Betti numbers. E.g. for the ltration in Figure 4.3, the

0
-barcode is [0, ), [0, 1), [1, 2) with the intervals describing the lifetimes of
36 CHAPTER 4. PERSISTENCE STRUCTURES
the components created by simplices a, b and d, respectively. The
1
-barcode
is [2, 5) [3, 4) for the two 1-cycles created by edges ad and ac, respectively,
and provided ad enters the complex after cd at time 2.
For a simplicial complex, there is the standard algorithm for computing Betti
numbers in each dimension [18]. In order to represent this reduction algorithm
and according to the Appendix A terminology, let us consider the k-dimensional
boundary operator

k
: C
k
C
k1
for the similarity simplicial complex /. Therefore, here we have
Z
k
def
= ker
k
, B
k
def
= Im
k+1
and H
k
def
= Z
k
/B
k
are kth chain (cycle) group, kth boundary group and kth homology group, respec-
tively.
Fig. 4.4: A chain complex with chain, cycle, boundary groups and their images under the
boundary operators.
Observe that the chain complex
. . .

k+2
C
k+1

k+1
C
k

k
C
k1
. . .
splits into a direct sum of subcomplexes, each is with at most two nonzero terms.
The chain complex arises from a nite simplicial complex, and these C
k
are
nitely generated. Since each C
k
splits as C
k
= Z
k
B
k
, we have a diagram:
. . .

k+2
C
k+1

k+1
C
k

k
C
k1

k1
C
k2

k2
. . .
.
.
.
.
.
.
0 B
k
Z
k1
0

0 B
k+1
Z
k
0

0 B
k+2
Z
k+1
0
4.1. PERSISTENCE BETTI NUMBERS OF DIFFERENT DIMENSIONS AND
BARCODE. 37
As chain groups C
k
is free, the oriented k-simplices form the standard basis for
it. We represent the boundary operator
k
relative to the standard bases of the
chain groups, as an integer matrix, M
k
, with entries 1, 0, 1. The matrix M
k
is called the standard matrix representation of
k
, it has m
k
columns and m
k1
rows which are numbers of k- and (k1)-simplices, respectively. As shown in
Figure 4.4, the null-space of M
k
is corresponds to Z
k
, and its range-space is
corresponds to B
k1
. In order to reduce M
k
to a more manageable form, we use
the following elementary row and similary dened elementary column operations:
1) exchange row i and row j;
2) multiply row i by 1;
3) replace row i by (row i)+q(row j), where q is an integer and i ,=j.
Utilizing the reduction, we can derive alternate bases for the chain groups,
relative to which the matrix for
k
is diagonal. Each column/row operation is
corresponds to a change in the basis for C
k
/ C
k1
, and, nally, we can reduce
M
k
to its (Smith) normal form,

M
k
, where all entries are zero except, possibly, a
block at the top left corner which may contain nonzero entries down the diagonal:

M
k
=
_

_
b
1
0
.
.
.
0 b
l
k
0
0 0
_

_
, where
_
l
k
=rank

M
k
, b
i
1 and
b
i
[b
i+1
for all 1i <l
k
.
In this matrix, the columns with non-zero entries corresponds to a basis for the
image, and each gives a summand of the form 0 Z
b
l
k
Z 0. The zero columns
of the matrix corresponds to a basis for Z
k
, and each one gives a summand of
the form 0 Z 0.
Computing the normal form for boundary operators in all dimensions, we get
a full characterization of H
k
as the following:
a) the torsion coecients of H
k
corresponds to d
i
in the considered below
structure theorem 4.1, and the diagonal entries are greater then one;
b) since e
i
[ l
k
+1i m
k
is a basis for Z
k
, then rank Z
k
=m
k
l
k
;
c) as b
i
e
i
[1i l
k
is a basis for B
k1
, therefore rank B
k
=rank M
k+1
=l
k+1
.
Combining b) and c), we obtain an elegant expression for Betti numbers:

k
= rank Z
k
rank B
k
= m
k
l
k
l
k+1
.
38 CHAPTER 4. PERSISTENCE STRUCTURES
For instance, for the ltered simplicial complex in Figure 4.3, the standard
matrix representation of
1
is
M
1
=
_

_
ab bc cd ad ac
a 1 0 0 1 1
b 1 1 0 0 0
c 0 1 1 0 1
d 0 0 1 1 0
_

_
,
where the bases are shown within the matrix. Reducing the matrix, we receive
the normal form

M
1
=
_

_
cd bc ab z
1
z
2
d c 1 0 0 0 0
c b 0 1 0 0 0
b a 0 0 1 0 0
a 0 0 0 0 0
_

_
,
where
_

_
z
1
=ad bccd ab,
z
2
=acbcab form a basis for Z
1
and
dc, cb, ba is a basis for B
0
.
4.2 The persistence module.
It is time now to place the persistence homology within the classical framework
of algebraic topology, what will allow us to utilize the standard structure theorem
in order to be able to establish the existence of a simple description of persistent
homology groups over arbitrary elds as a set of intervals. We are going to use
the classication of modules

over a polynomial ring with rational numbers eld


coecients for a computation of the persistent homology by a correlation it with
the birth and death of topological features in the data. Such a set of intervals
for a ltered complex, where positive cycle-creating simplices are paired with
negative cycle-destroying simplices, allowed the correct computation of the rank
of persistent homology groups.
Let us rst to combine the homology of all the complexes in the ltration in
a single algebraic structure. As was dened, a persistence complex is a ltered
simplicial complex, along with its associated chain and boundary inclusion maps.
I.e. we have a family of chain complexes /
i

i0
, assume over a commutative
ring with unity R, together with chain maps f
i
: /
i

/
i+1

. In general, it is a
family of chain complexes /
i

i0
and inclusion chain maps f
i
:
/
0

f
0
/
1

f
1
/
2

f
2
. . .

All necessary algebraical background it is possible to nd e.g. in well written [7].


4.2. THE PERSISTENCE MODULE. 39
More widely, along with the boundary maps, we have the diagram where the
ltration index increases horizontally under the chain maps, and the dimension
decreases vertically under the boundary operators:
.
.
.
.
.
.
.
.
.

/
0
2
f
0
/
1
2
f
1
/
2
2
f
2
. . .

/
0
1
f
0
/
1
1
f
1
/
2
1
f
2
. . .

/
0
0
f
0
/
1
0
f
1
/
2
0
f
2
. . .
For instance, in the ltered simplicial complex cinsidered in the below picture,
at the step 0, there are two contractible connected components which become
two circles at the step 1. At the step 2, the two components join to form just
one component. Finally, one of the circles is lled, killing o a 1-cycle class in
homology.
Fig. 4.5: A ltered simplicial complex.
It determines an inductive system of homology groups, i.e. a family of Abelian
groups
H
i

i0
together with homomorphisms H
i

H
i+1

.
Since the homology is computed with eld coecients, we obtain an inductive
system of vector spaces over the eld. Each vector space is determined up to
the isomorphism by its dimension. In order to obtain a simple classication of
an inductive system of vector spaces in terms of a set of intervals, we need an
additional structure that represents the homology of a persistent complex.
A persistence module, /, over ring R is a family of R-modules /
i
, to-
gether with homomorphisms
i
: /
i
/
i+1
. Written as /=/
i
,
i

i0
.
40 CHAPTER 4. PERSISTENCE STRUCTURES
The persistence module is a structure that represents the homology of a l-
tered complex, where
i
merely maps a homology class to the one that contains
it.
A persistent complex /
i

, f
i
is of nite type if each component complex
is a nitely generated R-module, and if, for suciently large i, the correspond-
ing maps f
i
become R-module isomorphisms. Similarly, a persistent module
/
i

,
i
is of nite type if each component of the module is a nitely generated
R-module, and if the maps
i
are isomorphisms for i more or equal of some in-
teger value. Since our complex / is nite, it generates of a persistence complex,
/
i

, f
i
, of nite type, whose homology is a persistence module, /, of nite
type.
The setup is now as follows. Suppose we are given a persistence module
/ = M
i
,
i

i0
over a ring R. We need to have a simple classication for
these modules and to develop a way to identify elements with the corresponding
elements at other timesteps of the ltration. In other words, for a computation
of the persistent homology, we have to choose bases which are compatible across
the ltration. The main classication comes in two main steps (see [44]).
4.2.1 The Artin-Rees correspondence.
As we assumed, a ring R to be commutative with unity. A polynomial f(t) with
coecients in R is the formal sum
_

i=0
a
i
t
i
[ a
i
R, t
_
.
The set of all polynomials f(t) over R forms a commutative ring, R[t], with
unity. If R has no divisors of zero, and all its ideals are principal, it is a principal
ideal domain, PID

.
A graded ring is a ring R, +, ) equipped with a direct sum decomposition
of Abelian groups
R

=

i
R
i
, i Z,
so that the multiplication is dened by the bilinear pairings R
n
R
m
R
n+m
.
Elements in a single R
i
are called homogeneous, and degree of these elements is i .
Let us grade R[t] non-negatively with the standard grading
(t
n
)
def
= t
n
R[t] [ n0.

For our purposes, PID is simply a ring where we may compute the greatest common divisor,gcd,
of a pair of elements. This is the key operation needed by the below structure theorem. PIDs include
the familiar rings Z, Q and R. Finite elds Z
p
for p a prime, as well as polynomials with coecients
from a eld F, F[t], are also PIDs and have eective algorithms for computing the gcd [7].
4.2. THE PERSISTENCE MODULE. 41
Now we are going to combine all of the complexes in the ltration in order to get
a single structure and to encode the time step at which an element is born by a
polynomial coecient. So dene
a graded module, (/), over a graded ring R[t] as the module which is equipped
with a direct sum decomposition, /

=

i
/
i
[ i Z, so that the ac-
tion of R on / is dened by bilinear pairings R
n
/
m
/
n+m
. A
graded ring or module is non-negatively graded, if, respectively, R
i
=0
or /
i
=0 for all i <0.
Literally, we have
(/)
def
=

i=0
/
i
,
so the R[t]-module structure is simply the sum of the structures on the individual
components, where the action of t is given by
t ( m
0
, m
1
, m
2
, . . . ) = ( 0,
0
(m
0
),
1
(m
1
),
2
(m
2
), . . . )
that is, t simply shifts elements of the module up in the gradation.
We start by computing a direct sum of the complexes, arriving at a much
larger space that is graded according to the ltration ordering. Then, we re-
member the time when each simplex enters using a polynomial coecient. For
example, while a simplex exists at time 0, if it retains in the complex until
some time p, we write t
p
at this time. The key idea is that the ltration
ordering is encoded in the coecient polynomial ring.
The main correspondence given by the Artin-Rees theory in commutative
algebra (see [15]). In the next section we will demonstrate how denes an
equivalence of categories between the category of persistence modules of nite
type over R and the category of nitely generated non-negatively graded modules
over R[t].
4.2.2 A structure theorem for graded modules over a graded PID.
A structure of a persistence module is described by the below structure theorem.
Theorem 4.1. If D is PID, then every nitely generated D-module is isomorphic
to a direct sum of cyclic D-modules. That is, it decomposes D uniquely into the
form
D

_
n

i=1
D/d
i
D
_
, where
_
Z,
d
i
D, such that d
i
[d
i+1
.

The theorem decomposes the structure into two parts: the free portion on the left and the
42 CHAPTER 4. PERSISTENCE STRUCTURES
Similarly, every graded module, /, over a graded PID, D, decomposes uniquely
into the form
_
n

i=1

i
D
_

_
m

j=1

j
D/d
j
D
_
,
where

denotes a shift upward in grading by ,


i
,
i
Z, and d
j
D are
homogeneous elements such that d
i
[d
i+1
.
Since the ground ring, R, is assumed to be a eld, F, then the graded ring
F[t] becomes a PID with only graded ideals are homogeneous in the form (t
n
).
So, by the above theorem, the structure of graded F[t]-modules can be represented
as the following direct sum:
_
n

i=1

i
F[t]
_

_
m

j=1

j
F[t]/t
n
j
_
.
Via the correspondence with nite-type persistence modules given above, the
coecients for this module decomposition can be made meaningful:
j
and
i
describes when a basis element is created along the ltration. The element then
either persists along the ltration until it death at the time
j
+n
j
1, or it retains
in the complex ltration forever if it lives in the free left summand. As before,
we express the lifespan of a basis element in the ltration by the pairing of its
creation and elimination times:
1) [
i
, ) from the left summand corresponds to a topological attribute that
is created at time
i
and exists in the nal structure;
2) [
j
,
j
+n
j
) from the right summand corresponds to an attribute that is
created at time
j
, lives for time n
j
, and is destroyed.
Finally, we can complete the theoretical part of this work by a parametriza-
tion of isomorphism classes of nitely generated F[t]-modules by a nite set of
combinatorial invariants. As was mentioned above, a T-interval in an ordered
pair
[i, j) [ 0 i <j Z

= Z + .
For a T-interval [i, j) we dene a map
(i, j)
def
=
_

i
F[t]/(t
ji
) if j <

i
F[t] if j =
,
torsional portion on the right. If the ring is a PID, D, the kth homology group H
k
is a D-module
and the theorem applies that the rank of the free submodule is the Betti number of the
module, and d
i
are its torsion coecients. When the ground ring is Z, the theorem describes the
structure of nitely generated Abelian groups. Over a eld, such as R, Q or Z
p
for p a prime,
the torsion submodule disappears. The module is a vector space that is fully described by a single
integer, its rank, , which depends on the chosen eld.
4.2. THE PERSISTENCE MODULE. 43
Fig. 4.6: Diagram of P-intervals corresponding to the ltered simplicial complex from
Figure 4.5. [0, ) and [0, 2) are 0-intervals; [1, ) and [1, 3) are 1-intervals.
and consider a nite set of T-intervals
o
def
= [i
1
, j
1
), [i
2
, j
2
), . . . , [i
m
, j
m
) .
We associate o with the nitely generated graded modules over the graded ring,
F[t], via the correspondence o (o) that denes a bijection
(o) =
m

=1
(i

, j

).
So it was constructed the correspondence with the classication which demon-
strates that the isomorphism classes of persistence modules of nite type over a
eld F are bijective to the nite sets of T-intervals. We refer to this multiset of
intervals as the considered above barcode. Like a homology group, a barcode is a
homotopy invariant.
Since we are working over a eld, in each dimension the homology groups
H
k
(/
i
) are in fact vector spaces, completely described by its ranks,
k
(/
i
), which
counts the number of topological attributes in the correspondent dimensions. By
the structure theorem, there exist a basis for a persistence module that is a
compatible basis for all these ltered vector spaces glued by direct sums, which
44 CHAPTER 4. PERSISTENCE STRUCTURES
gives an ability to track attributes life-spans through the ltration history, i.e.
to compute a persistent homology for the whole ltration.
Each T-interval [i, j) describes a basis element for the homology vector spaces
from time i until time j1. I.e. this element is a k-cycle, e, that is completed
at the moment of time i, forms a new homology class, and also remains non-
bounding until the moment of time j, when it joins the boundary group B
j
k
. Our
point of interest is when the k-cycle e+B

k
is a basis element for the persistent
homology groups H
,p
k
. Here we have three obvious inequalities which dene the
represented in the below picture triangle region in the index-persistence plain:
_

_
p 0 according to the ltration
i since i is a time of birth of the considered homology class
+p < j since eB
+p
K
otherwise
.
Fig. 4.7: The triangular region in the index-persistence plain, that denes when the cycle is a
basis element for the homology vector space.
Lemma 4.2. Let T be the set of such triangles dened by T-intervals for the
k-dimensional persistent module. Then
,p
k
is the number of triangles in T
containing the point (, p) (see [13]).
The above lemma asserts that a computing of a persistent homology over a eld
is equivalent to a nding of the corresponding barcode. Observe also, that while
component homology groups are torsionless, a persistence appears as torsional
and free elements of the persistence module.
4.2. THE PERSISTENCE MODULE. 45
It was shown that the persistent homology of a ltered k-dimensional simpli-
cial complex is merely the standard homology of a particular graded module over
a polynomial ring.
Chapter 5
The Realized Singular Software
5.1 The program structure.
I have implemented the persistence algorithm, the code is represented in the
Appendix B. The implementation is in the computer algebra system Singular
which is perfectly adopted for our purposes, and utilizes a library from Qhull,
a free open source software [39]. The goal was to take a nite cloud of points,
X, as input, and compute persistent Betti numbers as a function of a resolution
parameter. So the program requires for initial input a point cloud data which
is a material for a construction of the approximation complex. The similarity
complex triangulate the points cloud sampled from the underlying geological
formation, and is equipped with a ltration that explains how the complex might
be built in steps. As output the program produce a barcode of a persistent
module over a eld F, what is sucient information for an obtaining of the all
desirable persistence information. Observe that we can simulate the algorithm
over the considered eld itself, without the need for computing the F[t]-module.
The software consists of three independent blocks which is nally melted in the
one integral program. These three parts performs the following functions.
1. An approximation of the input point cloud, X, by the restricted Voronoi
diagram complex (or by its dual, the restricted Delaunay complex). The
similarity complex is represented in the so called Object File Format
the very common data format to represent the geometry of a model by
a specifying the polygons of the models surface. So the result of this
block execution is the transformation of the initial input data le to the le
ComplexInput which has includes a lot of information and statistics but
is still not suitable for the represented by the third block main Singular
program.
47
48 CHAPTER 5. THE REALIZED SINGULAR SOFTWARE
That is why we will transform the initial input le once again on the second
step. This initial block is C++ procedure from the free open source software
Qhull which is compiled and inserted inside of Singular.
2. A transformation of the OFF le ComplexInput to the main procedure.
The result of this second transformation of the initial input data is essen-
tial information about the approximation complex. Literally, we got the
representation of all facets of the complex in the suitable for the main Sin-
gular procedure form: each of the facets is described by the collection
of the vertexes which it is formed by. A convenient way to represent the
simplicial similarity complex is via the so called incidence matrix, whose
columns are labeled by its vertices and whose rows are labeled by its sim-
plices, as shown in the below picture example. This block is an Singular
program.
Fig. 5.1: An example of a simplicial complex and its incidence matrix representation. Columns
are labeled by its vertices and rows are labeled by its simplices.
3. Computation of the barcodes and persistent Betti numbers from the trans-
formed on the previous steps data, which reects topological structure of
X and, therethrough, of the underground geological object under investi-
gation. The block is the main Singular program.
All the three blocks are incapsulated in the program which demand for input
a name of an input le with a full directory path.
An initial input is a text le which should include the following information:
5.2. THE PERSISTENCE ALGORITHM. 49
1) the rst line is the dimension, d, of the data;
2) the second line is the number, N, of points in X;
3) each of the next N lines are coordinates of the points divided just by space
characters.
5.2 The persistence algorithm.
The realized in the third block of the program persistence algorithm stipulate for
the preprocessing task which is to generate a list of simplices up to the dimension
k +1 for the k-dimensional homology. A ltration implies a partial order on the
cells of the nite simplicial complex /. We start by sorting cells within each
time snapshot by dimension with a breaking other ties arbitrarily, obtaining a
full order. The algorithm takes for input a full order of the ltered approximation
complexs cells. For each simplex /, one needs to identify its faces and to
determine its times of appearance and disappearance. The algorithm generate
persistence barcodes or barcode a set of T-intervals that pairs creators and
destroyers for each homology class for the ltered complex, where the positive
cycle-creating simplices are paired with the negative cycle-destroying ones. These
intervals allow the correct computation of ranks of persistent homology groups.
The algorithm is represented in dierent works of authors of the persistence
idea (see e.g. [13]). The persistence algorithm from the Smith normal form reduc-
tion scheme for a computation over arbitrary elds and non-elds for complexes
in arbitrary dimensions is represented in [44], a revised version of the algorithm
is represented in [43], [45].
5.2.1 Matrix representations.
Let us observe once again that, in each dimension, the homology of complex /
i
becomes a vector space over a eld, fully described by its rank
i
. We need
to choose compatible bases across the ltration in order to compute persistent
homology for the entire ltration. So we form the corresponding to / persistence
module, a direct sum of these vector spaces. The structure theorem states that
there is a basis for this module, which provides compatible bases for all the
vector spaces. The main purpose of the algorithm is to nd a description of such
a structure. We will trace a result of the algorithms work on the example of
the simple ltered simplicial complex which was already considered before and is
again represented in the below picture.
A homogeneous basis is a basis of homogeneous elements. The rst step in the
derivation of the algorithm for a computing of persistence homology over a eld
50 CHAPTER 5. THE REALIZED SINGULAR SOFTWARE
Fig. 5.2: A simple ltration with newly added simplices highlighted and listed.
is to represent the boundary operator
k
: C
k
C
k1
relative to the standard
basis of C
k
and a homogeneous one for Z
k1
. Reducing to the normal form, we
read o the description provided by the direct sum from the structure Theorem
4.1 using the new basis e
j
for Z
k1
by the following rules:
1) zero row i contributes a free term with shift
i
=deg e
i
;
2) row with diagonal term b
i
contributes a torsional term with homogeneous
d
j
=b
j
and shift
j
= e
j
.
We are going just to simplify the considered above standard reduction algo-
rithm using the persistence module. As output, we get a nite set of T-intervals
for a ltered complex directly over the eld F, which, up to isomorphism, char-
acterize the persistence module i.e. the homology of the ltered complex
without any necessity to construct one.
Note that, relative to homogeneous bases, a matrix representation M
k
of
k
has the following property:
deg e
i
+ deg M
k
(i, j) = deg e
j
,
where e
j
and e
i
are homogeneous bases of C
k
and C
k1
, respectively, and
M
k
(i, j) denotes the elements at location (i, j).
Let us return to the simple ltered complex given in Figure 5.2. For
1
with
coecients in Z
2
we get the following matrix expression:
M
1
=
_

_
ab bc cd ad ac
d 0 0 t t 0
c 0 1 t 0 t
2
b t t 0 0 0
a t 0 0 t
2
t
3
_

_
.
The below table reviews the degrees of the simplices of this ltration as ho-
mogeneous elements of the persistence Z
2
-module.
5.2. THE PERSISTENCE ALGORITHM. 51
a b c d ab bc cd ad ac abc acd
0 0 1 1 1 1 2 2 3 4 5
Table 5.1: Degree of simplices of ltration in Figure 5.2.
For example, it is possible to verify the basic property of the matrix repre-
sentation of the boundary operator: e.g.
deg M
1
(5, 4) = deg t
3
= deg acdeg a = 30 = 3.
To arrive at the desired representation of the boundary operator, we proceed
inductively in dimension. The base case is simple as
0
= 0, Z
0
= 0, and the
standard basis represents
1
. In the inductive step, we assume that there are
given: 1) a matrix M
k
for
k
relative to the standard basis e
j
for chain
groups C
k
(which is, clearly, homogeneous); 2) a homogeneous basis e
i
for
Z
k1
. For induction, we need to compute a homogeneous basis for Z
k
and a
matrix representation M
k+1
of
k+1
relative to the standard basis of C
k+1
and
the computed basis.
The motivation for this step is the reduction algorithm for homology which
was represented in section 4.1, where one computes M
k
with respect to a nicer
basis. One key distinction though, is that we will be able to accomplish our goal
via only column operations. That is, instead of a reducing the matrix completely
to its (Smith) normal form using both row and column operations, we can get
a halfway there by using Gaussian elimination on the columns, utilizing the ele-
mentary column operations of types 1 and 3 only. So we can reduce M
k
to its
so called column-echelon form,

M
k
, that is a lower staircase, where the steps have
variable height, all landings have a width equal to one, and all non-zero elements
must lie beneath the staircase:

M
k
=
_

_
0 0 0 0 0
0 0 0 0
0 0 0 0
0 0 0
0 0
0 0
0
. . .
0
.
.
.
0
_

_
.
The elements in the example is a pivot, and a row/column with a pivot
is called a pivot raw/column. Starting with the leftmost column, we eliminate
non-zero entries occurring in pivot rows in order of increasing row. To eliminate
an entry, we use an elementary column operation of type 3 that maintains the
homogeneity of the basis and matrix elements. We continue until we either arrive
52 CHAPTER 5. THE REALIZED SINGULAR SOFTWARE
at a zero column, or we nd a new pivot. If needed, we then perform a column
exchange by an operation of type 1 to reorder the columns appropriately.
The pivot elements in the column-echelon form are exactly the same as the
diagonal elements of the (Smith) normal form, which indicated the presence of
torsion and free summands in homology. Moreover, the degree of the basis ele-
ments on pivot rows is the same in both forms. The only dierence here is that
now these elements do not correspond to the component homology groups, but
they, rather, give torsion and free summands in the persistence module, which
are all that we need for the desired pairing.
Observe that the number of pivots in an echelon form is rank M
k
=rank B
k1
,
and that the basis elements corresponding to the non-pivot columns of the column-
echelon form comprise the desired basis for Z
k
. For the complex in Figure 5.2,
we continue:

M
1
=
_

_
cd bc ab z
1
z
2
d t 0 0 0 0
c t 1 0 0 0
b 0 t t 0 0
a 0 0 t 0 0
_

_
, where
_

_
z
1
=ad cdt bc t ab and
z
2
=act
2
bct
2
ab form
a homogeneous basis for Z
1
.
So, if we are only interest in the degree of the basis elements, we may read
them o from the echelon form directly, and we may use the following corollary
of the standard structure Theorem 4.1 to obtain the description.
Corollary 5.2.1.1. Let

M
k
be the column-echelon form for
k
relative to the
basis e
j
and e
i
for C
k
and Z
k1
, respectively. If row i has pivot

M
k
(i, j)=
t
k
, it contributes the summand
deg e
i
F[t]/t
k
to the description of H
k1
. Other-
wise, it contributes the summand
deg e
i
F[t]. In the language of T-intervals for
H
k1
, we get the pairs (deg e
i
, deg e
i
+k) and (deg e
i
, ), respectively.
In our example,

M
1
(1, 1)=t. As deg d=1, the element contributes
1
Z
2
[t]/(t)
or T-interval (1, 2) to the description of H
0
.
From here, it is easy to obtain a matrix representation for
k+1
with respect
to the computed basis for Z
k
(see [44]).
Lemma 5.2.1.2. To represent
k+1
relative to the standard basis for C
k+1
and
the basis computed for Z
k
, by a matrix with respect to the computed basis for

k
as above, merely delete the rows of M
k+1
corresponding to the pivot columns
of

M
k
.
Therefore, we have no need for row operations and can simply eliminate the
rows corresponding to pivot columns one dimension lower. By this way, we are
able to get the desired representation for
k+1
in terms of the basis for Z
k
,
5.2. THE PERSISTENCE ALGORITHM. 53
what completes the induction. In the considered example, the standard matrix
representation for
2
is
M
2
=
_

_
abc acd
ac t t
2
ad 0 t
3
cd 0 t
3
bc t
3
0
ab t
3
0
_

_
.
To get a representation in terms of C
2
and the basis (z
1
, z
2
) for the computed
earlier Z
1
, we simply eliminate the bottom three rows. These rows are associated
with the pivots in the represented abowe

M
1
, and we obtain

M
2
=
_

_
abc acd
z
2
t t
2
z
1
0 t
3
_

_
,
where
_
we have also replaced ab and ac with the correspondent
bases elements z
1
=ad bccd ab and z
2
=acbcab
.
The persistence algorithm is based on these two lemmas which show that a
full reduction to the normal form is unnecessary and that only column operations
are needed. The algorithm has the same running time as Gaussian elimination
over elds, so it takes O(m
3
) in the worst case, where m is the number of
simplices in the ltration.
For the considered example ltration in Figure 5.2, the marked 0-simplices
a, b, c, d and 1-simplices ad, ac generate T-intervals
L
0
=[0, ), [0, 1), [1, 1), [1, 2) and L
1
=[2, 5), [3, 4), respectively.

5.2.2 A pseudo-code of the revised version


of the persistence algorithm.
We need just measure lifetimes of certain topological properties of a ltered sim-
plicial complex, which appears and disappears when simplices are added to the
complex. The incremental, one cell at a time, algorithm computes the generator
for each homology class and pair the correspondent cell to its partner the cell
which eliminate the class. Once we got this pairing of such creators and destroy-
ers, we can read o the barcode and, herewith, the Betty numbers themselves

This example is also visually illustrated by the below Table 5.2.


54 CHAPTER 5. THE REALIZED SINGULAR SOFTWARE
from the ltration. Information about the representatives of homology classes for
each cell, , stores in a k-chain so called cascade which is initially itself.
Observe that if a destroyer does not exist then the homology class persist until
the nal simplex in the complex ltration, and the correspondent Betti number
is
i
=.
For instance, the below Table 5.2 reects all required topological attributes
for the ltration in the above Figure 5.2. It is easy to see that e.g. the vertex a
has no partner, the vertex b is paired with edge ab, and the vertex d is paired
with edge cd. Therefore, as was illustrated in the Figure 4.3, we got intervals
[0, ), [0, 1) and [1, 2) for
0
barcode, respectively.

The algorithm xes the impact of s entry on the topology by the deter-
mination whether a boundary, , of the regular cell is already a boundary in
the complex /. For this it sweep (cascade[]) through the while loop of the
represented below algorithms pseudo-code. After this loop, there are two possi-
bilities:
1. (cascade []) = 0 and we can write as a sum of the boundary basis
elements, so is already a (k 1)-boundary. Therefore cascade [] is
a new k-cycle that completed. In this case is a creator of a new
homology cycle and its cascade is a representative of the homology class it
created.
2. (cascade []) ,= 0 and becomes a boundary after we add . In this case
is a destroyer of the homology class of its boundary and its cascade is a
chain whose boundary is a representative of the homology class it destroyed.
So we pair with the youngest the most recently entered the ltration
cell in (cascade []).
a b c d ab bc cd ad ac abc acd
partner [] ab bc cd b c d acd abc ac ad
cascade [] a b c d ab bc cd ad ac abc acd
cd bc abc
bc ab
ab
Tab. 5.2: Data structure after running the persistence algorithm on the ltration in Figure 5.2.
The simplices without partners, or with partners that come after them in the full order, are
creators. The others are destroyers.
Repeat briey: on each step the algorithm identies the i-th cell as a creator
or as a destroyer and then computes its cascade; in the rst case, the cascade is

The vertex c here is died immediately after its birth, what corresponds to the interval [1, 1).
5.2. THE PERSISTENCE ALGORITHM. 55
a generator for the homology class it creates; in the second case, the boundary
of the cascade is a generator for the boundary class.
Below is the pseudo-code of the persistence algorithm:
for
1
to
n
/

partner [] ;
cascade [] ;

while (cascade []) ,= 0

Yngst( (cascade []));


if partner [] ,=

cascade [] cascade []+cascade [ partner [] ];

else while;

if (cascade []) ,= 0

Yngst( (cascade []));


partner [] ;
partner [] ;

The while loop corresponds to the processing of one row/column in Gaus-


sian elimination. Here we repeatedly check whether the youngest cell in
(cascade []) has partner. If not, we leave the loop. Otherwise, the cycle that
created was destroyed by its partner, and we add the cascade of the s partner
to the cascade of . Of course, addition of boundaries does not change homology
classes.
For example Table 5.3 enable us to trace iterations while we sweep the sim-
plexes cd and then ad throughof the while loop.
The algorithm return exhaustive persistence information and works for a large
class of cell complexes. In order to formalize this assertion, we need the following
denition. Lets assume that we have a ltered cell complex with a partial order
on the cells.
A based persistence complex is a persistence complex equipped with a choice
56 CHAPTER 5. THE REALIZED SINGULAR SOFTWARE
cascade[cd] partner[] cascade[ad] partner[]
cd d ad d cd
ad+cd c bc
ad+cd+bc b ab
ad+cd+bc - -
Tab. 5.3: Tracing the successive simplexes cd and ad of the considered complex through the
iterations of the while loop.
of basis in every dimension and persistence level, such that the basis in
one xed dimension and level maps to a subset of the bases in the same
dimension and higher levels under inclusion.
Since we only use basis elements and the boundary operator for each given
complex, and nothing special to the geometry of the underlying complex, the algo-
rithm computes persistent information for any based persistence complex (see [44]).
5.3 Examples of data processing.
In order to demonstrate that the implementation of the persistence algorithm for
elds perform pretty well for data-sets sampled of geometrical objects, I provide
here a simple graphic examples of the realized software work.
Example 5.1. Let us consider a collection of points which is uniformly dis-
tributed around boundaries of two distanced from each other areas, e.g. around
circles, x
2
+y
2
=1 and (x 10)
2
+y
2
=1. The input data le is represented
in the left column of the below table. The Voronoi diagram of the correspondent
point set is shown in Figure 5.3.
As a result of the program work we obtained the following information. We
got ten barcodes which corresponds to zero homology group, and the length of the
barcodes or the persistence of zero Betti numbers are:
97, 112, 104, 88, 82, 86, 93, 67, 75, 46.
Now we have to read the data with a lter parameter which t for the situation.
In the considered case, the correct interpretation is possible if we take e.g. 100
as the lowest minimal threshold for the Betti numbers persistence, and receive

0
= 2, two simple connected areas, what is the true result.
Example 5.2. Now we add the third circle, (x 5)
2
+(y + 5)
2
=1, and proceed
similarly with the three areas. The augmented input data le and the correspon-
dent Voronoi diagram is represented in the right column of Table 5.4 and in Figure
5.4, respectively.
5.3. EXAMPLES OF DATA PROCESSING. 57
2 % 2 d input sample 2
56 % number of points 84
0 1 0 1
0.26 0.97 0.26 0.97
0.51 0.86 0.51 0.86
0.7 0.72 0.7 0.72
0.81 0.58 0.81 0.58
0.87 0.49 0.87 0.49
0.97 0.24 0.97 0.24
1 0 1 0
0.97 -0.25 0.97 -0.25
0.86 -0.52 0.86 -0.52
0.73 -0.68 0.73 -0.68
0.65 -0.76 0.65 -0.76
0.49 -0.87 0.49 -0.87
0.22 -0.97 0.22 -0.97
0 -1 0 -1
-0.24 -0.97 -0.24 -0.97
-0.5 -0.87 -0.5 -0.87
-0.67 -0.74 -0.67 -0.74
-0.78 -0.62 -0.78 -0.62
-0.87 -0.48 -0.87 -0.48
-0.96 -0.28 -0.96 -0.28
-1 0 -1 0
-0.98 0.21 -0.98 0.21
-0.87 0.5 -0.87 0.5
-0.76 0.64 -0.76 0.64
-0.67 0.74 -0.67 0.74
-0.51 0.86 -0.51 0.86
-0.27 0.96 -0.27 0.96
10 1 10 1
10.25 0.97 10.25 0.97
10.5 0.87 10.5 0.87
10.67 0.74 10.67 0.74
10.75 0.66 10.75 0.66
10.87 0.49 10.87 0.49
10.97 0.22 10.97 0.22
11 0 11 0
10.97 -0.25 10.97 -0.25
10.86 -0.5 10.86 -0.5
10.75 -0.66 10.75 -0.66
10.63 -0.77 10.63 -0.77
10.48 -0.88 10.48 -0.88
10.24 -0.97 10.24 -0.97
10 -1 10 -1
9.75 -0.97 9.75 -0.97
9.52 -0.88 9.52 -0.88
9.36 -0.77 9.36 -0.77
9.29 -0.71 9.29 -0.71
9.16 -0.54 9.16 -0.54
9.04 -0.28 9.04 -0.28
9 0 9 0
9.03 0.24 9.03 0.24
9.12 0.47 9.12 0.47
9.23 0.64 9.23 0.64
9.32 0.73 9.32 0.73
9.53 0.88 9.53 0.88
9.74 0.97 9.74 0.97
5 -4
5.23 -4.03
5.49 -4.13
5.69 -4.28
5.78 -4.38
5.88 -4.52
5.97 -4.76
6 -5
5.96 -5.28
5.86 -5.51
5.77 -5.64
5.64 -5.77
5.49 -5.87
5.25 -5.97
5 -6
4.75 -5.97
4.52 -5.88
4.37 -5.77
4.25 -5.66
4.15 -5.53
4.04 -5.26
4 -5
4.02 -4.79
4.12 -4.53
4.23 -4.36
4.34 -4.25
4.49 -4.14
4.72 -4.04
Tab. 5.4: The Singular input data le which represents the collection of points sampled of
two (left) and three (right) distanced from each other circles.
58 CHAPTER 5. THE REALIZED SINGULAR SOFTWARE
Fig. 5.3: The Voronoi diagram of the points from the left column of Table 5.4.
In this case, we receive the following persistence related to zero homology
group:
162, 117, 112, 150, 116, 145, 98, 81, 81, 108.
As we can see, the picture is become even more clear then in the previous ex-
ample, and an isolation of true features from topological noise was increased.
If we determine the lowest minimal threshold for zero Betti numbers persistence
is equal to e.g. 140, we get
0
= 3, three simple connected areas, i.e. a correct
interpretation of the obtained result.
The establishing of a parameter threshold value is a key moment for data
interpretation, and is exactly the boundary which separates the mathematical
and geophysical parts of the considered project.
It is natural that, in a case when we have huge amount of points, data pro-
cessing is a time consuming process. So, when we switch to another dimensions
or/and to much bigger point clouds, the program demand more then just sev-
eral seconds like in the considered examples, but, nevertheless, the time remains
within reasonable limits.
5.4. SUMMARY AND CONCLUDING REMARKS. 59
Fig. 5.4: The Voronoi diagram of the points from the right column of Table 5.4.
5.4 Summary and concluding remarks.
The main aim of this work was to give a topological description of some under-
ground geological formation on a base of exploration data, what is extremely
desirable in oil and gas elds prospecting. In the context of this PhD project,
this oil-containing geological conformation plays the role of a geometrical object
which may have any shape. The obtained during geological exploration experi-
ments data may be viewed as a cloud of points, and contains both noise and
missing information. All the input information at our disposal corresponds to
reected post-explision signals, but, nevertheless, strictly related to the geolog-
ical formation surveyed in the exploration experiment. Since the sampled data
received on Earth surface, in the 3D problem the seismic data contains only
two of three spatial coordinates of the object under investigation. This is the
dierence with, for example, medical tomography, where it is possible to scan
the geometrical object from all directions. In order to reconstruct the missing
spatial coordinate, we needed to involve techniques based on a special kind of
algebraic topology formalism. This implies, in particular, that the experimen-
tal data should be considered in terms of methods from computational topology,
and useful information can be extracted directly from the experimental, unpro-
60 CHAPTER 5. THE REALIZED SINGULAR SOFTWARE
cessed exploration data by applying topological methods, notably methods from
computational homology.
Construction of an approximation by simplicial complexes creates a topolog-
ical setting which oer exible tools for gauging various topological attributes.
Here our interest lies in a detection of long-lived homology groups of a constructed
similarity simplicial complex during the course of its history which include both
addition and removal of simplices. An obvious consequence of such a resilience is
that it gives important information about robust quality of the considered topo-
logical constructions. The persistence of certain topological attributes assumes
also prolonged deciency in certain topological forms in simplicial complexes, cor-
responding to the deciency of certain relations in the geological object, which is
indicated by Betti numbers. This dynamical connectivity information cold not be
inferred by making use of any conventional methods. As the result, the created
Singular software has barcodes and persistent Betti numbers of the beforehand
created approximation/similarity complex.
This PhD research is devoted to a mathematical part of the ambitious project.
I do believe that the combination of this work with the geophysical theory rep-
resents an exciting avenue of research and will be a great step towards interpre-
tations of geological exploration data.
Appendices
61
Appendix A
Basic Notions and Concepts
Following the expositions in the mentioned before classic algebraic topology
books, we give here some basic notions and concepts, just in order to give to
a reader without this kind of mathematical background a possibility to under-
stand the main regulations of the work.
A topology on a set X is a system of subsets A 2
X
with the following
properties:
1) , X A;
2) if
i
[ i IA, then

iI

i
A;
3) if
i
[ i I, I niteA, then

iI

i
A.
The pair (X, A) is called a topological space. A sets in A is open sets and
the complements of the open sets are closed sets of X. A neighborhood of a point
xX is an open set that contains x. A cover is a collection of sets whose union
is X. X is a compact if every cover of X with open sets has a nite subcover.
X is connected if the only subsets of X that are both open and closed are
and X. The subspace topology of YX is the system =YX [ XA. The
pair (Y, ) is called a subspace of the topological space (X, A).
Suppose that we have topological spaces X and Y. A function : X Y
is continuous if the preimage of every open set in Y is open in X. A map is a
continuous function. The closure

A of A is the intersection of all closed sets
containing A. The interior

A of A is the union of all open sets contained in A.
The boundary of A is A=

A

A. A homeomorphism is a bijective map whose


inverse is also continuous. X and Y are homeomorphic or topologically equivalent
or have the same topological type, written X Y, if there is a homeomorphism
between them. This is the most restrictive notion of equivalence in topology.
63
64 APPENDIX A. BASIC NOTIONS AND CONCEPTS
A homotopy between two maps f
0
, f
1
: X Y is a continuous map
F : X [0, 1] Y such that F(x, 0)=f
0
(x) and F(x, 1)=f
1
(x) for all xX,
then f
1
and f
2
are said to be homotopic, denoted f
1
f
2
, via homotopy F.
Given two continuous maps g : X Y and h: Y X so that gh and hg are
homotopic to to 1
Y
and 1
X
respectively, then X and Y are homotopy equivalent
and have the same homotopy type: X Y. A space with the homotopy type of a
point is contractible or null-homotopic. Homotopy is a topological invariant since
XY XY.
A covering space of X is a topological space Y together with a projection
p: Y X, which satises the following property:
xX there is a path-connected neighborhood U so that, for each path-
connected component V of p
1
(U), the restriction p[
V
is a homeomor-
phism.
If Y is connected then it is universal. Any two universal covering spaces of X
are topologically equivalent.
The d-dimensional Euclidean space is the set of real d-tuples,
R
d
= x=(x
1
, x
2
, . . . , x
d
) [ x
i
R
d
.
The norm of xR
d
is |x|=(

d
i=1
x
2
i
)
1
2
. The distance between points x, y R
d
is d(x, y)= |xy|. The ane hull of a set of points T =p
0
, p
1
, . . . , p
n
in R
d
is
a(T)
def
=
_
n

i=0

i
p
i
[
n

i=0

i
= 1
_
.
T is anely independent if all a(T) is dierent from the ane hull of every
proper subset of T. The convex hull of T is
conv(T)
def
= xa(T) [
i
0 .
Let T =p
0
, p
1
, . . . , p
k
be anely independent. Then =conv(T) is a k-simplex
with vertixes T and with dimension dim() =k = card(T)1, k d. A face of
is a simplex =conv(U) with U T, i.e. ; it is a proper face if U is
a proper subset of X. The barycentric coordinates of a point x are the real
numbers
i
with
k

i=0

i
p
i
= x, and
k

i=0

i
= 1.
The barycenter of is the point b() with barycentric coordinates
i
=
1
k+1
.
A path is a continuous map : [0, 1] X, it joins the initial point, (0), to
the terminal point, (1). X is path-connected if every pair of points in X can be
65
joined by a path. Two paths are equivalent if they are connected by a homotopy
which leaves the common initial and terminal points xed. The inverse of is

1
(t)=(1t). The product of two paths and is dened if (1)=(0):
=
_
(2t) if 0 t
1
2
(2t 1) if
1
2
t 1
.
A path is a loop if (0) =(1) =x
0
, where x
0
is called a basepoint. A trivial
loop
1
is equivalent to the constant map [0, 1] x
0
. The fundamental
group of X at the basepoint x
0
, denoted (X, x
0
), is the equivalence classes of
loops based at x
0
together with the product operation. For a path-connected
space X any two groups (X, y
0
) and (X, z
0
) are isomorphic, therefore, we
have a unique (X) for the entire space. The fundamental group is invariant
over homotopy equivalent spaces. If X is contractible then (X) is trivial; the
reverse is not correct, and a good example for this is the d-sphere, where d2.
A topological space may be viewed as an abstraction of a metric space. Sim-
ilarly, manifolds generalize the connectivity of d-dimensional Euclidean spaces,
R
d
, by being locally similar, but globally dierent. A d-dimensional chart at
some point pX is a homeomorphism : U R
d
onto an open subset of R
d
,
where U is a neighborhood of p. Loosely, a d-manifold is X with such a chat
at every point. So every point of a d-manifold has a neighborhood homeomor-
phic to R
d
. If it exist, the boundary of a d-manifold with boundary is always a
(d1)-manifold without boundary, and is a set of Xs points with neighborhoods
homeomorphic to x=(x
1
, x
2
, . . . , x
d
) R
d
[ x
1
0 . Here we are interest in a
compact d-manifold with boundary. A closed surface is a compact 2-manifold.
All manifolds of dimension d3 are triangulable.
An ordered k-simplex, = [ p
0
, p
1
, . . . , p
k
], is a k-simplex together with a
permutation of its vertices. Two orderings have the same orientation if they dier
by an even permutation. All simplices of dimension 1 and higher have two
orientations. The orientation of a (k1)-face induced by is
= (1)
i
[ p
0
, p
2
, . . . , p
i1
, p
i+1
. . . , p
k
] ,
where a leading minus reverses the orientation. An orientation of a k-simplex
= [ p
0
, p
1
, . . . , p
k
] is an equivalence class of orderings of vertices of , where
( p
0
, p
1
, . . . , p
k
) ( p

0
, p

1
, . . . , p

k
) are equvalent if the parity of the permu-
tation is even, that is the sign of is 1; denote an oriented simplex as []. Two
k-simplexes sharing a (k1)-face are consistently oriented if they induce dif-
ferent orientations of . A triangulable d-manifold is orientable if all d-simplices
in any of its triangulations can be oriented consistently, i.e. so that all adja-
cent pairs are consistently oriented. Otherwise, the d-manifold is non-orientable.
Non-orientable closed surfaces can not be embedded in R
3
.
A simplicial complex, /, is a nite collection of simplices such that:
66 APPENDIX A. BASIC NOTIONS AND CONCEPTS
1) if / and is a face of then /;
2) if ,

/ then

is empty or a face of both.


The multifaceted property algebraic, topological and combinatorial of sim-
plicial complexes makes them particularly convenient for a modeling of complex
structures and connectedness between dierent substructures. A subcomplex of
/ is a subset of / which is also a simplicial complex. The k-skeleton /
()
of
/ is the subcomplex containing simplices with dimension less than or equal to
. The vertex set is vert / = / [ dim = 0. The underlying space of /
is the part of space covered by simplexes in / : [/[ =

K
. The dimension
is dim / = max dim [ /. A triangulation of a topological space X is a
simplicial complex, /, such that [/[ X. Triangulation enable us to represent
topological spaces compactly as simplicial complexes, and X is triangulable if it
has a triangulation. Two simplicial complexes / and L are isomorphic, /

=L,
if [/[ [L[. Usually, the evolution of the complex considers its creation start-
ing from the empty set, hence, the assumption is that simplices are added to
the complex in the order of increasing. A ltration of a complex / is a nested
sequence of subcomplexes
= /
0
/
1
. . . /
d
= /,
where superscripts are ranks in a ltration sequence.
Let / and L be two simplicial complexes with a map : vert / vert L
which takes vertices of any simplex in / to the vertexes of a simplex in L. So,
if = conv T [ T = [p
0
, p
1
, . . . , p
k
] is a simplex in /, then conv (T) is a
simplex in L. A simplicial map : [/[ [L[ is the linear extension of a vertex
map :
(x)=
k

i=0

i
(p
i
), where
_
pT, x and
i
is the barycentric
coordinate of x that corresponds to p
i
T
.
/ and L are isomorphic or simplicially equivalent if they permit a bijective vertex
map . In this case, is a homeomorphism between [/[ and [L[. There is a
standard realization for a k-simplex as follows. The standard k-simplex,
k
, is the
convex hull of e
i

i{0,1,...,k}
, where
e
i
= (0, . . . , 1, . . . , 0) [ 1 is in the i th position, i I =0, 1, . . . , k
is the ith standard basis vector for R
k
. For any indexing set J I,
J
is the
face of
k
=
I
spanned by e
j

jJ
. The standard simplex may be subdivided
using the barycenters of its faces to produce the simplicial complex /
k
with
[/
k
[ =
k
. Each non-empty face
J
of
k
has an associated vertexes in /
k
.

J
is triangulated by subcomplex /
J
/
k
with [/
J
[ =
J
.
67
It is possible to dene simplicial complexes as purely combinatorial objects,
what is crucial from a computation point of view. An abstract simplicial complex
is a pair (/, ), where / is a nite set whose elements are referred to as vertices,
and where is a family of non-empty subsets of / so that and
implies ; the elements of are referred to as faces. The sets in /=

are called abstract simplexes. If a face consists of k +1 elements of /,


then = p
0
, p
1
, . . . , p
k
is a k-simplex of with 0-simplexes p
0
, p
1
, . . . , p
k
as vertices. The dimensions of and are dim() = card() 1 = k and
dim()
def
= max dim() [ . Intuitively, a simplicial complex structure on a
space is an expression of the space as a union of points, intervals, triangles, and
higher dimensional analogues. Abstract simplicial complexes are purely combi-
natoric objects, which enables computations of topological invariants. A graph
is a 1-dimensional abstract simplicial complex. The nerve of / is an abstract
simplicial complex
^(/)
def
= A /[ A ,= .
A geometric realization of an abstract simplicial complex (/, ) is a map
r : / R
d
for which / = conv r(X) [ X/ is a simplicial complex,
i.e. r is given by

/(), where /() = conv e


r(s)

s
is a simplicial
complexes, and e
i
denotes the i th standart basis vector. A realization gives us
the familiar low-dimensional k-simplices: vertices, edges, triangles, tetrahedrons
etc. Every abstract simplicial complex of dimension k has a geometric realization
in R
d
for some large enough d.
There is a strong relationship between the geometric and abstract denitions:
every abstract simplicial complex, (/, ), is isomorphic to the geometric realiza-
tion of some simplicial complex /. An approximation of a topological space by
simplicial complexes is a combinatorial way to describe one, and the homology
can be computed using only linear algebra of nitely generated Z-modules.
The smallest subcomplex of / which contains an another subcomplex L/
is the closure,

L, of L. The star L contains all of the cofaces of L, and
link L
def
= star L star (

L )
is the boundary of star L. Stars and links corresponds to open sets and boundaries
in topological spaces.
Assign to each simplex an arbitrary but xed ordering of its vertices, i.e.
impose a total order on the vertex set /. Denote as

k
def
= [ card()=k+1
the subset of with ordered k-simplices as elements. A chain is a collection of
abstract simplices which can be ordered so that
0

1
. . .
k
. A k-chain
68 APPENDIX A. BASIC NOTIONS AND CONCEPTS
is the function c = c
k
:
k
Z, and can be written as a formal sum:
c
k
def
=
N
k

i=1
n
i
[
i
], where
i

k
, n
i
Z and N
k
is the cardinlity of
k
in /.
Dene the group of k-chains C
k
=C
k
(X) in X as the free abelian group on the
set of oriented k-simplices
k
. I.e. the group C
k
is formed by the set of all k-
chains together with the operation of addition. A collection of (k1)-dimensional
faces of a k-simplex is a (k1)-chain itself and is the boundary,
k
, of . The
boundary of the k-chain is the sum of the boundaries of the simplices in the
chain, i.e. c
k
=

N
k
i=1
n
i
(
k
i

k
). For a k-chain c every-time we have c
k

k1
=0.
The boundary operator d
i
:
k

k1
(for 0 i k) maps an ordered k-simplex
to a (k1)-chain:
d
i

def
=
k

i=0
(1)
i
[ p
0
, p
2
, . . . , p
i1
, p
i+1
, . . . , p
k
].
The boundary homomorphism
k
: C
k
C
k1
is dened linearly on a chain c by
action on any simplex = [ p
0
, p
2
, . . . , p
k
] c :

k
def
=

i
(1)
i
d
i
=

i
(1)
i
[ p
0
, p
2
, . . . , p
i1
, p
i+1
, . . . , p
k
].
The chain complex is the sequence of chain groups connected by boundary
homomorphisms:
. . .

k+2
C
k+1

k+1
C
k

k
C
k1
. . . C
1

1
C
0

0
,
with
k

k+1
= for all k. The image and the kernel of a boundary homomorphism
are
Im
k
=c
k
C
k1
[ cC
k
and Ker
k
=cC
k
[ c
k
=0, respectively.
A k-chain c is a k-cicle if it has no boundary, i.e. if cKer
k
. Since the k-cycles
constitute the kernel of
k
, they form a subgroup of C
k
, the kth chain group
Z
k
def
= Ker
k
= c C
k
[ c
k
= 0 .
A k-chain c is a k-boundary if it is the boundary of a (k+1)-chain, i.e. if
c Im
k+1
. Another name of a k-boundary is a non-homologous k-cycle. Since
the k-boundaries lies in the image of
k+1
, they form an another subgroup of C
k
,
the kth boundary group
B
k
def
= Im
k+1
= c C
k
[ c = c

k+1
for c

C
k+1
.
Since the boundary of a boundary is always empty,
k1

k
c=0 for all k and for
every c C
k
. Thus, dened subgroups are nested: B
k
Z
k
C
k
. The k-cycles
69
are the basic topological objects that dene the presence of k-dimensional holes in
the simplicial complex. Also, many k-cycles may characterize the same hole, and
cycles possessing the property that their dierence is the boundary are said to
be homologous. The kth homology group is an algebraic invariant that expressed
as the quotient of the cycle group over the boundary group:
H
k
def
= Z
k
/B
k
.
If z
1
, z
2
Z
k
are in the same homology class, then they are homologous, denoted
z
1
z
2
, and z
1
= z
2
+b [ b B
k
. Since the groups C
k
(X) are equipped with
the bases
k
,
k
can be expressed as matrices D(k) which we describe next.
Columns of D(k) are parametrized by
k
, rows are parametrized by
k1
, and,
for
k
and
k1
, the entry D(k)

is 0 if , and is (1)
i
if
and if is obtained by removing of the ith member of . So homology
is algorithmically computable for simplicial complexes. This calculations can be
performed by putting matrices constructed out of the D(k)s in Smith normal
form (at greater length see [10]).
The homology groups are nitely generated and abelian, and can be computed
using only linear algebra of nitely generated Z-modules. The fundamental the-
orem on such groups implies that
H
k
= Z

k
T,
where
k
=rank H
k
=rank Z
k
rank B
k
is the kth Betti number of a simplicial
complex /, it counts the number of k-dimensional holes in /. T is the torison
subgroup of H
k
, and can be written as the direct sum of nitely many cyclic
groups Z
k
. This construction is justied by the fact that H
k
is an invariant
over all simplicial complexes triangulating the same topological space X [/[.
H
k
=H
k
(/) =H
k
(X) is functorial, i.e. every continuous map f : XY induces
a linear transformation H
k
(f) : H
k
(X) H
k
(Y ). The homology groups are
invariants for [/[ and for homotopy equivalent spaces. Formally,
XY H
k
(X)

=H
k
(Y) for all k.
In particular,
k
(X) =
k
(Y) =
k
(/) =
k
is invariant over all triangulations
of X.
Since H
k
is a nitely-generated group, the standard structure theorem states
that it decomposes uniquely into a direct sum

i=1
Z
l

j=1
Z
t
j
, where
k
, t
j
Z, t
j
[t
j+1
, Z
t
j
= Z/ t
j
Z.
The left sum captures the free subgroup and its rank is the kth Betti number,

k
, of /; the right sum captures the torsion subgroup, and the integers t
j
are
70 APPENDIX A. BASIC NOTIONS AND CONCEPTS
the torsion coecients for the homology group. Over a eld F, a module becomes
a vector space and is fully characterized by its dimension, the Betti number, and
we get a full characterization for torsion-free spaces in this case. For torsion-
free spaces in three-dimensions, the Betti numbers have intuitive meaning as
a consequence of the Alexander Duality:
0
counts the number of connected
components of the space,
1
is the dimension of any basis for the tunnels,
2
counts the number of enclosed spaces or voids. For instance, the torus is one
connected component, has two tunnels, and encloses one void, correspondently,

0
=1,
1
=2, and
2
=1.
Appendix B
The Singular Code
Here is represented a code of three Singular programs which was created in the
scope of the project. The rst one is devoted to the computation of persistence
Betti numbers via the correspondent barcode. This is the main result of this
work, where the structure of this program was described in detail.
Two others programs are dedicated to a calculation of the Grobner bases,
and are realizations of exact and approximative versions of the Buchberger-Moller
algorithm. This was initial stream of the research, but was postponed for the
sake of the main direction. As explained in the end of this chapter, this software
can be used for a further development of the project.
B.1 Computation of persistence Betti numbers of noisy
point cloud data.
After a launching of the program, the main procedure Betti demands for input
the name of the input le with the full directory path. For instance:
Betti(UsersOlegPhDGreatProgramInputData.txt);
The program return the barcode and, via it, the persistence Betti numbers
which corresponds to the input point cloud data X.
//******************* The main procedure (Betti) ******************
proc Betti(string path)
{
string qhull="qvoronoi <"+path+" o TO ComplexInput";
71
72 APPENDIX B. THE SINGULAR CODE
int d=system("sh",qhull);
string s=read("./ComplexInput");
list input=inputData(s);
int N= input[1];
list FST=input[2];
list Bett=input[3];
int i,m;
list CD,PN,V,BV,E;
for(i=1;i<=N;i++)
{
V[i]=i;
PN[i]=0;
CD[i]=list(i);
BV[i]=list();
}
for(i=1;i<=size(FST);i++)
{
V=V+list(FST[i]);
PN[N+i]=0;
CD[N+i]=list(N+i);
// BV[N+i]=BS(N+i,V);
}
for(i=1;i<=size(FST);i++)
{
BV[N+i]=BS(N+i,V);
}
for(i=N+1;i<=N+size(FST);i++)
{
E=EBD(i,CD,PN,BV);
m=E[1];
CD=E[2];
if(m!=0)
{
PN[i]=m;
PN[m]=i;
}
}
list Pers=Output(PN, Bett);
write("outputPN",PN);write("outputBett",Bett);
return("Barcodes-",PN,"Pesistence-",Pers);
}
B.1. COMPUTATION OF PERSISTENCE BETTI NUMBERS OF NOISY POINT
CLOUD DATA. 73
//*************** Eliminate-Boundaries (EBD) *******************
proc EBD(int i,list CD,list PN,list BV)
{
int m;
list A,Q,E;
A=YBD(CD[i],BV,Q);
m=A[1];
Q=A[2];
if(m==0)
{
E[1]=m;
E[2]=CD;
}
while(m!=0)
{
if(PN[m]==0)
{
E[1]=m;
E[2]=CD;
return(E);
}
else
{
CD[i]=ADD(CD[i],CD[PN[m]]);
A=YBD(CD[PN[m]],BV,Q);
m=A[1];
Q=A[2];
}
E[1]=m;
E[2]=CD;
}
return(E);
}
//**************** Youngest-Boundary of Cascade (YBD) ****************
proc YBD(list X,list BV,list Q)
{
int i,m;
74 APPENDIX B. THE SINGULAR CODE
list A;
for(i=1;i<=size(X);i++)
{
if(size(BV[X[i]])>0)
{
Q=ADD(Q,BV[X[i]]);
}
else
{
A[1]=m;
A[2]=Q;
return(A);
}
}
for(i=1;i<=size(Q);i++)
{
if(m<Q[i])
{
m=Q[i];
}
}
A[1]=m;
A[2]=Q;
return(A);
}
//********************** Boundary of Simplex (BS) *******************
proc BS(int i,list V)
{
int j,k;
list BV,P;
for(j=size(V[i]);j>=1;j--)
{
P=delete(V[i],j);
k=ID(P,V);
if(k>0)
{
BV=ADD(BV,k);
}
}
B.1. COMPUTATION OF PERSISTENCE BETTI NUMBERS OF NOISY POINT
CLOUD DATA. 75
return(BV);
}
//*********************** Identification (ID) ************************
proc ID(list P,list V)
{
int i,e,k;
int j=1;
int s=size(P);
for(i=1;i<=s;i++)
{
while(j<size(V))
{
if(size(V[j])==s)
{
for(e=1;e<=s;e++)
{
if(P[e]!=V[j][e])
{
k=1;
break;
}
}
if(k==0)
{
return(j);
}
else
{
k=0;
j++;
}
}
else{j++;}
}
}
return(0);
}
//********************* Addition of Lists (ADD) **********************
76 APPENDIX B. THE SINGULAR CODE
proc ADD(list A,list B)
{
int i,j,k;
for(i=1;i<=size(B);i++)
{
for(j=1;j<=size(A);j++)
{
if(B[i]==A[j])
{
k=j;
break;
}
}
if(k==0)
{
A[size(A)+1]=B[i];
}
else
{
A=delete(A,k);
k=0;
}
}
return(A);
}
//********* Read OFF Format Data of "Qhull" (inputData) ************
proc inputData(string s)
{
int i,j,q,l;
list A,L,FST,Bett;
string V,s1,s2;
V=s[1];
execute("int Dim="+V+";");
A=NLE(s);
int N=A[1];
s1=A[2];
A=NEL(s1);
int F=A[1];
B.1. COMPUTATION OF PERSISTENCE BETTI NUMBERS OF NOISY POINT
CLOUD DATA. 77
s2=A[2];
for(i=1;i<=N+1;i++)
{
s1=s2[find(s2,newline)+1,size(s2)];
s2=s1;
}
while(j<F)
{
V=s1[1,find(s1," ")-1];
execute("q="+V+";");
for(i=1;i<=q-1;i++)
{
A=NEL(s1);
l=A[1];
s1=A[2];
L=L+list(l+1);
s2=s1;
}
s2=s1[find(s1," ")+1,size(s1)];
V=s2[1,find(s2,newline)-1];
s1=s2[find(s2,newline)+1,size(s2)];
execute("l="+V+";");
L=L+list(l+1);
j=j+1;
FST[j]=L;
L=list();
Bett[N+j]=q-1;
}
for(i=1;i<=N;i++)
{
Bett[i]=0;
}
return(list(N,FST,Bett));
}
//*************** New element in a line (NEL) ******************
proc NEL(string s1)
{
int Val;
string s2,V;
78 APPENDIX B. THE SINGULAR CODE
list A;
s2=s1[find(s1," ")+1,size(s1)];
V=s2[1,find(s2," ")-1];
execute("Val="+V+";");
A[1]=Val;
A[2]=s2;
return(A);
}
//***************** New line element (NLE) ********************
proc NLE(string s2)
{
int f,Val;
string s1,V;
list A;
s1=s2[find(s2,newline)+1,size(s2)];
f=find(s1," ");
V=s1[1,f-1];
execute("Val="+V+";");
A[1]=Val;
A[2]=s1;
return(A);
}
//********* Output Betti numbers persistence (Output) ************
proc Output (list PN, list Bett)
{
int i;
list Pers;
for(i=1;i<=size(PN);i++)
{
if(PN[i]!=0)
{
if(PN[i]>=i)
{
Pers[i]=string("Betti(",Bett[i],") has persistence ",PN[i]-i);
}
if(PN[i]<i)
{
B.2. COMPUTATION OF THE GR

OBNER BASIS BY A REALIZATION OF THE


BUCHBERGER-M

OLLER ALGORITHM. 79
Pers[i]=string("Betti(",Bett[i],") has negative persistence ",
PN[i]-i, ", just marks a destroyer for the homology class");
}
}
}
return(Pers);
}
B.2 Computation of the Grobner basis by a realization of
the Buchberger-Moller algorithm.
There is huge amount of literature devoted to the Grobner basis and to the
Buchberger-Moller algorithm for it computation, see for instance [1], [17], [19].
For an introduction to this area see well written [23], [24], and [7].
The main procedure BMA stipulates for input a matrix whose rows are points
coordinates. For example:
matrix P [7] [2] =1,2,3,4,5,6,7,8,9,10,11,12,13,14;
BMA(P);
The program return the Gr obner basis which corresponds to the input points.
Also for output we have generators.
//******************* The main procedure (BMA) ******************
ring K=(real,30),(x,y),(c,dp);
proc BMA(matrix P)
{
int n=nvars(basering);
int s=nrows(P);
int i,j,k,e,sizem,sizes;
list L=1;
list G,Q,AA;
poly t,h;
vector V;
ideal S,A,GG,LL;
module M;
while(size(L)!=0)
{
t=L[size(L)];
L=delete(L,size(L));
80 APPENDIX B. THE SINGULAR CODE
V=Plugin(P,t);
AA=Calc(V,M,s);
V=AA[1];
A=AA[2];
if(V==0)
{
for(i=1;i<=ncols(S);i++)
{
h=h+A[i]*S[i];
}
G=insert(G,(t-h));h=0;
}
else
{
M[size(M)+1+sizem]=V;
if(V==0)
{
sizem=1;
}
else
{
sizem=0;
}
for(i=1;i<=ncols(S);i++)
{
h=h+(A[i]*S[i]);
}
S[size(S)+1+sizes]=t-h;
if(t-h==0)
{
sizes=1;
}
else
{
sizes=0;
}
h=0;
Q=insert(Q,t);
for(j=1;j<=size(G);j++)
{
GG[j]=lead(G[j]);
B.2. COMPUTATION OF THE GR

OBNER BASIS BY A REALIZATION OF THE


BUCHBERGER-M

OLLER ALGORITHM. 81
}
for(j=1;j<=size(L);j++)
{
LL[j]=L[j];
}
attrib(GG,"isSB",1);attrib(LL,"isSB",1);
for(j=1;j<=n;j++)
{
for(i=1;i<=size(L);i++)
{
if((NF(var(j)*t,LL)!=0)&&(NF(var(j)*t,GG)!=0))
{
for(k=1;k<=size(L);k++)
{
if(L[k]<=(var(j)*t))
{
L=insert(L,var(j)*t,k-1);
e=1;
break;
}
}
if(e<>1)
{
L=insert(L,var(j)*t);
}
e=0;
break;
}
}
}
if(size(L)==0)
{
for(j=1;j<=n;j++)
{
if(NF(var(j)*t,GG)!=0)
{
e=e+1;
L[e]=var(j)*t;
}
}
e=0;
82 APPENDIX B. THE SINGULAR CODE
}
GG=0;LL=0;
}
}
return(G,Q);
}
//******************* (Plugin) ******************
proc Plugin(matrix P,poly t)
{
int n=nvars(basering);
int s=nrows(P);
int i,j;
poly f;
vector V;
for(i=1;i<=s;i++)
{
f=t;
for(j=1;j<=n;j++)
{
f=subst(f,var(j),P[i,j]);
}
V=V+f*gen(i);
}
return(V);
}
//******************* (Calc) ******************
proc Calc(vector V,module M,int s)
{
int r=size(M);
int i;
list W;
ideal B;
module N;
for(i=1;i<=r;i++)
{
N[i]=M[i]+gen(s+i);
}
B.3. COMPUTATION OF THE GR

OBNER BASIS BY A REALIZATION OF THE


APPROXIMATIVE VERSION OF THE BUCHBERGER-M

OLLER ALGORITHM.83
attrib(N,"isSB",1);
option(redSB);
vector A=reduce(V,N);
option(noredSB);
W[1]=A[1..s];
for(i=1;i<=r;i++)
{
B[i]=-A[s+i];
}
W[2]=B;
return (W);
}
B.3 Computation of the Grobner basis by a realization of
the approximative version of the Buchberger-Moller
algorithm.
As opposed to the exact Buchberger-Moller algorithm, the approximative variant
has another structure, uses so called singular value decomposition

, and, in addi-
tion, has for input some approximative parameter which denes an extent of an
approximation. For the sake of simplicity, in the represented below program such
a parameter is installed inside of a body of the main procedure as the number eps
with the value is equal to 0.000000007. Of course, the parameter can be easily
led out for preliminary input, and, in this case, the input for the main procedure
ABM look like in the following example:
matrix P [7] [2] = 14,13,12,11,10,9,8,7,6,5,4,3,2,1;
number eps=0.000000007;
BMA(P,eps);
The program return the approximative Grobner basis which corresponds to
the input points and to the approximative parameter value. Also for output we
have generators.
//******************* The main procedure (ABM) ******************
ring K=(real,30),(x,y),(c,dp);
LIB "matrix.lib";

The procedure SVD in the program is compiled and installed inside Singular. SVD is the
C++ procedure from the free open source software, it will be included to the next version of
Singular.
84 APPENDIX B. THE SINGULAR CODE
LIB "aksaka.lib";
proc ABM(matrix P)
{
int n=nvars(basering);
int s=nrows(P);
int i,j,k,e,sizes;
number eps=0.000000007;
number f=eps+1;
list L=1;
list G,Q,AA;
poly t,h;
ideal S,A,GG,LL;
vector V;
matrix M[s][0];
while(size(L)!=0)
{
t=L[size(L)];
L=delete(L,size(L));
V=Plugin(P,t);
if(ncols(M)!=0)
{
AA=TLS(V,M,s);
}
V=AA[1];
A=AA[2];
f=AA[3];
if(f<=eps)
{
for(i=1;i<=ncols(S);i++)
{
h=h+A[i]*S[i];
}
G=insert(G,(t-h));
h=0;
i=0;
while(i<size(L))
{
i++;
if(L[i]/t!=0)
{
L=delete(L,i);
B.3. COMPUTATION OF THE GR

OBNER BASIS BY A REALIZATION OF THE


APPROXIMATIVE VERSION OF THE BUCHBERGER-M

OLLER ALGORITHM.85
i--;
}
}
}
else
{
if(ncols(M)==0)
{
matrix M[s][1]=V;
}
else
{
M=concat(M,V);
}
for(i=1;i<=ncols(S);i++)
{
h=h+(A[i]*S[i]);
}
S[size(S)+1+sizes]=t-h;
if(t-h==0)
{
sizes=1;
}
else
{
sizes=0;
}
h=0;
Q=insert(Q,t);
for(j=1;j<=size(G);j++)
{
GG[j]=lead(G[j]);
}
for(j=1;j<=size(L);j++)
{
LL[j]=L[j];
}
attrib(GG,"isSB",1);attrib(LL,"isSB",1);
for(j=1;j<=n;j++)
{
for(i=1;i<=size(L);i++)
86 APPENDIX B. THE SINGULAR CODE
{
if((NF(var(j)*t,LL)!=0)&&(NF(var(j)*t,GG)!=0))
{
for(k=1;k<=size(L);k++)
{
if(L[k]<=(var(j)*t))
{
L=insert(L,var(j)*t,k-1);
e=1;
break;
}
}
if(e<>1)
{
L=insert(L,var(j)*t);
}
e=0;
break;
}
}
}
if(size(L)==0)
{
for(j=1;j<=n;j++)
{
if(NF(var(j)*t,GG)!=0)
{
e=e+1;
L[e]=var(j)*t;
}
}
e=0;
}
GG=0;
LL=0;
}
}
return(G,Q);
}
//************************ Auxiliary Procedures ******************
B.3. COMPUTATION OF THE GR

OBNER BASIS BY A REALIZATION OF THE


APPROXIMATIVE VERSION OF THE BUCHBERGER-M

OLLER ALGORITHM.87
proc Plugin(matrix P,poly t)
{
int n=nvars(basering);
int s=nrows(P);
int i,j;
poly f;
vector V;
for(i=1;i<=s;i++)
{
f=t;
for(j=1;j<=n;j++)
{
f=subst(f,var(j),P[i,j]);
}
V=V+f*gen(i);
}
return(V);
}
//...................................................................
proc TLS(vector V,matrix M,int s)
{
list W;
int i,j,k;
int r=ncols(M);
number f,f1,f2;
number epsilon=1/1e2147483647;
matrix B[s][r+1]=concat(M,-V);
matrix v[s][1]=V;
matrix b[r][1];
matrix u[s][1];
matrix uu[s][1];
list L=system("svd",B);
for(k=1;k<=3;k++)
{
for(i=1;i<=nrows(L[k]);i++)
{
for(j=1;j<=ncols(L[k]);j++)
{
88 APPENDIX B. THE SINGULAR CODE
if(absValue(leadcoef(L[k][i,j]))<epsilon)
{
L[k][i,j]=0;
}
}
}
}
matrix l[r+1][1]=(transpose(L[3]))[r+1];
int rr=r+1;
while(absValue(leadcoef(l[r+1,1]))<=epsilon)
{
rr=rr-1;
l=(transpose(L[3]))[rr];
}
matrix n[s][1]=(L[1]*L[2])[rr];
for(i=1;i<=r;i++)
{
b[i,1]=l[i,1]/l[r+1,1];
}
for(i=1;i<=s;i++)
{
u[i,1]=-n[i,1]/l[r+1,1];
}
for(i=1;i<=s;i++)
{
uu[i,1]=v[i,1]-(M*b)[i,1];
}
for(i=1;i<=s;i++)
{
if(leadcoef(u[i,1])>=epsilon)
{
k=7;
break;
}
}
if(k==7)
{
for(i=1;i<=s;i++)
{
f1=f1+leadcoef(u[i,1]^2);
f2=f2+leadcoef(V[i]^2);
B.3. COMPUTATION OF THE GR

OBNER BASIS BY A REALIZATION OF THE


APPROXIMATIVE VERSION OF THE BUCHBERGER-M

OLLER ALGORITHM.89
}
f=(wurzel(f1))/(wurzel(f2));
}
else
{
f=0;
}
k=0;
f1=0;
f2=0;
V=0;
for(i=1;i<=size(u);i++)
{
V=V+u[i,1]*gen(i);
}
W[1]=V;
W[2]=b;
W[3]=f;
return(W);
}
The Grobner basis can be considered as a one of outlooks for the farther
project development, and represent a direction which gives a fresh perspective
as well as a new arsenal of computational tools to attack an old and signicant
problem in data analysis. The Buchberger-Moller algorithm can be used for
a computation of the so called multidimensional persistance; the Grobner basis
enable to reconstruct the entire multidimensional persistence vector space, and
provide a convenient way for a computation of the rank invariant. Since this
matters are beyond the scope of this work, we refer to [5] for the details (see also
[4], page 293).
Appendix C
A Reference Mapping Way and
a Representative Graph
I would like to represent here an alternative way of a similarity complex con-
struction, and a method of the complex visual depiction. This techniques was
not reected in the created in the scope of the work software, but, nevertheless,
are interesting by themselves.
C.1 Filtering.
We should come up ourselves with additional input information. First of all, we
need to dene a so called reference map or a lter which is chosen for a partition of
the given cloud of points X. This is a real valued continuous function f : X Z
to a xed reference space, Z, whose dimension will be an upper bound for the
dimension of the similarity simplicial complex. This f can be a well known
function which reect geometric properties of the data set, or can be a user
dened function which is chosen in order to understand how these properties
interact with it.
A polynomial function: it is naturally to choose a function from the polynomial
ring R[x
1
, x
2
. . . , x
n
].

Advantages of this choice are possibility to use the


well developed theory, and also a comparative simplicity of a treatment
with such functions.

In order to avoid a dependence of the radius-vectors signs in the case of more complicated
parametrizations, it is reasonable to take the ring R[x
2
1
, x
2
2
. . . , x
2
n
] instead.
91
92
APPENDIX C. A REFERENCE MAPPING WAY AND A REPRESENTATIVE
GRAPH
Below is represented few functions which carry interesting geometric informa-
tion about X in general. All of these functions are rely on the ability to compute
distances between points, so it is important to generate lters directly from the
metric (see [35]).
Density estimator: a non-negatve function on X, which reect useful geometri-
cal information about the given data, it can be produced by any density
estimator applied to X.
Gaussian kernel:
f

(x) = C

y
_
exp(d
2
(x, y))

_
,
where x, y X, C

is a constant such that


_
f

dx = 1, and >0 control


the smoothness of the estimation of the density function on X.
Eccentricity:
E
p
(x) =
_

yX
d(x, y)
p
N
_
1
p
,
where x, y X and 1 p < +. Also E

(x) = max
x

X
d(x, x

) for
p = +. The idea is to identify points which are far from the center
without identifying an actual center point, and refers to a data depth.
A (normalized) graph Laplacian matrix:
L(x, y) =
w(x, y)
_
z
w(x, z)
_
z
w(y, z)
,
which eigenvectors gives us a set of orthogonal vectors on the vertex set
of the graph, which encode interesting geometric information and can be
used as lter functions on the data. The vertex set of Graph Laplasian
is the set of all points in X, and the weight of the edge between points
x, y X is w(x, y) =k( d(x, y)), where k is some smoothing kernel such
as a Gaussian kernel (at greater length see [40]).
Filters determines reference spaces to which we produce a map. In the sim-
plest case, Z =R but it can be R
2
, the unit circle S
1
in the plane, or any another
space where a covering could be constructed relatively easily. Then we have to
establish some parameters for the construction method of such a covering.
We start with a nding of the range of the function restricted to the given
points. For the sake of simplicity, let us consider Z=R. So suppose that we are
given a space equipped with a continuous map f : XR and a covering
| = U
i

iI
[ F
iI
U
i
, F = Im(f)
C.2. CLUSTERING. 93
with some integer indexing set I. In order to use this construction, one must
develop methods for coverings creation. For instance, the simplest cover can be
constructed by dividing F into a smaller overlapping intervals, that also gives
us a possibility to parameterize the covering by two parameters which, in due
course, can be used for a resolution control: the length of the smaller intervals
and the percentage of a overlap between successive intervals.
It is natural to represent the covering method as covering of F by open balls
B

= y R
d
[ d(x, y) <
with a positive real radius . Below are considered two modes of such construc-
tions for two dierent input values: distance between balls centers, R, and a
positive real radius, . These parameters can be interpreted as an amount of
blurring applied to Z.
The covering | [ R, ] of Z=R consist of all intervals of the form
U
i
= [ iR , (i+1) R + ].
Here we have two parameters for a resolution control, and the covering
dimension will be 1 while <
R
2
, since there are will be no non-empty
threefold overlaps in this case. It is easy to obtain a corresponding covering
of R
n
by multiplying the intervals.
Let an integer N 2. The covering | [ N, ] = U
j

0jN
of Z =S
1
is
dened by the setting
U
i
=
_
(cos(x), sin(x)) [ x
_
2i
N
,
2i
N
+
_ _
, whenever >

N
.
In order to use this predened covering of the reference space for our purposes,
we pull back to a covering of X by the set f
1
( | [ N, ] ). Since f is a continuous
function, the bers that form it domain,
X
i
= f
1
(U
i
) = xX[ f(x)U
i
,
also form the corresponding open covering X
i

iI
of X.
C.2 Clustering.
Now it is necessary to describe a method for a transporting of this construction
from the setting of topological spaces to the setting of the points cloud. Since
elements X
i
of the covering of X might be in several connected components, we
need to treat each such a connected component as a separate subset in X
i

iI
.
94
APPENDIX C. A REFERENCE MAPPING WAY AND A REPRESENTATIVE
GRAPH
We use so called clustering in order to avoid this kind of diculty. For all i I,
let us consider the decomposition
f
1
(U
i
) =
J
i
_
j=1
V
i,j
of X
i
into its path connected components, where J
i
is the number of such
components in X
i
. Each X
i
is presented now as the union of the disjoint sets,
V
i,j
, treated as the representative points which makes up a vertex set. Finally,
we can represent the obtained from | covering of X as
X
i

iI
= V
i,j

iI, j[1,...,J
i
]
def
= Q

A
, where A =
_
1, . . . ,

i
iJ
i
_
.
Let us call this subsets of sampled points Q

as clusters. This construction


depend on the lter as well as on values of parameters of the covering of the
parameter space.
Clustering refer to the process of a partitioning of a data set into a number
of parts which are recognizably distinguishable from each other. So the goal of
clustering is to identify high density regions which are separated by low density
regions. A nding of a good clustering is a challenging problem and is a fun-
damental issue in a computing of the similarity simplicial complex. There are
a lot of dierent principles of a clustering construction (e.g. see [21] or diuse
speculations about this in [41]).
Usually, algorithms require parameters to be set before a output is received.
Such parameters often designates arbitrarily, but the arbitrariness of various
thresholds choices does not go with a lack of robustness. Some work in clus-
tering theory has been done in trying to determine the optimal choice of , but
it is much more informative to consider the so called hierarchical clustering (see
[22]). This kind of clustering combines data objects into clusters, those clusters
into larger clusters, and so forth, creating a hierarchy. A tree representing this
hierarchy of clusters is a dendrogram which provide a summary of the behavior
of clustering under all possible values of the parameter at once.
For instance, one can construct data sets which have been thresholded at two
dierent values, and the behavior of clusters under the inclusion of the set with
tighter threshhold into the one with the looser threshold is informative about what
is happening in the data set. Of course, we can play even a more complicated
game when we have more then one threshold parameters or when we associate
many functions

with each data point instead of just one. Individual data objects

As an example of Z =S
1
, consider a parameter space dened by two functions f and g which
are related such that f
2
+g
2
=1. A very simple covering for such a space is generated by considering
of overlapping intervals of equal size. If we used M functions, let R
M
to be our parameter space.
After this we would have to nd a covering of an M dimensional hypercube which is dened by the
ranges of M functions.
C.3. THE CLUSTER COMPLEX. 95
are the leaves of the tree, and the interior nodes are nonempty clusters. This allow
us to explore the data at dierent levels of granularity.
Hierarchical clustering methods are based on linkage metrics results in clusters
of proper shapes, and are categorized into so called agglomerative (bottom-up)
and divisive (top-down) approaches. The agglomerative clustering starts with sin-
gleton clusters and recursively merges two or more of the most similar clusters.
A divisive clustering starts with a single cluster containing all data points and
recursively splits that cluster into appropriate subclusters. The process contin-
ues until a stopping criterion is achieved, e.g. as such a criterion could be the
requested a number k of clusters.
The advantages of hierarchical clustering: exibility regarding to the
level of granularity;
ease of handling any form of similarity or distance;
applicability to any attribute type.
The disadvantages of hierarchical clustering: the diculty of choos-
ing the right stopping criteria;
most hierarchical algorithms do not revisit (intermediate) clusters once
they are clustered.
There are really overwhelming amount of dierent clustering techniques, algo-
rithms (e.g. birch, agnes, mst, chameleon and others, see a survey in [22]),
and there is even open source clustering software

. Classication of clustering
algorithms is neither straightforward nor canonical. In fact, the dierent classes
of algorithms overlap.
After the partitioning of X to subsets which corresponds to elements of the
covering X
i

iI
, we can use the interaction of the formed by this way subsets
between each other for an approximate representation of the exploration data.
C.3 The cluster complex.
In order to switch from the topological construction to a point cloud, we apply
standard clustering algorithms to subsets of the given data and use then the
interaction of the partial clusters with each other, just like we did before with
elements of the covering.
The cluster complex of the covering X
i

iI
is the nerve of the covering of
X by sets which are path connected components of each X
i
. In another

For example see [38].


96
APPENDIX C. A REFERENCE MAPPING WAY AND A REPRESENTATIVE
GRAPH
words, this is the abstract simplicial complex ((X
i

iI
) =((X, X
i

iI
)
whose vertex set is the dened above indexing set A, and where a family

0
,
1
, . . . ,
k
spans a k-simplex if and only if
Q

0
Q

1
. . . Q

k
,= .
So we should nd corresponding clusters Q

A
, each of which we treat
as a vertex in our complex whenever Q

j
,= . And then, when-
ever Q

0
, Q

1
, . . . , Q

k
are overlapping, we add a (k1)-simplex to the
complex.
The set map AI yields the map of simplicial complexes
((Q

A
) ^(X
i

iI
).
This is kind of projection, and ((X
i
) is more sensitive then ^(X
i
). Actu-
ally, ((X
i
) is homeomorphic to X, while ^(X
i
) is not.
As mentioned before, our input set is equipped with the Euclidean metric. Let
us assume that clusters Q

is represented by equal balls with a large enough


radius, i.e. that one covering of X is given by the family
B

(X)
def
= B

(x) [ xX, >0, Q

.
One can construct the nerve ^

= ^(B

(X)). According to the denition,


for 0 the cluster complex (

(X) = ((B

(X)) includes the k-simplex =


[ p
0
, p
1
, . . . , p
k
] if and only if B

(p
i
) have non-empty common intersections.
In our case of Euclidean data, there is the following consequence of the nerve
theorem (see [3]).
Theorem C.1. For a nite set of points in Euclidean space, X R
d
, there is a
number > 0 such that (

(X) is homotopy equivalent to X whenever .


Moreover, if Y is sampled from X, and B

(Y ) covers and is homotopy equiv-


alent to X, then the subcomplex (

(Y ) (

(X) on the vertices in Y is also


homotopy equivalent to X, and therefore (

(Y ) has the same homology as X.


We nish this section with three simple examples of constructions which pro-
duces a multiresolution structure.
Example C.1.1. If B

i
[ XB

i
, i I is a representation of the covering
by balls with equal radiuses, then we get a diagram
(

0
(X)

0
(

1
(X)

1
. . .

n1
(

n
(X).
of inclusions of upper complexes into lower complexes, since the upper one cor-
responds to a smaller parameter value then the lower one.
C.4. THE MAYER-VIETORIS BLOWUP. 97
Example C.1.2. Let consider the covering B
R

i
[ i I of Z =R with the
integer indexing set, where R is the second parameter with the meaning of a
distance between balls centers. The identity map on Z for some
0

1
. . .
n
provides a map of coverings
B
R

0
B
R

1
. . . B
R

n
which consists of inclusions of intervals into the intervals with the same center
but with a lager diameter. Finally, we get the diagram
((f
1
( B
R

0
))

0
((f
1
( B
R

1
))

1
. . .

n1
((f
1
( B
R

n
)).
Example C.1.3. For the equal-radius balls covering B
R
i

[ i I of Z=R,
consider a map of coverings B
R

R2R
B
2R

induced by the map of integers


k
k
2
|. This gives us a diagram of simplicial complexes
. . .

R/8
((f
1
(B
R/4

))

R/4
((f
1
(B
R/2

))

R/2
((f
1
(B
R

)).
As farther to the left one moves here, as the coverings of X, and therefore the
resolution of the picture of the object under investigation, becomes more and more
rened.
The complex ((Z, R, ) captures large-scale topology features and ignores
small-scale ones. The extent of scale is dened in terms of the parameters which
are nested. So natural inclusion maps
((f
1
( B
R

)) ((f
1
( B
R

)),
whenever RR

or/and

, induce corresponding maps between homology


groups. Finally, we have a similarity theorem or heuristic relating H
k
() to the
persistence homology group Im [ H
k
(Z, R, ) H
k
(Z, R

) ] under reasonable
geophisical sampling and for some choice of the parameters.
Persistence homology here study the full system of the homology groups
H
k
(Z, R, ) together with the induced maps between them by varying the nested
parameters over a large range. By this way, we decant features which no longer
visible at scale R

or/and

.
C.4 The Mayer-Vietoris blowup.
Let /
I
denote the abstract simplicial complex with a vertex set I. For any
non-empty J I, subcomplex /
J
/
I
is the face spanned by J. Here we can
98
APPENDIX C. A REFERENCE MAPPING WAY AND A REPRESENTATIVE
GRAPH
dene the so called associated to the covering Mayer-Vietoris blowup of X as the
subspace
M(X, X
i

iI
)
def
=
_
=JI
K
J

jJ
X
j
K
I
X.
Here we use those fact that the map M(X, X
i

iI
) X is a homotopy
equivalence when X has the homotopy class of the nite complex and when all
X
i
are open sets ([18],[34]).
On the below picture example the graph containing three cycles is covered
by two sets which is blown up into two pieces, each with two 1-cycles. Since
the middle cycle of the original space is contained in the intersection of the cover
sets, it exists in both local pieces. To recover the global topology, we equate
the two copies of the middle cycle by gluing a cylinder to them. The resulting
construction, the so called Mayer-Vietoris blowup complex, has the same number of
cycles as the original space but also incorporates the geometric cover information
within its structure.
Fig. C.1: Given a space equipped with a cover (a), we rst blow up the space into local pieces
(b) and then glue back the pieces to get the blowup complex (c), giving us a ltration consisting
of two complexes at times t =0 and t =1 , respectively. The persistence barcode barcode (d)
localizes the topology of the original space with respect to the cover.
In constructing of the blowup complex there are no any tear or gluing manip-
ulations, but only stretching of certain pieces. Therefore, blowup complex has
the same topology as the original space. This fact reected in the below lemma
(see [34]).
Lemma C.2. The projection
X
: M(X, X
i
) X is a homotopy equivalence
in the following cases:
X
i
is an open covering of a normal space, e.g. any subspace of R
d
;
X
i
is a covering of a simplicial complex by subcomplexes.
Therefore,
X
induces an isomorphism at the homology level. That is
M(X, X
i
) X, and H

(M(X, X
i
))

= H

(X).
So the geometry which is contained within the cover can be incorporated into
homology by building the blowup complex and computing its persistent homology.
C.5. THE SIMILARITY GRAPH. 99
Obtained by this way so called localization reects the quality of the given cover
and gives better description of the cover, that portray the geometry of the space
via the attributes location [45].
C.5 The similarity graph.
We start with a simple denition.
A graph , G, is a subset of R
3
which is made up of a nite collection of points
v
0
, v
1
, . . . , v
n
, called vertices, together with joins these vertices straight-
line segments e
0
, e
1
, . . . , e
m
, called edges, and which satisfy the follow-
ing intersection conditions:
1) the intersection of distinct edges either is empty or consists of exactly
one vertex;
2) if edge and a vertex intersect, then the vertex is an endpoint of the
edge.
More explicitly, an edge [ v
i
, v
j
] joining vertices v
i
and v
j
is the set of
points
x R
3
[ x =tv
i
+(1 t)v
j
, 0 t 1 .
A path in G is an ordered sequence of edges of the form
[v
0
, v
1
], [v
1
, v
2
], . . . , [v
l 1
, v
l
] .
Since any k-simplicial complex can be embedded in (2k +1)-dimensional Eu-
clidean space, any 1-dimensional abstract simplicial complex can be represented
as a graph.
One method of a point cloud clustering is the so-called single-linkage cluster-
ing, where a graph is constructed whose vertex set is the set of points in the
cloud, and where also two such points are connected by an edge if their distance
is less or equal then some . The parameter can be used for a control of
resolution. Here shorter edges are required to connect points within each cluster,
but relatively longer edges are required to merge the clusters. So the number of
clusters is obtained automatically, and it is not necessary to require specifying
one beforehand. For instance, implemented in [35] algorithm returns a vector
1 R
N1
which holds the length of the edge which was added to reduce the
number of clusters by one at each step in the algorithm. Of course, it is possible
to dene the number of clusters rst and to obtain the correspondent parameter
then.
Since the represented in the work complexes contains information about mul-
tiresolution structure of the input data, it serve as a source for a construction of
100
APPENDIX C. A REFERENCE MAPPING WAY AND A REPRESENTATIVE
GRAPH
Fig. C.2: The similarity graph created on base of a simple cover of a circle.
the similarity graph. A visualization

conducts to better qualitative understand-


ing of the noisy data set, and the graph representation of the higher dimensional
approximaton simplicial complex is one of ways for such a qualitative representa-
tion of R
3
. Each vertex of this graph, G, are nodes of the simplicial complex,
is corresponded to the clusters, and labeled by color and size. The size of each
node is proportional to the cardinality of the cluster complex elements Q

. The
color indicates

the value of the reference map, f, at a representative point in the


corresponding set of the covering, X
i

iI
. For example, as the representative
point could be taken a barycenter or, perhaps, a suitable average taken over the
set of the ponts belonging to each cluster.
It is pretty widespread to model the data points and their distances by a
neighborhood graph, just because of a visual obviousness. It is also possible
to use the graph for a reection of object shape changes with the course of
time. A clustering can be reduced to standard graph algorithms: in the easiest
case, one can simply dene clusters as connected components of the graph, and,
alternatively, one can try to construct minimal graph cuts which separate the
clusters from each other. Anyhow, constructing the similarity graph even for
a nite sample from some larger underlying space is not a trivial task (several
popular constructions see e.g. in [40], [27]).
In order to be able to merge or split subsets of points rather than individ-
ual points, the distance between individual points has to be generalized to the
distance between subsets. Such a derived proximity measure is called a linkage
metric. Since each node of the graph corresponds to a cluster, and since the
original set of points is came from a metric space, we can dene a new metric

Quite interesting graph visualization software avaliable at [37].

E.g. red being high and blue being low.


C.5. THE SIMILARITY GRAPH. 101
Fig. C.3: Similarity graph representations of the same objects at two dierent moments of time.
space, ( G
1
, G
2
, . . . , G
n
, D
G
), on the graph by computing distances between
clusters. Let say that the vertices of the graph are G
i
, each G
i
corresponds to
a cluster Q

i
with the cardinality card Q

i
, and let us to dene the metric as
D
H
(G
i
, G
j
)
def
=
_
max
_

y
min
x
d(x, y)
card Q

j
,

x
min
y
d(x, y)
card Q

i
_
[ xQ

i
, y Q

j
_
.
This dissimilarity measure is dened in terms of pairs of nodes, one in each
respective cluster. The measure calculates inter-cluster distances and naturally
related to the similarity graph. Here every data partition directly corresponds to
a graph partition.
This intrinsic graph metric is an alternative choice to the Euclidean metric as,
in some situations, it represents the intrinsic geometry of the data much better.
Appendix D
Brief Description of the Project
The theme of my PhD work is Topological Methods for the Representation and
Analysis of Exploration Data in Oil Industry. The main purpose of the research
is to apply algebraic topology methods to a description of shapes of undegraund
capacities where oil/gas gathering. The motivation of the research is clear: since
to drill one oil well is extremely expensive, it is crucial point to have a view how
the capacity roughly look like. This information is as important as information
about oil elds location.
Briey, the idea behind this is as follows. Imagine a volume of oil and gas in
some reservoir. This volume can be considered as an algebraic surface in three-
dimensional space. The surface is embedded in the reservoir rock and captures
also faults as well as impermeable layers in the reservoir rock, where as a result
can be no oil or gas. In this context, these anomalies can be interpreted as holes
of the algebraic surface. It is for establishing and counting of these holes the
algorithms from computational algebraic topology with some auxiliary algorithms
are implemented in the computer algebra system Singular in the scope of this
project.
This kind of knowledge stipulate for a processing of huge amount of experi-
mental exploration data which always contain a lot of noise and also has missing
information. The data are obtained after a row of explosions and corresponds to
times of arrival of post-explosion reected waves to a network of special sensor
detectors.
What was proposed in the work is a distillation of persistent topological fea-
tures from the noisy changeable input data. Since Betti numbers enable us cap-
ture the three tipes of holes characterizing its connectivity (the gaps which sepa-
rate components, the tunnels which pass through the shape, the voids which are
103
104 APPENDIX D. BRIEF DESCRIPTION OF THE PROJECT
components of the complement space inaccessible from the outside), the persistent
Betti numbers are signicant geometric information about the input point cloud
and, via it, about the underground geological formation. The Betti numbers can
be represented as their geometrical persistence analog, so called barcodes. Of
course, this calculations stipulate a condition of a construction of some simplicial
complex which approximate the data in topological meaning.
From the point of view of exploration, i.e. nding oil/gas reservoirs, this
approach is new and, therefore, will be assessed against traditional techniques.
References
[1] J. Abbott, A. Bigatti, M. Kreuzer, and L. Robbiano. Computing ideals of points.
Journal of Symbolic Computation, 30(4):341 356, October 2000. [cited at p. 79]
[2] Biondo L. Biondi. 3D Seismic Imaging. Number 14 in Investigations in Geophysics.
Society if Exploration Geophysicists, 2006. [cited at p. 5]
[3] E. Carlsson, G. Carlsson, and V. de Silva. An algebraic topological method for
feature identication. International Journal of Computational Geometry and Appli-
cations, 16(4):291 314, 2006. [cited at p. 96]
[4] G. Carlsson. Topology and data. Bulletin of the American Mathematical Society,
46(2):255 308, April 2009. [cited at p. 17, 89]
[5] G. Carlsson, G. Singh, and A. Zomorodian. Computing Multidimensional Persis-
tence, volume 5878/2009 of Lecture Notes in Computer Science. Springer, Berlin /
Heidelberg, December 2009. pages 730 - 739, more complete version available on
http://arxiv1.library.cornell.edu/abs/0907.2423. [cited at p. 89]
[6] G. Carlsson, A. Zomorodian, A. Collins, and L. Guibas. Persistence barcodes for
shapes. International Journal of Shape Modeling, 11(2):149 187, 2005. [cited at p. 33]
[7] D. Cox, J. Little, and D. OShea. Ideals, Varieties, and Algorithms. Undergraduate
Texts in Mathematics. Springer, New York, third edition, 2007. [cited at p. 11, 38, 40,
79]
[8] V. de Silva. A weak denition of delaunay triangulation. submitted, October 16
2003. [cited at p. 18]
[9] V. de Silva and R. Ghrist. Coverage in sensor networks via persistent ho-
mology. Algebraic & Geometric Topology, 7:339 358, April 2007. DOI:
10.2140/agt.2007.7.339. [cited at p. 16]
[10] J.G. Dumas, F. Heckenbach, B.D. Saunders, and V. Welker. Computing simplicial
homology based on ecient smith normal form algorithms. Algebra, Geometry, and
Software Systems, pages 177 207, 2003. [cited at p. 69]
[11] H. Edelsbrunner. Algorithms in Combinatorial Geometry. Springer-Verlag, New
York, 1987. [cited at p. 20]
105
106 REFERENCES
[12] H. Edelsbrunner. The union of balls and its dual shape. Discrete and Computational
Geometry, 13:415 440, 1995. [cited at p. 22]
[13] H. Edelsbrunner, D. Letscher, and A. Zomorodian. Topological persistence and
simplication. Discrete Computational Geometry, 28:511 533, 2000. [cited at p. 29,
44, 49]
[14] H. Edelsbrunner and N. R. Shah. Triangulating topological spaces. Interna-
tional Journal of Computational Geometry and Applications, 7:365 378, 1997.
[cited at p. 10, 23]
[15] D. Eisenbud. Commutative Algebra with a View Toward Algebraic Geometry. Grad-
uate Texts in Mathematics. Springer, third printing edition, 1999. [cited at p. 41]
[16] M. Erwig. The graph voronoi diagram with applications. Networks, 36(3):156163,
2000. [cited at p. 11]
[17] Claudia Fassino. An approximation of the grobner basis of ideals of perturbed
points, part i. Available at http://arxiv.org/abs/math/0703154v1, March 2007.
[cited at p. 79]
[18] Allen Hatcher. Algebraic Topology. Cambridge University Press, third edition, 2002.
[cited at p. 2, 8, 10, 11, 26, 36, 98]
[19] D. Heldt, M. Kreuzer, S. Pokutta, and H. Poulisse. Approximate computation of
zero-dimensional polynomial ideals. Journal of Symbolic Computation, 44(11):1566
1591, November 2009. In Memoriam Karin Gatermann. [cited at p. 79]
[20] Tomasz Kaczynski, Konstantin Mischaikow, and Marian Mrozek. Computa-
tional Homology, volume 157 of Applied Mathematical Sciences. Springer, 2004.
[cited at p. 11]
[21] J. Kogan. Introduction to Clustering Large and High-Dimensional Data. Cambridge
University Press, Cambridge, 2007. [cited at p. 94]
[22] J. Kogan, Ch. Nicholas, M. Teboulle, et al. Grouping Multidimensional Data.
Springer-Verlag, Berlin Heidelberg, 2006. [cited at p. 94, 95]
[23] Kreuzer and L. Robbiano. Computational Commutative Algebra 1. Springer, Hei-
delberg, 2000. [cited at p. 79]
[24] Kreuzer and L. Robbiano. Computational Commutative Algebra 2. Springer, Hei-
delberg, 2005. [cited at p. 79]
[25] J. Leray. Sur la forme des espaces topologiques et sur les points xes des
repr esentations. J. Math. Pures Appl., 24(9):95 167, 1945. [cited at p. 13, 21]
[26] A. T. Lundell and S. Weingram. The Topology of CW Complexes. Van Nostrand
Reinhold Company, New York, 1969. [cited at p. 22]
[27] M. Maier, M. Hein, and U. von Luxburg. Optimal construction of k-nearest neighbor
graphs for identifying noisy clusters. Theoretical Computer Science, 410:1749 1764,
2009. [cited at p. 100]
REFERENCES 107
[28] J. R. Munkres. Topology: A First Course. Prentice Hall, Engiewood Clis, New
Jersey, 1975. [cited at p. 11]
[29] J. R. Munkres. Elements of Algebraic Topology. AddisonWesley, Menlo Park, Cali-
fornia, 1984. [cited at p. 11, 12]
[30] Atsuyuki Okabe, Barry Boots, Kokichi Sugihara, and Sung Nok Chiu. Spatial Tes-
sellations: Concepts and Applications of Voronoi Diagrams. Wiley Series in Prob-
ability and Statistics. John Wdey & Sons Ltd, Chichester, second edition, 2000.
[cited at p. 11]
[31] F. P. Preparata and M. I. Shamos. Computational geometry: an introduction.
Springer-Verlag, New York, 1985. [cited at p. 12, 20]
[32] J. J. Rotman. An Introduction to Algebraic Topology, volume 119 of Graduate Texts
in Mathematics. Springer-Verlag, New York, 1988. [cited at p. 11]
[33] John A. Scales. Theory of seismic imaging. Springer-Verlag, Berlin, New York,
1995. [cited at p. 5]
[34] G. Segal. Classifying spaces and spectral sequences. Publications Math emathiques
de lInstitut des Hautes

Etudes Scientiques, 34:105 112, 1968. [cited at p. 98]
[35] G. Singh, F. Memoli, and G. Carlsson. Topological methods for the analysis of high
dimensional data sets and 3d object recognition. Point Based Graphics, September
2007. Prague. [cited at p. 92, 99]
[36] E. H. Spanier. Algebraic Topology. McGraw-Hill Book Co., 1966. [cited at p. 14]
[37] Graphviz. Graph visualization software. http://www.graphviz.org. [cited at p. 100]
[38] Open Source Clustering Software.
http://bonsai.ims.u-tokyo.ac.jp/mdehoon/software/cluster/index.html.
[cited at p. 95]
[39] Qhull. http://www.qhull.org. [cited at p. 47]
[40] U. von Luxburg. A tutorial on spectral clustering. Statistics and Computing,
17(4):395 416, 2007. [cited at p. 92, 100]
[41] U. von Luxburg and S. Ben-David. Towards a statistical theory of clustering. In
PASCAL workshop on Statistics and Optimization of Clustering, London, July 2005.
[cited at p. 94]
[42] A. Zomorodian. Topology for Computing, volume 14 of Cambridge Monographs on
Applied and Computational Mathematics. Cambridge University Press, New York,
2005. [cited at p. 8]
[43] A. Zomorodian. Computational Topology, volume 2. Algorithms and Theory of
Computation Handbook, second edition, 2009. Chapter 3. [cited at p. 49]
[44] A. Zomorodian and G. Carlsson. Computing persistent homology. Discrete Com-
putational Geometry, 33:249 274, 2005. [cited at p. 29, 40, 49, 52, 56]
[45] A. Zomorodian and G. Carlsson. Localized homology. Computational Geometry:
Theory and Applications, 41:126 148, November 2008. [cited at p. 49, 99]
List of Symbols
and Abbreviations
X topological space
algebraic surface of a geological formation
A
i
amplitude of the i th reected signal
X points cloud data sampled from
| covering (of X)
^ nerve (of |)
( geometric realization (of ^)
similarity (homologous) relation
homotopy relation
homeomorphism relation

= isomorphism relation
(X) fundamental group
k-simplex
/(X) simplicial complex
[/[ underlying space of /

((X)

Cech complex
some xed parameter, >0
B

open ball with a radius

(X) Rips complex


L landmark points
J(X, L, ) strong witness complex

J(X, L, ) weak witness complex


1
p
Voronoi cell of pX
1
X
Voronoi diagram
1

Voronoi diagram restricted to


T
X
Delaunay complex (triangulation)
T

Delaunay triangulation restricted to

(X) -shape complex


((X) cluster complex
C
k
group of k-chains

k
k-dimensional boundary operator
109
110 LIST OF SYMBOLS AND ABBREVIATIONS
Z
k
kth chain (cycle) group
B
k
kth boundary group
H
k
kth homology group

k
kth Betti number
Z
j
k
k-th cycle group
B
i
k
kth boundary group
H
j,p
k
p-persistent kth homology group

i,p
k
p-persistent kth Betti numbers
M
k
standard matrix representation of
k

M
k
(Smith) normal form of M
k
/ persistence module
(/) graded module

shift upward in grading by


D graded PID
b barycenter
M Mayer-Vietoris blowup (complex)
G graph
R ring
F eld
PID principal ideal domain
gcd greatest common devisor
List of Figures
1.1 Seismic acquisition on land using a dynamite source and a cable of geophones. . . 1
1.2 3D marine seismic acquisition, with multiple streamers towed behind a vessel. . . 3
1.3 A syncline reector (left) yields bow-tie shape in zero oset section (right). . . . 3
1.4 Reections in time (a) and in depth (b). . . . . . . . . . . . . . . . . . . . . 4
1.5 (a) Transmission response of the noise sources in the subsurface observed at the
surface. (b) Synthesized reection response, obtained by seismic interferometry.
(c) Synthesized reection depth image from reection responses as in (b). . . . . . 4
2.1 Four simple subspaces of S
3
. . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.2 Data sampled from a circle. . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.3 A xed set of points can be completed to

C

(X) or to R

(X) . The

Cech complex
has the homotopy type of the /2 cover, S
1
S
1
S
1
, while the Rips complex has
homotopy type S
1
S
2
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.4 Decomposition of the plane by Voronoi cells of a nite set. . . . . . . . . . . . . 19
2.5 Delaunay complex corresponding to the shown in Figure 2.3 decomposition by
Voronoi cells. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.6 Delaunay based triangulation of a complicated shape. . . . . . . . . . . . . . . 22
3.1 A

Cech complex

C

constructed on a nite collection on points in the Euclidean


plane. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
3.2 The

Cech complex after increasing of the parameter value to

>. . . . . . . . 28
4.1 Filtration of a simple simplicial complex with topological characteristics. Just
added at a current step vertex or face are represented in red. . . . . . . . . . . . 34
4.2 Barcode for the ltration of the simplicial complex presented in Figure 4.1. . . . . 35
4.3 A ltered simplicial complex and its barcode persistence interval multiset in each
dimension. Each persistent interval shown is the lifetime of a topological attribute,
created and destroyed by the simplices at the low and high endpoints, respectively. 35
4.4 A chain complex with chain, cycle, boundary groups and their images under the
boundary operators. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
111
112 LIST OF FIGURES
4.5 A ltered simplicial complex. . . . . . . . . . . . . . . . . . . . . . . . . . . 39
4.6 Diagram of P-intervals corresponding to the ltered simplicial complex from Figure 4.5.
[0, ) and [0, 2) are 0-intervals; [1, ) and [1, 3) are 1-intervals. . . . . . . . . 43
4.7 The triangular region in the index-persistence plain, that denes when the cycle is
a basis element for the homology vector space. . . . . . . . . . . . . . . . . . 44
5.1 An example of a simplicial complex and its incidence matrix representation. Columns
are labeled by its vertices and rows are labeled by its simplices. . . . . . . . . . 48
5.2 A simple ltration with newly added simplices highlighted and listed. . . . . . . 50
5.3 The Voronoi diagram of the points from the left column of Table 5.4. . . . . . . 58
5.4 The Voronoi diagram of the points from the right column of Table 5.4. . . . . . . 59
C.1 Given a space equipped with a cover (a), we rst blow up the space into local
pieces (b) and then glue back the pieces to get the blowup complex (c), giving us
a ltration consisting of two complexes at times t =0 and t =1 , respectively. The
persistence barcode barcode (d) localizes the topology of the original space with
respect to the cover. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
C.2 The similarity graph created on base of a simple cover of a circle. . . . . . . . . 100
C.3 Similarity graph representations of the same objects at two dierent moments of
time. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
List of Tables
2.1 Betti numbers
0
,
1
and
2
of the geometrical objects from Figure 2.1. . . . . . 9
5.1 Degree of simplices of ltration in Figure 5.2. . . . . . . . . . . . . . . . . . . 51
5.2 Data structure after running the persistence algorithm on the ltration in Figure
5.2. The simplices without partners, or with partners that come after them in the
full order, are creators. The others are destroyers. . . . . . . . . . . . . . . . . 54
5.3 Tracing the successive simplexes cd and ad of the considered complex through the
iterations of the while loop. . . . . . . . . . . . . . . . . . . . . . . . . . . 56
5.4 The Singular input data le which represents the collection of points sampled of
two (left) and three (right) distanced from each other circles. . . . . . . . . . . 57
113
Index
-shape complex, 23

Cech complex, 14
P-interval, 42
P-intervals, 30, 34
i-th standard basis vector, 15
k-cell, 22
d-dimensional Euclidean space, 64
d-dimensional chart, 65
k-boundary, 68
k-chain, 67
k-cicle, 68
k-dimensional boundary operator, 36
k-dimensional homology groups, 8
k-simplex, 11, 64, 67
k-skeleton, 66
kth Betti number, 69
kth boundary group, 29, 36, 68
kth chain group, 36, 68
kth cycle group, 29, 36
kth homology group, 36, 69
p-persistent kth Betti numbers, 33
p-persistent kth homology group, 29
-weak witness, 17
d-manifold, 65
abstract simplexes, 67
abstract simplicial complex, 11, 67
ane hull, 64
anely independent set of points, 64
agglomerative clustering, 95
alpha complex, 19
approximation complex, 7
approximation simplicial complex, 13
barcode, 33, 35, 43, 49
barycenter, 15, 64
barycentric coordinates, 15, 64
barycentric coordinatization, 15
based persistence complex, 55
basepoint of a loop, 65
Betti number, 69
Betti numbers, 8
boundary group, 36, 68
boundary homomorphism, 68
boundary of a manifold, 65
boundary of a set, 63
boundary of a simplex, 68
boundary operator, 36, 68
Buchberger-M oller algorithm, 71, 79
cascade of a cell, 54
cell, 22
chain, 67
chain complex, 36, 68
chain complexes diagram, 39
chain group, 36, 68
chart, 65
closed covering, 12
closed sets, 63
closed surface, 65
closure of a complex, 67
closure of a set, 63
cluster, 94
cluster complex, 95
clustering, 94
column-echelon form, 51
compact space, 63
connected space, 63
consistently oriented simplexces, 65
continuous function, 63
contractible space, 64
convex hull, 64
cover, 63
covering, 12
covering space, 64
creator of a homology class, 53
115
116 INDEX
creator of a homology group, 30
creator of a new homology cycle, 54
cycle group, 36
data depth, 92
Delaunay complex, 11, 19
Delaunay triangulation, 11, 20
dendrogram, 94
density estimator, 92
destroyer of a homology class, 53, 54
destroyer of a homology group, 30
dimension of a complex, 66
dimension of a face of a complex, 67
dimension of a simplex, 64, 67
dimension of abstract simplicial complex,
12
dimension of an abstract simplex, 67
distance, 64
divisive clustering, 95
dual complex, 22
eccentricity, 92
edges of a graph, 99
elementary column operations, 37
elementary row operations, 37
equivalent paths, 65
face of a simplex, 64
faces of an abstract complex, 67
lter, 91
ltration of a complex, 66
nite covering, 12
nite type of a persistence complex, 40
nite type of a persistence module, 40
free portion of a module decomposition, 41
full order of cells, 49
functor, 26
functorial clustering algorithm, 25
functorial homology group, 69
functoriality, 26
fundamental group, 65
fundamental theorem on homology groups,
69
Gaussian kernel, 92
geometric realization, 12, 67
Gr obner bases, 71
Gr obner basis, 79
graded module, 41
graded ring, 40
graph, 67, 99
graph Laplacian matrix, 92
graph representation, 100
graph Voronoi diagram, 11
greatest common divisor, 40
group of k-chains, 68
hierarchical clustering, 94
historical analysis, 29
homeomorphic spaces, 63
homeomorphism, 63
homogeneous basis, 49
homogeneous elements, 40
homologous cycles, 69
homology, 10
homology classes, 10
homology group, 36, 69
homology groups, 8
homology groups inductive system, 39
homotopic functions, 64
homotopy, 64
homotopy equivalent spaces, 64
homotopy groups, 7
homotopy invariance, 26
homotopy type, 64
image of a boundary homomorphism, 68
incidence matrix, 48
induced orientation, 65
initial point of a path, 64
interior of a set, 63
intrinsic graph metric, 101
inverse of a path, 65
isomorphic complexes, 66
kernel of a boundary homomorphism, 68
landmark points, 11
Lerays theorem, 13
link of a complex, 67
linkage metric, 100
localization, 99
loop, 65
lower complex, 26, 28
manifold, 65
map, 63
map of coverings, 25
Mayer-Vietoris blowup, 98
Mayer-Vietoris blowup complex, 98
Mayer-Vietoris lemma, 98
multidimensional persistance, 89
multiresolution, 25
multiscale image, 25
INDEX 117
neighborhood of a point, 63
nerve, 12, 67
nerve theorem, 13
non-degenerate position of points, 12
non-homologous k-cycle, 68
non-negatively graded module, 41
non-negatively graded ring, 41
non-orientable manifold, 65
norm, 64
normal matrix form, 37
null-homotopic space, 64
open covering, 12
open sets, 63
ordered k-simplex, 65
orientable manifold, 65
orientation, 65
orientation of a k-simplex, 65
oriented simplex, 65
partition of unity, 14
partner of a cell, 53
path, 64
path in a graph, 99
path-connected space, 64
persistence, 30
persistence algorithm, 49
persistence barcode, 33
persistence barcodes, 49
persistence Betti numbers lemma, 44
persistence complex, 29, 38
persistence homology, 34
persistence homology group, 34
persistence module, 39
persistence of Betti numbers, 56
persistent kth homology group, 29
persistent Betti numbers, 9, 33
persistent homology, 29
pivot column, 51
pivot element, 51
pivot row, 51
polynomial function, 91
polynomial with ring coecients, 40
principal ideal domain, 40
product of paths, 65
program description, 48
projection, 64
proper face of a simplex, 64
pseudo-code of the persistence algorithm,
55
reference map, 91
reference space, 91
regular CW complex, 22
resolution level, 25
restricted Delaunay complex, 20
restricted Delaunay triangulation, 20
restricted Voronoi cell, 19
restricted Voronoi diagram, 19
Rips complex, 16
similarity graph, 100
similarity simplicial complex, 13
similarity theorem, 13
simplexes, 8
simplicial complex, 2, 65
simplicial map, 15, 66
simplicially equivalent complexes, 66
single-linkage clustering, 99
Singular software, 47
singular value decomposition, 83
Smith matrix form, 37
software on Singular, 47
space-time analysis, 29
standard k-simplex, 66
standard k-simplex, 15
standard basis, 37
standard basis vector, 66
standard grading, 40
standard matrix representation of
k
, 37
standard realization, 15
standard realization for a simplex, 66
standard structure theorem, 69
star of a complex, 67
strong witness complex, 17
structure theorem, 41
subcomplex, 12, 66
subspace of a topological space, 63
subspace topology, 63
terminal point of a path, 64
topological space, 63
topological type, 63
topologically equivalent spaces, 63
topology, 63
torison subgroup, 69
torsion coecients, 70
torsion portion of a module decomposition,
42
triangulable space, 12, 66
triangulation, 12, 66
trivial loop, 65
underlying space, 12
118 INDEX
underlying space of a complex, 66
universal space, 64
upper complex, 26, 28
vertex set, 12
vertex set of a complex, 66
vertexes of a simplex, 11
vertices of a graph, 99
vertices of an abstract complex, 67
vertixes of a simplex, 64
Voronoi cell, 11, 18
Voronoi cells, 11
Voronoi diagram, 11, 16, 18
weak witness complex, 17
weight of an edge, 92
witness, 20
witness theorem, 18
youngest cell in a ltration, 54
Scientic Career
1999 Diploma, St. Petersburg State University
1999 -- 2001 Teacher, Company Informatization of Education
2001 -- 2004 Teacher, Electrical Engineering Saint-Petersburg State University
2005 Research Fellow, A.N. Krylov Research Institute
2006 Research Assistant, Hong Kong University of Science and Technology
2008 -- 2009 Research Mathematician, EP-Research, Royal Dutch Shell, Rijswijk, The Netherlands
2006 -- 2010 Ph.D. Candidate in Mathematics, Technische Universit at Kaiserslautern
119

You might also like