0% found this document useful (0 votes)
53 views17 pages

Lithogeochemical Prospecting: Signals Processing Applied To Segmentation of Geochemical Borehole Profiles

This document summarizes a paper about using signal processing and artificial intelligence techniques to interpret geochemical data from borehole profiles. It discusses challenges in converting raw borehole log data into meaningful higher-level entities for geological reasoning. These include managing ambiguity introduced by analyzing signals at different scales. The paper presents a solution specifically for signals from chemical analysis of borehole samples. It discusses formal aspects of representing borehole profiles as information objects in a domain of geological inference and reasoning. Key aspects include establishing attribute status and deciding whether to explicitly store secondary attribute values.

Uploaded by

George Anibor
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
Download as pdf or txt
0% found this document useful (0 votes)
53 views17 pages

Lithogeochemical Prospecting: Signals Processing Applied To Segmentation of Geochemical Borehole Profiles

This document summarizes a paper about using signal processing and artificial intelligence techniques to interpret geochemical data from borehole profiles. It discusses challenges in converting raw borehole log data into meaningful higher-level entities for geological reasoning. These include managing ambiguity introduced by analyzing signals at different scales. The paper presents a solution specifically for signals from chemical analysis of borehole samples. It discusses formal aspects of representing borehole profiles as information objects in a domain of geological inference and reasoning. Key aspects include establishing attribute status and deciding whether to explicitly store secondary attribute values.

Uploaded by

George Anibor
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
Download as pdf or txt
Download as pdf or txt
You are on page 1/ 17

SMYTH, c.P.

Lithogeochemical prospecting: signals processing applied to segmentation of geochemical borehole


profiles. APCOM 87. Proceedings of the Twentieth International Symposium on the Application of Computers
and Mathematics in the Mineral Industries. Volume 3: Geostatistics. Johannesburg, SAIMM, 1987. pp. 327 - 343.

Lithogeochemical Prospecting: Signals Processing


Applied to Segmentation of Geochemical Borehole
Profiles
C.P. SMYTH
Anglo American Corporation of South Africa, Johannesburg

Large volumes of quantitative geochemical data, presenting a formidable


interpretation task to geologists, are generated in the course of exploration
drilling for new ore deposits. The incorporation of geological expertise into
computer programs developed for the interpretation of this data would im-
prove the speed and accuracy with which it is interpreted.
A detailed analysis of the geochemicallog interpretation problem, together
with a survey of relevant research in Artificial Intelligence and petroleum well-
logging techniques, confirms that artificial intelligence, particularly as
developed in the fields of expert systems and pattern recognition, provides
techniques for implementing such data interpretation programs. In addition,
it provides data analysis techniques significantly more powerful than those
currently applied to geochemical data, and very appropriate to the 'hypothesise
and test' character of mineral exploration problem-solving.
The conversion of raw borehole log data to meaningful higher level
geochemical entities, in terms of which geological reasoning may take place,
is a critical step in applying these techniques to log data analysis. A method
of achieving this conversion as an exercise in signal-to-symbol transforma-
tion is presented. 1

Introduction
The interpretation of geochemical blem of computing these descriptions as
borehole profiles is an exercise in signal follows:
understanding, and hardly any signal
'A great deal of effort has been ex-
understanding task can be performed using pended to obtain this kind of primi-
the raw numerical signal values directly: tive qualitative description, and the
problem has proved extremely difficult.
some description of the signal must first The problem of scale has emerged consis-
2 tently as a fundamental source of
be obtained (Nii ). Such descriptions
difficulty, because the events we per-
should be as compact as possible, and ceive and find meaningful vary enor-
their elements should correspond as mously in size and extent. The problem
is not so much to eliminate fine-scale
closely as possible to meaningful objects noise, as to separate events at diffe-
or events in the signal-forming entity. rent scales arising from distinct phy-
sical processes. It is possible to
A discussion of the abstract aspects to introduce a parameter of scale by
computing these descriptions and their im- smoothing the signal with a mask of
variable size, but with the introduc-
portance to interpretation of geochemical tion of scale-dependenoe comes ambi-
profiles is presented below. Andrews P. guity: every setting of the scale
parameter yields a different descrip-
Witkin 3 has described the practical pro- tion; new external points may appear,

LITHOGEOCHEMICAL PROSPECTING 327


and existing ones may move or disappear. them) and information objects (create
How can we decide which, if any, of
this continuum of descriptions is them, define attributes for them) •
'right'? It is useful, in the discussion of data,
There is rarely a sound basi~ for set- to be explicit about the status of any
ting the scale parameter. In fact, it particular attribute value, with respect
has become apparant that for many tasks,
no one scale of description is catego- to the object it qualifies, and the rele-
rically correct: the physical processes vant operative executive object, because
that generate signals (such as images)
act at a variety of scales, none intrin- of the bearing this status has on whether
sically more interesting or important or not the value must be stored.
than another. Thus the ambiguity in-
troduced by scale is inherent and in- The status of an attribute value is
escapable, so the goal of scale-depen- primary with respect to its qualified ob-
dent description cannot be to eliminate
this ambiguity, but rather to manage ject and a specific executive object, if,
it effectively, and reduce it where once the object has been identified, the
possible.
value cannot be computed by the executive
This line of thinking has led to con- object from other primary attribute values
siderable interest in multi-scale
descriptions. However, merely com- of the identified object (ie: it must be
puting descriptions at multiple scales stored somewhere) .
does not solve the problem; if any-
thing, it exacerbates it by increasing The status of an attribute value is
the volume of data. Sane means must secondary with respect to its qualified
be found to organise or simplify the
description, by relating one scale object and a specific executive object,
to another. ' if, once the object's primary attributes
Every point raised by Witkin is wholely have been located, the executive object is
relevant to geochemical 'signals': his capable of computing that value from the
problem of scale is isomorphic to the primary attributes, or fetching it from
classic geochemical background/anomaly the attribute values of constituent sub-
distinction problem. objects or super-objects of the gualified
This.paper presents one solution to object. Whether attribute values are pri-
these problems, specifically developed for mary or secondary, then, depends both on

signals resulting from the chemical ana- how objects are identified (eg: fourth
lysis of borehole samples. sample from top, or, sample ABC-24), and
what knowledge is available to the execu-
Formal and abstract aspects of borehole tive object regarding generation of attri-
profiles bute values.
In the domain of geological inference A borehole segment is an information
and reasoning (by geologist or computer) object representing any sampled section of
one can distinguish between physical ob- a borehole.
jects (minerals, boreholes), information For the geologist, provided the name of
objects (borehole logs, data files) and the borehole from which the segment origi-
executive objects (the computer, the geo- nates is known, and provided the borehole
logist). Executive objects are active, data base is available, the most obvious
while physical and information objects are primary attributes of a segment are its
passive. They are active because they can starting depth, ending depth and name, if
manipulate either, or both, of physical it has one. Any other attributes are
objects (sample them, name them, destroy secondary, as they can be canputed or

328 GEOSTATISTICS: DATA ANALYSIS


fetched from the data base. While these relationships between inter-
Establishing the status of segment vals and their components and segments and
attributes with respect to the computer, their components are similar structurally
and deciding whether secondary attribute (syntactically), their semantics are quite
values should be stored explicitely are dif ferent. An interval has meaning only
both subjects of major importance to in terms of its individual component seg-
signal-to-symbol transformation, because ments, and the relations, such as
borehole segments are the symbol struc- discontinuities, between them.
tures to essential pattern recognition in In contrast, a segment has meaning with
borehole profiles. respect to the attribute values used to
A borehole interval is an information derive it, only on the explicit assumption
object representing any continuous se- that no discontinuity, at the segment's
quence of segments or intervals on a defined scale, exists between its compo-
borehole. (The reason for this recursive nent samples.
definition of an interval will become Consequently, intervals and segments
clear below.) As such, the relationships have entirely different representations.
between intervals (successions of Intervals are represented by lists of
segments) and their segments, and segments their component segments, while segments
(successions of samples) and their are represented by 14-tuples holding
samples, are very similar. segment attribute values.
The similarity goes further than their In general, there are t-wo methods of
both being sequences: with significant referencing an attribute value of an in-
exceptions, for any set of primary attri- fornation object - by its name or "by its
butes measured (or calculated) for any value. (In this context, we understand
particular interval of borehole segments, 'name' to be name or address - which is
those attributes are measured for all the acceptable in the context of our dis-
segments of the interval (in the most tinction between primary and secondary
general case, the interval of segments is status of objects or attributes.) If it
the entire borehole). This means that, as is to be referenced by name, with the
for storage or representation of primary ultimate objective of using its value,
sample attribute values, an appropriate that name must hold sufficient information
construct for primary attribute values of to allow the active_executive object to
segments is also the Third Nornal Form of locate the value, if it is stored, or per-
the Relational Data Base model. haps to recampute it, if it is a secondary
In fact, on close inspection, it becomes attribute value.
clear that a sample may be regarded as a Ensuring that newly generated informa-
degenerate (single-sample) segment, about tion objects and their attributes are
which certain infornation, such as slope given names that hold this information is
and correlation co-efficient, as a result the responsibility of the executive
of its degeneracy, is missing. objects which create them. At the same
Exploitation of this similarity contri- time, however, and particularly when large
butes to standardisation of data struc- numbers of information objects are
tures and data structure manipulation involved, this responsibility can be
primitives. burdensome and error-prone for the

LITHOGEOCHEMICAL PROSPECTING 329


geologist. It should, therefore, when the latter category. The difference may,
J;Xlssible, be the responsibility of the or may not, be significant, and its signi-
system to generate these names, capital- ficance is a function of the nature and
ising, in so doing on its strengths in scale of the phenomena sought in the sig-
inheritance administration and search nals.
control. Importantly, the two categories differ
The system defined below assumes respon- in the relationship existing for each
sibility for the naming of segments and between adjacent signal values. For inte-
intervals only. Segments are represented grated signal values, the adjacency rela-
by the same constructs in memory and tionship may be conjunction or disjunc-
storage, and retain the same names in both tion, while for spot signal values, the
contexts. Intervals, however, have diffe- relationship is always one of disjunction.
rent representations in memory and in Sequences of conjunctionally related in-
storage, and require more sophisticated tegrated signal values carry more infor-
naming conventions. mation about signal discontinuities than
do spot signal values - information which
Quantitative geochemical attribute values as may be used both to assist segmentation,
signals: Consequences for segmentation
and during pattern definition, particu-
Attributes of geochemical borehole profile signals
larly if discontinuities are important
Providing the entire sample is analysed, pattern components, as they are in geo-
and ignoring analytical error, a 'whole chemical borehole logs.
rock' chemical analysis value of a sample Hew this information is used during
which is a continuous section of drill- segmentation is detailed below, and re-
core, or drill-chips, represents the mean volves primarily around administering
composition of that sample along the segment-edge signal values.
section. The role of signal discontinuities as
Mathematically we may think of such a pattern components makes it imJ;Xlrtant to
value as being be able to refer to them explicitly during

[fiX) dx]
0..
llb-a)
pattern definition (ie: during interval
parsing). It proves expedient, however,
to express them only implicitly (as re-
where: lationships between signal values or seg-
x = depth ments) in segmented representations of raw
a = depth at which sample starts
b = depth at which sample ends signal data, as is made clear below.
f(x) = chemical attribute value at The above characterisation of borehole
depth x
signal values helps develop methods of
We shall call such a signal value an transforming and describing the signals in
'integrated signal value'. Signal values a manner most sensitive to meaningful
of this type are in a different class from entities they may reflect, by highlighting
signals measured discontinuously, at a signal attributes which may assist segmen-
succession of regularly (or irregularly) tation.
spaced points, which we shall call 'sJ;Xlt
signal values'. Most digitally acquired Use of signal attributes for segmentation
borehole magnetometer logs are examples of One interesting attribute, in terms of

330 GEOSTATISTICS: DATA ANALYSIS


the mathematical representation of the situations one sample must be included in
signal value given above, is the signal two segments. It has more subtle effects
value numerator (viz: If (x) dx ) , which may on the administration of missing signals.
be calculated by multiplying the signal
value by its sample length. Segmentation overview
Since geologists, and particularly The Segmentation Method presented below
mining geologists evaluating ore reserves, provides a variable-scale signal-ta-symbol
think naturally, if unconsciously, of the transformation technique which is very
'area under the curve' when interpreting sensitive to the information content of
borehole profiles, this seemed a poten- its input signals, together with symbol
tially useful metric to involve in the representation constructs which efficient-
segmentation procedure. It seemed even ly manage multi-scale signal descriptions.
more attractive in view of the problems Signal-to-symbol (ie: signal-to-descrip-
surrounding the application of conven- tion) transformation is implemented by
tional segmentation techniques (eg: piece- three scale-independent preliminary phase
wise linear approximation) to signal operations, designed to provide the most
values integrated over different interval primitive level of signal description.
lengths (ie: different sample lengths) - These are followed by 'fourth phase trans-
and in view of the recognition that formation' , which is scale dependent, and
segments themselves could be regarded as may involve a series of transforrrations,
conjunctional integrated signal values. producing successively more generalised
There is another significant characte- descriptions of the original signals.
ristic of geochemical attribute values as Signal descriptions are represented by
signals, namely that, in the majority of nested lists of segments, with finer scale
cases, when there is a change in the descriptions nested within more general
direction of change of the value, it is in representations. Different nesting
response to a real change in the sampled structures are used during preliminary
W)rld, rather than being 'introduced phase and fourth phase transformations,
noise'. Since establishing whether or not largely as a result of the predominance of
such real changes are significant can splitting in the former, and merging in
reasonably be left to higher levels of the the latter. It is likely that with
signal-understanding task, this too may be streamlining of the preliminary transfor-
incorporated into the segmentation mation phases, use of the 'splitting'
activity. structure may be discontinued.
An important qualifier to this appli- Segments are the primitives used for
cation of 'first derivative slope change' signal description, and constitute the
as a segmentation criterion is that, in elements of the nested lists which
the context of borehole signals, its describe signals. They are represented by
effect must be independent of the direc- 14-tuples, which hold the following
tion of evaluation along the profile. segment attribute values, which are
This is necessary to ensure the same re- primary attribute values in this context:
presentation of features penetrated in
(1) The segment name (identifier).
different directions by a borehole. Its
primary effect is that, in certain (2) The name of the first sample in the

LITHOGEOCHEMICAL PROSPECTING 331


segment. useful as a segment boundary. For simpli-
city of application, therefore, it sets
(3) The name of the last sample in the
segment. segment boundaries at all changes in
direction of signal value change.
(4) The depth of the top of the first
sample. Specifically, Phase One Segmentation
operates under the following conditions:
(5) The depth of the bottom of the last
sample. (i) Knowing that the objects upon
the attributes of which
(6) The estirrated value of the quanti- segmentation is based are
tative attribute (X) used to define samples, and that the attribute
the segment at the top of the values are conjunctional
segment. integrated values.
(7) The estimated value of X at the
bottcm of the segment. (ii) Requiring that the segmentation
be direction independent.
(8) The maximum value of X in the
segment. (iii) Observing that three signifi-
cantly different patterns may
(9) The minimum value of X in the surround changes in the
segment. direction of change of an attri-
bute value.
(10) The (weighted) mean value of X in the
segment. (iv) Knowing that segmentation re-
sponses to these different
(11) The slope of the linear regression of patterns should be sensitive to
X on depth. the information signal values
will carry (being conjunctional
(12) The intercept of the linear regres- integrated signal values), re-
sion of X on depth. garding the location of discon-
tinuities in the sampled object.
(13) The X: depth correlation coefficient. Specifically, this information
carried by a signal value margi-
(14) The number of samples in the segment. nal to a segment edge may:

For the geologist, only the first three a: include infonnation related
items above are truly primary segment to signal values on either
attributes. side of the marginal signal
value (ie: the discontin-
Segments are named uniquely within each uity occurs within the
sample). This cannot arise
signal description. A signal description for spot signal values.
(in fact an interval) has only one name
b: include information related
pertaining to the multiple-scale nested only to one of its adjacent
list which describes it. Different levels signal values (ie: a single
discontinuity occurs at one
of that description are accessed proce- margin of the sample).
dural 1y, not by name.
c: include infonnation related
to neither of its adjacent
Segmentation: Phase one signal values (ie: there
are discontinuities on both
During Phase One of its operation, the margins of the sample) •
Segmentation Process applies the knowledge
In practice, application of this
that, in the majority of cases, along a knowledqe is distributed between
Phase One and Phase Two segmen-
sequence of borehole samples, when there tation, because it can require
is a change in the direction of change of consideration of relative signal
value magnitudes (considered in
an attribute value, that change reflects a Phase Two), rather than simply
change in the sampled world, and may be their directions of change.

332 GEOSTATISTICS: DATA ANALYSIS


POTENTIAL
DISCONTINUITY
LOCATIONS POTENTIAL

HiH DISCONTINUITY
LOCATIONS

SIGNAL
i i
MAGNITUDE SIGNAL
MAGNITUDE

DEPTH DEPTH

III IIDD DD I I I I D I I I I

C DD DD D I I
INCREASING!DECREASING TRACE (L TO R)
INCREASING! DECREASING TRACE (R TO L)
I ID
G D D D DID DDD)
INCREASING! DECREASING TRACE (L TO R)
INCREASING! DECREASING TRACE (R TO L)
'-r-'
SAMPLE ASSIGNED TO DISCONTINUITY t
(INCLUDED IN BOTH SEGMENTS) JUNCTION ASSIGNED TO DISCONTINUITY

FIGURE lea). Type (1) direction-change pattern. Finite FIGURE l(b). Type (2) direction-change pattern. Finite
automaton searches for XXYY pattern in L to automaton searches for XXYXX pattern in L
R direction, where X = I and Y = D, or to R direction, where X = I and Y = D, or
X = D and Y = I X = D and Y = I.

SIGNAL
MAGNITUDE
I DEPTH
SIGNAL
MAGNITUDE
I DEPTH

(1) D 1 D 1 D (1)) (I) DID I D I D I (D))


~DECREASING
INCREASING!DECREASING
TRACE (L TO R )
TRACE (R TO Ll
(
INCRE;ASING!DECREASING
INCREASING! DECREASING
TRACE (L TO R)
TRACE (R TO Ll

SIGNAL
MAGNITUDE
r
DEPTH

1"(1) DID I DID 1 (D)U


\...-INCREASING!DECREASING TRACE (L TO R)
INCREASING! DECREASING TRACE (R TO L)

FIGURE l(c). Type (3) direction-change pattern

LITHOGEOCHEMICAL PROSPECTING 333


The three different patterns surrounding segments, which would result from the Type
direction changes, are presented and (1) approach.

annotated in Figures l(a), l(b) and l(c), The above theoretically developed
and described below. approach to segmentation is implemented in
three steps:
Pattern (1): In patterns of Type (1) in
Figure 1, on the basis of direction of
signal value change a16ne, there is no (i) A dichotomized trace (in one di-
evidence for the exact location of a rection) of the relevant bore-
discontinuity, if, indeed, one exists, at hole profile is produced, re-
all. Further, depending on the direction flecting each signal value's de-
in which the value sequence is evaluated, creasing or increasing relation-
different values become the first values ship to the sample before it.
of a new sequence . Consequently, because
conditions (ii) and (iv) a above apply, the (ii) This trace serves as input to a
marginal signal value is included in finite automaton subroutine,
segments to either side of it - as the which uses it to recognise and
first in one, and the last in the other. output Phase One segments (as
defined by their first and last
Pattern (2): In patterns of Type (2) in samples) •
Figure 1, on the basis of direction of
signal value change alone, there is (iii) Segment attributes are calcula-
evidence for a discontinuity, and for its ted, and stored in segment 14-
unequivocal location - although this tuples, and accumulated into a
becomes evident only after the direction list of segments which consti-
of change of the second sample after the tutes the interval that repre-
discontinuity has been checked. The sents the entire length of the
pattern recognition is not direction- borehole.
sensitive, and condition (iv)a or (iv)b
above applies, allowing segmentmarginal In practice, all these functions are
samples to be included in only one seg- carried out by a Pascal program
ment each. (The discontinuity itself may
occur in either of the two samples, or (PROLSEG4) , which is called from a Core
exactly between them.) program. The only operations for which
Pattern (3): Patterns of Type (3) may be the Core program itself is responsible,
regarded as special cases of Type (1) or are the following:
Type (2), and may be managed as either,
depending on whether condition (iv)a, (i) Receipt of the user's command to
(iv)b or (iv)c is thought to apply. If initiate the activity, and his
condition (iv)c applies, management as for selection of attribute (element)
Type (2) suffices. A fourth alternative and data set (borehole).
is to regard the entire extent of such
sequences as one segment. This policy was (ii) Initiation of the PROLSEG4
adopted initially, because such patterns program.
arise cornn:DI11y in geochemistry, when the
only difference between successive values (iii) Receipt from the PROLSEG4
is signal noise, or because of ~npling program of all of the output
effects resulting from under-sized segments, their structuring into
samples. It has the effect of minimising a nested list, and naming of the
the generation of many single sample, or resulting interval.
two sample segments. However, resolution
of signal structure can be severely Segments resulting from execution of
compromised, and this policy was dropped Phase One Segmentation of the Barium pro-
in favour of the Type (2) approach. This
causes Type (3) patterns to produce a file from Borehole 46gr are shown in
succession of single sample segments. Figure 2, alongside a plot of the original
Clearly, during data interpretation,
condition (iv) a, (iv)b and (iv)c may need data.
to be considered, and management of this Figure 3 explains the structure of the
task at higher levels in the system is
facilitated by single sample segments, nested lists used to represent intervals
rather than by the storing of two-sample during this and the following stage of

334 GEOSTATISTICS; DATA ANALYSIS


by Phase One Segmentation because they are
Ba not accompanied by a change in direction
985 4735 985 4735 985 4735
of change. It looks for these features by
examining, one segment at a time, the sig-
nal values from samples within a segment -
looking for an 'abnonnally' large signal
value change between adjacent signal
values. If it finds such a change, it
splits the segment into two segments,
across the abnormal change. Examples of
signal value sequences showing such
features are presented in Figure 4.
Currently, this activity looks only for
one feature around which to split within
each segment. A more sophisticated
, spli tter' would be sensitive to a number
of such features, particularly if segments
are constrained to linear representation,
since uni-directional curvilinear changes
FIGURE 2. Barium profiles for borehole b46gr: (which do occur in borehole profiles)
L.H.S.: Raw Ba data, with two missing samples; could then be approximated by a number of
middle: Profile output by phase one segmenta-
tion; R.H.S.: Profile output by phase two relatively short linear segments.
segmentation Phase Two Segmentation is implemented in
the Core program in PROUX;, with calls
segmentation. This representation was made to Data Base Read and Statistics Fa-
originally designed to manage both the cilities, which are implemented in Pascal,
splitting (Phases One and Two) and merging when necessary.
(Phases Three and Four) phases of segmen- Though implemented in PROUX;, the pro-
tation. It has since proven sub-optimal gram is procedural in character. It pro-
for the representation of merged segments, cesses the signal values grouped with each
and might eventually be discontinued al- Phase One segment by evaluating a split-
together, should the first three phases of ting metric for each position (between
the Segmentation system be combined into signal values) at which a segment may be
one process. It should be noted that dis- split, and then checking the highest
continuities have no explicit expression valued metric for the segment against a
in these structures, but have to be de- pre-set threshold. If the metric is
rived from the marginal values of abutting higher than the threshold, the segment is
segments. split, otherwise attention moves on to the
next segment. Evaluation of the metric is
Segmentation: Phase two explained in Figure 5. OUtput from con-
Phase Two segmentatio~ detects signal ducting Phase Two Segmentation on Phase
features which are potentially significant One segments from the Barium profile of
with respect to characteristics of the Borehole 46gr is shown in Figure 2.
sampled rock, but which are not detected

LITHOGEOCHEMICAL PROSPECTING 335


DIAGRAMATIC GRAPH REPRESENTATION:

PHASE I

PHASE 2

PHASE 3

PHASE 4

Nested List Representation of the Top of the above Tree:

[< segment 00 attributes> (empty) ([< seg 10 ats> (00) (LIST A) J ,


[< seg 11 ats> (00) (empty) ] ,
[ etc ]) ]

where LIST A [( < seg 20 ats> (10) « seg 50 ats>(20,30) (empty))),


« seg 21 ats> (10) « seg 30 ats>(21,11) (empty)))]
General Structure:

interval : = [< tuple of seg atts> (list of names of parent interval


or intervals) (list of subsegs or a single enclosing
segment) ]

ie:

interval := list of A's


where A := tuple X, list Y, List Z ] ;
[
tuple X := a tuple of segment attributes;
list Y := either: name of parent seg (if seg arose from split)
or: names of parent segs (if seg arose from merge)
list Z := either: list of subsegments into which current
level segment is split
or: super-segment into which current level
seg has become the 'left-marginal' merged
seg;
(a super-seg is represented only within
its left-marginal sub-segment)

FIGURE 3, Nested list structure designed to represent both split and merge components of an interval.
(It has major problems in representing overlap,)

336 GEOSTATISTICS: DATA ANALYSIS


1 1 1

r
SIGNAL
MAGNITUDE

DEPTH
..
ONE TWO
DISCONTINUITY DISCONTINUITIES CURVILINEAR

FIGURE 4. Signal value sequences showing significant discontinuous change

CHECKING METRIC FOR CHECKING METRIC FOR CHECKING METRIC FOR


SPLIT HERE: SPLIT HERE: SPLIT HERE:

1 ! j
0

1
r X X

SIGNAL
MAGNITUDE I 1
B 1

DEPTH

)(
)( x METRIC =------___________ _
METRIC = METRIC =---------- -_______ _
((ot-b)/2) mo)( «otb)/2); (d)) mo)(((otb)/2);
((ctdte)/3))

FIGURE 5. Evaluation of the 'segment splitting metric'

Segmentation: Phase three left to higher levels of signal transfor-


It is the principle of the first two mation. Nevertheless, the Qverall objec-
phases of segmentation to give expression tive of segmentation is to express the
to as much structure in the original sig- signal in as concise a form as possible.
nal as possible, although this may incur The sequential canbination of Phase One
the cost of generating and storing rrore and Phase Two Segmentation can generate
segments than are necessary. The task of erroneously redundant single-sample
'overlooking' these unnessary segments is segments, which are completely overlapped

LITHOGEOCHEMICAL PROSPECTING 337


by non-degenerate segments accounting for noise. At the SaI1'e time, it transforms
the SaI1'e subsection of a profile. How the internal representation of an interval
such a situation can develop is from that designed for splitting opera-
illustrated in Figure 6. tions, to that designed for the merging
Further, the heuristics employed in the phases of segmentation. The structure of
first two phases of segmentation may this representation is explained in Figure
generate segments whose presence may be 7. As in Figure 3, it should be noted
entirely accounted for by noise in the that discontinuities between segments are
original signal. (Some of the important not explicitly represented.
knowledge a geochemist uses in interpre- Phase Three Segmentation is implemented
ting a profile is his awareness of the in the Core program in PROLCG, with calls
(highly variable) noise level inherent in to a Statistics Facility for computation
a geochemical signal value. The noise of new segment attribute values.
level is a function of the attribute
measured, the method used to measure it, Segmentation: Phase four
its measured magnitude, and sometimes of The first three phases of the Segmenta-
the values of other sample attributes.) tion Facility effect signal-to-symbol
Phase Three Segmentation has, as its transformation at the finest practicable
primary objectives, the elimination of scale, given the nature of the signal, and
redundant degenerate segments, and de- the nature of the features that are to be
generate segments whose isolated existance recognised in the signals. However, as
is most likely to result from signal explained in the introduction, multi-scale

PHASE
ONE

SEGMENTATION +

PHASE
TWO

SEGMENTATION

+
REDUNDANT
SEGMENT

FIGURE 6. The development of redundant degenerate segments, by a combination of phase one and phase
two segmentation

338 GEOSTATISTICS: DATA ANALYSIS


DIAGRAMMATIC GRAPH REPRESENTATION:

PHASE THREE SEGMENTATION

PHASE FOUR SEGMENTATION


(PASS 1)

PHASE FOUR SEGMENTATION


(PASS 2)

CALCULATED DURING PHASE


ONE SEGMENTATION

General Interval Representation:

[< tuple of 'super-segment' attributes> (list of its sub-intervals) ]

Nested List Representation of the above interval:

before Phase Three Segmentation:

[< seg 00 ats « < seg 20 ats> (empty» J

( < seg 21 ats> (empty» J

( < seg 11 ats> (empty» J

( <seg 22 ats> (empty» J etc)]

after Phase Three Segmentation:

[< seg 00 ats « <seg 20 ats> (empty» J


( < seg 30 ats> «seg 21 ats > (empty» J ( < seg 11 ats > ( empty»),
( < seg 22 ats> (empty» J etc:)]

FIGURE 7. Nested list structure designed to represent successively merged components of an interval during
segmentation

representations of the original signal are them all their finer scale detail, which
needed for optimal feature recognition. may readily be accessed when necessary.
Phase Four Segmentation transforms re- Transformation of finer scale signal
latively finer scale symbolic representa- representations to coarser scale repre-
tions of geochemical profile signals to sentations is essentially the controlled
relatively coarser scale representations. merging of segments.
As an activity, it may be executed repeat- Phase Four Segmentation uses an 'Adja-
edly (with different parameters) on the cent Segment Area Difference' metric
same interval, to give ever more coarse- (ASAD) to control segment merging. If
scale representations of the same signal. this metric, as evaluated for both di-
Coarser scale representations carry with rections across a boundary between two

LITHOGEOCHEMICAL PROSPECTING 339


DEPTH DEPTH

DEPTH DEPTH

DEPTH DEPTH

FIGURE 8. Graphical illustration of the evalution of the 'adjacent segment area difference' metric,
emphasising its direction-dependency

segments, remains below a user-selectable than exclusively ASAD-controlled segment-


threshold, then those two segments may be merging would permit.
merged, provided that a second, discon- Examples of the operation of ASAD/DP -
tinuity-related, constraint is also satis- controlled segment merging are shown in
fied. Evaluation of the ASAD metric is Figure 9, together with an example of
illustrated in Figure 8. merging controlled by the ASAD metric
The user-selectable discontinuity-pre- alone.
serving constraint on segmentation is Phase Four Segmentation is implemented
provided because of the importance 'short- in the Core program in PROLCG, with calls
term' (and therefore with relatively small to a Pascal Facility for the calculation
ASAD's) discontinuities assume in certain of the ASAD metric, and to the Statistics
geological contexts, which requires their Facility for the computation of new seg-
explicit representation at coarser scales ment attribute values.

340 GEOSTATISTICS: DATA ANALYSIS


Ba
985 4735 985 4735 985 4735 1035 4735 1035 4735 1035 4735
o

FIGURE 9(a). Phase four segmentation without the discontinuity preserving (DP) metric (ASAD thresholds
for 9(b»

FIGURE 9(b). Phase four segmentation with the discontinuity preserving (DP) metric set to 0,35.
Profile xbal is derived from prifile b46gr ba 2 of Figure 2 with an ASAD threshold of 1000,
xba2 from xbal with a threshold of 2000; likewise with thresholds of 4000, 6000, and 900
up to profile xbz9

Segmentation: Variable sample lengths variable length samples.


In its current implementation, the The general principle applied during
system has to provide especially for grouping of samples into a segment is
samples of variable length only during that, 1.ll1less samples have achieved full
grouping of samples in Phase One and Phase 'segment' status (ie: not 1.ll1til the first
Three Segmentation, and whenever calcula- three phases of segmentation are com-
ting the attributes of segments including plete), they nay not be grouped together

LITHOGEOCHEMICAL PROSPECTING 341


Zn Ba Ba Ba
I 4240 44 39988 44 4000 44 4000 44 4000
o 0

.0 .0
t!) t!)
"0 "0
Cl> Cl>

FIGURE 10. Segmentation of data from samples of variable length.


Unsegmented Zn and Ba profiles are presented to illustrate the dynamic range over which
segmentation algorithms have to perform, and to illustrate the importance of scale in data
interpretation. Profiles (3) and (4) illustrate the utility of varying horizontal and vertical scales
respectively, to better appreciate the structure of the data. Profile (5) illustrates output from
phase one segmentation on the data of profile (4), with obvious examples of sample length
considerations overruling the direction-change segmentation heuristic (e.g.: the isolated, short,
high-valued sample one third the way down the profile)

Ba
o 4000 0 4000 0 4000 -3004000 -229 39988
o

(5
o
ot!)
o
.0

FIGURE 11. Segmentation phase one, two and four as applied to samples of variable length over the full
extent of a borehole. Profile (2) is the output from phase one segmentation, profile (3) the
output from phase two segmentation (the effect of which is well illustrated in the top three
samples of the profile). Profile (4) and (5) result from phase four segmentation at ASAD
thresholds of 2000 and 5000 respectively (DP metric was held constant at 0,35). Although
the linear regression approximation to the long segment near the centre of profile (5) is
obviously inappropriate (and produces a negative result at one extremity), the technique
provides a good approximation to most of the profile. The profile plotted in position (6)
is the same as that plotted in position (5), but is plotted at a different scale. It should be
compared with the second profile of Figure 10

342 GEOSTATISTICS: DATA ANALYSIS


with other samples of substantially diffe- tion (by use of extrapolated
rent lengths. Factors delimiting the segment lines to demarkate area
margins) •
maximum acceptable length-differences
between samples of one segment during low
(iii) Representation of segments as
curves.
level segmentation are user-adjustable, by
(iv) Better statistical manipulation
variation of 'current-maxlength' and
of variable length samples.
'current-minlength' factors.
(v) Automatic Phase Four thres-
Currently a rather unsophisticated, but
holding, based on assessment of
functional, method has been adopted for ASAD metrics obtaining after
calculating the attributes of segments Phase Three Segmentation, and
after each cycle of Phase Four
including variable sample lengths. It Segmentation.
requires the substitution of each signal
Nevertheless, the Segmentation Method
value within a segment by the closest in-
described above is fully functional. It
tegral number of equivalent integrated
transforms geochemical profiles into
$ignal values equal in length to half the
multi scale representations, appropriate
length of the shortest signal value in the
for input to parsing routines designed to
segment, at appropriately spaced locations
recognise geologically significant
along the sample length. Segment attri-
features of the original profiles. It
butes are then calculated from all such
thus provides a technique for direct in-
'derived' integrated signal values of
terfacing between large sets of drill
equal length for the segment.
data, and Expert Systems designed to in-
It is recognised that a more sophisti-
terpret such data.
cated technique is required for this pur-
pose. Implementation of such a technique ACknowledgements
was delayed for consideration together
ACKNavLEDGEMENTS: This research was
with consideration of more sophisticated
undertaken at the Imperial College of
segment represention (ie: curvilinear seg-
Science and Technology in IDndon, with
ments) .
financial assistance from the Anglo
Examples of segmentation output from
American Corporation of South Africa.
signal profiles with variable sample
lengths are presented in Figures 10 and References
11. 1. SMYTH, C.P. Expert System assisted
interpretation of geochemical bore-
Segmentation: Refinements and conclusions hole logs. MSc Thesis, Imperial
Among the many refinements that should College (London), 1985
be added to the Segmentation Facility, as
presented, the following are some of the 2. NIl, H.P. and FEIGENBAUM, E.A. Signal
most obvious: to symbol transformation: HASP/SIAP
Case Study. The AI Magazine vol. 3,
(i) Discrimination of more subtle pt 2, 1983 pp. 23-35.
discontinuities, and more than
one of these per segment, during
Phase Two Segmentation. 3. WITKIN, A.P. Scale-Space filtering.
Proc. 8th Int. J. Conf. on AI 1983
(ii) Incorporation of both segment
slopes into ASAD metric calcula- pp. 1019-1022.

LITHOGEOCHEMICAL PROSPECTING 343

You might also like