ICE Workbench
ICE Workbench
ICE Workbench
ROBOTICS
Frontiers in Artificial Intelligence
and Applications
Series Editors: J. Breuker, R. Lopez de Mdntaras, M. Mohammadian, S. Ohsuga and
W. Swartout
Volume 85
Vol. 84, H. Fujita and P. Johannesson (Eds.), New Trends in Software Methodologies. Tools and
Techniques
Vol. 83, V. Loia (Ed.), Soft Computing Agents
Vol. 82, E. Damiani et al. (Eds.), Knowledge-Based Intelligent Information Engineering Systems and
Allied Technologies
Vol. 81, In production
Vol. 80, T. Welzer et al. (Eds.), Knowledge-based Software Engineering
Vol. 79. H. Motoda (Ed.), Active Mining
Vol. 78, T. Vidal and P. Liberatore (Eds.), STAIRS 2002
Vol. 77. F. van Harmelen (Ed.), ECAI 2002
Vol. 76, P. Sincak et al. (Eds.). Intelligent Technologies - Theory and Applications
Vol. 75, I.F. Cruz et al. (Eds.), The Emerging Semantic Web
Vol. 74, M. Blay-Fornarino et al. (Eds.), Cooperative Systems Design
Vol. 73. H. Kangassalo et al. (Eds.), Information Modelling and Knowledge Bases XIII
Vol. 72, A. Namatame et al. (Eds.). Agent-Based Approaches in Economic and Social Complex Systems
Vol. 71, J.M. Abe and J.I. da Silva Filho (Eds.), Logic, Artificial Intelligence and Robotics
Vol. 70. B. Verheij el al. (Eds.), Legal Knowledge and Information Systems
Vol. 69, N. Baba et al. (Eds.), Knowledge-Based Intelligent Information Engineering Systems & Allied
Technologies
Vol. 68, J.D. Moore et al. (Eds.), Artificial Intelligence in Education
Vol. 67. H. Jaakkola et al. (Eds.), Information Modelling and Knowledge Bases XII
Vol. 66. H.H. Lund et al. (Eds.), Seventh Scandinavian Conference on Artificial Intelligence
Vol. 65, In production
Vol. 64, J. Breuker et al. (Eds.), Legal Knowledge and Information Systems
Vol. 63, I. Gent et al. (Eds.), SAT2000
Vol. 62, T. Hruska and M. Hashimoto (Eds.), Knowledge-Based Software Engineering
Vol. 61, E. Kawaguchi et al. (Eds.), Information Modelling and Knowledge Bases XI
Vol. 60. P. Hoffman and D. Lemke (Eds.), Teaching and Learning in a Network World
Vol. 59, M. Mohammadian (Ed.), Advances in Intelligent Systems: Theory and Applications
Vol. 58, R. Dieng et al. (Eds.), Designing Cooperative Systems
Vol. 57, M. Mohammadian (Ed.), New Frontiers in Computational Intelligence and its Applications
Vol. 56. M.I. Torres and A. Sanfeliu (Eds.), Pattern Recognition and Applications
Vol. 55, G. Gumming et al. (Eds.), Advanced Research in Computers and Communications in Education
Vol. 54, W. Horn (Ed.). ECAI 2000
Vol. 53, E. Motta. Reusable Components for Knowledge Modelling
Vol. 52. In production
Vol. 51. H. Jaakkola et al. (Eds.), Information Modelling and Knowledge Bases X
Vol. 50. S.P. Lajoie and M. Vivet (Eds.), Artificial Intelligence in Education
ISSN: 0922-6389
Advances in Logic, Artificial
Intelligence and Robotics
LAPTEC 2002
Edited by
Jair Minoro Abe
SEN AC - College of Computer Science and Technology, Sao Paulo, Brazil
and
/OS
Press
Ohmsha
All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted,
in any form or by any means, without prior written permission from the publisher.
Publisher
IOS Press
Nieuwe Hemweg 6B
1013 BG Amsterdam
The Netherlands
fax:+31 206203419
e-mail: [email protected]
LEGAL NOTICE
The publisher is not responsible for the use which might be made of the following information.
General Chairs:
Organizing Committee:
Alexandre Scalzitti (Germany), Claudio Rodrigo Torres (Brazil - Chair), Flavio Shigeo
Yamamoto (Brazil), Marcos Roberto Bombacini (Brazil - Vice-Chair), and Neli Regina S.
Ortega (Brazil).
On behalf of the Organizing Committee, we would like to express our gratitude to the
members of the following committees:
Honorary Chairs:
Scientific Committee:
Atsuyuki Suzuki (Japan), Braulio Coelho Avila (Brazil), Daniel Dubois (Belgium), Dietrich
Kuske (Germany), Edgar G. Lopez-Escobar (U.S.A), Eduardo Massad (Brazil), Germane
Lambert Torres (Brazil), Germane Resconi (Italy), Guo-Qiang Zhang (U.S.A.), Heinz-
Dieter Ebbinghaus (Germany), Helmut Thiele (Germany), Hiroakira Ono (Japan), Joao
Inacio da Silva Filho (Brazil), John Meech (Canada), Kazumi Nakamatsu (Japan), Kiyoshi
Iseki (Japan), Lotfi A. Zadeh (U.S.A.), Manfred Droste (Germany), Marcel Guillaume
(France), Maria Carolina Monard (Brazil), Mineichi Kudo ^Japan), Miyagi Kiyohiro
(Japan), Nelson Favilla Ebecken (Brazil), Newton C.A. da Costa (Brazil), Paulo Veloso
(Brazil), Patrick Suppes (U.S.A.), Seiki Akama (Japan), Setsuo Arikawa (Japan), Sheila
Veloso (Brazil), Ricardo Bianconi (Brazil), Tadashi Shibata (Japan), Tetsuya Murai
(Japan), and Tsutomu Date (Japan).
Also our special gratitude to the following additional scholars who helped us in refereeing
papers: Joao Pereira da Silva - DCC UFRJ (Brazil), Rosa Esteves - UERJ (Brazil), Maria
das Gramas Volpe Nunes - USP (Brazil), Alneu de Andrade Lopes - USP (Brazil), Gustavo
Batista - USP (Brazil), Claudia Martins - USP (Brazil), Jose Augusto Baranauskas - USP
(Brazil), Claudia Milare - USP (Brazil), Roseli Francelin - USP (Brazil), Joao Batista Neto
- USP (Brazil), Adriano Donizete Pila - USP (Brazil), Alexandre Rasi Aoki - UNIFEI
(Brazil), Claudio Inacio de Almeida Costa - UNIFEI (Brazil), and Luiz Eduardo Borges da
Silva-UNIFEI (Brazil).
We would like to thank the numerous sponsors, particularly the SENAC - College of
Computer Science and Technology, Sao Paulo, Brazil. We would also like to acknowledge
the following entities: FAPESP, IEEE, CNPq, Institute For Advanced Studies - University
of Sao Paulo, Universtity of Sao Paulo - Campus Sao Carlos, Himeji Institute of
Technology - Japan, Shizuoka University - Japan, Teikyo Heisei University - Japan.
UNIFEI - Universidade Federal de Itajuba - Brazil, Federal University of Rio de Janeiro -
Brazil. Sociedade Brasileira de Computacao, Sociedade Brasileira para o Progresso da
Ciencia, ABJICA - Brazil, Hokkaido University - Japan, The University of Britsh
Columbia - Canada, Universitat Dortmund - Germany, University of Liege - Belgium.
Stanford University - U.S.A., University of the Ryukyus - Japan, Centrais Eletricas do
Norte do Brasil S.A- Eletronorte, FURNAS Centrais Eletricas S.A, and IOS Press, the
publisher of this Proceedings.
Our undying gratitude again for all these gifted people, responsible for the success of
LAPTEC 2002.
Retriever Prototype of a Case Based Reasoning: A Study Case, H.G. Martins and
G. Lambert-Torres 1
Dynamic Compaction Process of Metal Powder Media within Dies, K. Miyagi, Y. Sana,
T. Sueyoshi and Z. Nakao 9
Automated Theorem Proving for Many-sorted Free Description Theory Based on
Logic Translation, K. Nakamatsu and A. Suzuki 17
Annotated Logic and Negation as Failure, K. Nakamatsu and A. Suzuki 29
Multi-agent System for Distribution System Operation, A.R. Aoki, A.A.A. Esmin and
G. Lambert-Torres 38
Arbitrariness: Putting Computer Creativity to Work in Aesthetic Domains,
A. Moroni, J. Manzolli and F.J. Von Zuben 46
An Overview of Fuzzy Numbers and Fuzzy Arithmetic, F. Gomide 55
The Brain and Arithmetic Calculation, F.T. Rocha andA.F. Rocha 62
Evolving Arithmetical Knowledge in a Distributed Intelligent Processing System,
A.F. Rocha and E. Massad 68
Meme-Gene Coevolution and Cognitive Mathematics, E. Massad andA.F. Rocha 75
Neuronal Plasticity: How Memes Control Genes, A. Pereira Jr. 82
The Influence of Heterogeneity in the Control of Diseases, L.C. de Barros,
R.C. Bassanezi, R.Z.G. de Oliveira 88
Paraconsistent Logics viewed as a Foundation of Data Warehouses, S. Akama and
J.M. Abe 96
Visualization of Class Structures using Piecewise Linear Classifiers, H. Tenmoto,
Y. Mori and M. Kudo 104
Design of Tree Classifiers using Interactive Data Exploration, Y. Mori andM. Kudo 112
Clustering Based on Gap and Structure, M. Kudo 120
Tables in Relational Databases from a Point of View of Possible-Worlds-Restriction,
T. Mured, M. Nakata and Y. Sato 126
On Some Different Interpretations of the Generalized Modus Ponens using Type-2
Fuzzy Sets, H. Thiele 134
Paraconsistent Knowledge for Misspelling Noise Reduction in Documents,
E.L. das Santos, P.M. Hasegawa, B.C. Avila and C.A.A. Kaestner 144
Automata with Concurrency Relations — A Survey, M. Droste and D. Kuske 152
Learning with Skewed Class Distributions, M.C. Monard and G.E.A.P.A. Batista 173
An Enlargement of Theorems for Sentential Calculus, S. Tanaka 181
A Real-time Specification Language, F.N. do Amaral, E.H. Haeusler and M. Endler 194
Defuzzification in Medical Diagonis, J.C.R. Pereira, P.A. Tonelli, L.C. de Barros and
N.R.S. Ortega 202
Fuzzy Rules in Asymptomatic HIV Virus Infected Individuals Model,
R.S. da Motta Jafelice, L.C. de BarroSf-R.C. Bassanezi and F. Gomide 208
Categorical Limits and Reuse of Algebraic Specifications, /. Cafezeiro and
E.H. Haeusler 216
Constructive Program Synthesis using Intuitionist Logic and Natural Deduction,
G.M.H. Silva, E.H. Haeusler and P.A.S. Veloso 224
An Agent-oriented Inference Engine Applied for Supervisory Control of Automated
Manufacturing Systems, J.M. Simdo and P.C. Stadzisz 234
LTLAS: a Language Based on Temporal Logic for Agents Systems Specification,
N.F. Mendoza and F.F. Ramos Corchado 242
Fuzzy Identification of a pH Neutralization Process, R.A. Jeronimo, LA. Sonza and
E.P. Maldonado 250
A Fuzzy Reed-Frost Model for Epidemic Spreading, T.E.P. Sacchetta, N.R.S. Ortega,
R.X. Mene2.es and E. Massad 258
Data Mining in Large Base using Rough Set Techniques, G. Lambert-Torres 267
An Automaton Model for Concurrent Processes, M. Droste 268
It is a Fundamental Limitation to Base Probability Theory on Bivalent Logic,
LA. Zadeh 269
Polysynthetic Class Theory for the 21 st Century, E.G.K. Lopez-Escobar 272
Development of a Fuzzy Genetic System for Data Classification, N. Ebecken 273
Why the Problem P = NP is so Difficult, N.C.A. da Costa 274
The Importance of Evaluating Learning Algorithms during the Data Mining Process.
M.C. Monard and G.E.A.P.A. Batista 275
Abstract - This paper presents the results from application back up of diagnostic
decision supported by a Case Based Reasoning (CBR), with the presentation of a
retriever prototype used for cases recovering from a specialist domain. Its aim is
to recover, from a memory of cases, the most adequate case for a new situation
and to suggest the solution for the new case.
1 - Introduction
The paradigm Case Based Reasoning (CBR) presume the existence of a memory where
the already solved cases are stored; uses these cases, through recovering, for helping in the
resolution or interpretation of new problems; and promotes the learning, allowing that new
cases (newly solved or newly spell-out) be added to the memory [1].
One CBR uses previous cases as far for evaluation, justification or interpretation of
proposed solutions (interpretative CBR), as for proposing solutions for new problems
(problem solving CBR) [2].
Case based reasoning have been opening new fields on computer back up regarding
decision problems of a bad structure.
This technique have been used as decision back up in many knowledge domains such
as: planning, projects, diagnostics, with a better performance than other reasoning systems;
as example we mention PROTOS and CASEY as referred in [1], AIRQUAP in [3], a water
sewage treatment system developed by [4] and e direct current motor repair system named
TRAAC [5].
The CBR system purpose is to recover from its memory the most similar case
regarding to the new, suggest the solution or one adaptation of this as a solution for the new
situation.
The usefulness of the old cases is made by the similarity access of a new case with the
old one.
The central methodology of the retriever prototype is the similarity determination of a
new case with all the previous cases. The similarities are set up through combination
functions (matching) and through the characteristics of the new case with all the previous
ones. In the next sections we will show how to handle with similarities and the matching
function, so as the architecture of the retriever prototype.
2 - Similarities
A case may be considered as a scheme comprising one set of pairs of attribute values,
that is to say, descriptions [1]. For example, in one decision scenery for credit evaluation,
one loans manager access several pairs of attribute values, namely, the "candidate type" has
2 H.G. Martins and G Lambert-Torres /Retriever Prototype
a "medium" value. He set up a similarity combination of the new case scheme with a
previous case scheme. This combination is made in two steps:
1) To find the similarity of new case scheme with a previous case scheme along the
descriptions;
2) To find the global similarity through combination functions [5].
The similarity between the new case and a previous case along the descriptions have
been find using a domain knowledge made up of rules of heuristic combinations and
specific domain [6]. As an example, a combination used for the determination of a
description "oven ember color" with one orange value is more similar to the description of
red value.
The global similarity of a new case with previous cases is found through Combination
Functions adding the similarities along the descriptions, the used function in this work is
the Modified Cosine Function, shown as follows.
Gcas=
The combination modified cosine determine the global similarity, named "Matching
degree", Gcas, between two cases by comparison of terms frequency, that is to say, the
description weight from the new case and the terms weight from previous case.
The function measures the cosine of the angle between the vectors weight of the new
and previous cases, which cosine is weighed by the similarity degree along the space m-
dimensions of descriptions.
The denominator terms of equation above normalize the vectors weight by the
determinations of its Euclidian lengths.
The similarity function is based in the pertinence (weight) of description values for the
diagnostic. The similarity between the value of the present description from the new case
and the value of same present description in the previous case of memory is taken as being
H.G. Martins and G. Lambert-Torres /Retriever Prototype 3
the difference between the unit and a rate between the weights that each one of these values
have for the diagnostic of the case in memory, with the extension value of the description
scale.
The importance determination of one description is set up in one scale as per Figure 1:
For example, presuming that the description of one specific system be the temperature
and the determination of its importance pass through one scale which extension is defined
as follow in Figure 2:
The similarity along the description may be computed, for example, for one value of
Very High Temperature (2) in combination with a value of Medium Temperature (1,25)
like.
4 -Retrieve Process
The recovery process in Case Based Reasoning - CBR, comprises experience of past
solutions stored into a memory known as cases. This technique aims to recover the most
useful previous cases toward the solution of the new decision take problem and to ignore
the irrelevant previous cases.
The cases recovery works in the following way, as set up in Figure 3: based upon the
description of the new decision take problem (new case) the basic case is searched by the
previous cases starting from a decision back up. The search is made based upon similarities
[7]. The previous cases get through the combination function (matching degree) and are
ordered in decreasing way regarding the matching degree. The combination function
determines the similarity degree of the useful potential from previous cases with one new
case.
The retriever prototype requires the whole evidence body supplied by memory, that is
to say, it requires that the entry case matching degree be computed against all the memory
cases.
4 H.G. Martina iiinl G. Lambert-Torres / Retriever Prototype
In Figure 4 we see the Retriever Prototype architecture. For each memory case a
matching function is defined between the new case and the same. This function computes
the Belief Function in favor of the diagnostic of this case for the case of entry.
The previous cases determined by search may be combined by Gcas and arranged in
global similarity decreasing order.
In the diagnostic domain, it is usual that the cases in which the same diagnostic
happens by symptoms groups or different characteristics.
"The most adequate diagnostic" supposed for the entry situation, is the one that
presents the major evidence to its favor, in other words, the one that has a bigger Belief
Degree, computed by the Matching Degree, Gcas.
"The most adequate case" suggested is the one among all the cases from the class of
selected diagnostic, that have the bigger Matching degree Gcas. The cases that belong to
that diagnostic class may be of help for the new problem solution.
The advantage in recovering the case is that it is possible to find in it information that
were useful for the solution of previous problems which may help for solving the new case.
Our CBR is used for diagnostic determination for the operation of a cement kiln [8].
The kiln operator is aim at production of High Quality cement, which is achieved
basically by the cement conditions.
H.G Martins and G. Lambert-Torres /Retriever Prototype
The operator target is the value control of two kiln parameters, named Rotation
Velocity (RV) and Temperature (T). These parameters values are set up by the operator
considering four other parameters that are: Granulation (G), Viscosity (V), Color (C), and
pH Level. A decision table where G, V, C and pH are "conditioning attributes" and RV and
T are "decision attributes" may describe these actions. The condition attribute values when
combined, correspond to cement specific quality produced in the kiln, and for each one of
these attributes adequate actions are expected to provide high quality. All attributes and its
values are listed according Table 1.
Table 1 - Specification of Descriptions and its Extensions.
Attributes Descriptions Extension of Importance Scale
Condition a - Granular 0-3
Condition b - Viscosity 0-3
Condition c - Color 0-3
Condition d - pH Level 0-3
Decision e - Rotational Velocity 0-3
Decision f - Temperature 0-3
In Table 2 are shown the possible diagnostics and its identification according to the
descriptions rotational velocity and temperature.
Table 2 - Diagnostics and its extensions
Diagnostic e f
D1 1 3
D2 0 3
D3 1 2
D4 1 1
The knowledge base for CBR evaluation comprises 13 cases and is related to
descriptions and with their diagnostics, as shown in Table 3:
Table 3 - Knowledge Base
Cases a b c d Diagnostic
CASE 01 2 1 1 1 D1
CASE 02 2 1 1 0 D1
CASE 03 2 2 1 1 D1
CASE 04 1 1 1 0 D2
CASE 05 1 1 1 1 D2
CASE 06 2 1 1 2 D3
CASE 07 2 2 1 2 D3
CASE 08 3 2 1 2 D3
CASE 09 3 2 2 2 D4
CASE 10 3 3 2 2 D4
CASE 1 1 3 3 2 1 D4
CASE 12 3 2 2 1 D4
CASE 13 3 0 2 1 D4
6 H.G. Martins and G. Lambert-Torres /Retriever Prototype
The Retriever Prototype was tested for several New Cases, the results of the first New
Case- [ 2 2 2 1 ] will be presented, that is to say, being the more simple Knowledge Base.
For the New Case - [2 1 02], the used Knowledge Base already has a bigger Previous
Cases memory, once its knowledge is more comprising.
NEW CASE: [ 2 2 2 1 ]
Case found in Knowledge Base: 0
No Case Found
Columns 1 through 14
C1 C2 C3 C4 C5 C6 C7 C8 C9 C10 C11 C12 C13 C14
2 2 2 1 1 2 2 3 3 3 3 3 3 3
1 1 2 1 1 1 2 2 2 3 3 2 0 1
1 1 1 1 1 1 1 1 2 2 2 2 2 1
1 0 1 0 1 2 2 2 2 2 1 1 1 1
Columns 15 through 20
C15 C16 C17 C18 C19 C20
2 1 1 2 1 2
2 3 3 3 1 1
2 3 3 0 2 0
1 1 0 2 3 2
Cr(Dl)=6.993502e-001
Cr(D2)=5.350826e-001
H.G. Martins and G Lambert-Torres / Retriever Prototype
Cr(D3)=8.181133e-001
Cr(D4)= 4.709396e-001
The basic idea in this work it is the knowledge retrieve process, in such a way that this
knowledge be stored so as to allow us to simulate future actions in the diagnostics
determination.
We have presented architecture for a Retriever Prototype of cases from one CBR.
As an additional and immediate work we pretend to involve special proprieties into
Belief Function that makes the selection of the more adequate diagnostic for an entry
situation and, consequently, the recovery of the more adequate case and of all cases
belonging to this diagnostic class, which also may be of help for the new problem
solution. These special proprieties are related to the Two Values-LPA2v Annotated
Paraconsistent Logic, for the purpose of uniting two major areas: Case Based Reasoning
and Paraconsistent Logic, according to [9] and [10].
Refferences
[1] Kolodner, Janet L., "Case-Based Reasoning", San Mateo, CA: Morgan Kaufmann Publishers, Inc., 1993.
[2] Kolodner, Janet L. and Leake, D. B.; "A Tutorial introduction to case- based reasoning", in Case Based
Reasoning: Experiences, Lessons, and Future Directions, D. B. Leake, Ed Menlo Park, CA: AAAJ Press,
1996.
[3] Lekkas, G. P., Avouris, N. M. & Viras, L. G., "Case- Based Reasoning im Environmental Monitoring
Applications", Applied Artificial Intelligence, 8: 359-376, 1994
[4] Krowidy, S. & Wee, W. G., "Wastewater Treatment Systems from Case- Base Reasoning", Machine
Learning, vol.10: 341-363, 1993.
[5] Gupta, K. M. and Montazemi, A. R., "Empirical Evaluation of Retrieval in Case- Based Reasoning
Systems Using Modified Cosine Matching Function ", IEEE Transactions on Systems, man and cybernetics -
Part A: Systems and Humans, vol. 27, n° 5, September, 1997.
[6] Porter, B. W., Bariess, R. & Holte R. C., "Concept learning in weak theory domains", Artificial
Intelligence, vol. 45, n° l-2,pp. 229-264, Sept. 1990.
[7] Bariess, R. and King, J. A., "Similarity assessment in case- based reasoning", im Proc DARPA
Workshop Case- Based Reasoning, San Mateo, CA: Morgan Kaufmann, pp 67-71, 1989
[8] Sandness, G.D., "A Parallel Learning System", CS890 Class Paper, Carnegie-Mellon University, pp 1-12,
1986.
[9] Gonzaga, Helga, Costa, C. I. A., Lambert- Torres, Germano, "Generalization of Fuzzy and Classic Logic
in NPL2v", Advances in Systems Science: Measurement, Circuits and Control - Electrical and Computer
Engineering Series, Ed N. E. Mastorakis and L. A Pecorelli- Peres,WSES, pp. 79-83 ,2001
[10] Lambert-Torres, G., Costa, C I. A., Gonzaga, Helga, "Decision- Making System based on Fuzzy and
Paraconsistent Logics", Logic, Artificial Intelligence and Robotics - Frontiers in Artificial Intelligence and
Applications, LAPTEC 2001, Ed. Jair M Abe and Joao I. da Siiva Filho - IOS Press, pp 135-146 , 2001
Advances in Logic, Artificial Intelligence and Robotics
J.M. Abe and J.I. da Silva Filho (Eds.)
IOS Press. 2002
Abstract Dynamic compaction of a copper powder medium was studied through both
theoretical analysis and experiments. By comparison of the theoretical and experimental
data obtained, it was hoped to provide further useful experimental information.
Introduction
In this study a porous metal powder medium in a die is assumed to be a uniform continuum
consisting of solid powder particles and air [5]. The pressure-density curve of the powder is known
to be convex and a shock wave propagates in the porous metal powder medium when the powder is
subjected to compaction [1]. In general, the shock wave propagating in a powder medium under
jump conditions is treated with the use of the equations for the constitution of the powder, in addition
to the Rankine-Hugoniot relation, also known as the conservation laws of momentum and mass [2].
In the study the static pressure density relation is used as tie constitutive equation for the
metal powder medium within the die, and elastic recovery is neglected in order to simplify the
treatment of the reflection wave. For the experiment, a new measuring method was developed, and
the dynamic behavior of the medium under compaction was examined by continuum [2,4]. The
powder particles were assumed to move only in the direction of compaction. The medium in the die,
which is finitely long and effectively infinitely long, was compacted [2,3,4] by the direct impact of
the punch, and theoretical and experimental studies were made with regard to the data obtained.
In the experiment the copper powder medium in the die was marked on one side with fine
lines of aluminum powder. The change of the interval of the lines was filmed with a high-speed
camera, and the behavior of the powder and propagation of the shock wave in the powder medium
were observed. It was found that in an effectively infinitely long powder medium, the powder
particles show the same behavior as the theory predicts, and the observed densities increase as the die
wall friction increases with time. It was also confirmed that in a finitely long powder medium, the
powder particles act as predicted, and the theoretical results agree fairly well with the experimental
results, and the compact has an approximately uniform density, which is affected somewhat by the
die wall friction [6].
Basic Equation
Effectively Infinitely Long Metal Powder Medium in the Die
Let the x-axis be along the direction of the powder in the die and let it satisfy 0 < x < oo; in
Fig. 1 a small element along the x-axis between the points I and n of distance dx is taken and let A be
the inner cross-sectional area of the die. We will consider that at x = 0 the punch with mass M
impacts the powder surface with initial velocity VQ. Thus, when the mass impacts the surface of the
powder medium, x = 0 and the velocity is VQ. We assume mat the shock wave reaches the point I at
some time t, and then it reaches the point II in time dt, and that the medium pressure, density and
particle velocity are denoted, respectively, by p, p and v behind the shock wave front and by p,
p and v in front of it, and also that the velocity of the shock wave is given by c. We further
assume that at time / + dt, points I and n move, respectively, to points I' and II', and that the
wave front reaches the points II'. Then lll' = cdt, ll' = vdt, IIIT = vflff and thus the
infinitesimal element III moves to I' II'. The state variables of the elements are transformed from
p, p and v top, p and v. Consequently, from the law of conservation of mass,
p(c - v) = p(c - v) (1)
From the laws of conservation of momentum and impulse, we have
p-p = p(c- v Xv - v) (2)
10 K. Miyagi et al. /Dynamic Compaction Processes oj Metal Powder Media
Before the impaction, the pressure and particle velocity of this powder satisfies p = v = 0 . By
letting p = /70 , in equations ( 1 ) and (2), we get
p0c = p(c - v) (3)
P = P0™ (4)
Although the powder pressure gradually decreases with time behind the wave front because
the punch velocity decreases due to powder resistance, the density does not change during unloading
if the elastic recovery is ignored. Therefore, die density at each point in the area remains constant,
which is determined by the pressure at the point immediately after the passing of the shock wave
front. Thus, the element between the shock wave front and the rigid body (i.e. the punch) moves at
the same velocity as the body, hi figure 1 , from the law of conservation of momentum of the
medium to the left hand side of point I and the punch, we obtain
- Apdt = (M + m + dm\v + dv}- (M + m}\> - dm v
= (M + m)dv + dm(v - v)
where A is die cross-sectional area as before, m is powder mass between the shock wave front and
the punch, and dm is the increment of m during dt given by dm = pA(c - v]dt .
Thus, equation (5) becomes
(M + m)— + Ap = 0 (6)
dt
Since p = v = 0 , from equation (5), we get d{(M + m)v} = 0, and hence,
(M + m)v = Constant (7)
Since m =0, V=VG immediately after the impaction, we have
During die passing of die wave front, die medium mass at x before deformation satisfies
m = pAx (9)
Consequently, die particle velocity at die moment v is given by
v = ^^_ 00)
M + p0Ax
We will use a static pressure-density relation. Ahhough diere are many equations proposed
for die relation, die equation due to Kawakha [7] is adopted, because rt is given in a simple algebraic
form, which is easier to apply. Let p and p be, respectively, die powder medium pressure and die
density at an arbitrary point in die die, and let a and b be coefficients determined by die medium.
Then the relation between/? and p is given by
P=, ^ P, (11)
1 + (b - ab)p
Eliminating c from equations (3) and (4), we obtain
A
K. Miyagi et al. / Dynamic Compaction Processes of Metal Powder Media
Elimination of p from equations (11) and (12) yields abp2 - bp(} v ~ p - p0v2 = 0 .
Putting a - ab, /? = bp0v2, y - p 0 v 2 , we get equation
P= ;La (13)
From equation (4), we obtain
„_ P _ > (14)
Pov
The time / required for the shock wave front to reach the point x is given by
t= n~dxJ (15)
J()C
By integrating equation (5) with respect to v using equations (10) and (14), we obtain
(16)
, f v 1 VV +k -v-k
+ ln —
V o + v - £ Jvn2
where k = . Thus, from equations (10), (11) and (13) the density distribution'in the medium
(20)
The pressure-density relation of Kawakita is given by
(21)
1 + (b - ab)p
From equations (19), (20) and (21) we obtain
12 K. Miyagi el al. /Dynamic Compaction Processes oj Metal Powder Media
ff + J/32 + 4af
P= - -- (22)
where
When the wave front travels toward the plug end after reflection of the wave front at the punch
surface, the particle velocity v remains always zero in front of the wave front. Thus,
p(c-v)=pc (24)
Equations (5') and (6) remain valid for the reflected wave from the equation of momentum behind
the wave front.
For the reflected wave moving toward the punch, powder medium mass is given by
m = m0 + A\'pcdt (i = 1,3, 5, • • • ) (27)
Here p and p are known functions; p, c, p ,v and m can be determined from equations (19),
(20), (5'), (22) and (27), where /, is the moment when the wave front is at the ends of the medium.
For instance, /o is the moment of impaction, t\ is the moment the wave front reaching the plug end
for the first time, ti the moment the reflected wave reaching the punch surface for the first time, etc.
The medium mass m in front of the wave front when the shock wave is propagating toward the plug
end is given by
m = A^pcdt (i = 0,2,4,6,-) (28)
The pressure p is known, p = 0 before the front wave reaches the plug end for the first time, and
it is die pressure behind the reflected wave, p is also known; thus the unknown values p,c,v,p
and m can be determined from equations (19), (20), (5'), (13) and (28). The necessary calculation
can be done by use of the difference method.
x"= 95 [cm] was theoretically about one half of the impact velocity. However, the theoretical data [2]
agrees with the experimental one [4] in that the powder particles move approximately as fast as the
punch when the wave front arrives, and although it is not clear from this figure, the powder medium
element gradually becomes longer, i.e., less dense, closer to the plug. Although there is a quantitative
discrepancy with the theoretical data as noted, the experimental results agree well with the qualitative
prediction the theory.
Conclusions
For the "effectively infinitely" long powder medium an analytical solution was obtained in
which the front position x (expressed by the Lagrangean coordinate) was used as the variable, and for
the finitely long medium a numerical solution was obtained. Also with the use of the Lagrangean
coordinate, from the result for the infinitely long medium it has been confirmed that in all cases the
element immediately behind tiie shock wave front lies where the pressure decrease occurs
approximately as theoretically predicted and the compact of approximately uniform density is
obtained by the high velocity compaction if the die wall friction is neglected. The behavior of the
medium was filmed with the high-speed camera and the high velocity compaction process was thus
investigated experimentally. Results obtained were compared with the theoretical results for the
parameters used in the experiment.
As the shock wave propagates through the medium, in the "effectively infinite" case, particles
14 K. Miyagi et ai / Dynamic Compaction Processes oj Metal Powder Media
just move at the punch velocity as the shock wave front reaches them, but in the finite length case,
particles again come to rest when the reflected wave front arrives. These results agree with the
theoretical prediction. In the "effectively infinite" case, the friction force increases with the advance
of the front, and the compaction process data obtained differs considerably from the theoretical
predictions. In the finitely long medium case, there is excellent agreement with the theoretical
prediction possibly because of the high number of reflected passes of the shock wave between the
punch and plug, so that the influence of die wall friction can be neglected. In this case a uniform
density compact is obtained. Higher densities and uniform density distributions are obtained when
the powder length is reduced and the mass and impact velocity of die punch are increased.
References
[ 1 ] Richitniyer, R.D. Morton.K.W.; Difference methods for initial value problems. Inter science Pub.. 1957.
[2] Sano,Y; Journal of the Japan Society of Powder and Powder Metallurgy 21 -1, P. 1,1974.
[3] Sano,Y. Sugita,!.: Journal of the Japan Society of Powder and Powder Metallurgy, 22-2. P.47.1975.
Sano,Y, Hgiwara,T. and Miyagi,K.; Journal of the Japan Society of Powder and Powder Metallurgy^ 1 -1, P.9.1974.
Sano,Y. and Miyagi JC; The International Journal of Powder Metallurgy & Powder Technology. 20-2, P.I 15.1984.
Sano,YJvliyagi,K.. and Hirose,T.;The International Journal of Powder Metallurgy & Powder Tech.. 144, R291.1978.
KawakitaJ.; Journal of the Japan Society of Powder and Powder Metallurgy, 10-2,P31,1963.
I I
Fig 2 Time variations of the rigid punch mass, shock wave front and powder particles
K. Miyagi et al. / Dynamic Compaction Processes of Metal Powder Media 15
(c)
Fig. 3 Time variations of the rigid punch mass, shock Fig. 4 Time variations of the rigid punch mass, shock
wave front and powder particles wave front and powder particles
16 K. Mivagi et al. / Dynamic Compaction Processes of Metal Powder Media
Fig. 6 Mean green density vs. initial kinetic energy Fig. 7 Mean green density vs. punch velocity
Advances in Logic, Artificial Intelligence and Robotics 17
J.M. Abe and J.I. da Silva Filho (Eds.)
IOS Press, 2002
1 Introduction
Some languages for knowledge representation have been being studied in the field of
artificial intelligence such as the field of knowledge representation and natural language
understanding, eg. KRL(Knowledge representing Languages) [1], D-script proposed
by Moore [6] etc. are those. If we consider to implement reasoning systems by those
languages, which are implemented by programming languages such as PROLOG, it
might be required to translate the languages to usual predicate logics so that they can
be implemented in PROLOG, or might be required to consider automated deduction
systems(theorem proving systems). It is known that KRL can be translated into a
many-sorted free description theory with equality except for reflexive reasoning [5].
Free logics are logics that can deal with undefined objects(non-existing objects)
[13, 15], and are indispensable for dealing with description. We give an algorithm
of the automated theorem proving for this many-sorted free description theory with
equality called FDn in this paper, and show that the theorem prover is complete with
respect to the translation from FDn into a first order predicate logic. FDn is an
extension of Scott [13]. The automated theorem proving method proposed in this paper
is based on the translation between free' description theory and standard first order
theory. Based on the translation, an automated deduction method for standard for the
standard many-sorted theory is proposed to implement the many-sorted description
theorem prover. The translation-based automated deduction method was originally
18 K. Nakamatsu ami A. Suzuki / Automated Theorem Proving
proposed in Nakamatsu and Suzuki [7, 14] as theorem provers for modal logics and has
been developed by Nakamatsu et al. [8, 9, 10, 11, 12] as the annotated semantics for
some non-classical logics.
This paper is organized as follows. First, we show that any formula of FDn can be
translated into formulas of a standard many-sorted theory with equality called SEn, and
that the translation preserves provability, i.e., P is a theorem of FDn iff the translation
of P is a theorem of SEn. Next, we describe the theorem prover for the many-sorted
theory with equality and how to implement it. We use an automated deduction method
that is an extension of the RUE-NRF deduction proposed by Diglicoli [3].
The reader is assumed to have basic knowledge of theorem proving and logic [2].
2 Formal Preliminaries
In this section, we define the many-sorted first order theory with equality SEn and the
many-sorted free description theory with equality FDn.
2.1 Languages
We define many-sorted(n-sorted) languages Ls and Z// used for the thories SEn and
FDn, respectively. The language Ls is defined as follows :
(1) variables x\,y\,z\,... ;x2,t/2? zi, • • • ; • • • '^xn,yn, zn,.... (Subscripts indicate the
sorts) ;
(6) if each Xi(i = 1 , . . . , n) is a variable of the x'-th sort and A is a formula, then LXiA
is a term.
We note that each Ei(i = 1,... , n) is a particular unary predicate symbol in Ls not
occurring in Lf.
2.2 Syntax
We describe the axiom schemata and the inference rules of SEn and FDn.
Axiom of SEn : let A, B are any formulas, each Xi (i', — 1 , . . . , n) is any variable of
the z'-th sort, and t,s are any terms of the same sort.
Al. .4, if A is a tautology.
A2. \/x,(A -4 B) -> (Va;^ -> VariB).
A3. * = *.
A4. £ = s A P ( . . . , t,...) = P(..., s,...), where P is a predicate symbol.
A5. t = s A f i ( . . . , i , . . . ) = / j ( . . . , s , . . . ) , where /; is a function symbol of the z'-th sort
(i = l , . . . , n ) .
Inference Rules of S'E'n : let .4, B are any formulas and each Xi(i = 1 , . . . , ri) is a
variable of the z'-th sort.
Rl. If A and A —> J5, then S (Modus Ponens). We write A to mean that y4 is provable.
R2. If A —> B and a variable X{ of the z'-th sort (i — 1,..., n) is not free in A, then
/I -> VzjR
The axiom schemata of FZ)n can be obtained by adding the following axiom schemata,
and the inference rules of FDn are same as those of SEn.
A6. Vyl3xl(xl = y,}.
AT. Vyl(yt = LxtA(xt) = Vxifa = yt = A(xi))).
A8. ~ 3(yi = LxlA(xi)} -> ixiA(xi) - iXifa ^ x^.
xt and yi (i — 1 , . . . , n) are variables of the z'-th sort.
2.3 Semantics
A model of SEn is an ordered n + 1-tuples M =< ZP l5 • • • , Dn, V >, where each Dr(i =
1 , . . . , n) is a non-empty set called a domain and V is a value assignment satisfying the
following conditions :
(1) for each variable rcj, V(xi) 6 Di(i = 1 , . . . , n) ;
(2) for each constant (0-ary function symbol) Cj, V(ci) € Di(i = 1 , . . . , n] ;
20 K. Nakamatsu and A. Suzuki / Automated Theorem Proving
(3) for each m-ary function symbol f i ( m > 0), V(/,-) is an unique mapping from
Djj x • • • x Djm to Di and satisfying the following condition.
l
) . . . . . V(tm)).
(4) for any formula A and variable xt of the i-th sort (z = 1. . . . . n).
A model of FDn is an ordered 2n + l-tuples < G\, ---- Gn,G\ ..... Gn-l >• where each
Gj is the domain of properly existing individuals of the i-th sort (i = 1 ..... n). each G,
is the domain of improperly existing individuals of the i-th sort (G, is non-empty), and
U is a value assignment satisfying the following conditions :
[1] for each variable Xi of the i-th sort, U(xt) 6 (G, U G,)(i = 1 ..... n) :
[2] for each constant ct of the i-th sort, t/(c t ) G (G, U Gt)(i = 1 ..... n) :
[3] for each m-ary function symbol (m > 0) /,, U(f,) is an unique mapping from
(Gj, U G~j,) x • • • x (G3m U G J m ) to (Gj, U Gj.) satisfying the following condition.
where each jk(l < k < m) is one of integers 1 ..... n and i = 1 ..... n :
K. Nakamatsu and A. Suzuki / Automated Theorem Proving 21
[4] for each m-ary predicate symbol P, U(P) is some set of ordered m-tuples, each of
form < W%\ , . . . , W^ >, where each W/. € (G^ U G^ (j = 1, . . . , m) and ij is one
of integers 1, . . . , n ;
[5] for any formula A and variable Xi of the z-th sort (i = 1, . . . , n),
The assignments for formulas in FDn are defined as well as those for in SEn except for
(4). The assignment U(\/XiA) is defined by modifying the item (4) to the item (4') as
follows :
(4') for any formula A and variable Xi of the z-th sort (i = 1, . . . , n),
where V is the value assignment such that V(xi) — €i and V(yi) = U(yl) for
distinct from Xi(i = 1, . . . , n).
[Definition 2]
A formula A in FDn is valid iff for all assignment U in FDn-mode\ U(A] = 1.
Note : the system of FDn can be regarded as an extension and a modification of the
Scott's system [13].
Some methods to translate modal predicate logics to two-sorted first order logics have
been proposed by Nakamatsu and Suzuki [7, 14]. In this section, we propose translation
rules from FDn into SEn and show that the translation preserves provability between
FDn and SEn.
[Definition 3] (*-translation)
Given any term t and a formula A in FDn, *-translation is defined as follows :
1. for any variable x^ of the z-th sort, (x;)* = Xi(i = 1, . . . , n) ;
2. for any constant Cj of the i-th sort, (Q)* = Cj(z = 1, . . . , n) ;
3. for any function symbol /, of the i-th sort, ( f ^ t 1 , ..., tm))* = ft(t1*, ..., tm*) ;
4. for a description iXiA(xi)(i = 1, . . . , n),
(i) let P(ix il >li(xi 1 ),. ..,ixlniAm(xlm)) stand for P(t1,. . ..tm). where ix^A^x^) .....
LX
im^m(xim} are all outermost descriptions occurring in tl ..... tm left to right.
then
6. let A and B be any formulas, (~ A)* =~ A*. (A -> b}* = A* -> B*.
) -+ A*)(i = 1, . . . , n).
- Di = G, U Gt •
- for any variable Xi of the z'-th sort, V'(x,) = U(xt) ;
- for a particular constant a; of the z-th sort, V(a t ) 6 G, :
Hence,
Induction Step. If the formula A has the form of ~ B or B -» C. the result is clear.
where U' is the assignment such that U'(xi) = €i and U'(yi) = U(yi) for distinct from
iff
if gj is the unique element of Gi such that U(B(xi}} = 1 then
U(ixlB(xi)} = U(xi) = €i and U(P(tl, . . . ,Xj, . . . , t m ) ) = 1, or if there is no such
element then U(iXiB(xi)} = Ufa) = U(at] = a{ G G{ and U(P(tl, . . . ,xt, . . . , t m ) ) = I .
iff
) A Vyi(Ei(yi) -+ (B * (yt) = x, = yt))V
(~ (Eifa) A Vyi(E(yi) -)• (5 * (^) = x, = yi))) A x, - a,) A P(t\ . . . , xit . . . , t m ))) - 1
iff
if €i is the unique element of Gi such that V^-B * (yi)) = 1 and Vfa) = e"j, or there is
no such element and Vfa) = K(OJ) € Gi, then F(P(^, . . . , xt, . . . , t m ) ) = 1.
Since B(XI) contains no description, from [Case 1] and by induction,
U(Bfa)) = l iff V ( B * ( V i ) ) = l.
Hence,
U(P(tl, . . . , ix&fa), . . . , t m )) = 1 iff V((P(t\ . . . , ^ ZJ B(^), - - • , t m
In [Case 2], for i = 1, . . . , n, Induction Steps are same as [Case 1].
Q.E.D.
Given an S^-model M* =< Di,---,Dn,V >, a FDn-model M =< G i , - - - , G n ,
GI, • • • , G n , £/ > can be defined. For each i = 1, . . . , n :
24 K. Nakamatsu and A Suzuki / Automated Theorem Proving
- for a particular constant a; of the z-th sort, U(ai) = V(a.i) = a, € Gj. therefore
each Gt is non-empty ;
- for the other constants Ci of the i-th sort, Ufa) = V'(c t ) ;
- for any variable Xi of the z-th sort, t/(xj) = V'(x^) ;
- for any function symbol /, of the i-th sort, £/(/,-) = V'(/,) :
- for any predicate P, V(P) = U(P).
Thus, we have another lemma and it can be proved similarly to [Lemma 1].
[Lemma 2] For any formula A* of *-translation of A,
V(A*) = 1 in SEn - model M * iff U(A) = 1 in FDn - model M.
Considering the completeness of the *-translation, we have the following theorem.
[Theorem 1] For any formula A in FDn,
A is provable in FDn iff A * is provable in SEn.
[Proof] We assume the completeness for FDn and SEn that can be proved by the
similar way of Scott [13] and Wang [16].
Proof of if-part. Suppose A is provable in FDn, we have that A is valid in FDn.
Moreover, .4 is valid in FDn iff U(A) — 1 in every FDn-model M =< G\ Gn.
GI, . . . , Gn, U >. Then, from [Lemma 1], V'(-4*) = 1 in the corresponding SEn-model
M* =< £ > ! , . . . , Dn, V >. Hence, A* is valid and provable in SEn.
Proof of only if-part. Suppose .4* is provable in SEn, we have that .4* is valid in
SEn. Moreover, .4* is valid in SEn iff V'(-4*) = 1 in every SDn-model M* =<
DI, . . . , Dn, V >. Then, from [Lemma 2], U(A) = 1 in the corresponding FDn-model
M =< GI, . . . , G n , GI, . . . , Gn, U >. Hence, .4 is valid and provable in FDn.
Q.E.D.
We show that for any formula F is a theorem in FDn iff there is a deduction of the
empty clause D from the set of clauses constructed in [Step 4].
[Definition 4] (unsatisfiability)
If a formula F in SEn is evaluated to 1 in a model M =< £>!,..., Dn, V >, we say that
M satisfies the formula F. A formula F is unsatisfiable(inconsistent) iff there exists
no model that satisfies F.
[Definition 5] (E-satisfiability)
A formula F is called E-satisfiable iff there is a model that satisfies the formula F
and the equality axioms that are stated below. Otherwise, F is called E-unsatisfiable.
[Equality Axiom]
Since a model of SE^ satisfies the equality axioms, any formula F in SEn is unsat-
isfiable iff the formula F is E-unsatisfiable. Since it is straightforward that a formula
F in FDn is a theorem iff ~ F* in SEn is E-unsatisfiable from [Theorem 1], we show
that
(i) ~ F* is E-unsatisfiable iff the clause set S constructed from ~ F* is E-
unsatisfiable.
(i) the clause set S is E-unsatisfiable iff an RUE-NRF deduction of the empty clause
D from the clause set S. This completeness can be proved similarly to Diglicoli[3],
because our theorem prover is an extension of the RUE-NRF to many-sorted
theory.
It is obvious that [Step 1] and [Step 2] preserve E-unsatisfiablity. Thus, we show it
for [Step 3] and [Step 4].
[Theorem 2] Let S be a set of clauses that represents the standard form of the
formula F in SEn. Then,
the formula F is E-unsatisfiable iff the set S is E-unsatisfiable.
[Proof] It can be proved similarly to the proof of Theorem 6 in Suzuki and Nakamatsu
[14]. We have shown this theorem on two-sorted first order theory.
Q.E.D.
Here we give an outline of the RUE-NRF for many-sorted. We consider 4 and 5 of
[Equality Axiom]. From 4, we can deduce
where s and t are terms that are the same sort. From (1), we can infer s ^ t. This rule
of inference is called Negative Reflexive Function Rule (NRF for short). Next from 5, we
can deduce P(s) and ~ P(t)- This rule of inference is called Resolution by Unification
and Equality (RUE for short). In order to give rigorous definitions of these rules later,
we introduce some definitions of terminologies in theorem proving.
[Definition 6] (disagreement set)
Given two terms s, i of the same sort, a disagreement set is defined as :
26 K. Nakamatsu and A. Suzuki / Automated Theorem Proving
- if s, t are not identical, the set of one element the pair (s, t) is the origin disagreement
set of s, t ;
A disagreement set of a pair of complementary literals P(s 1 ,... ,s m ) and ~ P(tl,... .tm)
is the union : D = (J™=1Dj, where Dj is a disagreement set of (s3. t3}.
MGU(most general unifier) is used in standard resolution, on the other hand,
MGPU(most general partial unifier) is used in the RUE-NRF. MGPU is the substi-
tution used when two terms s, t are not completely unifiable.
[Definition 7] (difference set)
Let W be a non-empty set of expressions. The first difference set is obtained by locating
the first position (counting from the left) at which not all the expressions in W have
exactly the same symbol and then extracting from each expression the subexpression
that begins with the symbol occupying that position. If we resume the comparison in
each expression of W at the first position after the subexpression used to define the first
difference set, find the next point of disagreement and again extract the corresponding
subexpressions that comprise the second difference set of \V. If the elements of IT are
not identical, k difference sets : d1, d 2 , . . . , dk(k > 1) are constructed in this fashion.
Then, MGPU Algorithm is defined to compute a MGPU substitution for \V.
[Definition 8] (MGPU. MGPU Algorithm)
MGPU Algorithm
Let W be a set of expressions and each d3 is a difference set defined in [Definition
7]-
(1) Set j' = 1, a — {}, where o is a substitution.
(2) Find d3' for W as previously described, if it does not exist, terminate with a a
MGPU of the set W.
(3) If d? contains a variable Vi of the z-th sort as members and a term t, of the ?-th
sort that does not contain v^ then let a = a U { t i / V i } and \V = \\'{ti/vt}. Go to
(2).
(4) If d3 does not contain the above variable, then let j = j + 1 and go to (2).
We show that the following formula is a theorem of FDn as an example of the theorem
proving,
Vy(y = ixA(x) -> Mx(x = y = A(x}}, (2)
where x, y are variables of the i-th sort (i = 1 , . . . , n). Let us describe the process of
the RUE-NRF deduction of the formula (2).
*-translation of the formula (2),
Vy(£(y) -> (VzPfo y) -> Va:Q(x, y))), (3)
where
P(z, y) = £(x) -> ((E(x) -> VzQ(*, x)) V (~ (£(2;) A V2Q(z, x)) A x = a) ->• y = a;),
Q(:r, y) = E(x) -> (a: = y = A * (x)), ^(x) = ^(x),
and 2 is a variable of the i-th sort. The negation of the formula (3) :
3y(E(y) A P(x, y) A 3x - Q(x, y)).
Skolemization :
£(c) A P ( a ; , c ) A ~ Q ( d , c ) ,
where c, rf are Skolem functions, and
P(z,c) = (c = xV - ^(x)V ~Q(/(a;),a;)) A (c = a; V x ^ a) A
(~ £"(0;) V Q(z, x) V x ^ a) A (~ £?(a;) V Q(^, a:) V x ^ a).
Then, the set
S={ E(c),
c = xV ~ E(x)V
c = xV x ^ a,
~ £;(x) V Q(z, x) V x ^ o,
~Q(d,c) }
of clauses is produced and by the RUE-NRF deduction,
Q(z, c) V c ^ a, by RUE, from (4) and (7), (9)
c ^ a, by RUE, from (8) and (9), (10)
a / a, by RUE, from (6) and (10), (11)
n (the empty clause), by NRF, from (11).
28 K. Nakamatsit and A. Suzuki /Automated Theorem Proving
5 Conclusion
In this paper, we have given an algorithm for automated theorem proving for the many-
sorted free description theory FDn with equality. In the algorithm, the free description
theory FDn is translated into the standard many-sorted theory SEn without descrip-
tion, and the translation is somewhat complicated. Thus, we need some strategies to
improve the efficiency for the RUE-NRF deduction.
Considering the efficiency of theorem proving for many-sorted logic, to translate
many-sorted logics into standard ones may not be adequate. Therefore, we give an
automated theorem proving method for the many-sorted theory SEn as it is.
References
1 Introduction
Some nonmonotonic reasonings such as default reasoning are utilized and the treat-
ment of inconsistency has become more important in AI fields. For example, default
reasoning for belief revision techniques is used in knowledge representation systems
and the treatment of contradiction between agents in a multi-agent system is one of
hot topics. Considering more intelligent knowledge systems, they would require more
than two kinds of nonmonotonic reasoning as well as the capability to deal with in-
consistency such as NOGOOD in a nonmonotonic ATMS. However, it is difficult to
deal with these nonmonotonic reasonings and inconsistency uniformly, since they have
different semantics. Thus, we try to uniformly represent the declarative semantics for
such nonmonotonic reasonings and inconsistency based on annotated logic. We have
already characterized some nonmonotonic reasonings with inconsistency by annotated
logic programs [8, 9, 10, 11, 12, 13]. In this paper, we characterize the Negation as Fail-
ure by annotated semantics called annotated completion and prove its completeness
and soundness.
Annotated logics are a family of paraconsistent logics that were proposed initially by
Subrahmanian [14] and da Costa [3]. They have been applied to develop the declarative
semantics for inheritance networks and object-oriented databases [15].
The derivation rules of negative information in logic programming systems and
knowledge bases, Clark's Negation as Failure(NF for short) [2] is a nonmonotonic
reasoning. Many researchers have given some logical interpretations to these rules.
Fitting[4] and Kunen[6] provided an explanation of logic program with negation based
on some logical models. Balbiani[l] and Gabbay[5] showed that SLDNF-provability has
a modal meaning and it can be expressed in a modal logic. However, their semantics
for these rules cannot deal with inconsistency such as contradiction between agents in
multi-agent systems. Therefore, we give different declarative semantics for NF based on
annotated logic to deal with inconsistency. With respect to these semantics, we show
the soundness and completeness of NF.
30 K. Nakamatsu and A. Suzuki /Annotated Logic and Negation as Failure
NF rule is an inference rule in logic programming which gives a false value to a ground
atom if the logic program cannot give a proof of that ground atom. If A is a ground
atom NF rule can be interpreted informally as follows :
We give the semantics for the NF by means of the annotated completion for logic pro-
gram, which differs from Clark's completion [2] where the underlying logic is annotated
logic rather than classical logic. An annotated logic AL is a paraconsitent and multi-
valued logic based on a complete lattice of truth values. However, AL provides a simple
formalism that is closer to classical logic than ordinary multi-valued logic. For exam-
ple, each atom in AL has a truth value as an annotation and the formula itself can be
interpreted in classical logic fashion. Generally a goal(ground atom) A either succeeds,
finitely fails or loops in logic programming. We consider the following complete lattice
T in Figure 1 of truth values { /, /, s, T }. The language of AL is similar to that
T (inconsistent)
/'(failure) s(success)
/(loop)
of an ordinary logic. The only syntactic difference is that atomic formulas in AL are
constructed from those in the ordinary logic by appending to them annotations that
are drawn from the lattice T. If A is a ground atom, then (.4 : //) is called an annotated
atom, where // € T. / / i s called an annotation of ,4. AL has two kinds of negations :
an epistemic negation (->), a unary function from an annotation to an annotation such
that -(/) = /, -•(/) = a, - • ( « ) = / , -(T) = T.
We now adddress the semantics for AL. Let 7 be an interpretation and 5.4 be a set
of ground atoms. An interpretation / of AL over T may be considered to be a mapping
7 : 5.4 —» T. Usually / is denoted by a set { (p : U/i,)|7 f= (p : n\) A • • • A (p : //„)}.
where U/^ is a least upper bound of {//i, • • - , / / „ } .
[Definition 1] (satisfaction)
An interpretation 7 is said to satisfy
The satisfaction of the other formulas are the same as the ordinary logic. The details
of the syntax and semantics for AL can be found in [3].
The reason motivating our choice of AL to study the problem of the semantics for
NF are explained. An annotated atom (A:/j.) can represent the derivation of the atom
A by its annotation p, as follows :
(A:l) the ground atom A is known to loop
i.e. A is known to neither succeed nor finitely fail ;
(A : f) the ground atom A is known to finitely fail ;
(A : s) the ground atom A is known to succeed ;
(A:~T) the ground atom A is know to both succeed and finitely fail.
Usually the annotation T does not appear to characterize ordinary logic programs with
NF. However, it is necessary for expressing inconsistency in such cases where multi-
agent systems include contradiction between their agents.
We have the following equivalences based on the property of epistemic negation.
3 Annotated Completion
The most widely accepted declarative semantics for NF uses the "completed database"
introduced by Clark[2]. These formulas are usually called the completion or Clark's
completion of a program P and denoted by Comp(P). The aim of the completion is to
logically characterize the SLD-finite failure set of the program P.
We propose the annotated completion formula and prove the soundness and com-
pleteness theorems for NF with respect to the annotated completion. First, we review
about Clark's completion. The idea of Clark's completion is to consider that a predicate
is totally defined by clauses whose heads have the same symbol as the predicate. This
property is syntactically denoted in Comp(P} as an equivalence between a predicate
symbol and the disjunction of clause bodies. In the general case of Comp(P), each
clause
in the program P in which the predicate symbol R appears in the head is taken and
rewritten in a general form :
where xi, • • • ,x n are new variables (i.e., not already occurring in any of these clauses)
and ?/i, • • • ,ym are variables of the original clause. If the general forms of all these
clauses are
The completion Comp(P) of the predicate R is defined to be the set of completed def-
initions of each predicate symbol in P together with the equality and freeness axioms
called CET(Clark's Equational Theory) [7]. We define an annotated completion for-
mula AComp(P) and prove that NF is sound and complete for the annotated completion
AComp(P). Let P be a program and
Vx! • • •Vxn(El V • • • V EM
Vx! •••Vx n (-i/?i A ••• A-i£
Note : We assume the axiomatic system of AL including CET and the interpretation
of equality is given as usual.
Let us replace each literal 1/^(1 < i < m] and R(ti,---,tn) by the corresponding
annotated literal (L, : s) (1 < i < m) and (R(ti, • • • , tn) : s) respectively.
[Definition 2] (Annotated Completion AComp(P))
We obtain the following formulas that constitute the annotated completed definition of
a predicate symbol R :
We note Pt+ the set of positive completed definition (3), Pj~ the set of negative com-
pleted definition (4) of the predicate R and define
AComp(P) = P+ \JP~.
In AComp(P), the annotated atom (A : s) that has s as its annotation should be
interpreted as "A succeeds in P" , that is to say, "there is the SLD-refutation of A in
the program P" and the annotated atom (.4 : /) that has / as its annotation should
be interpreted as "A fails finitely in the program P" , that is to say, "there is the SLD-
finitely failed tree of A in P". Generally, the SLD-derivation of A either succeeds,
finitely fails or loops. In the annotated completion of the program P, the annotations s
and / can be regarded to represent "succeeds" and "finitely fails" , respectively. If any
attempt to refute A in the program P loops on A then A neither succeed nor finitely
fail in the program P.
We show how the annotated completion AComp(P) describes the meaning of logic
programs with NF in which inconsistency is included by a simple example.
[Example 1] Let
WTe show the soundness and completeness theorems for NF with respect to the annotated
completion. We recapitulate some concepts about ordinary logic programming before
giving the theorems.
UL denotes a Herbrand Universe and BL denotes a Herbrand Base. We identify as
usual a Herbrand interpretation IH as a subset of B^. ground(P) denotes the set of all
instanciations of clauses in a logic program P. A mapping Tp from a set of Herbrand
interpretations to itself is defined as follows : for every Herbrand interpretation IH ,
Then the finite failure set FF of P and the upward iteration of Tp are defined recur-
sivelv.
34 K. Nakamatsu and A. Suzuki / Annotated Logic and Negation as Failure
[Definition 3] Let FFj be a set of ground atoms in BL which finitely fail in P at depth
d [1, 7].
1. FF0 = BL\TP(BL),
2. A€FFd+1,
if for every clause, LI A • • • A Lm —> A, in ground(P). there is an integer z(l < i <
m) such that Li € FFj,
T p f O = 0,
TpTa = Tp(Tpt(a-l)).
where a is a successor ordinal, A is a limit ordinal, and U denotes the least upper bound.
A set of Herbrand interpretations can be ordered by the usual set inclusion relation
and is a complete lattice. The least fix point of Tp is a set Tp t ^ and it is equivalent
to the least Herbrand model of P.
Now we define a different mapping TN that maps an interpretation of annotated
formulas into itself.
[Definition 4] Let J// be a Herbrand interpretation of the annotated logic. Then.
TN t 0 = A,
logic.
Induction Hypothesis There is an integer a > 1 such that ; for any ground atom
A, if A is a logical consequence of P, i.e., A € Tp t «, any interpretation that satisfies
Pf is a model of (A:s), i.e., Pf |= (A:s).
Induction Step d = a + 1. In this case, there is a clause,
BI A • • • A Bm -> A,
in ground(P) such that ; for any integer z(l < i < m),
5t € TP t a.
Thus, by the induction hypothesis, any interpretation that satisfies P+ is a model of
(Bi-.s) A - - - A ( 5 m : s ) .
51A---A5ro->A'(t'1,...,t'n),
in P and
P+ |= 3s/i • • • 3j/ fc (ti = *; A • • • A < n = ^ A (B! :s) A • • • A (B m :s)).
Hence,
P+MU,- ••,*«):*.
Proof of 4=
The proof is by the induction on the least integer d such that a Herbrand interpretation
TN t d (= (A : s) whenever TN | d is a model of P+.
Basis d = 1. In this case, we can assume that P* |= (^4:s) and T/v 1 1 is a model of
P!+ and ( A : s ) . Then, there is a formula,
Vxi • • • Vxn(£;if V • • • V El -» A'(XI, . . .
in P!+ and there is an integer i(l < i < L) such that Ef is x\ — t\ A • • • A xn = tn.
Then, there is a unit clause B in P such that its head can be unified with A and
A e TP t 1.
Induction Hypothesis There is an integer a > 1 such that ; for any ground atom
A, if Pj4" (= (A:s), then A is a logical consequence of P, i.e., A € TP t a.
Induction Step d = a + I. In this case, there is a clause,
BI A • • • A Bm -> A,
in ground(P) and a Herbrand model T/v t a of (Bi : s) A • • • A (Bm : s) such that it
satisfies Pj+. Thus, by the induction hypothesis, there is a clause,
Bl A • • • A Bm -> A,
in ground(P) such that ; for any integer z(l < i < m),
in P!~~ and if Pf~ is valid, A'(t\, . . . ,tn) : f is satisfied in the model of the annotated
logic.
Induction Hypothesis There is an integer a > 1 such that : for ground atom .4.
if.4eFF, thenPr \= (A:f).
Induction Step d = a + 1. In this case, for any clause.
#1 A • • • A Bm -> .4.
in ground(P), there is an integer i(l < i < ra) such that B, € FFQ. Thus, by the
induction hypothesis,
p-^(Bl:f)V---V(Bm:f).
Then, for any clause Bl A • • • A Bm -> A'(t\,. . . , t'n) in P.
P~ \= Vyi • • • Vyfc(*! = t\ A • - - A tn = t'n -> (Bl :/) V • • • V (Bm :/)).
Consequently,
A-|=(.4:/).
Proof of <=
The proof is by the induction on the least integer d such that a Herbrand interpretation
TN t d satisfies (.4:/) whenever T/v t d is a model of Pj~.
Basis d = 1. In this case, we can assume that P^~ (= (.4 : /). and 7\ t 1 can De a
model of Pf" and (-4:/). Then, there is a clause.
in ground(P), there is an integer i(l < i < m) such that if Tyy t a satisfies P^~, then
T/v t Oi satisfies ( B i \ f } . By the induction hypothesis, for any clause,
BI A • • • A Bn ->• A,
in ground(P], there is an integer i(l < i < m) such that Bi e FF a _i and consequently,
A e FFQ.
Q.E.D.
5 Remarks
In this paper, we have shown that an annotated logic is appropriate to provide the
declarative semantics for NF, although many researchers have already given some log-
ical interpretations to them based on other logics. The main difference between the
annotated semantics and the others is the treatment of inconsistency. For example, in
the case of a multi-agent system having a contradiction between the belief or knowledge
of each agent, we can formalize such a situation of agents by the annotated completion,
although we have not mentioned in detail about the treatment of inconsistency.
References
[I] Balbiani,P. : Modal Logic and Negation as Failure. Logic and Computation, 1, (1991)
331-356
[2] Clark,K.L. : Negation as Failure. (H.Gallair and J.Minker,Eds.), Logic and Databases
Plenum Press., (1987) 293-322
[3] da Costa,N.C.A., Subrahmanian,V.S. and Vago,C. : The Paraconsistent Logic PT.
Zeitschrift fur Mathematische Logik und Grundlangen der Mathematik, 37, (1989) 137-148
[4] Fitting,M. : A Kripke-Kleene Semantics for Logic Programs. J. Logic Programming, 5,
(1985) 295-312
[5] Gabbay,D.M. : Modal Provability Foundations for Negation by Failure. Proc. Int'l Work-
shop Extension of Logic Programming, LNAI 475, (1989) 179-222
[6] Kunen,K. : Negation in Logic Programming. J. Logic Programming, 7 (1987) 91-116
[7] Lloyd, J.W. : Foundations of Logic Programming (2nd edition). Springer-Verlag (1987)
[8] Nakamatsu,K. and Suzuki,A. : Automatic Theorem Proving for Modal Predicate Logic.
Trans. IECE Japan, E67 (1984) 203-210
[9] Nakamatsu,K. and Suzuki,A. : Annotated Semantics for Default Reasoning. Proc. PRI-
CAI'94, (1994) 180-186
[10] Nakamatsu,K. and Suzuki,A. : A Non-monotonic ATMS Based on Annotated Logic
Programs with Strong Negation, in Agents and Multi-Agent Systems, LNAI 1441 (1998)
79 93
[II] Nakamatsu,K., Abe,J.M. and Suzuki,A. : A Defeasible Deontic Reasoning System Based
on Annotated Logic Programming. Proc. 4th Int'l Conf. Computing Anticipatory Systems,
AIP Conf. Proc. 573 (2000) 609-620
[12] Nakamatsu,K., Abe,J.M. and Suzuki,A. : Annotated Semantics for Defeasible Deontic
Reasoning. Rough Sets and Current Trends in Computing, LNAI 2005 (2001) 470-478
[13] Nakamatsu,K. : On the Relation Between Vector Annotated Logic Programs and Defea-
sible Theories. Logic and Logical Philosophy, 8 (2002) 181-205
[14] Subrahmanian,V,S. : On the Semantics of Quantitative Logic Program, Proc. 4th IEEE
Symp. Logic Programming (1987) 178-182
[15] Thirunarayan,K. and Kifer,M. : A Theory of Nonmonotonic Inheritance Based on An-
notated Logic. Artificial Intelligence, 60 (1993) 23-50
Advances in Logic, Artificial Intelligence and Robotics
J.M. Abe and J.I. da Silva Filho (Eds.)
IOS Press. 2002
Abstract There is a permanent demand for new application and simulation software
for power systems. The main purpose of this work is the development of
computational tool to help operators during restoration task of power substations and
distribution systems. This tool was developed using an Object-Oriented approach
integrated with the intelligent agent technology. The application of object-oriented
modeling in power system has been shown appropriated due to its reuse and
abstraction. Furthermore, the Multi-Agent Systems (MAS) are being applied in
power system problems solving, and some good results were obtained and the
interest in this area has increased in the last two years.
1. Introduction
Due to the current Brazilian power system scenario, an overloaded system in an energy
crisis generated by lack of investment and rains, many actions had been taken to overcome
this situation. The most prominent action was a plan of control for the energy supply, which
lasted for 8 months, and contributed to increase the water storage reservoirs volumes to
reach safe levels.
Nevertheless, there is a permanent demand for new application and simulation
software, required for different purposes such as research, planning and power system
operation. This software becomes larger and increasingly complex, and as a consequence,
to create it is more difficult to complete in time and within the budget's constraints. It has
become very difficult to create these new applications with traditional software
development technology. ^Tien finally finished, they are difficult to understand, to
maintain, to integrate into old application and to modify for new requirements.
Studies in computer science have shown that the reuse may improve software
development productivity and quality. Productivity increases as previously developed
assets can be used in current applications, which saves development time. Quality may be
increased as frequently reused assets have been tested and corrected in different study
cases.
The main purpose of this work is the development of a computer tool to help
operators during restoration task of power substations and distribution systems. The
restoration of power system normal configuration after a fault, or even a blackout, is
performed by intervention of a human operator. Considering the growing complexity in the
arrangement of substations and power distribution systems, and the probability of human
failure, the time spent in the execution of the restoration actions is larger and has to be
optimized.
This tool was developed using an Object-Oriented approach integrated with the
intelligent agent technology. The application of object-oriented modeling in power system
has been shown appropriated due to its reuse capability and abstraction [1 - 3].
A.R. Aoki et al. /Multi-agent System far Distribution System Operation 39
Furthermore, the Multi-Agent Systems (MAS) are being used in power system problems
solving, and some good results were obtained and the interest in this area increased in the
last two years [4-6].
The implementation was done using the Java-based Framework [7, 8], which
provides a generic methodology for developing MAS architecture, and a set of classes to
support this implementation.
MAS are a way to artificially reproduce real-life system through a model made of
autonomous and interacting objects, called agents [9, 10]. The main advantage of multi-
agent simulation is to allow the modeling of individual behavior, and the facility to get
more real simulation systems. The behavior of the power system components can be
simulated by agents that act in the same way.
The development methodology follows five stages: (i) identifying the agents, (ii)
identifying the agent conversations, (iii) identifying the conversation rules, (iv) analyzing
the conversation model, and (v) MAS implementation. The system developed provides
communication, linguistic and coordination support through Java classes. Communication
support is provided for both directed communication and subject-based broadcast
communication. This feature enables the development of scalable, fault-tolerant, self-
configurable and flexible MAS.
2. Multi-Agent Systems
The use of Agent Technology increased over the last decade and several different
applications are shown, in order to understand, modeling and develop complex distributed
systems, by viewing then as a computational organization consisting of various interacting
components [4 -6,9, 10]. These applications are founded in different areas, like finance
market, Internet, robotics, power systems and educational and simulation problems.
The intelligent agents represent interactive and autonomous independent entities,
reproducing real-life phenomena artificially. It is a computer object that has the following
properties: autonomy, social ability, reactivity and pro-activeness [9], and besides
processing its inputs accordingly to its own intelligence, generating some outputs, Fig. 1.
The agent has the feature of temporal continuity, because it monitors the
environment awaiting the occurrences that asks for actions. By analyzing an occurred facts
sequence registration, the agent takes the right decision based on its own knowledge. The
autonomy feature is given by its decision-making and actions' control to achieve its goals
and acquired knowledge based behavior.
• ENVIRONMENT
The MAS use in power systems research has been increasing with the development of
several studies in power systems operation [4], markets [6], diagnosis [11] and protection
[12]. The Intelligent Agents' theme for the first time have had a session at The International
Conference on Intelligent System Application to Power System, in 2001, and this shows the
researchers' interest in this area.
The Multi-Agent Model architecture is shown in Fig. 2. Two main packages and
support agents compose this model: a Distribution System Package and a Power Substation
Package. There are five support agents working on integration of the model with the real
world and among the packages.
There are five support agents in the model responsible for the integration of model with the
real world and among the packages. The Interface Agent is responsible for accessing the
SCADA database, so this agent is responsible for the temporal continuity of the MAS in
providing timely and consistent information to the other agents. In this agent is described
what information is given to each agent, filtering all irrelevant data, minimizing
communication and pre-processing tasks.
The Power Substation Package performs operation tasks in substations, and contains five
agents: Alarm Processing Agent, Switching Agent, Measurements Agent, Equipment Agent
and Integration Agent I [5].
Alarm Processing Agent provides expertise about the possible occurred problem.
When a disturbance occurs in the system, there will be many alarms provided by the
SCADA system. Usually, most of these alarms are redundant and happen due to secondary
problems caused by the primary problem.
The main idea of this agent is to detect the primary problem, and to send two kinds
of alarms to the MAS. The first kind is the alarms of the primary problem, while the second
kind are the main alarms for secondary problems. The first kind is very useful for the Event
Identification Agent to know the problem and to provide a solution. The second kind
(sometimes, more important than the first one) is useful to decide the degree of the
contingency.
So, this agent contains two main parts, one for detection of problems and another for
evaluation of the disturbance. The first part is composed of production rules, which reads
information about the relays and other sensors to define where the disturbance started. The
second part makes an evaluation using the data from the files and some rules based on
operation conditions.
The Equipment Agent acts as a function of the affected equipment, according to a
defined procedure. This program checks up the transformers, buses and capacitors
conditions, and in the absence of equipment's inside defects, it signals to liberate the
restoration action.
The Measurements Agent monitors the analogical signals of interest, as the voltage
values, current, frequency and the angle between voltage and current, in permanent state,
with their linked inputs through transducers to CT and PT. In case identification occurs, it
stores the post and pre-fault values. This monitoring detects the defect possibilities or
oscillations that affect the restoration and, in agreement with the found values, the process
can be validated, interrupted or modified, impeding voltage outages and badly energization
caused by strategy mistakes and damaged equipment.
The Switching Agent checks up the operated switches, performing continuous
monitoring of the switches' state, circuit breakers, grounding switches and by-pass
switches. Accordingly with its knowledge base it defines the action to be taken into system
switches.
The Integration Agent I is very important for providing integration between
packages, and it is responsible for the information of which feeder or buses are operating to
the Distribution System Package, or even performs a request for help from an agent of the
Distribution System Package, like a load flow analysis of reconfiguration.
The Distribution System Package performs the operation tasks in distribution systems, and
contains five agents: Restoration Agent, Switching Agent, Load Flow Agent, Load
Shedding Agent and Integration Agent II.
Restoration Agent contains some advice about the best strategy for switching on.
Two main ideas provide the ways for problem solving. The first one is to try a restoration
by a previously energized feeder. The second idea is to find parallel circuits to provide this
restoration. The first idea is possible to apply in radial systems or in temporary unavailable
circuits (e.g. temporary faults). In the case where a partial blackout occurs, the system
contains a strategy to feed in first place the boundary buses of the blackout system. The
idea is to reduce the affected area step-by-step.
Load Flow Agent is a numerical application inside an agent and provides the
voltage drop on each feeder branch, the voltage on each bus, and the projected power flow
A.R. Aoki et ul / Multi-agent System for Distribution System Operation 43
through the distribution system. This information is used by the Planning Agent in finding
between the possible solutions, which one has better chances to be executed.
Load Shedding Agent is designed to avoid a frequency or voltage power system
collapse. This agent contains a knowledge base about the load shedding strategies of the
system under analysis, and comprises a standard numerical program.
The Integration Agent II is responsible for providing integration between packages,
for example, sometimes it will be necessary exchange some data for protection analysis of
the Distribution System.
The computer packages are under development, the first one performs the distribution
system package tasks, and the second one the power substation package tasks. It was used a
Java platform for the power substation problem, as presented in Fig. 4. It has a diagram of
the substation, a dialog box, a floating window where plans for operation are presented, and
finally a status bar where is presented all system's measures. The user has an option to
visualize all protections by activating a menu command, and also, it is possible to visualize
all the messages exchanged by the agents, Fig. 5.
For the program of distribution system it was used a Visual Basic platform
presented in Fig. 6, and this program is under development yet.
5. Conclusions
The object-oriented model developed has a good degree of abstraction, so it allows the
reuse in other topologies. The MAS shows to be adequate for working with the object-
oriented model, given to the characteristics of distribution of both the technologies. The
main benefits given by the object-oriented model were:
• Distributed representation, allowing the virtual model to reproduce the
distributed properties of the system;
• Flexibility, because of the virtual model adaptability to the alterations that
may occur representing changes in the real world;
• Reuse capability, using the system developed for modeling a new substation,
through the redefinition of the devices and addition or change of rules;
• Open Architecture, allowing addition of components and system expansion,
as part of a bigger system.
A.R. Aoki et at. / Multi-agent System for Distribution System Operation 45
The MAS architecture is divided into packages of agents that interact to solve the
same problem. This packaging is organized as to facilitate the development process, which
can be done by different developer groups. In addition, it gives the possibility of including
hierarchical analysis of power systems, and flexibility to extend the system through
addition of new agents. The main benefits obtained with the multi-agent model were:
• Cooperative behavior, characterizing dynamic exchange of information
between entities and making possible segmentation of tasks, besides
allowing the reduction of the hierarchic characteristics of the decision
process;
• Competitive processing, making possible better use of the hardware
resources and reduction of the computational load, improving the execution
speed;
• Distributed topology, making possible in the abstraction process, the
division of the system in lesser number of parts, limited by functional
characteristics;
• Open architecture, allowing the addition of components and the expansion of
the system, as part of a bigger system, with creation of new environment
levels.
The power substation program was developed and implemented by modulus, in this
way it is possible to guarantee flexibility to the expansion and implementation of the multi-
agent model. The modular characteristic of the multi-agent systems allows the inclusion of
new modules and the expansion of the MAS for power system operation, keeping the
characteristics of distribution of the technique.
Beyond restoration tasks, these programs can be used to perform operation tasks
like maintenance planning, reconfiguration, etc. This is provided by the general knowledge
base developed in each agent.
References
[I] D. Becker, H. Falk, J. Gillerman, et al, "Standards-Based Approach Integrates Utility Applications,"
IEEE Computer Applications in Power, Volume 13(4) October 2000, pp. 13-20.
[2] S. Pandit, S.A. Soman and S.A. Khaparde, "Object-Oriented Network Topology Processor," IEEE
Computer Applications in Power, Volume 14(2) April 2001, pp. 42-46.
[3] S.K. Abidi and A.K. David, "An Object-Oriented Intelligent Approach to System Restoration," in
Proc. ISAP'99, G. Lambert-Torres and A.P. Alves da Silva, Eds., 1999, pp. 61-65.
[4] M. Amin, "Toward Self-Healing Energy Infrastructure Systems". IEEE Computer Applications in
Power, Volume 14(1) January 2001, pp. 20-28.
[5] C.R. Lopes Jr., A.R. AOKI, A.A.A. ESMIN, G. Lambert-Torres, "Multi-Agent Model for Power
Substation Restoration," in Proc. IASTED PES 2001, 2001.
[6] F.-R. Monclair and R. Quatrain, "Simulation of Eletricity Markets: AMulti-Agent Approach," in Proc.
ISAP 2001, P. Kadar and G. Tarnai, Eds., 2001, pp. 207-212.
[7] Sun MicroSystems, "Implementing Java Computing Solutions White Paper,"
http ://www.sun, cp/nc/whitepapers/.
[8] J.P. Bigus and J. Bigus, "Constructing intelligent agents with Java: a programmer's guide to smarter
applications," Wiley Computer Publishing, 1997.
[9] M, Wooldridge and N.R. Jennings, "Intelligent Agents: Theory and Practice," Berlin, Germany:
Springer-Verlag, 1994.
[10] J Feber, "Multi-Agent Systems: An Introduction to Distributed Artificial Intelligence^" Addison-
Wesley, 1999.
[II] M.A. Sanz-Bobi, J. Villar, et al, "DIAMOND: Multi-Agent Environment for Intelligent Diagnosis in
Power Systems," in Proc. ISAP 2001, P. Kadar and G. Tarnai, Eds., 2001, pp. 61-66.
[12] C.-K. Chang, S.-J. Lee, et al, "Application of Multi-Agent System for Overcurrent Protection System
of Industrial Power System," in Proc. ISAP 2001r P. Kadar and G. Tarnair Eds., 2001., pp. 73-77
Advances in Logic, Artificial Intelligence and Robotics
J.M. Abe and J.I. da Silva Filho (Eds.)
IOS Press. 2002
1. Introduction
One of the great powers of computer programming is the ability to define new compound
operations in terms of old ones, and to do this over and over again, thus building up a vast
repertoire of ever more complex operations. This ability is quite reminiscent of evolution, in
which more complex molecules evolve out of less complex ones, in an ever-upward spiral of
complexity and creativity. At each stage, the products get more flexible and more intricate,
more "intelligent" and yet, more vulnerable to delicate "bugs" or breakdowns [1].
Evolution is now considered useful in simulation to create algorithms and structures of
higher levels of complexity. Living things are too improbable and too beautifully designed to
have come into existence only by chance. How, then, did they come into existence? The
answer, Darwin's answer, is by gradual, step-by-step transformations from simple beginnings,
from primordial entities sufficiently simple to have come into existence by chance [2]. Each
successive change in the gradual evolutionary process was simple enough, relative to its
predecessor, to have arisen by chance. But the whole sequence of cumulative steps constitutes
anything but a chance process. When you consider the complexity of the final end-product
relative to the original starting point, the cumulative process is directed by non-random
survival.
Complexity may be defined as the situation in which, given the properties of the parts
and the laws of their interaction, it is not an easy matter to infer the properties of the whole.
Also, the complexity of a system may be described not only in terms of the number of
interacting parts, but in terms of their differences in structure and function [3]. But there are
special problems when examining systems of high complexity: a system may have so many
different aspects that a complete description is quite impossible and a prediction of its behavior
is unattainable. The analysis of such a system may require the use of approximation models,
and there is no efficient procedure to estimate the confidence to be attributed to the final
results. There is an enormous complexity in living systems, complexity which does not persist
for its own sake, but is maintained as entities at one level, which are compounded into new
entities at a next higher level, and so on.
Evolution can be used as a method for creating and exploring complexity that does not
require human understanding of the intrinsic processes involved. Interactive evolution normally
A. Moroni et al. /Arbitrariness 47
will depend on a user-machine interface for helping the user with creative explorations, or it
may be considered a system attempting to "learn" about human aesthetics from the user. In
both situations, it allows the user and computer to work together, in an interactive manner, to
produce results that could not be produced alone. Often, creativity in art, literature or science
can be understood as a certain richness of association with different, perhaps seemingly
unrelated disciplines [4]. In the same way, a more creative algorithm would act to go beyond
innovation by transferring useful information from other domains.
hi what follows, section 2 talks about problem-solvers and computer creativity. Strong
and weak problem-solvers are described, hi section 3, perceptual selection is associated with
interactive genetic algorithms and applied to visual and acoustic domains. Next, the term
Arbitrariness is defined as an interactive iterative optimization process, applied to aesthetic
domains. Finally, the conclusions about the consequences of adopting this kind of approach are
presented.
2. Problem-solvers and computer creativity
One of the main attributions of the human mind, and certainly a challenge for computational
intelligence, is creativity, associated here with the procedures of generating new, unanticipated
and useful knowledge, concerning the formal objectives to be achieved in a computational
environment. Some problems under investigation are too complex to be properly described and
solved using a precise mathematical formalism. The absence of a formal description and the
violation of some important restrictions prevent the application of powerful methodologies of
solution, denoted here strong methods. Strong methods are dedicated problem-solvers that
require high-quality and structural knowledge about the problem to be solved, and impose
restrictive assumption about the nature of the problem. For example, the best available
algorithms for iterative optimization require continuity, convexity, and the availability of
second-order information, at each point in the search space. The optimal solution may not be
obtained if one of the previous conditions is not valid. So, though being very efficient problem-
solvers, strong methods have an undesirable restricted field of application, and the more
challenging practical problems of our days cannot be solved by strong methods.
The alternatives are less efficient and generic problem-solvers, denoted here weak
methods. These methods search for the solution based on a minimum amount of information
and restrictions. For example, some algorithms for iterative optimization require neither
continuity nor second-order information. So, the field of application of every weak method is
much broader than the one associated with its strong counterpart, if available. However, the
performance may be completely unsatisfying, because lack of specification associated with the
feasible solutions make the space of candidate solutions to be prohibitively large, and the
absence of high-quality information prevent the search from being effective.
In rather general terms, there is a compromise between the power of the problem-solver
and the required quality of the available information. Nowadays, the most successful tools to
deal with this compromise are those inspired in mechanisms and modes of behavior present in
intelligent systems [5]. Problem-solvers derived from one (or a hybrid association) of the
computational intelligence methodologies guides to something in between strong and weak
methods, and that applies creativity in three different manners [6]:
• combinational creativity: definition of an original and improbable combination of already
available ideas.
48 A. Moroni et al. Arbitrariness
model is strictly a model of artificial selection, not natural selection: the criterion for 'success'
is not the direct criterion of survival. So the genes that survive tend automatically to be those
genes that confer on bodies the qualities that assist them to survive.
Latham applied in Form Synth the concept of accumulating small changes to help
generating computer sculptures made with constructive solid geometric techniques [8].
Sculptors have a number of practical techniques at their disposal to make 3-D forms, for
example, welding, chiseling, adding small pieces of clay, wood carving, construction in
plastics. The main concept behind Form Synth, however, did not grow out of any existing art
style but from another area altogether, the rule-based construction of complex polyhedra from
geometric primitives. But on Form Synth the form's evolution entirely depends on the intuitive
choices of commands that the user makes, the artist uses the rules successively on geometric
primitives and irregular complex forms to produce large "evolutionary tree" drawings of
irregular complex forms.
Techniques introduced by Sims [9] contributes towards the solutions to these problems
by enabling the "evolution" of procedural models using interactive "perceptual selection".
Evolutionary mechanisms of variation and selection were used to "evolve" complex equations
used in procedural models for computer graphics and animation. An interactive process
between the user and the computer allows the user to guide evolving equations by observing
results and providing aesthetic information at each step of the process. The computer
automatically generates random mutations of equations and combinations between equations to
create new generations of results. This repeated interaction between user and computer allows
the user to search hyperspaces of possible equations, without being required to design the
equations by hand or even understand them. Sims [7] also successfully applied genetic
algorithms to generate autonomous three-dimensional virtual creatures without requiring
cumbersome user specifications, design efforts or knowledge of algorithmic details for
evolving virtual creatures that can crawl, walk, or even run. The user sacrifices some control
when using these methods, especially when the fitness is procedurally defined. However, the
potential gain in automating the creation of complexity can often compensate for this lost of
control, and a higher level of user influence preserved by the fitness criteria specification.
Figure 1 presents a very simple example of the effect of user-machine interaction in a
visual domain. There is no direct intention of including aesthetic affairs here. We implement
and evolutionary algorithm such that each individual in the population is a picture composed of
50 lines characterized by color, position of one end point, angle and length. The list of
attributes of the 50 lines corresponds to the genetic code of an individual. Twelve pictures are
disposed in a uniform grid on the screen, representing all the individuals at the current
generation. The task of the user is to attribute a grade to each one of the twelve pictures. After
that, the evolutionary algorithm is activated and produces the next generation of twelve
pictures. This interactive iterative process goes on until a limit of generations or until a grade
level is achieved. The process of grade or fitness attribution should be based on some fitness
criterion. In this experiment, the objective is simply to force the lines to be concentrated at the
right-bottom corner of the frame. Notice that no explicit instruction is presented to the
computer, only better grades are attributed to pictures characterized by a higher concentration
of lines at the right-bottom corner. Of course, this rather silly criterion may be replaced by
much more complex and meaningful purposes, including the consideration of aesthetic affairs.
In essence, after a number of generations, representing steps of user-machine interaction,
the implicit purpose emerges, without explicitly programming the machine to converge to such
50 A. Moroni et til / ArThitrariness
a proposition. The question is the following: besides guiding lines to the right-bottom corner of
a frame, what else can be done? Without taking into account additional practical aspects, the
door is now open to include fitness criteria that cannot be describe in mathematical terms or
even using a formal language.
Fig. 1 - Populations of 50 lines were evolved to keep distance from the left and top edges of the frame. From
left to right, the pictures corresponds to the individual with the best fitness at generations 0,4,8 and 16.
fitness evaluation and to modify the duration of the genetic cycles, interfering directly in the
rhythm of the composition. The pad control allows the composer to conduct the music through
drawings, suggesting metaphorical "conductor gestures" when conducting an orchestra. By
different drawings, the composer can experience the generated music and conduct it, trying
different trajectories or sound orbits. The trajectory affects the reproduction cycle and the
musical fitness evaluation. Figure 2 presents two different drawings and the generated musical
notes resulting from using them as fitness function in Vox Populi.
Fig. 2 - In the left, a simple draw and its corresponding musical output generated by Vox Populi system. In
the right, a more complex draw and its corresponding musical output.
4. ArTbitrariness
In an interactive genetic algorithm (IGA), human judgment is used to provide fitness,
considering a user-machine interface. This cycle typically begins with the presentation of the
individuals in the current population for the human mentor to experience them, hi visual
domains, where each individual typically decodes to an image, all the individuals are usually
presented at once, often in reduced size so that the entire population can be contrasted or
compared. The mentor can then determine the fitness of each individual with relation to all the
others. A well-formalized fitness criterion gives rise to a fitness surface, and a global maximum
of this surface in the search space corresponds to the optimal solution. However, to identify the
criteria used by the mentor in his evaluation is hard enough. To justify or even explain his
reliance on those criteria is still more difficult.
So, the previously described dilemma between strong and weak methods may become
even more intricate when the human being is allowed to contribute decisively in one or more
steps to be followed by the problem-solver in an exploratory creative domain. As an example,
52 A. Moroni ci al. / ArThitrarinesx
applications of evolutionary computation in artistic domains are often hampered by the lack of
a procedure to determine fitness [13]. In such domains, fitness typically reflects an aesthetic
judgment determining which individuals in a population are relatively better or worse, based on
subjective and often ill-defined artistic or personal criteria.
When the human judgment replaces a formal fitness criterion, we still have a fitness
surface, but this surface cannot be expressed in mathematical terms, unless we are able to
produce a precise model of the human judgment. Besides, if the human judgment changes with
time, than the fitness surface is time-varying. It stands to reason that strong methods have no
applicability under these conditions. On the other hand, evolutionary computation techniques
remain as powerful methods to implement what we call an interactive iterative optimization. In
particular, they perform an interactive (fitness criterion defined by human judgment) parallel
search for a better solution in a space of candidates (search space), in order to discover
promising regions (characterized by the presence of candidates with higher fitness) in the
search space and to provide, in average, better solutions at each iterative step (generation), even
in the presence of a time-varying fitness surface [14].
When evolutionary computation and other computational intelligence methodologies are
involved, every attempt to improve aesthetic judgment will be denoted ArTbitrariness, and is
interpreted as an interactive iterative optimization process. ArTbitrariness is suggested in this
paper as an effective way to produce art based on an efficient manipulation of information and
a proper use of computational creativity to incrementally increase the complexity of the results
without neglecting the aesthetic aspects.
However, the potential of ArTbitrariness is not enough to guarantee high performance in
every context of application. Roughly speaking, the user should not be requested at the same
frequency that the machine is requested. For example, in musical domain not characterized by
hybrids like Vox Populi, the temporal evolution of musical events prevents the compressed,
parallel presentation of individuals as in computer graphics. Most of the applications of GA to
music found in literature presents population as an evolving trajectory of music material such
as chords, motives and phrases represented as events. The net result for music, then, is that
each member of a population must be presented individually and in real time. This leads to a
severe fitness bottleneck, which often limits the population size and the number of generations
that realistically can be bred in a musical IGA. These limits are necessary not only to cut down
the length of time it takes to run a musical IGA, but also to help reducing the unreliability of
human mentors as they attempt to sort through the individuals in a population, listening to only
one sample at a time. Nevertheless, mentors often become tired of an overused lick and start
punishing individuals in later generations that had been rewarded heavily in earlier generations.
5. Conclusion
Most evolutionary algorithms only explore a pre-given space, seeking the "optimal" location
within it, but some also transform their generative mechanism in a more or less fundamental
way. For example, evolutionary algorithms in graphics may enable superficial tweaking of the
conceptual space resulting in images which, although novel, clearly belong to the same family
as those which went before; or it may be so "complexified" that the novel images may bear no
resemblance even to their parents, still less to their ancestors. Some should assume that
transformation is always creative, or even that artificial intelligent (AI) systems that can
transform their rules are superior to those which cannot. Significantly, some AI systems
deliberately avoid giving their programs the capacity to change the heart of the code, that is,
they prevent fundamental transformations in the conceptual space.
A. Moroni et al. /Arbitrariness 53
One reason for avoiding rampant transformation in Al-models of creativity is that human
may be more interested, at least for a time, in exploring a given space than in transforming it in
unpredictable ways. A professional sculptor such as Latham, for instance, may wish to explore
the potential and limits of one particular family of SD-structures before considering others.
Another reason is the difficult of automating evaluation. Vox Populi presents the composer
with the opportunity to make subjective judgements on all mutations and recombinations while
moving to the next iteration. From the composer's point of view it captures the idea of
judgement, and from the system point of view it changes the search of musical space as it
allows the composer to steer the navigation. The choices interactively made by a composer in
response to the evolving music can be stored as a parametric control file and recorded as a
musical signal as well. The obtained data can be applied to train neural networks, which in turn
may be used as fitness functions, imposing a personal style.
Interactivity, or even real-time performance, is a desired feature in this kind of
evolutionary system. Interactivity applied to genetic algorithms means that the user is able to
interrupt the evolutionary process at, ideally, any stage and manipulate, ideally, any desired
parameter. After such a manipulation, the user must be able to judge the result within a
response time that allows interactive working.
Finally, the use of ArTbitrariness is powerful to control the complexity of artistic
material in the flow [13]. The relevance of this approach goes beyond the applications per se.
Thinking informally, processes that go beyond the innovative are creative. Evolutionary
systems that are built on the basis of a solid theory can be coherently embedded into other
domains. The richness of associations of the domains are likely to initiate new thinking and
ideas, contributing to areas such as knowledge representation and design of visual and acoustic
formalisms.
Acknowledgments: Part of this project was possible with the support of FAPESP to the Gesture
Interface Laboratory, in which VoxPopuli was developed. Fernando J. Von Zuben is supported
by the CNPq grant 300910/96-7. Artemis Moroni is supported by CenPRA.
References
[I] D. R. HOFSTADTER. Methamagical Themas, New York: Basic Books, p. 1985.
[2] R. DAWKINS. The Blind Watchmaker. London, England: Penguin Books, 1991.
[3] G.A. COWAN, D. PINES AND D. MELTZER (eds.) Complexity - Metaphors, Models, and Reality, Santa Fe Institute Studies
in the Sciences of Complexity: Perseus Books, Proceedings Volume XIX, 1994.
[4] D. GOLDBERG. "The Race, the Hurdle, and the Sweet Spot", in Bentley, P. (ed.) Evolutionary Design by Computers. San
Francisco, USA: Morgan Kaufmann, pp. 105-118,1999.
[5] D.B. FOGEL. Evolutionary Computation - Toward a New Philosophy of Machine Intelligence, 2nd edition. New York:
The IEEE Press, 1999.
[6] M. A. BODEN. "Creativity and Artificial Intelligence", Artificial Intelligence 103, pp. 347 - 356, 1998.
[7] K. SIMS. "Evolving Three-Dimensional Morphology and Behavior", in Bentley, P. (ed.) Evolutionary Design by
Computers . San Francisco, USA: Morgan Kaufmann, pp. 297 - 321, 1999.
[8] W. LATHAM. Form Synth: The Rule-based Evolution of Complex Forms from Geometric Primitives, in Computers in Art,
Design and Animation eds. J. Lansdown & R. A. Earnshaw. New York, USA: Springer-Verlag, 1989.
[9] K. SIMS. "Interactive Evolution of Equations for Procedural Models", The Visual Computer Vol. 9, No. 9, pp. 466-476,
1993.
[10] D. HOROWITZ. "Generating rhythms with genetic algorithms", Proceedings of the 1994 International Computer Music
Conference, pp. 142-143, 1994.
[ I I ] A. MORONI, J. MANZOLLI, F. VON ZUBEN & RJCARDO GUDWIN. "Vox Populi: Evolutionary Computation for Music
Evolution" /'/; Bentley, P. (ed.) Creative Evolutionary Systems. San Francisco, USA: Morgan Kaufmann, pp. 205 - 221,
2002.
54 A. Moroni el al. /Arbitrariness
[12] A. MORONI, J. MANZOLLI, F. VON ZUBEN & RICARDO GUDWIN. "Vox Populi: An Interactive Evolutionary System for
Algorithmic Music Composition", Leonardo Music Journal, Vol. 10, pp. 49-54, 2000.
[13] MORONI, A, VON ZUBEN, F.J. & MANZOLLI, J. Arbitration: Human-Machine Interaction in Artistic Domains, Leonardo.
MIT Press, vol. 35, no. 2, 2002.
[14]M.WiNEBERG. Improving the behavior of the genetic algorithm in a dynamic environment. Ph. D. Thesis. Ontario:
Ottawa-Carleton Institute for Computer Science, 2000.
Advances in Logic, Artificial Intelligence and Robotics 55
J.M. Abe and J.I. da Silva Filho (Eds.)
IOS Press, 2002
Abstract
The aim of this paper is to overview the field of fuzzy numbers and the arithmetic
that has been developed to perform operations with fuzzy numbers. Issues
concerning overestimation, shape preserving, properties and the expected intuitive
characteristics of the operations are particularly emphasized. Recent revisions
suggesting how to avoid discrepancies and pitfalls often observed in fuzzy
arithmetic are discussed. These issues are of utmost relevance in many engineering,
data processing and biological systems modeling and applications because fuzzy
numbers are values for fuzzy variables. Moreover, when they are connected with
linguistic concepts they provide meaning for values for linguistic variables, a key
concept in the theory of fuzzy sets and fuzzy logic.
1. Introduction
In practice, exact values of model parameters are rare in many engineering, data
processing, and biological systems modeling and applications. Normally, uncertainties
arise due to incomplete or imprecise information reflected in uncertain model parameters,
inputs and boundary conditions. This is often the case, for instance, in price bidding in
market oriented power system operation and planning, in internet search engines, with the
transfer rates in dynamic epidemiological models, and with the amount of carbohydrates,
proteins and fat in ingested meals and gastroparese factor in human glucose metabolic
models. A fruitful approach to handle parameter uncertainties is the use of fuzzy numbers
and arithmetic. Fuzzy numbers capture our intuitive conceptions of approximate numbers
and imprecise quantities such as about five and around three and five, and play a
significant role in applications, e.g. prediction, classification, decision-making,
optimization and control. In these cases, fuzzy numbers represent uncertain parameters and
system inputs, with their support and shape derived from either experimental data or expert
knowledge.
Fuzzy quantitites does fit well to many real world cinscunstances, but standard
fuzzy arithmetic shows some undesirable effects and may become unsuitable for analysis
of uncertain models, especially from the application point of view. For instance, the
accumulation of fuzziness, which causes the phenomenon of overestimation, skews the
membership functions as a result of fuzzy operations. This effect is similar to that
encountered in error accumulation in conventional numerical computations. Fuzzy
multiplication and division operations do not guarantee shape preservation. For example,
products of triangular fuzzy numbers are not necessarily triangular. Arithmetic operations
with fuzzy quantities do, however, satisfy some useful properties as found in conventional
operations. They are commutative, associative, but in general they are sub distributive only.
Overestimation and shape preserving are particularly problematic because in most cases
they mean non intuitive results. Except in pure mathematical and formal contexts,
overestimation and lack of shape preservation turn system and model analysis, validation
and interpretation difficult.
56 F. Gomidc /An Overview of Fu::v Numbers and Fn:~\- Arithmetic
Fuzzy quantities are intended to model our intuitive notion of approximate intervals
and numbers. They are defined in the universe of real numbers R and have membership
functions of the form
A:R->[0,\]
A fuzzy number is a fuzzy subset A of R that has the following characteristics (K2] :
1. A(x) =1 for exactly one x,
2. The support {x: A(x)>0} of A is bounded,
3. The a-cuts of A are closed intervals.
It is clear that real numbers are fuzzy numbers. In addition, it can be shown [see e.g. 1, 2]
that a fuzzy number is convex, upper semi-continuous, and if A is a fuzzy number with
A(p)=\, then A is monotone increasing on [-00, p] and monotone decreasing on [p, oo]. If the
first condition holds for more than one point, then we have a fuzzy interval, figure 1.
Basically, there exist two classic methods to perform fuzzy arithmetic operations.
The first method is based on interval arithmetic whereas the second employs the extension
principle. The extension principle provides a mechanism to extend operations on real
numbers to operations with fuzzy numbers.
Let A and B be fuzzy numbers and let * be any of the four basic arithmetic
operations. Thus, the fuzzy set A*B is defined via the a-cuts Aa and Ba as (A*B)a= Aa*Ba
for any ae(0,l]. When * = / (division operation), we must require that 0£#a Vae(0,l].
Therefore, from the representation theorem ^1] we have
A*B= \J(A*B)a
ae[0,l]
Thus, the first method to perform fuzzy arithmetic is a generalization of interval
arithmetic. The second method uses the extension principle to extend standard operations
on real numbers to fuzzy numbers. Therefore the fuzzy set A*B of R is defined by
(A*B)(z)= sup min[A(x),B(y)],\/zeR
As surveyed in [3] and more recently in [4], operations with fuzzy numbers do not
fulfill some of the properties of the operations with real numbers. A problem with
processing fuzzy numbers is related with the validity of group properties and distributivity.
This means that A+(-A) is not equal to 0 and A(\IA) is not equal to 1. In addition, if a, beR,
then (a+b)A is not generally equal to (aA)+(bA). Thus A+A does not need to be 2A, a rather
counterintuitive result. Therefore, fuzzy quantities do not form neither an additive nor a
58 F. Gomide /An Overview of Fuzzy Numbers and Fuzzy Arithmetic
multiplicative group if strict equality is demanded, and 0 and 1 are considered as zero and
unit element. This complicates some theoretical considerations and practical procedures.
See [3, 4] for further details on these issues.
Direct implementation of fuzzy arithmetic via interval arithmetic generally is
computationally complex. Implementation of the extension principle is equivalent to solve
a nonlinear programming problem. To overcome this difficulty, often we limit to simple
membership functions such as triangular or trapezoidal as depicted in figure 1. hi these
cases, approaches to compute fuzzy operations become simpler because they can be done
via the parameters that define the fuzzy numbers. Unfortunately, the shape of triangular
numbers is not preserved under certain operations. More specifically, the shape of the
triangular fuzzy numbers is not closed under multiplication (see figure 3) and division
because the result of these operations is a polynomial membership function and triangular
approximations can be quite poor and produce incorrect result [5]. This is particularly
crucial in engineering and biological systems modeling and applications. Next we shall
discuss the main efforts that have been addressed to overcome some of the problems
inherent to classic fuzzy arithmetic.
In what follows, we»review new methods to perform fuzzy operations and produce
more intuitively meaningful results. In general, the methods assume, either explicitly or
implicitly, a form of constrain in the operation procedure. Properties such as associativity,
may be lost and further computational difficulties may arise. Context dependent heuristics
seems to provide a powerful and pragmatic approach to turn fuzzy arithmetic consistent
with human intuition and practicalities.
When arithmetic operations are performed with real numbers we implicitly assume
they are independent of the objects they represent. This principle is also assumed in the
usual interval and fuzzy arithmetic. However, usually they are not independent as they are
tacitly tightened to a theoretical or application context. For example, if we let ^=(1,2,4) be
a triangular fuzzy number whit modal value 2 and left and right base points 1 and 4,
respectively. Then its a-cut is Aa=[\+a, 4-2a] and (A-A)a=[-3+3a, 3-3a]. This result does
not fit our intuition and may lack practical meaning. The point is that fuzzy set operations
(subtraction in the example) neglect the fact that the operands are equal. Requisite
F. Gomide / An Overview of Fuzzy Numbers and Fuzzy Arithmetic 59
constraints is a key concept recently been introduced by Klir [6] to address this issue.
Clearly, as shown in the example above, it is essential to include the equality constraint,
when applicable, into the general definition of basic arithmetic operations. Otherwise, we
may get results that are less precise and counterintuitive. In general, as suggested in [6],
arithmetic operations constrained by a relation 9? becomes, using the generalization of
interval arithmetic and the extension principle,
( A * B ) R ( z ) = sup mm[A(x),B(y),W(x,y)]
z=x*y
respectively. Fuzzy arithmetic with requisite constraints is a fruitful area, with mathematical and
computational challenges yet to be resolved before it becomes fully operational.
A natural way to model approximately known values such as numbers that are
around a given real number, is fuzzy arithmetic. From the common sense point of view,
around hundred plus one is still around hundred («100 + la 100), although in fuzzy
arithmetic this is not the case. A modification of fuzzy arithmetic that does retain the
common sense view has recently been proposed by Kreinovich and Pedrycz ^ . Intuitively,
their method assumes that, given two fuzzy numbers A and B, we should compute the
membership function of C=A+B, find the interval of possible values of C (e.g. as the values
such that C(x)>p0 for some value p0, pick the simplest value c on this interval, and then
return "around c" as the result of adding A and B. The key here is what does simplest mean.
The formalism introduced in [9] suggests the shortest length of a formula F(y) in a
language L which defines the particular number x, (i.e. which is true for y=x and false
otherwise) as a measure of complexity D(x) of a real number*.
60 F. Gomide /An Overview of Fuzzy Numbers and Fuzzy Arithmetic
The current brain and biological information processing knowledge has indicated
that counting and operations with cardinal quantities is an ability encountered in many
animals. This ability seems to be essential in their struggle to survive and evolve. Being a
complex distributed information-processing system, the brain dynamics suggests
mechanism to perform arithmetic operations in distributed organizations. Here, fuzzy
numbers play an essential role once they are part of the effort to process approximate
quantities as a means to gain competitiveness. In this vein, Massad and Rocha (10) have
suggested the definition of K-fuzzy numbers, a number whose precision depends of the
number itself. The reader is referred to the reference [10] for details.
Most, if not all of the current efforts to bring fuzzy arithmetic closer to meaningful
and intuitively understandable results, as required by models of real world systems, still
lack formal and computational analysis. The requisite constraint approach [6] does provide
an appropriate framework to verify the properties and characteristics of the arithmetic
operations. However, it is not clear yet what properties are these and which algorithms are
relevant for efficient computational implementations. The discrete fuzzy arithmetic
approach [8] has indications to perform computationally well and to be semantically
consistent with the expected results. However, arithmetic operations properties still are to
be characterized. The common sense viewpoint of fuzzy arithmetic has been shown to lack
associativity and to be difficult to compute [9]. Shape preserving is not, in general,
guaranteed by any method. Recent research on fuzzy operations with t-norms has provided
hints on what classes of t-norms are tuned with shape preservation for some operations.
The discrete fuzzy arithmetic approach claims to provide a solution to the overestimation
problem, hi fact, the examples discussed in [7, 8] do suggest that this is the case, but
generally speaking, formalization of overestimation still is an open issue.
5. Conclusions
Acknowledgements
The author acknowledges CNPq, the Brazilian National Research Council, for its
support.
F Gomide /An Overview of Fuzzy Numbers and Fuzzy Arithmetic 61
References
[1] G. Klir and B. Yuan, Fuzzy Sets and Fuzzy Logic: Theory and Applications. Prentice Hall, Upper Saddle
River, 1995.
[2] H. Nguyen and E. Walker, A First Course in Fuzzy Logic. CRC Press, Boca Raton, 1997.
[3] D.Dubois and H. Prade, Fuzzy Numbers: An Overview. In: J. Bezdek (ed.) Analysis of Fuzzy
Information, CRC Press, Boca Raton, 1988, vol. 2, pp. 3-39.
[4] M. Mares, Weak Arithmetics of Fuzzy Numbers. Fuzzy sets and systems, 91 (1997) 143-153.
[5] R. Giachetti and R. Young, A Parametric Representation of Fuzzy Numbers and their Arithmetic
Operators. Fuzzy sets and systems 91(1997) 185-202.
[6] G. Klir, Fuzzy Arithmetic with Requisite Constraints. Fuzzy sets and systems, 91 (1997) 165-175.
[7] M. Hanss, On Implementation of Fuzzy Arithmetical Operations for Engineering Problems. Proceedings
ofNAFlPS 1999(1999) 462-466, New York.
[8] M. Hanss, A Nearly Strict Fuzzy Arithmetic for Solving Problems with Uncertainties. Proceedings of
NAFIPS 2000 (2000) 439-443, Atlanta.
[9] V. Kreinovich and W. Pedrycz, How to Make Sure that "«100"-i-l is ~ 100 in Fuzzy Arithmetic: Solution
and its (Inevitable) Drawbacks. Proceedings of the FUZZ-IEEE (2001)1653-1658, Melbourne.
[10] E. Massad and A. Rocha, Implementing Mathematical Knowledge in a Distributed Intelligent Processing
System. Proceedings of the conference IPMU'2002, Annecy, July 2002 (to be published).
[11]D. Filev and R. Yager, Operations on Fuzzy Numbers via Fuzzy Reasoning. Fuzzy sets and systems, 91
(1997) 137-142.
62 Advances in Logic, Artificial Intelligence and Robotics
J.M. Abe and J.I. da Silva Filho (Eds.)
IOS Press, 2002
Abstract
This paper describe the results of an experimental study about the cerebral activity
associated with and the time required for arithmetic operation solving in adults and
children. The main findings show that males are faster than females in solving
arithmetic calculations, independent of age and degree of education. This difference
in performance is supported by different neuronal recruitment in man and woman.
1. Introduction
2. Methodology
Experimental group A was composed by 8 girls and 8 boys attending the second
year of the elementary school, mean age around 8 years. Experimental group B was
composed by 10 male and 10 female enrolled in a technology master course, mean age
around 30 years.
F. T. Rocha and A.F. Rocha / The Brain and Arithmetic Calculation 63
The recorded EEG was analyzed according to the technique described by Foz et. al.
fl2]
. Numerical data were analyzed according to each group, sex and mathematical
operations. Non-parametric statistics was used to evaluated mean sex differences. The
following index was used to quantify sex differences t7J :
I = RT f -RT m /SD g
where RTf, RTm stand for the mean response time for females and males in the group g and
SDg stands for the standard deviation calculated for the each experimental group. Factor
Analysis was used to disclose principal cerebral activity components associated with each
arithmetic calculation and sex.
3. Results
The statistical analysis of the data showed a clear difference between sexes for the
group of adults (B in Fig. 2). Males were faster than females in solving all arithmetic
operations. These differences were greater for addition and division, and were minimum
for product.
64 F. T. Rocha and A.F Rocha / The Brain and Arithmetic Calculation
AD: Addition, Su: Subtraction; PR: Product; DI: Division; FM1, FM2, FM3: the three
principal components identified by factor analysis.
Factor analysis disclosed three principal components of cerebral activity associated
with arithmetic calculations in both experimental groups (Fig. 1); which accounted for
around 30%, 30% and 20% of data variance, respectively. The brain areas associated to
each factor varied according to the calculation being performed, sex and experimental
group. Despite this, some general patterns of brain activation may be assumed to exist in
each experimental group.
In the case of male adults, FM1 tended to involve left hemisphere areas in all types
of calculation, while it tended to group bilateral frontal regions in the case of female
adults. FM3 tended to involve right hemisphere areas in both groups, but also the left
hemisphere in the females. FM2 tended to be much more similar considering the different
arithmetic calculations for females than for males. Women tended to use neurons on
central and parietal areas more frequently than men, which in turn tended to recruit more
the right hemisphere. Sex factor differences were less pronounced for product than for the
other types of calculations.
In the case of children, FM1 involved a huge number of areas in both hemispheres,
such that sex differences may appear more blur than for the other two factors. FM2 tended
to group central parietal and left occipital areas in both boys and girls, but it also recruited
right neurons in females. FM3 seems to be the factor that shows the greatest sex
differences in the case of children. It also exhibits a greater variety when the different
types of operations are considered.
4. Conclusion
The present results seem to confirm those assumptions since factor analysis
disclosed correlated activity mostly among areas that are classically associated to non-
verbal processing. Most of the areas evidenced to participate on arithmetic calculations are
distributed over the right hemisphere and the occipital lobe, which area regions assumed to
be mostly involved with visual processing '10' 14' I5l Other frontal and parietal areas
appearing here to be associated to calculus solving, have being proposed to be involved
with the hand and eyes movement control [16> 17).
The present results seem also to show that people may use different strategies to
solve the same arithmetic problem. The variability of the brain patterns disclosed by the
factor analysis when the type of calculation, sex and age are considered, may be the reflex
of using different neural circuits to achieve the same result. For instance, maybe adult
males relied more upon visual operations supporting block counting as could be evidenced
by a substantial use of the right hemisphere showed by FM2 and FM3; while females used
more frontal and parietal neurons as showed by FM1 and FM2, to simulate a mental
sequential (finger) counting.
The performance improvement from infancy to adulthood may be associated to the
dramatic change on the style of brain activation showed by factor analysis when children
and adults are compared. Children seems to enroll many neurons widely distributed over
the brain to solve all types of calculations, as showed by FM1, whereas no such pattern is
exhibited by any of the factors in the case of adults. Also, it seems to exist a greater
variability of FM2 and FM3 for all arithmetic calculations when children are compared to
adults.
The present results and those published by Rocha and Rocha ''3| seem to support
the conclusion that the different strategies used to solve arithmetic problems, are supported
by a counting process that did not depend on language, but relied a distributed processing
system specialized for quantification and calculation as proposed by Massad and Rocha
[3,4]
References
[15] K. Sathian, T. J. Simon, S. Peterson, G.A Patel, J.M. Hoffman and S.T. Grafton, Neural evidence
linking visual object enumeration and attention. J. Cog. Neuroscie. 11, 1999, pp. 36-51.
[16] R.A. Anderson, Multimodal integration for the representation of space in the posterior parietal cortex In
The Hippocampal and Parietal foundations of spatial cognition., Buergess, N, Jeffery, K. J and O'Keefe,
J (eds), Oxford University Press, 1999, pp. 90-101.
[17] C.R. Olson, S.N. Gettner and L. Temblay. Representation of allocentric space in the monkey frontal
lobe. In The Hippocampal and Parietal foundations of spatial cognition., Buergess, N, Jeffery, K. J and
O'Keefe, J (eds) Oxford University Press, 1999, pp.357-380.
68 Advances in Logic, Artificial Intelligence and Robotics
J.M. Abe and J.I. da Silva Filho (Eds.)
10S Press. 2002
Abstract
1. Introduction
a) a finite set of agents, each of them instrumented to solve a given class of problems, and
b) a finite set of resources to support communication transactions among these agents,
such that the solution of a complex task becomes a job for a group of agents recruited
according to the suitability of their tools to handle the given problem.
DIPS knowledge ought to be spatially distributed rather than being the privilege of one
or few agents. Resulting redundancy then turns this knowledge resistant to corruption, and
it is mainly implemented by a set of agents having similar but not identical tools to handle
de same piece of information.
The solution of any (new) task different from those already solved by the system, is to
be pursuit by attempting either novel agent recruitment or new agent specialization.
Therefore, DIPS' intelligence is both determined by the plasticity of communication
transactions and evolution of agent specialization.
The brain is the most complex DIPS known, and special attention has recently being
paid to understand its mathematical capabilities [4 ' 5] . Basic to the issue is the process of
quantification of the numerosity or the cardinality of a set. Human being call this process
by the name of counting and claim it to be its privilege. However, neurosciences have
being demonstrated that many animals are also able to quantify the cardinality of those sets
(of food or predators) important to their survival, and arithmetic calculation has been
proposed to evolve from cardinality quantification (6- 71.
A.F. Rocha and E. Massad/Evolving Arithmetical Knowledge 69
The purpose of this paper is to discuss how this counting and arithmetic capabilities
may be implemented in a DIPS and how they may evolve from simple systems as those
used by animals to our complex mathematical knowledge.
The paper is organized as follows. Section 2 and 3 describes the DIPS model
introduced by Massad and Rocha '8> 9] to account for cardinality quantification as proposed
by neurosciences. Section 4 proposed how changes on the strategies used by some DIPS'
agents may result in counting evolution from animals to modern humans. Section 5 makes
some comments on this evolution.
a) a finite set C of control agents in charge of moving S over U, in order to cover the
subspace V containing o,, that is
Kc{Fy},=itoU (1)
where F, is the subspace of U sensored by S at the step t, whose size is F and u is the
number of steps required to cover V;
c) a finite set Q of agents that are able to recognize the quantities accumulated by A, by
classifying their readings of a. In this, each qi eQ performs computations of the type
DJ = dj if a <CT< p, otherwise Uj = rj (3 )
where Uj is the output of q\; a,P > 0; dj is a label in a dictionary D, t) is the empty label
inD.
3. K Fuzzy Number
In other words:
d i+ i = di © di, di = di © TI (9)
and
dM = di <8> di, di = d! <8> TI (10)
otherwise
(12)
d f l € 6 = 1^ dn, (13)
Yg = K ( d g ) , Y g e D (14)
Dc=[d,,...,dj] (17)
a) changing from the linear accumulation defined by eq. 2 to a non-linear function such as
P^,=Z,*ot*p,L (20)
as depicted by fig. 3;
b) creating a hierarchy of subsets of accumulators (Ai,A2,...) to monitor the
discontinuities on the lower level accumulators;
c) creating a hierarchy of sequential quantifiers (Bi, 62,...) associated to that of
accumulators (see fig. 3), and
d) associating the monotonic increasing accumulation to the crisp subset
Dc = [ d,,..., dj ]
in 17.
This type of DIPS is able to handle a set of Crisp Base Numbers (CBN) whose size of
block is defined by the crisp set in 17. Our decimal numbers are the special case of CBN
with
De=[l,...,9,0] (22)
5. Conclusion
The model proposed here incorporates many of the basic properties of the neural
systems involved in quantifying the cardinality of sets [4> 5] and defines a special class of
fuzzy numbers, called here K Fuzzy Numbers, for which the base of their membership
function is a function of the number magnitude. The properties of KFNs nicely map upon
the experimental data obtained for both animals and man in cardinality set estimation.
But more important, it is the fact that KFN circuits may evolve to other DIPS
having the capability of handling crisp numbers, or Crisp Base Numbers (CBN). This
evolution may be attained by
a) changing the linear accumulation function used by some of the DIPS' agents into a
non-linear estimation;
74 A.F. Rocha and E. Massad/Evolving Arithmetical Knowledge
The biological evolution experienced from other animals to man may be understood as
consequent to gene modification allowing:
a) first the change from linear to periodical accumulation functions used by some cells
located at multi-sensorial areas, such as the parietal lobe, and
b) second by increasing the number of available accumulators and quantifiers.
The genetic evolution may allow the two types of accumulators to exist in the brain,
maybe the most primitive normally predominating if learning of cultural numerical
transmitted system does not occur [9]. This could explain the use of KFN by children and
many types of indian cultures.
The discovery of CBN by humans was certainly favored by the genetic evolution
creating non linear accumulators. However, the transmission of this discovery must have
been guarantee by memetic inheritance [9]. Such inheritance may be proposed to be
supported by learning by imitation or teaching. Both mechanism may recruit the
biochemical transactions required to activate the reading of the genes for non-linear
accumulation and if necessary to block the expression of the genes for linear estimation
[io]_
It may be concluded, therefore, that the model presented here to implement arithmetic
knowledge in a DIPS, not only incorporates many of the basic properties of the neural
systems involved in quantifying the cardinality of sets, but also may explain the evolution
from KFN to CBN, that have occurred in the human specie.
References
[1] W. Pedrycz, P.C.F. Lam and A.F. Rocha, Distributed Fuzzy system Modeling. IEE Trans. Sys. Man
Cyber. 25, 1995, pp. 769-780.
[2] A.F. Rocha, The brain as a symbol processing machine, Progress in Neurobiology 53, 1997. pp. 121-
198.
[3] A.F. Rocha, A. Pereira Jr. and F.A.B. Coutinho, NMDA Channel and Consciousness: from signal
Coincidence Detection to Quantum Computing, Progress in Neurobiology 64, 2001, pp. 555-573.
[4] B. Butterworth, The mathematical brain, Macmillan, London, 1999.
[5] S. Dehaene, The number sense, Penguin Books, London, 1997.
[6] M. Fayol, Lenfant et le nombre: du comptage a la resolution de problemes, Delachaux & Niestle, Paris.
1990.
[7] R.S. Siegler, Emerging minds Oxford Univesity Press, London, 1996.
[8] E. Massad. and A. F. Rocha, Implementing arithmetical knowledge in a distributed intelligent processing
system. To appear in Proceedings of IPMU, 2002a.
[9] E. Massad and A. F. Rocha, Meme-gene Coevolution and Cognitive Mathematics. To appear in this
Proceedings, 2002b.
[10] Pereira Jr, A. (2002) Neuronal Plasticity: How Memes Control Genes. To appear in this volume.
Advances in Logic, Artificial Intelligence and Robotics 75
JM. Abe and.I.I. da Silva Filho (Eds.)
IOS Press, 2002
Abstract
After presenting the basic aspect of the new science of memetics and the basic
mechanism of meme-gene coevolution we discuss the role of memes in the
evolution of brain size and complexity. We also address the issue of cognitive
mathematics, that is, how the human brain deals with number and present some
hypothesis on the evolution of our mathematical abilities. We end the paper with a
brief discussion of the future perspectives of memetics and cognitive mathematics,
arguing that this is a very rich field of investigation, still in its infancy.
1. Introduction
Uniquely among animals, humans are capable of imitation and so can copy from
one another ideas, habits, skills behaviors, inventions, song and stories. These are all
memes, a term which first appear in Richard Dawkins' book The Selfish Gene []\ In that
book, Dawkins dealt with the problem of biological (or Darwinian) evolution as
differential survival of replicating entities. By replicating entities Dawkins meant,
obviously, genes. Then, in the final part of his book, Dawkins asked the question 'are
there any other replicator on our planet?', to which he answered lyes'. He was referring
himself to cultural transmission and fancied another replicator - a unit of imitation [2].
Dawkins first though of 'mimeme', which had a suitable Greek root (Dawkins' words) but
he wanted a monosyllable word which would sound like 'gene' and hence the abbreviation
of mimeme - meme. A revolutionary new concept (actually, a truly Kuhnian paradigm
shift) was born. Like genes, memes are replicators, competing to get into as many brains as
possible.
One of the most important memes created by humans is the concept of numbers. As
mentioned by Dehaene ^ numbers are cultural inventions only comparable in importance
to agriculture or to the wheel. Counting, however, is a process observable in a great
number of non-humans species. Millions of years earlier the first written numeral was
engraved in bones or painted in cave walls by our Neolithic ancestors, animals belonging
to several species were already registering numbers and entering them into simple mental
computations [3]. We, modern humans differ from the other species by being able to deal
with numbers in a highly sophisticated manner, rather than the simple block-counting
process characteristic of lower species which is typically limited to 3 or 4 units.
The purpose of this paper is to introduce the logic and basic assumptions of
memetics, presenting its definition and presenting memetics as a potential answer to the
problem of brain evolution and the human capacity of dealing with complex mathematical
processes.
76 E. Massud and A.F. Rot tin /Meme-Genc Convolution
Memetic and genetic evolution can interact in rich and complex ways, a
phenomenon described as 'meme-gene-coevolution |2'. Cultural evolution is a branch of
theoretical population genetics and applies population genetics models to investigate the
evolution and dynamics of cultural traits equivalent to memes 14]. Gene-culture coevolution
employs the same methods to explore the coevolution of genes and cultural traits. Meme
evolution, in turn, may occur either exclusively at the cultural level, or through meme-gene
interaction, both mechanisms having important consequences, hi the following sub-
sections I intend to present examples of those consequences of meme dynamics.
One important concept to the following arguments is that of meme-gene
coevolution. According to this theory, social learning and memetic drive are the cause
behind the evolution of increased brain size [4'. According to Blackmore [2', this process
has three stages. In the first, individuals with a genetic predisposition for imitating would
succeed over those that only leam directly from the environment. This argument assumes
that the object or the behaviour imitated increases the performer's fitness. In the second
stage, in a population of imitators, those with a genetic predisposition to imitate from the
best imitators (selective imitators) would be selected. Thirdly, assortative mating between
selective imitators would produce the most successful offspring who would inherit the
genetic predisposition to imitate selectively and the imitation gene frequency would
increase in this population. As selective imitation requires a greater than average cognitive
capacity there is, therefore, a selective pressure for an increase in the average brain size. In
addition, a minimally sophisticated language system is required for meme spreading,
E. Massad and A.F. Rocha /Meme-Gene Coevolution 77
which also contributes to the increasing in brain size. Hence, the meme-gene coevolution
acting sinergically to the evolution of our sophisticated language and enormous brain.
Since, unlike genes, memes do not come package with instructions for their
replication, our brain do it for them, strategically, guided by a fitness landscape that
reflects both internal drives and a world view that is continually updated through meme
assimilation [5l
The increase in brain size began with the macroevolutionary events that culminated
with the first species of the Homo genus, about two and a half million years ago. By about
100,000 years ago H.sapiens had brains as large as ours and the other sapiens sub-species,
the Neanderthals had brains with a volumetric capacity larger than ours. They controlled
fire, had cultures and probably had some form of language as well.
The increase in brain size, however, have a price [2]. Oversized brains are expensive
to run and ours consumes 20% of the body energy for a size correspondent to only 2% in
weight. In addition, brains are expensive to build. The amount of protein and fat necessary
for the development of human brain forced the first members of the Homo gender to
increase their meat consumption, which implied in better hunting strategies, which in turn
feed back in increased brain size. Finally, big brains are dangerous to produce. The
increasing in brain size, along with the bipedal gait of Homo specimens implied in severe
birth risk. The big brained human babies have enormous difficulty to pass through the
narrowed birth canal. This implies, in addition to the higher maternal and foetal mortality,
that the human baby born prematurely, as compared with other primates. On one hand this
have the beneficial consequence that our brain has greater neuronal plasticity, which
increases its learning capacity. On the other hand, the complicated twisting manoeuvres
the human foetus had to do in order to pass through the birth canal implies that the human
female rarely is able to deliver without assistance. This also contributes to socialisation
and additional selection for brain increasing.
Our brains have changed in many ways other than just size. The modern human
prefrontal cortex, oversized when compared with other hominids, is fed by neurons
coming from practically all other parts of the brain. Its role in the complex cognitive
abilities of modern humans is still to be decifrated but we already known that when
damaged by accident or surgical removal (a common practices some decades ago) the
victim is severely limited in its calculation performance (in addition to many other
personality changes). We will return to the prefrontal cortex later on this text.
So why did the human brain increased so much? Several theories have been
proposed but in this paper we will adopt the meme theory proposed by Blackmore [2].
According to this author the turning point in our evolutionary history was when we begun
to imitate each other. This implies that the selection pressures which produced the increase
in our brain size were initiated and driven by memes, according to the following rationale.
Suppose that your neighbour has learned (or invented) some really useful trick or tool.
Then, by imitating him you can avoid the slow, costly and potentially dangerous process of
trying out new solutions by yourself. It is obvious the competitive edge imitation can have.
Imitation, however, requires 3 skills t6]:
In summary, big brains is a far from expected natural event in our evolutionary path,
due to the heavy constrains they impose. Therefore, the genecomplex responsible for
increase in brain size must provide a selective advantage to its possessors. The memetic
theory of brain size assumes that imitation was the driving force behind the selective
pressures - genes have been force into creating big brains capable of spreading memes.
We are born with a capacity to enumerate objects which is strictly limited to very
few items and we share this genetically determined characteristic with more 'primitive'
(that is, less brained) animals. Several empirical evidences have already demonstrated that
the counting capacity is innate in rats, pigeons and monkeys (3). There is even an
observable and remarkable competence of human babies in simple arithmetic. This
genetically determined arithmetical competence was studied by several authors since the
beginning of the last century. Some interesting experiments by early psychologists were
summarised by Dehaene [3] in which human subjects are asked to enumerate objects. The
results are that enumerating a collection of items is fast when there are one, two or three
items, but starts slowing drastically beyond four. In addition, errors begin to accumulate at
the same point. To circumvent this genetical upper bound in our counting capacity we
invented a clever strategy: counting in blocks, a process called in the specialised literature
as 'subtizing'. It takes about five- or six-tenth of a second to identify a set of three dots , or
about the time it takes to read a word aloud or to identify a familiar face and this time
slowly increases from 1 to 3 t3]. Therefore, subtization requires a series of visual operation
all the more complex the greater the number to be recognised. But we are endowed with a
highly sophisticated thinking machine whose sheer size and tremendously complex wiring
allows us to outstanding cognitive performance without parallel in the phylogenetic scale.
Even Neanderthals, whose brain were equal or slightly bigger than ours, but whose
neuronal composition and configuration were probably strikingly different from ours,
could not imagine to reach the complexity of our cerebral products, like modem
mathematics. How and why this cognitive abilities evolved? It is tempting to answer this
question by the obvious assumption that the larger the brain the more its owner capacity to
adapt and survive in aggressive and/or rapidly changing environments. However, as
mentioned above, large brains are expensive if overdimensioned and the role of memes in
the evolution of brain size was already explained in the previous section. One could then
argued that, in addition to the size of the brain, it is its configuration which is responsible
for the differential mathematical competence of modern humans. But is there a cerebral
region responsible for mathematical thinking? The first experiments, carried out in the
early eighties, demonstrated a higher cerebral activity during numerical performance, in
E. Massad and A.F. Rocha /Meme-Gene Coevolution 79
particular the inferior parietal cortex as well as multiple regions of the prefrontal cortex [7].
Recent experiments with functional magnetic resonance, however, demonstrated that
several other cerebral areas are activated during mental calculations [3]. It is now accepted
that the inferior parietal region is important for the transformation of numerical symbols
into quantities, and the representation of relative number magnitudes. The extended
prefrontal cortex in turn is responsible for sequential ordering of successive operations,
control over their execution, error correction, inhibition of verbal responses, and, above all,
working memory. It is impossible to dissociate memory from calculation!
Let us imagine the African environment of about 150,000 years ago. The first wave
of Homo erectus emigration out of Africa had already occurred. Groups of Neanderthals
wandered throughout Europe and Asia, endowed with a cerebral capacity of the same
order of magnitude of modem humans but, as mentioned above, with a neuronal
configuration certainly different from us. At that time, the first modern humans were
organising themselves into tribes and, the cognitive threshold of meme spreading had
already being surpassed. These humans certainly had a cultural structure more
sophisticated than the European and Asiatic Neanderthals and, after a second emigration
wave out of Africa, they encountered, clashed against and even mate with their European
cousins, some 100,000 years later. It is now widely accepted that, in spite of the fact that
both modern humans and Neanderthals had the same cranial capacity the organisational
competence of the former displaced, and even caused the latter extinction. It is even
possible to speculate that a superior mathematical competence of modem humans
contributed to the decisive events that culminated with the extinction of Neanderthals.
Strategic thinking involves, among other things, a mathematical sophistication that was
probably inferior in Neanderthals when compared with modern humans. In addition, as
explained in the previous section, the prefrontal cortex, a well developed area in modern
humans and very small in Neanderthals is one of the predominant cerebral area in
mathematical processing.
Well before these events, something between 1 and 2 millions ago, Homo ergaster
started the evolutionary line which culminated with the modern human [8]. Actually, two
other species of the genus Homo preceded H. ergaster, namely, Homo habilis and Homo
rudolfensis. However, the sophistication of the brain which resulted in our cognitive
capacity to deal with numbers started probably with Homo ergaster. What happened at that
time? It is possible that a particular individual happened to be endowed, by a set of more
or less complex mutations, with the capacity of numerical processing well above the
simple blocking counting. This individual was able to count objects, potential weapons and
enemies. His edge over his competitors was obvious. Suppose now that another individual,
minimally endowed with the cerebral capacity and with a genecomplex for imitation. If
this second individual is able to imitate the counting process adopted by the first one, his
edge over competitors increases as well. If the memes for counting are appropriately
imitated by those individuals with a higher cerebral capacity, their differential
reproductiveness would guarantee that his genes for increased cerebral capacity would
start to spread in this population. Hence, the meme-gene-coevolution of mathematical
abilities.
80 E. Massad and A.F. Rocha /Meme—Gene Convolution
8. Final conclusions
changing the world, and our world vision, had certainly helped in the evolution of our
cognitive capacity. Since the first hominids surpassed the imitation threshold, individual
mathematical memes more complex than our innate subtizing, in addition to all other
memeplexes that characterise human culture, have been transmitted by imitation and later
on, by teaching, creating an autocalytic virtuous circle that culminated in the human brain.
References
[1] Dawkins, R., The Selfish Gene. Oxford University Press, 1976.
[2] Blackmore, S., The Meme Machine. Oxford University Press, 1999.
[3] Dehaene, S., The Number Sense. How the Mind Creates Mathematics. Penguin Books, 1997.
[4] Kendal, J. R. and Laland, K. N., Mathematical Models for Memetics. Journal ofMemetics - Evolutionary
Models of Information Transmission, 4. http:/,(jom.emit.cfpm.org/2QOQ/vol4/Tcendal_ir&laland_kn.html.
2000.
[5] Gabora, L., The Origin and Evolution of Culture and Creativity. Journal of Memetics - Evolutionary
Models of Information Transmission, l.http://jom-mit.cfpm.org/voll/gabora_l.html, 1997.
[6] Blackmore, S. J., Probability misjudgement and belief in paranormal: a newspaper survey. British
Journal of Psychology, 88, 1997, pp. 683-689.
[7] Roland, P.E and Friberg, L., Localization of cortical areas activated by thinking. Journal of
Neurophysiology, 53, 1985, pp. 1219-1243.
[8] Conroy, G.C., Reconstructing Human Origins: A Modern Synthesis. W.W. Norton & Company, 1997.
[9] Rocha, A.F. and Massad, E., Evolving Arithmetical Knowledge in a Distributed Intelligent Processing
System, 2002. This volume.
Advances in Logic, Artificial Intelligence and Robotics
J.M. Abe and J.I. da Silva Filho (Eds.)
IOS Press. 2002
Abstract
This paper gives a outline of the operational steps by which perceived patterns
of behavior are transduced from peripheral sensors to the central nervous
system, and then to synaptic and intra-neuronal mechanisms called Signal
Transduction Pathways (STPs). STPs control cyclic AMP molecules that
activate transcription factors leading to differential desinhibition of genes. One
of the consequences of the expression of such genes, besides the production of
proteins present in the previously activated STPs, is the activation of neural
trophic factors (NTFs) that control dendritic growth, and therefore contribute to
shape the morphology and patterns of neural connectivity in the brain. As this
morphology and dynamical connections influence the behavior of the individual
in the social context, a dynamic rapport between cultural patterns ("memes") and
his/her genetic system can be established in epigenesis.
1. Introduction
The concept of "memes" comes from a darwinian tradition '"' and has
contributed to a broader understanding of primate evolution, in terms of a co-
evolutionary process involving an interplay of cultural and biological patterns (see
Deacon [2J). In this paper I will consider the possibility of studying this interplay from
the perspective of cognitive neurobiology.
A meme translates into a complex, usually multimodal sequence of stimuli,
generated from a repetitive pattern of behavior in a social context. The perception of a
meme involves not only single objects, their features and temporal processes involving
them, but also associations between such objects, features and processes. For instance,
perception of the use of a stone tool involves recognition of different objects and their
respective features (tool, hand, substrate modified by the tool, product of the
transformation, social consequences of generating this product) and the conception of
the process as a whole, including multimodal associations between proprioceptive,
tactile and visual information.
The concept of meme also refers to the possibility of cultural reproduction of
such behavioral patterns, i.e., the patterns that qualify to be a meme are those
susceptible of being imitated, thus providing adaptive advantages for the imitators in
the group. Therefore, the concept of meme in cognitive neurobiology should also
account for a loop between complex perceptual patterns and behavioral schemes
("action schemes", as discussed by Pereira Jr. |3)). Therefore a meme is a perceived
behavioral scheme that can be translated by the brain (with the involvement of the pre-
motor cortex, as argued by Gallese et al. [4]) into an action scheme, making the
imitation of the perceived behavior possible.
A. Pereira Jr. / Neuro rial Plasticity: How Memes Control Genes 83
The possibility of the brain being shaped by experiences has been clearly
defined in recent studies of regulatory processes in the cell. Although all somatic cells
in a biological individual share practically the same genes, the patterns of expression of
these genes in different tissues is different. In other words, the pool of proteins present
in each different tissue is different from the others, although all are produced from the
same genes. The solution for this apparent paradox is to conceive the genome as a
combinatorial system like natural languages, where a practically infinite^ number of
different sentences can be formed from a finite number of words, which are formed
from a relatively small number of letters.
hi the genome, the combination of a small number of "letters" (A, T, C, G, U,
corresponding to nucleotides adenine, timine, citosine, guanine and uracil) is sufficient
to generate different "words" (genes), that can be combined to generate different
"sentences" (linear sequences of aminoacids that fold into proteins).
Early in ontogeny, dynamical processes in the cytoplasm, beginning with
differential factors already present in the ovule ^, as well as signaling proteins derived
from "notch" genes (see Russo f6]), are believed to contribute to embryo differentiation
into specialized tissues. Each different tissue has a different pool of proteins, and this
difference is maintained since each pool of proteins activate the genome to produce the
same proteins (this idea is close to the idea of an "autopoietic" process proposed by
Maturana and Varela [7], although these authors didn't focus on the mechanism of tissue
differentiation). Such an auto-regulation of protein production is possible because the
genome is a combinatorial system susceptible of a large number of different "readings".
Based on this reasoning, Rocha et al. [8] presented a formal grammar (G)
composed of a set of symbols corresponding to different nucleotides, and a syntax
defined by a set of combinatorial rules:
G={Vs,Vn,Vt,P,n], (1)
p: a Si b —> a Sj b, (2)
where Si is a string of symbols pertaining to the union of all symbol sets, Sj is a string
of symbols pertaining to the union of Vs and Vn, and a and b are contextual parameters.
The genetic grammar L is a sub-set of the strings generated by G. Each string
accepted as belonging to L is a well-formed formula (wff) obtained as the derivation
chain (d) required to transform an initial symbol into a terminal one:
wff (So, Sm) = d (So, Sm) = a SI b —> ... a Si b —> Sm, (3)
series of transformations beginning with an initial symbol and ending with a terminal
one.
a) the temporal structure of the evoked response in a cell assembly carries information
about the stimulus;
A. Pereira Jr. / Neuronal Plasticity: How Memes Control Genes 85
b) the relative timing of spikes in central neuronal assemblies is reliable and also
carries information;
c) individual neurons participating in an assembly function as oscillators, with less
specialization than previously supposed, being able to selectively "resonate" to a range
of temporal patterns;
d) the possibility of combination of different temporal patterns across different cell
assemblies.
Therefore, the logical grammar of the electrical neural code should account for
three kinds of factors:
When electrical pulses reach the synaptic terminal and open vesicles, thus
releasing transmitters in the synaptic cleft, the patterns of information encoded in
electromagnetic form is transduced to a chemical/molecular code. This code is
composed of three kinds of elements: transmitters, receptors and neuromodulators.
Such elements have affinity or not to each other. When they combine (combinations of
two, three or more elements) they originate specialized agents. These agents can be
classified in five categories:
where Vt is the set of initial symbols belonging to T, Vr the set of intermediary symbols
belonging to R, Vm the set of intermediary symbols belonging to M, Vf the set of
terminal symbols belonging to F, and P and n are defined as in (2).
Strings generated by the STP grammar are represented as a well-formed
formula (wff) obtained as the derivation chain (d) of symbols S required to transform an
initial symbol into a terminal one:
wff(So, Sr, Sm, SJ) = d (So, Sr) = a SI b —> ... a Si b —> Sf (5)
6. Acknowledgements
I would like to express my gratitude to Drs. Armando F. Rocha and Eduardo Massad
for their encouragement to write this paper, and to CNPQ/Brasil for financial support.
References
[I] DAWKINS, R., The Selfish Gene. Oxford University Press, 1990.
[2] DEACON, T.W., The Symbolic Species: The Co-Evolution of Language and the Brain. W.W. Norton and
Co., 1997.
[3] PEREIRA JR., A., A Possible Role for Action Schemes in Mirror Self-Recognition. Revista de Etologia
2, 1999, pp.127-139.
[4] GALLESE, V., FADIGA, L., FOGASSI, L. and RIZZOLATI, G., Action Recognition in the Premotor
Cortex. Brain 119, 1996, pp. 593-609.
[5] PEREIRA JR., A.., GUIMARAES, R.. e CHAVES JR., J. C., Auto-Organiza9ao na Biologia: Nivel
Ontogenetico. In: Debrun, M., Gonzales, M.E. e Pessoa, O. (Eds.) Auto Organizacao - Estudos
Interdisciplinares. Centro de Logica e Epistemologia/ UNICAMP, 1996.
[6] RUSSO, E., Interpreting the Signaling of Notch: Delineation of Biochemical Pathway Points to Possible
Alzheimer's Therapy. The Scientist 15 (4), 2001, pp. 19-22.
[7] MATURANA, H. and VARELA, F., Autopoiesis and Cognition: the Realization of the Living. Boston
Studies in the Philosophy of Science 42/Reidel, 1980.
[8] ROCHA, A.F., REBELLO, M.P. and MIURA, K., Toward a Theory of Molecular Computing.
Information Sciences 106 (1/2), 1998, pp. 123-157.
[9] FOTHERINGHAME, O.K. and YOUNG, M.P., Neural Coding Schemes for Sensory Representation:
Theoretical Proposals and Empirical Evidence. In Rugg, M.D(Ed.) Cognitive Neuroscience. Cambridge,
MIT Press, 1997.
[10] CONNOR, C.E., HSIAO, S.S., PHILLIPS, J.R. and JOHNSON, K.O., Tactile Roughness: Neural
Codes That Account for Psychophysical Magnitude Estimates. The Journal of Neuroscience 10 (12),
1990, pp.3823-3836.
[ I I ] CONNOR, C.E. and JOHNSON, K.O., Neural Coding of Tactile Texture: Comparison of Spatial and
Temporal Mechanisms for Roughness Perception. The Journal of Neuroscience 12 (9), 1992, pp. 3414-
3426.
[12] HUBEL, D.N. & WIESEL, T.N., Receptive Fields, Binocular Interaction and Functional Architecture in
the Cat's Visual Cortex. Journal of Physiology 160, 1962, pp.106-154.
[13] HUBEL, D.N. & WIESEL, T.N., Receptive Fields and Functional Architecture of Monkey Striate
Cortex. Journal of Physiology 195, 1968, pp. 215-243.
[14] PHILLIPS, W.A. (1997) Theories of Cortical Computation. In Rugg, M.A. (Ed.) Cognitive
Neuroscience. Cambridge: MIT Press, 1997.
[15] CARIANI, P., As Time Really Mattered: Temporal Strategies for Neural Coding of Sensory
Information. In Pribram, K.(Ed.) Origins: Brain and Self-Organization, Hillsdale: Erlbaum, 1994.
[16] HOPFIELD, J.J., Pattern Recognition Computation Using Action Potential Timing for Stimulus
Representation. Nature 376, 1995, pp. 33-36.
[17] RIEKE, F., WARLAND,D., STEVENINCK, R.R. and BIALEK, W., Spikes. Cambridge: MIT Press,
1997.
[18] CHANGEUX, J.P. and EDELSTEIN, S.J., Allosteric Mechanisms in Normal and
Pathological Nicotinic Acetylcholine Receptors. Current Opinion in Neurobiology 11 (3), 2001, pp. 369-
377.
[19] MERIGHI, A., Costorage and Coexistence of Neuropeptides in the Mammalian CNS. Progress in
Neurobiology 66 (3), 2002, pp. 161-190.
[20] ROCHA, A.F., The Brain as a Symbol-Processing Machine. Progress in Neurobiology 53, 1997, pp.
121-198.
[21] BHALLA, U.S. and lyengar, R., Emergent Properties of Networks of Biological Signaling
Pathways. Science 283, 1999, pp. 381-387.
Advances in Logic, Artificial Intelligence and Robotics
J.M. Abe and J.I. da Silva Filho I Eds.)
IOS Press. 2002
1
DMA - IMECC - UNICAMP, 13.081-970 - Campinas/SP, Brazil
2
IGCE- UNESP, 13.506-700 - Rio Claro/SP, Brazil
1 Introduction
The models of Kermack and McKendric are the first mathematical models in Epidemiology.
These models consider that all the infected individuals have the same chance of infecting
susceptible in each meeting. More recent models consider different factors that influence the
occurence of a new infection, among them, the viral charge of infected individuals. Aiming
to obtain information about disease control and to do more precise analysis, it is common to
consider the infected class subdividing it into n different stages, according to the infectivity
degree of the individuals. In this case, the complexity of the model increases, making difficult
the analysis of some epidemiological parameters, like the Basic Reproduction Value RQ [4].
The mathematical treatment of gradual uncertainties, like one used to differentiate the
individuals within a population, has utilized more and more techniques of Fuzzy Theory.
Sadegh-Zadeh [7] distinguishes the individuals according to their health state, considering
intermediary stages between health and disease.
In [2] a model is proposed in which the heterogeneity of infected population is taken into
account, and in which the individuals infect differently according to their viral charge. For
this, the contact rate was considered as a fuzzy set, and it was then possible to obtain a Basic
Reproduction Value R^ which is different from the deterministic model, RQ.
In this paper we present more information about disease control based on RQ . In addition,
a comparative study between the equilibrium points of the disease for the the classical and
fuzzy SIS models is done.
L.C. de Burros et al. / The Influence of Heterogeneity 89
2 Preliminaries
A fuzzy subset F of the universe set U is simbolically represented by the membership func-
tion u : U -4(0,1], where u(x) indicates the degree of membership of £ in the fuzzy set
F.
In order to use a defuzzification method, we need the concept of fuzzy measure. Let Q be
a non-empty set and P(fi) the power sets of fi.
The function /x : P(fi) —> [0, 1] is a fuzzy measure if
a) //(0) = 0 and //(ft) = 1 b) n(A) < //(5) ifACB.
As a defuzzifier, we are going to use the fuzzy integral or Fuzzy Expected Value of fuzzy
set that is given by
FEV[u] = sup inf [a, H(a)],
0<a<l
where H(a) — fj,{x €: ft : u(x] > a}. Thus, H(a) is a decreasing function and the FEV[u]
is the fixed point o f H ( a ) .
Sugeno [6] has proved that | FEV[u] — E[u] |< 0.25 if // is a probability measure, u is
both a fuzzy set and random variable and E[u] the classical expectancy of u.
3 The Model
The simplest mathematical model to describe the dynamics of directly transmited diseases
with interaction between susceptible and infected individuals, which do not confer immunity,
is the SIS model without vital dynamics. Mathematicaly, it is described by a non-linear system
of differential equations
f f = -PSI + iI
(1)
We have chosen for the fuzzy subsets j3 and 7, the following membership functions ([!]):
0 if v <£ [v m in,v max )
and w =til l , + i
V\f ~ Umin
1 if v 6
Traditionaly, a disease control program is based on the Basic Reproduction Value RQ, which
gives the number of secondary cases caused by an infected individual introduced into a whole
susceptible population.
For the classical model 5/5 we have RQ = & and the disease will be extinct if RQ < 1.
and it will become established in the population if RQ > 1 [3].
The use of RQ(V) to control the disease pressuposes the knowledge of the viral charge to
the whole population. To make the model more realistic, we consider that the viral charge v €
[0. fmax] has different chances of occurence in the population, i.e., each individual contributes
differently to the disease propagation, whose membership function is given by p(v)
if v £ [v — 8. v + 6]
if v 6 [v — 8, v]
-$(v-v-6) if v G (v, v 4- <5]
The parameter v is the median viral charge (in this case, the median viral charge of the
infected individuals), and 8 is the dispersion. Observe that p(v) is typically a membership
function of a triangular fuzzy number [5].
The Fuzzy Basic Reproduction Value (R^) will be defined using the concept of Fuzzy
Expected Value (FEV). This new parameter is a kind of average of RQ(V), taking into account
the distribution of the viral charge v, according to the membership function p(v).
L.C. de BIHTOS et al. / The Influence of Heterogeneity 91
Note that RO(V) isn't a fuzzy set since it can be bigger than one. However, it is easy to see
that the maximum value of RQ(V) is — . Then, 70-Ro(^) < 1> indicating that 70jRo is a fuzzy
set and in this case FEV [lo-Ro] is well-defined.
We define the Fuzzy Basic Reproduction Value for the whole population by:
Pi = -FEVfroRo]. (2)
7o
As seen in the preliminary section,
This one can be considered a very reasonable measure to be adopted by experts who
do not want to take risks in their evaluations and possible treatments, because it is quite a
conservative measure in the sense that the infectivity degree of a group is represented by the
individual with the highest degree of infectivity.
Returning to the calculation of FEV[yoRo\, observe that as ^4 is not decreasing with
v. Therefore, the set X is an interval of the form [i/, vmax] where v' is the solution to the
equation
7o -
7(1;)
Then,
Observe at first, that H(0) = I and H(l) = p(vmax). On the other hand, if a > 0 and
since ^4 is increasing, we have
In order to exemplify the evaluation of FEV[yoRo], we will assume that the viral charge
V of a group of individuals is a linguistic that assumes the classifications: weak (V_), strong
(V + ) or medium (Vj~). Each classification is a fuzzy number based on the values vm-m, VM
and wmax that appear in the definition of /3 (see the Figure 2).
92 L.C. de Burros ct ai ! The Influence of Heterogeneity
• Case a) Weak viral charge (V_), which is characterized by v + 6 < vm\n. Since v + 8 < v'
then
and
o] = 0 < 70 « P < 1
Therefore, as -^ > 70, it follows that R$ > 1, which indicates that the disease will
become established in the studied group.
Case c) Medium viral charge (V+), which is characterized by v—6 > vmin and v+8 < VM.
L.C. de Burros et al. / The Influence of Heterogeneity 93
Considering the viral charge present in the population, from equations system (1) we have
the following equilibrium points:
A stability analysis of the classical model (1) shows that Pa is unstable while P2 (with
(3 > 7) is asymptoticaly stable. Observe that v* for which 0(v*) = 7(v*) is the bifurcation
value.
Since the viral charge has different possibilities of occurence, we can say the same for the
equilibrium points P^(v) . Then, we calculate the average number for these points (P2) and
compare with the deterministic case P^(v) (see, the Figure 3).
Consequently, 72 < /2(v), that is, the proportion of infected individuals, considering all
the viral charges, is less than the proportion of infected individuals when only the median
viral charge is considered (assuming population homogeneity).
For the classification of viral charge used earlier, it is possible to obtain precise values for
Hi-.
Thus, using the expression v' for this case, FEV^Ro] is the only positive solution to the
second degree equation:
• for the medium viral charge, we have vmin < v' < VM. Then, FEV[^0Ro] is the only
positive solution to the second degree equation:
that is, there exists only one viral charge v such that RQ (classical) and RQ (fuzzy) coincide. In
addition, the average number of secondary cases (R$ ) is higher than the number of secondary
cases (Ro(v)) due to the median viral charge.
To reinforce the use of system (1) with the parameter v, in the sense of describing the
dynamic of the disease in the whole population, we will analyze the control of the evaluation
of the disease in the population using RO(V) = R^ :
L.C. de Barros et al. / The Influence of Heterogeneity 95
• for the case in which the viral charge is weak, we have v < v + 6 < vm\n- Then RQ(V) = 0
and the disease will not be established in the population.
• for the case in which the viral charge is strong, we have V>V>V + &>VM- Thus,
RO(V) = ^y > 1, indicating that the disease will become established.
• for the case in which the viral charge is medium, we have:
- if v* > v entao .Ro(^) = —^r < , , = 1,indicating the disease will be extinct.
7(7;) y(v*)
- ifv*<v entao RQ(V) = —~ > —}—^ = 1, indicating the disease will be present
l(v) y(v*)
in the population, where v* is the bifurcation value, as seen in the section 5.
Now, the adoption of R{ agrees with the policy of disease control used by public health
experts:
1. Since R^ is obtained as a positive solution of second degree equation, it is not difficult to
see that its reduction can be obtained by increasing umin (consequently increasing v*), or
in other words, by decreasing the susceptility of the group. This can be accomplished by
improving the quality of life of the study population.
2. Since v € (v, v + 8), RQ can be diminished by decreasing the median viral charge; for
example, through the use of medicine or isolation of infected individuals (decreasing 8).
In this study we proposed a possible way to use Fuzzy Theory to model the uncertainty
present in Epidemiology. Our main conclusions are that, if the uncertainty is excluded before
modelling (e. g. a deterministic model), important parameters of management and control of
phenomenon are not being well evaluated. Moreover, it can occur that each parameter is an
underestimate of the real one, as was seen above: -Ro(tJ) < RQ.
Acknowledgments: This research was supported by CNPq and FUNDUNESP.
References
[ 1 ] L.C.Barros, R.C.Bassanezi and M.B.F.Leite, The epidemiological models SI with fuzzy parameter of trans-
mission, (submitted).
[2] L.C.Barros, R.C.Bassanezi, R.Z.G.Oliveira and M.B.F.Leite, A disease evolution model with uncertain pa-
rameters. Proceedings of 9th IFSA World Congress and 29th NAFIPS International Conference, Vancouver,
(2001)
[3] L. Edelstein-Keshet, Mathematical Models in Biology. In: Birkhauser Mathematics Series. McGraw-Hill
Inc., New York, (1988).
[4] M.B.F.Leite, R.C.Bassanezi and H.M.Yang, The basic reprodution for a model of directly transmitted infec-
tions considering the virus charge and the immunological response ". IMA Journal of Mathematics Applied
in Medicine an Biology 17 (2000) 15-31.
[5] H.T.Nguyen and E.A.Walker, A First Course in Fuzzy Logic, CRC Press, Inc. (1997).
[6] M.Sugeno, Theory of Fuzzy Integrals and Its Applications. Doctoral Thesis, Tokyo Institute of Technology.
(1974).
[7] K.Sadegh-Zadeh, Fundamentals of clinical methodology: 3. Nosology - Artificial Intelligence in Medicine
17(1999)87-108.
Advances in Logic. Artificial Intelligence and Robotics
J.M. Abe and J.I. da Silva Filho (Eds.I
IOS Press. 2002
Abstract
Data warehouse can be seen as a large database in which we can use several
techniques from data mining and knowledge discovery in database (KDD). Data
warehouses store many data which may have features related to incomplteness
and inconsistency. To provide a foundation of data warehouses, paraconsistent
logics are useful. In this paper, we sketch an outline of data warehouses using
existing systems of paraconsistent logics. We compare these systems in several
respects and claim that most systems are suitable to KDD. We also argue that
some extensions of paraconsistent logics can be developed to show the mecha-
nism of data mining.
Keywords
Paraconsistent logics, data warehouse, database, data mining, knowledge dis-
covery in database (KDD).
1 Introduction
Database now becomes one of the necessary software systems in our daily life.
As a consequence of the development of memory technology, it is possible to
store a large number of data in a database. Database theory can serve as a
basis of efficient storage and data retrieval from large databases, in particular.
relational databases. Recently, large databases are attracted under the name
of data warehouses since the proposal of Inmon (1992). We now expect to see
5. Akama andJ.M. Abe / Paraconsistent Logics 97
2 Data Warehouses
By data warehouses, we mean a large database that is a basis of extracting
"hidden" information of data. Here, such informaion is available by methods in
data mining. And many techniques of data mining are gathered and discussed
in the area of knowdge discovery in database (KDD). However, the idea of data
warehouses is not new and redicovered in the literature on computer science.
One of such topics is undoubtedly found in the work on decision support system
(DSS) and management information system (MIS); see Date (2000). Roughly
speaking, these systems were skecthed as intelligent systems to help our decision
and for this purpose we need some kind of "database".
Below we use the term "data warehouse" to mean a class of such intelligent
systems. In this regard a data warehouse is a database with the following
mechanisms:
• data input
• information retrieval
• data manipulation
• modelling
98 5. Akama andJ.M. Abe / Paraconsistent Logics
• analysis
• reporting
Data input is a process of constructing a database. Information retrieval is to
extract the relevant information from data. For instance, in relational database,
it can be done by SQL. Data manipulation is to transform data suitable to the
analysis. Modelling is to specify a model of decision support. Based on the given
model, analysis gives data an interpretation by some method from fields mainly
in statistics. After the analysis, reporting outputs the result in a document. One
can see that the recent database systems also have similar mechanisms. In this
sense, the point is that data warehouses should be interpreted as "intelligent"
databases in which the relevant information can be extracted.
Investigators in the area agree that data warehouses should have the follow-
ing features. First, they are subject-oriented in that data depend on a certain
subject. Second, they are integrated in that data in data warehouses should
also satisfy integrity constraints. Third, they are time-variant in that the input
time of data is crucial. Fourth, they are non-volatile in that no data is updated.
These features lead us to consider a foundation of data warehouses different
from that of databases. Here are some important points. A database is usually
consistent in the sense of some data model. On the other hand, data warehouses
allow to store large data which may inconsistent each other following the above
mentioned features. By non-volatility, new data may contradict old data. But,
by integrity, we need possible constraints in inconsistent data warehouses. In
addition, the subject-orientedness means that data belong to some domain in
discourse. And we should integrate the notion of time due to the time-variance.
If we consider a logic-based formalization of data warehouses, it must satisfy
these features. But, in this paper, we starts with the foundation satisfying
integrity and non-volatility. The remaining features will be reconsidered in
possible extensions of the foundation suggested in section 5.
3 Paraconsistent Logics
If we model a database in logic, several notions like data model, integrity con-
straint, query should be logically formalized. This is a starting point of logic
database; see Gallair and Minker (1976). One might pursue a foundation of data
warehouses in this line. However, this line of extension is problematic. The rea-
son is that "logic" used in logic database is classical logic. The difficulty lies in
the inability of handling inconsistency of classical logic. Since classical logic is
consistent in the sense that both A and -<A hold, it cannot satisfy non-volatility.
If we have a contradiction in classical logic, we can deduce arbitrary conclusions.
This means that we cannot extract relevant information from the inconsistent
database. Therefore, we can conclude that the logic database based on classical
logic has no extensions relevant to data warehouses.
Fourtunately, there are logical systems capable of fomalizing reasoning with
inconsistency. In the literature, such systems are known as paraconsistent logic.
5. A kama and J. M. A be / Paraconsistent Logics 99
fails. The integrity constraint is nothing but the law of non-contradiction in the
standard logical form. It is well known that there are cases in which the Clark
completion gives rises to inconsistency in (classical) logic databases. In such
cases, classical logic has no semantics. On the other hand, in paraconsistent
logic databases, inconsistency is in general local. Namely, one contradiction is
harmless in the sense that other data can be consistently captured. It is also
possible to deduce some conclusion from the paraconsistent logic database in
such cases.
But, we need to resolve inconsistency in paraconsistent logic. This perhaps
needs non-logical mechanism. The paraconsistent logic database thus guaran-
tees the non-valatility in the sense that we need not update data. It also has the
integrity, since the integrity constraint in paraconsistent logic plays the role. In
addition, we formalize a query language of paraconsistent logic databases. The
5. Akama and J.M. Abe / Paracomistent Logics 101
query in this context is data mining process which can be expressed as logi-
cal inferences. Therefore, data mining can be logically expressed. In fact, we
know some implementations of paraconsistent logic programming languages like
Paralog.
There are several extensions of logic databases. For instance, disjunctive
logic databases allow disjunction on the head of a rule, i.e. rules are written
in non-Horn clauses. By this extension, we can specify some descriptions like
null values related to incompleteness in disjunctive logic databases. It would
be possible to extend the present framework with disjunction for paraconsistent
logic databases.
In this way, if we employ some system of paraconsistent logic as the under-
lying logic of databases, the database will become a data warehouse. It should
be also noted that if inconsistency does not arise in a paraconsisten logic data
base, it behaves like the (classical) logic database. In this regard, paraconsistent
logic databases are an extension of classical logic databases. But this is not the
end of story.
logic can be also applied to the foundations of data bases. We can reformulate
the logic of belief revision in a paraconsistent setting.
6 Conclusions
We proposed a foundation of data warehouses in paraconsistent logics. The
attempt may be original and firstly be in print. The basic idea is that data
warehouses can be seen as paraconsistent logic databases and the restriction to
(classical) logic databases induces several stages related to data mining. This
implies that paraconsistent logics can be regarded as one of the sound descrip-
tions of data warehouses with several notions from data mining.
Future work includes the formulation of the present ideas in the context of
relational databases. We need to develop a version of paraconsistent relational
algebra in the style of Codd (1972). If it is successful, the paraconsistent versions
of both domain and tuple logics could be similarly reformulated.
Another topic is to design the language of data warehouses capable of data
mining. One of the candidates is obviously annotated logics of da Costa et al.
(1991). The reason is that annotated logics have smooth basis and computa-
tional proof theory.
It would be also interesting to fuse the given foundation with non-logical data
mining techniques from statistics, learning, and so on. This line will classify the
usefulness of known techniques in logic-based approaches to data warehouses.
Because the present paper addresses the conceptual basis of data warehouses
in logical view, we should elaborate on the ideas technically. We hope to report
these topics in forthcoming papers.
References
Adriaans, P. and Zantinge, D.: Data Mining, Addison-Wesley, Reading, Mass.,
1996.
Anderson, R. and Belnap, N.: Entailment vol. I, Princeton University Press.
Princeton, 1976.
Batens, D., Mortensen, C., Priest, G. and Van Bendegem, J.-P., eds.. Frontiers
of Paraconsistent Logic, Research Studies Press, Baldock, 2000.
Belnap, N.: A useful four-valued logic, J.M. Dunn and G. Epstein (eds.), Modern
Uses of Multiple-Valued Logic, 8-37, Reidel, Dordrecht, 1977.
Codd, E.: A relational model of data for large shared data banks, Communica-
tions of the ACM 13, 377-387, 1972.
da Costa, N.: On the theory of inconsistent formal systems. Notre Dame Journal
of Formal Logic 15, 497-510, 1974.
da Costa, N., Subrahmanian, V. and Vago, C.: The paraconsistent logic Pr.
Zeitschrift fur mathematische Logik und Grundlagen der Mathematik 37, 139-
148, 1991.
. Akama and J.M. Abe / Paraconsistent Logics 103
1
Department of Information Engineering
Kushiro National College of Technology
Otanoshike Nishi 2-32-1, Kushiro 084-0916, Hokkaido, Japan
Tel: +81-154-57-7351 E-mail: [email protected]
2
Division of Systems and Information Engineering
Graduate School of Engineering, Hokkaido University
Kita 13 Nishi 8, Kita-ku, Sapporo 060-8628, Japan
Tel: +81-11-706-6852 E-mail: {yasu, mine}@main. eng. hokudai . ac. jp
Abstract Piecewise linear classifiers are utilized in order to visualize class structures
in high-dimensional feature space. The feature space is split into many polyhedral
regions by the component hyperplanes of the classifier, and the class label of each
polyhedron is determined by taking majority rule on the number of samples. The intra-
and inter-class relationships of the polyhedra are shown as a graph, drawing each
polyhedron as a node and connecting the nodes by edges according to the adjacency of
the regions. Compared to another graph representation method, the proposed method
is superior in showing the inseparability or the closeness between the classes.
1 Introduction
overlapping cluster, onto two-dimensional space as a node instead of the individual sam-
ples, and connect the nodes by edges that represent the relationships between the subclasses.
By this method, they succeeded to visualize the spatial relationship among the class regions
without large loss of discriminant information. The comparison between the subclass-based
graph-representation method and the conventional mapping techniques was discussed in ref-
erence [6].
In this paper, we propose another way to visualize the spatial class structures, in which
the clusters of the training samples to be represented as nodes are formed by piecewise linear
classifiers. An example of the classifier is shown in Fig.l. In this method, the training samples
are split into nonoverlapping clusters by the component hyperplanes. Each cluster consists of
samples from almost same class. In this study, we project such clusters in a high-dimensional
space onto two-dimensional space in order to visualize the spatial structures of the class
regions.
Figure 1: Example of the piecewise linear classifier. The symbols represent training samples and the polygonal
solid line represent classification boundary. The dotted lines represent the component hyperplanes that separate
the training samples into clusters.
Such a graph representation had been already made by Sklasnky and Michelotti[7]. How-
ever, in that paper, the nodes were drawn manually, because the authors' main purpose was
simplification of the classifier, not was visualization.
Therefore, in this paper, we try to determine the locations of nodes by the principal two
components obtained by KL expansion in order to visualize the graph automatically. In ad-
dition, we take the inseparability, or the closeness, of the classes into consideration by repre-
senting it as the thickness of the edges.
In this study, we employed the Sklansky and Michelotti's method[7] for the construction of
piecewise linear classifiers. This method is originally intended to use for two-class problems,
but later extended for multiclass problems[8]. The algorithm can be described briefly as fol-
lows:
1. Form provisional clusters on the training samples for each class. Any conventional clus-
tering methods can be used for this purpose. We applied /cmeans method in this paper.
106 H. Tenmoto et nl. / Visualization of Class Structures
Keep the resultant assignments of the individual samples to the provisional clusters, and
calculate prototypes as the mean vectors of the local samples in the clusters. Let the pro-
totype sets be M \ and MI.
Note that, for fine visualization, the number of prototypes for each class should be appro-
priately selected according to the complexity of the class structure, which is given by a
priori information or MDL criterion[9].
2. Find close-opposed pair set II defined as £12 D £2i • Here, £i2 is the set of prototype-pairs
{(«, v)\u 6 All, v 6 M?}, in which v is the nearest prototype from a, and £2i is defined
in the reverse way. This concept can be extended to IT**) by expanding the nearness up to
the fcth. In this study, for the value of k, we adopted the maximum number of prototypes
among the two classes.
3. Separate all of the pairs in TI^ by hyperplanes. The hyperplanes have to be placed so as
to classify the local training samples correctly. Here, the local samples are constructed by
samples that are associated with the prototype-pairs to be cut by this hyperplane.
Any nonparametric linear classifiers can be employed for this purpose. In the original
paper[7], they used a probabilistic descending method called "window training proce-
dure" [10, 11]. However, such a probabilistic algorithm produces different results in each
run-time, and such a behavior is not preferable for visualization. Therefore, in this paper,
we adopted deterministic "minimum squared error" method[12] for the local training.
4. Determine the class label of each region split by the hyperplanes. Majority rule is applied
for the decision on the local training samples in each region.
As a result, piecewise linear decision boundary is formed in the feature space like as Fig. 1 .
Such a boundary may have an appropriate complexity as a classifier.
From the resultant piecewise linear classifier, we can obtain a graph representation G =
(V, E) of the high-dimensional class structures as follows:
1. For each region that is surrounded by the component hyperplanes, calculate the mean
vector of the local samples. Let the mean vectors as nodes v € V. Here, we calculate the
projections of the mean vectors onto two-dimensional space, which is spanned by the two
principal basis vectors obtained by KL expansion taken on all of the training samples. The
sizes of the nodes are decided proportional to the standard deviation of the local samples
in the corresponding regions.
Here, we introduce a threshold parameter 9 in order to suppress the appearance of numer-
ous very small nodes. The nodes that do not have samples more than 9 are removed from
the resultant graph.
2. Connect the nodes by edges e 6 E if the two nodes belong to the same class and the
distance between the corresponding regions is 1 . Here, the distance is calculated by city
block distance, i.e., the adjacent regions sharing same hyperplane have distance 1.
3. Connect the nodes between different classes if the distance between the corresponding
regions is 1 . Here, the thickness of the edge is determined proportional to the following
entropy:
H. Tenmoto et al. / Visualization of Class Structures 107
N+
l
^ N+ + A/"- z N+ + N~
+
Here, 7V is the number of training samples belonging to the major class, and N~ is the
number of training samples belonging to the minor class (Fig.2).
Figure 2: Example of counting samples for entropy calculation. Here, N+ = 19, N = 3 for region RI , and
N+ = 18, N- = 4 for region R 2 .
Figure 3: Graph representation for the example data. The white circles represent the clusters of class 1, and
the gray circles represent the clusters of class 2. The number of local samples in each cluster is shown in the
corresponding node. The inter-class edges with the values of entropies represent the closeness between the
classes.
Figure 4: Example of subclass method. The symbols and rectangles represent training samples and subclasses
that exclude the negative samples, respectively.
Mori, et al. proposed a new method[4] for the visualization of class structures on the basis
of this subclass method. They represent the subclasses as a graph G = (V, E) as follows:
• Locate the nodes by the first two principal components in KL expansion of the mean
vectors.
• Locate the nodes on the vertices of a regular polygon.
In order to extract only the important information, noninformative nodes and edges are
removed by two thresholds 9\ and 92. Thus, vs and es4 are drawn only if
d(s) o(s,t)
and
max d(t) max o(u. vi
t u,v
4 Experimental Comparisons
We carried out experimental comparisons on two cases of artificial datasets. First, we tried
to visualize Tori dataset shown in Fig.5(a). This dataset consists of 1000 samples (500 per
class) distributing in a three-dimensional artificial feature space. There are two ring structures
of different classes. Each class is separated from the other class completely.
The results are shown in Fig.6(a), Fig.6(b) and Fig.6(c). For this dataset, we used ten
prototypes for each class and 5 for the value of threshold 9 in the proposed method. From the
result of the proposed method, we can observe the both class regions form ring shapes, and the
two classes encounter each other around the center of the dataset. On the other hand, the two
results of the subclass-based methods provide full information of the complete separability
of the classes, while the ring structure could not be sufficiently represented.
Next, we had an experiment on a two-class problem in a ten-dimensional artificial feature
space, in which the two classes are the inside and outside regions separated by a hypersphere
H. Tcnmoto et ai / Visualization of Class Structures 109
with a volume of 0.5 surrounded by a hypercube with sides of length 1 (Fig5(b)). For this
dataset, 1000 data (500 per class) uniformly distributed in both regions were used. Each class
is completely separated from the other class also in this dataset.
The results are shown in Fig.6(d), Fig.6(e) and Fig.6(f). For this dataset, we used five
prototypes for the inner class, twenty prototypes for the outer class and 13 for the value of
threshold 6. We can observe that, from both methods, the inner class is surrounded by the
outer class and the inner class is almost unimodal.
For this dataset, we had to use the higher threshold value in order to visualize the inter-
class structure clearly. If we take a lower value for the threshold 6, huge number of tiny nodes
appear all over the graph and it makes no sense. This is caused by numerous hyperplanes,
because we need so many hyperplanes in order to separate such a "complex" dataset in high-
dimensional space.
5 Concluding Remarks
We proposed an alternative way to represent the spatial class structures as graphs via piece-
wise linear classifiers. In this method, the training samples are split into nonoverlapping clus-
ters by the component hyperplanes, and the clusters in a high-dimensional space are projected
onto two-dimensional space instead of projecting the individual samples.
We also tried to determine automatically the locations of vertices by the principal two
features obtained by KL expansion. In addition, we took the closeness information between
classes into consideration by representing it as the thickness of the edges.
From experimental comparison between the proposed method and the subclass-based
graph method, we obtained the following conclusions:
• The proposed method is superior in showing the inseparability or the closeness between
classes, because the subclass method tends to avoid overlapping of the different classes
in nature. The proposed method produces inter-class edges even for completely separated
datasets due to the piecewise linear approximation of class boundaries.
• The proposed method apparently depends on the construction method of piecewise lin-
ear classifiers. However, the construction method of the optimal piecewise linear clas-
sifier is still being developed. The piecewise linear classifiers with optimal number of
hyperplanes[13] should be tested for the visualization.
no H. Tenmoto et al. / Visualization of Class Structures
(a) Proposed Method for Tori dataset (d) Proposed Method for Sphere dataset
(b) Subclass-based Method (KL) for Tori dataset (e) Subclass-based Method (XL) for Sphere dataset
(c) Subdass-based Method (Polygon) for Tori dataset (f) Subclass-based Method (Polygon) for Sphere dataset
• The proposed method may cause scattering problem, i.e., very small regions may be pro-
duced numerously by the component hyperplanes. In order to overcome this difficulty, we
need merging such hashed small regions and represent them as virtual large nodes.
There are various visualization methods of high-dimensional supervised data, and each
has its own properties. Therefore, the users have to observe and unify the informations ob-
tained by several methods for the given problem.
Acknowledgment
This work was partly supported by Grant-in-Aid for Scientific Research (B), No. 60205101,
from Japan Society for the Promotion of Science.
References
[ 1 ] K. Fukunaga, Introduction to Statistical Pattern Recognition, Academic Press (1990) 225-257.
[2] M. Aladjem, Parametric and Nonparametric Linear Mappings of Multidimensional Data, Pattern Recog-
nition 24 6(1991)543-553.
[3] M. Aladjem, Linear Discriminant Analysis for Two Classes via Removal of Classification Structure, IEEE
Transactions of Pattern Analysis and Machine Intelligence 19 2(1994) 187-192.
[4] Y. Mori, M. Kudo, J. Toyama and M. Shimbo, Visualization of Classes Using a Graph, Proceedings of the
14th International Conference on Pattern Recognition (ICPR '98) Vol. 11(1998) 1724-1727.
[5] M. Kudo and M. Shimbo, Optimal Subclasses with Dichotomous Variables for Feature Selection and
Discriminant, IEEE Transactions on Systems, Man and Cybernetics 19(1989) 1194-1199.
[6] Y. Mori, M. Kudo, J. Toyama and M. Shimbo, Analysis of Pattern Data Using Graph Representation, Pro-
ceedings of the International ICSC Congress on Computational Intelligence: Methods and Applications
2001 (CIMA2001) (2001) CD-ROM.
[7] J. Sklansky and L. Michelotti, Locally Trained Piecewise Linear Classifiers, IEEE Transactions on Pattern
Analysis and Machine Intelligence 2 2(1980) 101-111.
[8] Y. Park and J. Sklansky, Automated Design of Multiple-Class Piecewise Linear Classifiers, Journal of
Classification 6(1989) 195-222.
[9] J. Rissanen, A Universal Prior for Integers and Estimation by Minimum Description Length, Annals of
Statistics 11(1983) 416-431.
[10] J. Sklansky and G. N. Wassel, Pattern Classifiers and Trainable Machines, Springer-Verlag New York Inc.
(1981).
[11] L. Bobrowski and J. Sklansky, Linear Classifiers by Window Training, IEEE Transactions on Systems,
Man and Cybernetics 25 1(1995) 1-9.
[12] R. O. Duda, P. E. Hart and D. G. Stork, Pattern Classification, Second Edition, John Wiley & Sons, Inc.
(2001)236-238.
[13] H. Tenmoto, M. Kudo and M. Shimbo, Piecewise Linear Classifiers with an Appropriate Number of Hy-
perplanes, Pattern Recognition 31 11(1998) 1627-1634.
Advances in Logic, Artificial Intelligence and Robotics
J.M. Abe and J.I. da Silva FilhoiEds.)
IOS Press. 2002
Abstract In pattern recognition, knowledge of the structure of pattern data can help
us to know the sufficiency of features and to design classifiers. We have proposed a
graphical visualization method where the structure and separability of classes in the
original feature space are almost correctly preserved. This method is especially effec-
tive for knowing which groups of classes are close and which groups are not. In this
paper, we propose a method to group classes on the basis of this graphical analysis.
Such a grouping is exploited to design a decision tree in which a sample is classified
into groups of classes at each node with a different feature subset, and is further di-
vided into smaller groups, and finally reaches at one of the leaves consisting of single
classes. The main characteristics of this method is to use different feature subsets in
different nodes. This way is most effective to solve multi-class problems. An experi-
ment with 30 characters (30 classes) was conducted to demonstrate the effectiveness
of the proposed method.
1 Introduction
Exploratory data analysis is a very important technique in pattern recognition. The main goal
of pattern recognition is to determine correctly a class to which a given data belongs, that is,
to design a good classifier. A good classifier naturally requires a good feature space. When
adequate features are gathered, it is expected that all data belonging to a same class are close
to each other and, as a whole, are far from the other classes in the feature space. However,
features are usually collected empirically and, thus, it is difficult to collect informative fea-
tures only. Therefore, it is important to analyze the structure of classes and to investigate the
state of scattering of data points in a given feature space. However, since the dimensionality
of feature space is generally very high, we cannot perceive the state of these data directly.
Mapping techniques onto a low dimensional space are, therefore, introduced in order to help
our understanding. A comparative study by the authors showed that a graphical visualiza-
tion of data subsets is most effective to capture faithfully the structure of data of a class and
the separability of classes [1, 2]. In addition, this method allows us to see which groups of
classes are close to each other and which groups are not. Therefore, we can easily group
classes (clustering of classes).
On the other hand, feature selection is known to be very effective to improve the per-
formance of classifiers designed from a finite number of samples. So far many studies have
been devoted to develop the methodology (for example, [3, 4]). Removing useless features
Y. Mori anil M. Kudo / Design of Tree Classifiers 113
for classification, we can raise up the estimation precision of parameters of parametric clas-
sifiers and can avoid abuse of no-informative features even in non-parametric classifiers. As
a result, it is expected that the generalization error is decreased by feature selection. Usually
a feature subset is chosen in common to all classes. However, in multi-class problems, it is
natural to think that the most informative feature subsets are different depending on classes to
be classified. For example, in a pair of similar characters such that a short stroke at a certain
location exists in one character and does not exist in the other character, the existence of the
stroke is the key for distinguishing these two characters, but such a key is useless for distin-
guishing a different pair of characters. So, the authors have been discussing how to choose
different feature subsets in different groups of classes [5]. Choosing different feature subsets
depending on different groups of classes naturally results in a decision tree classifier, in which
at root node the set of all classes are divided into several groups of classes by such a feature
subset by which this rough classification is carried out most effectively, and in each children
nodes, the classes are further divided into finer groups with a feature subset different from the
feature subset used in the root node. This is repeated until division produces single classes.
In such a decision tree with different feature subsets, the main difficulty is in how to
determine the structure of the tree. For example, at root, when we want to divide C classes
into two groups, there are 2 C ~ 1 — 1 possible candidates of groupings. This is infeasible even
for not so large C. Therefore, in [5], we proposed a bottom-up way to construct a decision
tree. However, this approach does not necessarily work well. Thus, we consider to group
classes on the basis of the above-mentioned graphical analysis [1 ]. In this case, a user can help
grouping by observing the graphical results. In what follows, we discuss the effectiveness of
this approach.
2 Graphical Visualization
First, we describe how we can group classes in order to construct a decision tree. Once the
structure of the tree is determined, we carry out feature selection in each node of the tree.
The graph visualization method [1] is based on the subclass method [6, 7] which approx-
imates a true class region by a set of convex hulls spanned by subsets of training samples. In
this approach, we find quasi convex hulls, called subclasses, including only training samples
of a class maximally and excluding the samples of the other classes. An example is shown in
Fig. 1. We can know the state of the distributions of the data points from information on these
convex hulls, for example, from the volumes of convex hulls and the volume of the intersec-
tion between two convex hulls. Using this kind of information, we have proposed a method
for visualizing high-dimensional data [2]. Such subclasses can be thought of as a compressed
expression of the training samples. For visualization, the following information is extracted
from the constructed quasi convex hulls.
resolution parameter (for details, see [7]). The volume V(s] is calculated by
114 Y. Mori and M. Kudo / Design of Tree Classifiers
V(s)=(ll\Ml(S)-ml(s)\} €[0,1],
where m^s) and Mj(s) are the minimum and the maximum ends, respectively, of s in the
/th dimension of D dimensions.
3. o(s, t): the ratio of the volume of the convex region intersected by two subclasses, s and
t, to the volume of the union of s and t, that is,
V(snt)
o(s,t) =
V(sUt)'
4. n(s}: the ratio of the number of samples included in s to the total number of samples of
the class.
Polygon of classl
Figure 2: Graph representation in PCD when there Figure 3: Graph representation in PD when there
are two subclasses s and t for class 1 and subclass are 3, 5 and 4 subclasses for class 1, 2 and 3, re-
u for class 2. spectively.
3 Construction of Trees
Now, we can determine the structure of a decision tree in the following manner.
Step 1: With #/ = 1.0 (no link appears), decreasing gradually the value of 6V step by step
with 0.01 from Ov — 1.0, find the maximum value of 9V such that at least one
vertex (subclass) of every class appears in the graph.
Step 2: Keeping the value of Ov as it was, set the value of t9/ to zero. At this stage, some
classes are connected to the other classes by some links as long as the subclasses of
those classes overlap. Then, increase the value of Oi gradually until a few groups,
hopefully two or three, of classes appear due to vanishment of links that corre-
sponds to a little amount of overlap. At this stage, we can find the first grouping
corresponding to the root of the tree.
Step 3: Let us consider each group (a child node of the root node) found in Step 2. Then,
increasing the value of 61 further, we find a finer grouping in which some classes
are further separated. This process is applied for all groups (all children nodes of
the root).
Step 4: In each of newly created children nodes, Step 3 is repeated until all classes are
separated. At this stage, we reach at one of leaves consisting of single classes.
Once a tree is constructed, the next task is to find a best feature subset in each node of the
tree. For this purpose, we used an approach based on the structural indices of categories [8].
This method works in a linear order of the number of features and find a feature subset that
does not depend a specific classifier. So, for large-problems in size of feature set, this method
is appropriate. For the details, see [8].
4 Character Recognition
We performed an experiment in order to evaluate the proposed method. The Japanese charac-
ter dataset taken from ETL9B database [9] was used. The number of the samples is 200 per
class, and we divided them into halves for training and testing. The number of features is 196.
116 Y. Mori and M. Kudo / Design of Tree Classifiers
The dataset is of 30 hand-written Japanese characters that are 15 pairs of similar characters
(Table 1). Here, the word similar characters is used for a pair of characters that are similar in
shape as one can see, not for showing a pair that are close in the feature space. The features
Table 1:15 pairs of similar characters.
were measured as follows. First, we divide a character image of size 64x64 into 49 blocks of
size 16x16 in which adjacent blocks overlap a half. Next, in four directions of multiples of 45
degree, we count the number of pixels along to each direction. As a result, 196(= 4 x 49) fea-
tures are measured [10]. According to our previous experiments [2], this data was analyzed
in some stages with different parameter sets of (Oi, Ov) (Fig. 4). From Fig. 4, we can see
1 . All classes are represented by single vertices and, thus, are almost separable (Fig. 4(a-
5 Conclusion
We proposed a method for constructing a decision tree. The structure is determined by a user
in support of graphical visualization analysis of data points. The main characteristic of the
K Mori and M. Kudo /Design of Tree Classifiers 17
Figure 4: Results for Japanese character dataset. There are 15 pairs of similar characters. Left figures are PD's
and right figures are PCD's. For readability, all links are drawn with a same width.
118 Y. Mori and M. Kudo / Design of Tree Classifiers
Figure 6: Decision tree of Japanese character dataset. The number in each node shows the number of features.
X. Mori and M. Kudo /Design of Tree Classifiers 119
tree is that in each node a different feature subset is chosen so as to maximize the performance
of the local classification at that node. The effectiveness of this approach was demonstrated
in a Japanese character recognition problem.
Acknowledgment
This work was partly supported by Grant-in-Aid for Scientific Research (B), No 60205101,
Japan.
References
[1] Y. Mori, M. Kudo, J. Toyama and M. Shimbo, Comparison of Low-Dimensional Mapping Techniques
Based on Discriminatory Information, Proceeding of 2nd International ICSC Symposium on Advances in
Intelligent Data Analysis (AIDA'2001), United Kingdom(2001) CD-ROM Paper-No 1724-166.
[2] Y. Mori and M. Kudo, Interactive Data Exploration Using Graph Representation, Proceeding of 6th World
Multiconference on Systems, Cybernetics and Informatics (SCI2002), Florida(2002), to appear.
[3] J. Kittler, Feature set search algorithms, in C. H. Chen (ed.), Pattern Recognition and Signal Processing,
Sijthoff and Noordhoff, Alphen aan den Rijn, Netherlands (1978) 41-60.
[4] P. Pudil, J. Novovicova, and J. Kittler, Floating Search Methods in Feature Selection, Pattern Recognition
Letters, 15(1994)1119-1125.
[5] K. Aoki and M. Kudo, Decision Tree Using Class-Dependent Feature Selection, Proceeding of 4th Inter-
national Workshop on Statistical Techniques in Pattern Recognition (SPR2002), Canada(2002), to appear.
[6] M. Kudo, S. Yanagi and M. Shimbo, Construction of Class Regions by a Randomized Algorithm: A
Randomized Subclass Method, Pattern Recognition, 29 (1996) 581-588.
[7] M. Kudo, Y. Torii, Y. Mori, and M.Shimbo, Approximation of Class Regions by Quasi Convex Hulls,
Pattern Recognition Letters, 19 (1998) 777-786.
[8] M. Kudo and M. Shimbo, Feature Selection Based on the Structural Indices of Categories, Pattern Recog-
nition, 26 (1993) 891-901.
[9] T. Saito, H. Yamada and K. Yamamoto, On the Data Base ETL9 of Handprinted Characters in JIS Chinese
Characters and Its Analysis, The Transactions of the Institute of Electronics and Communication Engineers
of Japan, J68-D (1985) 757-764 (In Japanese).
[10] N. Sun, M. Abe and Y. Nemoto, A Handwritten Character Recognition System by Using Improved Direc-
tional Element Feature and Subspace Method, The Transactions of the Institute of Electronics, Information
and Communication Engineers, J78-D-1I (1995) 922-930 (In Japanese).
120 Advances in Logic, Artificial Intelligence and Robotics
J.M. Abe and J.I. da Silva Filho (Eds.)
1OS Press, 2002
Abstract Proposed is a clustering algorithm that takes into consideration the gap
among data points and the structure of data points. In a previous study, we observed
that two different clusters are separated by a gap and that a geometrical structure is
important to form a cluster. Based on this observation, we 1) made the gap appear by
noise points placed around data points and 2) extracted a geometrical structure as a
classification boundary in a problem of distinguishing the data class from the noise
class. However, this method is applicable only to two-dimensional data. In this study,
we present a method that eliminates this limitation. Some experimental results show
the effectiveness of this approach.
1 Introduction
Many clustering algorithms have been proposed [1, 2, 3,4]. One group of algorithms, includ-
ing /c-means [1] and fuzzy c-means [2] algorithms, is suitable for finding spherical clusters,
while another group, such as the algorithms proposed in Refs. [3, 4], is suitable for handling
arched clusters. However, it is difficult to construct algorithms that meet both requirements.
Any algorithm produces a single result as long as its parameters are fixed. However, ac-
cording to human clustering results in 2-dimensional data, the clustering results seem not to
be unique. This is often observed in the case of small dataset sizes. A possible reason for this
is that the way of feeling gaps between data points differs depending on the observer or that
the structure found in data points differs depending on the observer. In our previous study,
we considered these two factors and proposed an algorithm that reflects them [5]. The pro-
posed algorithm has the following merits: 1) it produces several clustering results as human
observers, 2) the number of clusters is determined automatically, and 3) it can explain the
change in results according to a change in the size of the background space on which points
are presented. However, the algorithm is limited to two-dimensional data. In this paper, we
present an extension of the algorithm that enables higher-dimensional data to be handled.
2 Algorithm
First, we show an outline of the algorithm with an example (Fig. 1). The algorithm consists
of the following three stages.
M. Kudo / Clustering Based on Gap and Structure 121
(a) Original data (b) Avoidance fields (c) Classification (d) Clustering result
and noise points boundary (# of clusters is 4)
Figure 1: Outline of the algorithm.
Stage 1 Generate some noise points, avoiding near the data points (Fig. l(b)).
Stage 2 Solve a two-class problem in which one class is of data points and the other class
is of noise points (Fig.l (c)).
Stage 3 Find the connected regions in the class region of data points (Fig. 1 (d)).
In our previous study [5], such an avoidance field for generation of noise points was given
a physiological interpretation. There is a field called a foveal visual field that is the range that
we can see without eye movement when we focus our eyes on a single point on a sheet, and
the size of this field is considered to be 1°20', corresponding to the size of afovea. It should
be noted that this is an angle, and the absolute range (radius) is determined by the distance
between the point and the observer. In other words, a foveal visual field around a point can
be seen the area occupied by that point. In addition, we assume that such a foveal visual
field is affected by the density around the point. That is, the visual foveal fields becomes
smaller as the density increases (Fig. 2). This idea is based on a perception model in a 2-
dimensional case. However, the distance between a point and an observer can be replaced by
the relationship between the size of the data area and the size of the background area, e.g.,
the ratio of the diameter of the smallest sphere containing a whole data points to the diameter
of the background space (a sheet in 2-dimensional cases). In this way, we can apply the idea
of the foveal visual fields to a case of more than two dimensions.
In Stage 1, an avoidance field (circle) around a data point is determined by two factors:
the size of the background area and the density of the local area around the point. We gen-
122 M. Kudo / Clustering Based on Gap and Structure
erate noise points in such a way that 1) they are generated randomly according to a uniform
distribution, 2) do not fall into the avoidance fields, and 3) keep a certain distance from each
other. The radius r of the avoidance field around a point x is determined by
r = min(r p ,r d ),
The previously proposed method was limited to two-dimensional data, because the stage for
extracting clusters (Stage 3) could not be processed in a higher dimensionality. After Stage
2, since a data class region is known, class assignment is easily done for any point. However,
finding clusters as connected regions in the data class region requires a special technique.
To cope with this difficulty, we take the following approach.
1. Select any pair of data points and see if the line segment connecting these pair of points
goes across the noise class region or not. If it does not, the pair of points is considered to
belong to the same connected region, and they are said to be in a relation R.
2. Clusters are found by taking a transitive closure of R.
In practice, the line segment is replaced by a set of points on that line. To check if two
points are in R or not, we take the following approach. For two points x and y, the middle
point z\ — (x + y)/2 is first checked and then two points 22 = (x + z\)/2 and z3 = (2! + y)/2
are checked, and so on. This procedure is repeated recursively up to depth d, that is, 2d — 1
equally-spaced points on the line segment are examined. This recursive call is terminated
when a point belonging to the noise class is found (Fig. 3). In this way, it is possible that
a connected region is judged as more than one region. Fig. 3 is such an example. This is a
limitation of this procedure. The precision, however, will increase by increasing the number
of data points and the value of d.
4 Experiments
We used the support vector machine (SVM) [7] with radial basis functions as its kernel func-
tions. The SVM has two parameters: c for specifying the degree of soft margin and a for
specifying the degree of sharpness in the kernels. We adopted c — 1000, which means that
a hard margin is expected, because, in our setting, the classification problem is separable in
nature. The value of a was chosen so as to give an appropriate result in each experiment. The
parameters for avoidance fields a and ft were set to a = 0.035 and ft — 1.1, respectively. We
generated the same number of noise points as the number of data points.
M. Kudo / Clustering Based on Gap and Structure 123
We dealt with three kinds of synthetic datasets: two in a two-dimensional and one in a
.three-dimensional spaces.
(1) Synthetic dataset 1
We used a dataset similar to those in [4]. Many of human observers may find four line-
shaped clusters in this dataset. The results are shown in Fig. 4. The desired result (Fig. 4(b))
was obtained four times in 20 trials. Various results (4 to 7 clusters) were obtained. This
variation in the results is caused by different sets of noise points. All of the results seem
reasonable in different interpretations.
;
Figure 4: Results for dataset 1 (a = 5.0). Different symbols show different clusters.
Figure 5: Results for dataset 2 (a = 0.5). Different symbols show different clusters.
Figure 6: Results for dataset 3 (a — 0.5). Different symbols show different clusters.
M. Kudo / Clustering Based on Gap and Structure 125
5 Discussion
The proposed algorithm differs from previous algorithms in the following points: 1) the al-
gorithm takes into consideration the gap and the structure of data points, 2) clustering is
formulated as classification that distinguishes data points from noise points, and 3) the al-
gorithm produces several different results even in a fixed parameter set. Property (3) is the
most remarkable characteristic of this approach. In the case of a small dataset, the various re-
sults give several different but meaningful interpretations of the dataset. In the case of a large
dataset, there would be less variation in the results. In addition, if every clusters is aggregate
and a large gap exists among clusters, the results have less variation, say, a unique result. In
that case, the results would not be sensitive to the parameters.
A change in the parameters of the classifier results in a different cluster number. In the
case of SVM, a larger value of a produces many clusters. If this algorithm is carried out with
a sequence of decreasing values of a, it gives a hierarchical clustering.
6 Conclusion
We have proposed a clustering algorithm that is based on gap and structure of data points.
This algorithm is an extension of our previously proposed algorithm. Experimental results
using the newly proposed algorithm were satisfactory in the sense that the most frequently
obtained results were consist with the clustering results by human observers or were natural
for the datasets. The following problems remain to be solved: 1) clarification of the relation-
ship between the number of noise points and the clustering results and 2) improvement in a
systematic way to choose the values of parameters of a classifier.
References
1 Introduction
2 Possible-Worlds-Restriction
where TT* is exactly one of p, and ->p,-. They are nothing but state-descriptions defined
by Carnap[l]. Let Wm be a set of all such worlds generated from <JJ. We can identify the
set of possible worlds with the power set of ^J by a bijection wk —>• {p, | Trf = pj in the
T. Murai et al. / Tables in Relational Databases 127
following way: W<n = 2 ". In crisp cases, one of those 2n worlds should be exactly the
actual one. If, however, we do not have so much information about p;, then we cannot
specify which of the worlds is the actual one.
Example 1 Let ^J = {Pi,p 2 ,p 3 } ; then we have the following 23(= 8) possible worlds
and their corresponding sets:
Possible Worlds Logical Sentences Sets
Wi PiA P 2 A p3 {p l7 p 2 ,p 3 }
W2 pjA p2A~>p3 {Pl»P2>
W3 pjA~<p 2 A p3 (Pi^Pa)
W4 PiA-ip2A-ip3 (Pi)
W5 -•Pi A P 2 A p3 {p2,p3}
WQ -.ptA p2A->p3 {P 2 >
Wj —>piA~~ip2A p3 {P3>
0
In the definition, B<x\ is a non-empty set of subsets of Wm whose elements are possible
candidates for the actual world under available information. The relation \= can be
extended for any sentences in -C(^J) in a usual way. In particular, we can define a
truth-condition for the modal operator by
Bp , w' |= p).
In general, the proposition ||p||OT of a sentence p in a model QJl is defined as the set of
possible worlds where p is true:
?cv> C llDll^.
Thus pwr-models are a standard kind of Kripke models, at least KD45- models, where
K.
4. Bp-^BBp,
because the relation R satisfies seriality, transitivity, and euclidness (cf.[2]).
If 9Hqj, w (= p holds at any world w in a model 9Jl<n, p is said to be valid in this
model and denoted 9Jl<rr> \= p.
1 28 T. Murai el al. / Tables in Relational Databases
Example 3 For the set of possible worlds described in Example 1, assume that we
have two pieces of information /i/ PJ and /2: Pj —> p3. Then they are translated into the
following restrictions:
^1 = t l P l l l - { w > l , W 2 , W 3 , W4} ,
^2 = IJPi ->• Pall = {u>i,w3,w5,w6,w7,w6} ,
respectively. Then, by taking the intersection of the two pieces of information, we have
Jt=l,2
Let PRED be a set of predicate symbols. We assume each predicate symbol p G PRED
has their own number n(p) of arity, by which we classify predicate symbols : let
PBEDk =f {p | n(p) = k} ,
then
where N+ is the set of positive integers. We denote a predicate symbol p of arity k also
by P(XI, • • • , xjt) using variables X i , • • • .x^.
Further we assume each place in each predicate symbol has its own set. whose
elements can be applicable :
So, given a frame $=<PRED, f/>, the set ^J^ of atomic sentences is defined by
V=' U pePFtED
U <*. k
where
P(X!, • • • , x f c ) | z, € U
for p 6 PREDk.
T. Murai et al. / Tables in Relational Databases 1 29
Example 5 Let PEED = PREDi\J PRED2, where PBEDl = {company, city, part} ,
and PBED-2 = {locate, supply} . Also let
Thus we have a frame $ = <PRED,U>. Then we have the following set of atomic
sentences:
<P = { company (Smith), company (Jones), company (Adams),
city(London), city(Paris), part(bolt), part(nut),
locate(Smith, London), locate(Smith, Paris), locate(Jones, London),
locate(Jones, Paris), locate(Adams, London), locate(Adams, Paris),
supply (Smith, bolt), supply (Smith, nut), supply (Jones, bolt),
supply (Jones, nut), supply(Adams, bolt), supply (Adams, nut) }. •
DB = { company(Smith),company(Jones),locate(Smith, London),
locate(Jones, Paris), supply( Jones, nut) }. •
DB9 =f DB n <pp,
4 Basic Tables
mp=<Wp,Bp, \= >,
where Wp = 2P, Bp = {we 2P | DBp C w} , and QJlp,w; |= p (for p e %) iff
p G w. Given a basic schema p and its model 5Kp, we can define a table Tp generated
in 9Jlp as a set of data that have support by belief.
130 T. Murai et al. / Tables in Relational Databases
Definition 11 Given a frame # and a database DB in #> a basic table with respect to
p generated in 9Jlp is define by
5 Compound Tables
(1)
(2) p K , - . . ,a f c ),p'(A,-" ,&') € SCFfee and (C/f = uf for
where a,-, /3j, and 7^ are meta- variables for variables X!,x 2 , • • • and
(companyAlocate)(:ri,£2),
(companyAsupply)(a; 1 ,X3),
(locateAsupply)(x! , x 2 , x 3 ). •
(PAP')(*V. •,*';«
from P(XI, • • • , Xfc) and p'(xi,- • • , x^'). We often identify pAp' with set {p,p'} •
The set of possible worlds and a valuation is defined in a similar way:
v \= p (for p
To define the set of accessible worlds 5pAp'> we must notice that a database is given
as a set of atomic sentences
while each world is a set of compound sentences that has form of the compound schema
in question. So we must embed the database DBpp into the set of compound sentences
*PpAp'. To accomplish this, we use approximation in rough set theory [5], in the following
two way :
, (JX C
def
£>BpAp, ^ {X e X I 3Y € X(Y C X) =» X = Y} ,
def
,p, C (JX} ,
def rv v, I -,-,,- v / v .-
= -[A fc £V_ I J / t fi_(A ^-
132 T. Murai et al / Tables in Relational Databases
Then we can define the following two kinds of sets of accessible worlds
= { {w € 2*W | X C w} } x 6 5 B , ,
' |x g w}
2KpAp',™ N B q 4^
and we have
Obviously we have
IpAp' ^
!
This is a multi-Scott-Montague model.
T. Murai et al. / Tables in Relational Databases 133
TcompanyAsupply
-^-company A supply company part Status
company city Status Smith bolt B
Smith bolt B Jones bolt B
Jones nut B
In the first table Z CO mDanvAsuDDlv> there is loss of information, i.e., company (Jones).
In the second table Tcompanv/^SUpp|v, where in the latter table we add a new row of
'Status' for convenience, we can aggregate the second and third tuples if we define the
value '-' (null, unknown) if
{x | 2KCompanyAsupply h= B(companyAsupply)(Jones,z)} = U3.
Then we have the following new table representation
In this paper, we illustrated a way of constructing tables from data as atomic sentences
with approximation in rough set theory. The first author was partially supported by
Grant-in-Aid No. 14380171 for Scientific Research(B) of the Japan Society for the Pro-
motion of Science of Japan.
References
[1] R.Carnap, Meaning and Necessity : A Study in Semantics and Modal Logic. Univ. of Chicago Press,
1947.
[2] B.F.Chellas, Modal Logic : An Introduction. Cambridge Univ. Press, 1980.
[3] E.F.Codd, A Relational Model of Data for Large Shared Data Banks. CACM, 13(1970), 377-387.
[4] T.Murai and M.Shimbo, Fuzzy Sets and Modal Logic Based on Evidence. Proc. 6th IFSA Congress,
Sa6 Paulo, 1995, 471-474.
[5] Z.Pawlak, Rough Sets: Theoretical Aspects of Reasoning about Data. Kluwer, 1991.
134 Advances in Logic, Artificial Intelligence and Robotics
J.M. Abe and J.I. da Silva Filho {Eds.)
IOS Press, 2002
1 Introduction
The notion of Type-2 Fuzzy Set was introduced in 1975 by LOFTI A. ZADEH [20]. Since
then a lot of papers were published, for instance [2-5,9-13,15-18]. The book [9] gives a
comprehensive survey of the state of the art in this field, including a long list of references.
H. Thiele / Different Interpretations oj the Generalized Modus Ponens 1 35
The Generalized Modus Ponens (GMP) introduced in 1973 by ZADEH [19] is a well-known
inference rule in the field of (fuzzy logic based) approximate reasoning. Note that in this
paper we shall not discuss other inference rules used in approximate reasoning.
Let $, *, $', and *' be Cr-Type-2 Fuzzy Sets on C7, i. e. $, # , $ ' , # : U -> CT. The
scheme
On the basis of a given semantics SEM2 we define how #' can be computed using the
fuzzy sets $, ^, and 4>' and using an interpretation of =>. To do this we apply the well-
known Compositional Rule of Inference introduced by ZADEH in [19]. We underline that
meanwhile other methods to interpret the generalized modus ponens have been developed
but due to restricted space we can not discuss these problem here.
Definition 2.3.
7. The CT-Type-2 GMP is said to be correct for $ and $ with respect to the semantics
SEM2 =def FUNKSEM\3>, *, $) = #.
2. TTie CT-Type-2 GMP is said to be correct with respect to the semantics SEM2 —^ For
every $, # € (C7)^, FUNKSEM\3>, #, $) = *.
Note. Further correctness definitions are formulated in [6, 17], in particular, using a topology
defined in (CT)U and the concept of continuity.
Now, we are going to prove two correctness theorems, as examples. Therefore we assume
that C is a lattice with respect to the binary relation < on C. For c, d € C by m/(c. d) we
denote the infimum of c and d with respect to <. For £, 77 € CT we define £ ^ 77 =</,/ V£ (£ €
Then it is well-known that CT is a lattice with respect to ^. The infimum of £ and 77 with
respect to ^ is denoted by /n/(£, 77).
Theorem 1. //
7. VOfyfo // € CT
2. V<eVr/(e, 77 € CT
3. VX(.Y C CT A A" ^ $ A V£(£ € X -> f ^ 77) -> Q 2 X X 77),
,*(v)), (1)
H. Thiele / Different Interpretations of the Generalized Modus Ponens 1 37
furthermore, we obtain
,tf($(u)^(v}}} (2)
hence by assumption 3
$(u),U2(3>(u),y(v)})\u 6 U}
For formulating and proving the next theorem we need the following result and notation.
Assume that the lattice [C, <] has the unit element 1. For t 6 T we define l(t) =def 1.
Then 1 is the unit element of the lattice [CT, ^].
For / : CT —> Cr we say that / is said to be Q2 -continuous
C CT A X ^ 0 -> /(Q 2 X) ^ Q2{/(o:)|x €
Theorem 2. 7/ar /ea^r one of the following three assumptions is satisfied
assumption I: (a)
|w € t/}
2
(c) Inf is Q -continuous in its first argument
(d) U2 = )C2 = Inf
assumption 2: (a) V?7(r? e CT —> A^/C 2 (^, II2(^, 77))) rs Q2 -continuous
(b) V7](ry 6 CT -)• H 2 (l, r/) = /C 2 (l, r?) = ry)
(c) g2{$(u)|u € £ / } = !
assumption 3: (a)
f^j 3 u u € f/ A $ u = 1
Proof.
ad assumption 1: By la, v € V, and ^(u) € {^(u}\u 6 C/}, we get
*(v)^Q2{y(u)\u£U}, (4)
ad assumption 2: By 2a we get
/C 2 (Q 2 {$(w)|w e U},n2(Q2{$(u)\u e U},V(v))) (11)
^ Q 2 {/C 2 ($( M ), n 2 ($(«), *(v)))|u e t/}.
Applying 2b and 2c we obtain successively
fC2(Q2{$(u)\u € t/}, n 2 (<2 2 {*(u)|u € f/}, *(„))) (12)
hence by (11)
^(«) ^ Q 2 {/C 2 ($(w), n 2 ($(u), *(«))), (13)
i. e.
3>(v)£FUNKSEM\$,V,$)(v). (14)
For getting a suitable terminology, in the following elements £, r\ € CT are called CT-Type-2
Fuzzy Values.
Consider an n-ary operation OPERn with (7T-Type-2 Fuzzy Sets $ l5 . . . , $n on U, i. e.
Now, we reduce the operation OPERn to an n-ary operation Opern with CT-Type-2
Fuzzy Values, i. e. using the operation opern : (CT}n —)• CT we define the operation OPERn
as follows where u G U.
Obviously, this reduction is used in definition 2.1 and 2.2 with respect to the binary map-
pings n2, /c2 : CT x cr -» cr.
In [16] we have described two approaches to operate with context dependent fuzzy sets,
the so-called external and internal operations, respectively. These ideas are adopted here for
operating with CT-Type-2 Fuzzy Values.
Definition 3.1 (External Operations with <7T-Type-2 Fuzzy Values).
Assume we have an n-ary mapping op™ : Cn —>• C called external operation. Then we "lift"
this operation to an n-ary "external defined" operation Op™ on CT, i. e. Op™ : (CT)n -> CT
as follows. For every £1, . . . , £n 6 CT and every t e T we put
0P™(^---^n)(t)=def
Sup{min(£i(i),...,£n(i))\ti,...,tn e T A t = op?(ti, . . . ,t n )}
Obviously this definition only makes sense if C — (0, 1), for instance. Note that this
"lifting procedure" is used in the papers [10, 11] in order to define algebraic structures in
(0, l)^ 0 ' 1 ' by "internal lifting" of max (r, s),mm(r, s),neg(r)(r, s 6 (0, 1)) and other norms,
respectively.
If we assume that C is a complete lattice with the supremum operator Sup and the inn-
mum operator Inf then we can generalize the definition of Op™ above as follows
0P?(£i, - . . , &)(*)=*/
Sup{Inf(^(t), . . .,f n (t))|*i, - - - , tn € T A t = op?(ti, . . . , tn)}
For our investigations it is favourable to introduce the following further generalization of
the extension principle where T and C are arbitrary non-empty sets.
Assume n™ : C™ ->• C and Q{ : PC -> C then we put
, . . . , tn € T A t =
140 H. Thiele / Different Interpretations of the Generalized Modus Ponens
In section 3 we have reduced general operations with (?T-Type-2 Fuzzy Sets to operations
with their Cr-Type-2 Fuzzy Values. Note that this reduction principle is already applied in
section 2 to interpret the CT-Type-2 GMP. Furthermore, in section 3 we have stated that for
operating with Cr-Type-2 Fuzzy Values there are two methods, the external and the internal
lifting.
We recall that for interpreting the CT-Type-2 GMP we have two mappings.
for operating with Cr-Type-2 Fuzzy Values. Then we have defined (see def-
inition 2.2) two binary CT-Type-2 Fuzzy Relations \u\vU2 (<b(u). V(v)) and
Despite other possibilities not discussed here, referring to section 3 each of the functions
II2 and /C2 can be constructed by an external and an internal lifting from C to CT and from
T to CT, respectively.
So using this "lifting philosophy" we obtain the following four cases of interpreting the
Cr-Type-2 GMP where in the following we generally assume u, v € U and t € T.
Case 1. The ^^-Interpretation where "EE" means "external-external".
In this case we assume II2 and /C2 are generated by external lifting of the functions
7Te, KC : C x C —> C from C to C7'. Therefore we define
Definition 4.1. SEMee = [7re, Ke, Q] is said to be an EE -Semantics for the C1 -Type-2 GMP
—def 1- Ke , Ke : C X C —> C
2. Q : PC -» C
Using this semantics we define the so-called EE -Interpretation of the CT-Type-2 GMP
as follows
Definition 4.2.
1. RELe(ve, *, *)(u, v)(t) =def 7r e (*(u)(0, *(«)(«))
2. *'»(*) =defQ{Ke(V(u)(t),RELe(7re,$,*)(u,v)(t))\u£ U}
3. FUNKSEM" ($, *, $') =def Vee
3. ft'.CxC -+C
4. Qt : PC ->- (7
5. Q : PC -> C
Using this semantics we define the so-called £7 -Interpretation of the CT-Type-2 GMP
as follows
Definition 4.4.
2.
3. Q2 : PC -)• C
4. Ke:CxC^C
5. Q : PC -> C
Using this semantics we define the so-called /^-Interpretation of the CT-Type-2 GMP
as follows
Definition 4.6.
2. m : C x C ->• C
3. Q,- : PC -> C
4. Ki : C x C -»> C
5. K : C x C -> C
6. QJ- : PC -* C
7. Q : PC -> C
Using this semantics we define the so-called //-Interpretation of the Cr-Type-2 GMP as
follows
Definition 4.8.
, m, Qi, $, *)(u, v)(<) =def
2.
5 Conclusions
With respect to the concepts developed in section 4 we are faced with the problem to investi-
gate and to apply a Type-2 approximate reasoning based on the four kinds of the interpreted
CT -Type-2 Generalized Modus Ponens.
This will be done in a forthcoming paper because of restricted space here.
Acknowledgement
The author would like to thank VOLKHER KASCHLUN for his help in preparing the
manuscript.
H. Thiele / Different interpretations of the Generalized Modus Ponens 143
References
[1] R. E. Bellmann and L. A. Zadeh. Local and fuzzy logics. In J. M. Dunn and G. Epstein, editors, Modern
Uses of Multiple-Valued Logic — Invited Papers of 5th ISMVL Symposium 1975, pages 103-165. Reidel,
Dordrecht, 1977.
[2] D. Dubois and H. Prade. Operations in a fuzzy-valued logic. Information and Control, 43(2):224-240,
November 1979.
[3] R.I. John. Type 2 fuzzy sets: an appraisal of theory and applications. International Journal of Uncertainty,
Fuzziness and Knowledge-Based Systems, 6(6):563-576, Dec. 1998.
[4] N. N. Karnik and J. M. Mendel. Operations on type-2 fuzzy sets. Fuzzy Sets and Systems, to appear.
[5] N. N. Karnik, J. M. Mendel, and Q. Liang. Type-2 fuzzy logic systems. IEEE-FS, 7(6):643, December
1999.
[6] St. Lehmke, B. Reusch, K.-H. Temme, and H. Thiele. On interpreting fuzzy if-then rule bases by concepts
of functional analysis. Technical Report CI-19/98, University of Dortmund, Collaborative Research Center
531, February 1998.
[7] St. Lehmke Logics which allow Degrees of Truth and Degrees of Validity - A way of handling Graded
Truth Assessment and Graded Trust Assessment within a single framework. Ph.-D. Thesis. University of
Dortmund (Germany). Department of Computer Science I. October 2001
[8] J.M. Mendel. Computing with words when words can mean different things to different people. In 3rd
Annual Symposium on Fuzzy Logic and Applications, Rochester, New York, USA, June 22-25 1999.
[9] J.M. Mendel. Uncertain Rule-Based Fuzzy Logic Systems: Introduction and New Directions. Prentice
Hall PTR. 2001,555 pages
[10] M. Mizumoto and K. Tanaka. Some properties of fuzzy sets of type 2. Information and Control, 31(4) :312-
340, August 1976.
[11] M. Mizumoto and K. Tanaka. Fuzzy sets under various operations. In 4th International Congress of
Cybernetics and Systems, Amsterdam, The Netherlands, August 21-25, 1978.
[12] J. Nieminen. On the algebraic structure of fuzzy sets of type-2. Kybernetica, 13(4), 1977.
[13] H. Thiele. On logical systems based on fuzzy logical values. In EUFIT '95 — Third European Congress
on Intelligent Techniques and Soft Computing, volume 1, pages 28-33, Aachen, Germany, August 28-31,
1995.
[14] H. Thiele. On the concept of qualitative fuzzy set. In Proceedings of the Twenty-Ninth International
Symposium on Multiple-Valued Logic, pages 282-287, Freiburg, Germany, May 20-22, 1999.
[15] H. Thiele. On approximate reasoning with context-dependent fuzzy sets. In WAC 2000 — Fourth Biannual
World Automation Congress, Wailea, Maui, Hawaii, USA, June 11-16 2000.
[16] H. Thiele. A New Approach to Type-2 Fuzzy Sets. In the Congress of Logic Applied to Technology
(LAPTEC 2001) Sao Paulo, Brazil, November 12-14, 2001. Conference Proceedings, pages 255-262.
[17] H. Thiele. On approximative reasoning with Type-2 Fuzzy Sets. Accepted Paper. IPMU 2002, Annecy,
France, July 1-5,2002.
[18] M. Wagenknecht and K. Hartmann. Application of fuzzy sets of type 2 to the solution of fuzzy equation
systems. Fuzzy Sets and Systems, 25:183-190,1988.
[19] L. A. Zadeh. Outline of a New Approach to the Analysis of Complex Systems and Decision Processes..
IEEE Trans, on Systems, Man, and Cybernetics, vol. SMC-3, pp. 26-44, 1973
[20] L. A. Zadeh. The concept of a linguistic variable and its application to approximate reasoning — I-III.
Information Sciences, (I) 8:199-249, (II) 8:301-357 (III) 9:43-80, 1975.
144 Advances in Logic, Artificial Intelligence and Robotics
J.M. Abe and J.I. da Silva Fillw (Eds.}
IOS Press. 2002
Abstract. There has been a large variety of search engines available on the Web. At
least most of them act as regular Information Retrieval tools, where the collection
is the whole World Wide Web. Although evaluations are made according to human-
pointed references of the set of relevant documents in a collection for a given query, at
least a bulk of works are variations of the classic logic, probabilistic and vector mod-
els, without taking memory issues — which underlies cognitive processes, specially
reasoning — into account — neither the models, nor the variations use knowledge.
Thus, they are supposed to retrieve pages that contain the words of a query, rather than
the virtual set of pages containing the subject those words are intended to point. It
sounds human-like heuristics and techniques would perform better. In this paper, some
questions concerning IR are discussed, whilst an add-on process for misspelling noise
reduction using paraconsistent knowledge over automatic acquisition of behaviours of
features is presented. The domain is retrieval of ontologies represented in Resource
Description Framework (RDF) according to keys provided by the user. This is an ex-
tremely short step into the primary improvements likely to be attained by providing
search engines with semantic knowledge.
1 Introduction
Search engines are usually build according to three paradigms well defined in the Information
Retrieval (IR) field: boolean, vector and probabilistic models [2]. Those techniques were
developed under assumptions concerning mathematic and statistical issues. The current state
of art, from the perspective of results achieved, is not bad but is far from what human beings
can do and specially far from what they wish.
IR is not being attacked the way it should be. Since the target is a human-like retrieval of
information, mechanisms to determine which documents must be fetched in satisfaction of
a query should be closer to cognitive processes, instead of just strictly word-checking/word-
counter models.
In this work, an approach for reduction of noise due to misspelling mistakes is introduced.
This approach is pimarily based on knowledge about similarity of words. Overmore, in order
not to allow mistaken unifications, some heuristics concerning the behaviour of some features
noticed in collections are automatically acquired and used. All the terms are qualified with
'This work was supported by CAPES, Brazil.
E.L. dox Santos et al. /Paraconsistent Knowledge 145
those features. Terms are then refined through paraconsistent knowledge of authenticity of
words, which is related to those features automatically acquired. However, this knowledge is
provided by experts.
At the end of that process, noise due to misspelling mistakes is reducted. The motivation
is that some collections may present lots of misspelling mistakes which can jeopardise the
performance of a search engine.
Section 2 shows how naive string approximation methods for misspelling noise tolerance
can be replaced by rules conveying semantic knowledge. Next, knowledge from experts is ex-
pressed in the form of paraconsistent rules in order to avoid inadequate unification, discarding
erroneous actions. The whole architecture of a search engine using the technique for noise
reduction presented is provided in Section 4. Finally, a conclusion is presented. Examples of
additional information used are supplied in the Appendices.
As it is common in real world data, the RDF collection contained several typing errors. If no
tolerance was applied, the search engine would not be able to identify approximation was,
in fact, intended to be approximation, for instance.
The tolerant unification was implemented according to some heuristics which try to rep-
resent the way human beings reason when they must decide whether two strings are or are
not equal. The heuristics were expressed as rules and concern the following ideas:
• Two words are equal if, when compared, a mistake does not occur more than once before
it is forgot;
• A mistake is forgot if it was not repeated within the same word in the last 9 characters (in
this paper, the value of 9 was 3);
• There are three kinds of mistakes: transposition mistake — which is the reversal of two
letters —, typing mistake — when a different letter is found instead of the expected one
— and missing mistake — when a letter is missed1.
It can be noticed in Table 1 that allowing smooth matching was useless for recall. Nonethe-
less, its effects for precision were extremely bad: lots of inadequate unifications were allowed.
Initially, this approach would be used in the search. Every term in the index which
matched the terms on the query would be used to retrieve the related documents. However,
since this technique proved not to be suitable because of the trouble already exposed — in-
adequate unifications — according to the results provided later in this section. Actually, the
way it was being dealt with was completely incorrect. Consequently, a new strategy turned
out to be required: it is not the possible matchings that should be recognized in the search;
search must look for exact words. Instead, the index must be properly constructed in order to
incorporate misspelling issues. This is explained in the next section.
'Additional letters are treated as missed letters too, as none of the words is considered to be right a priori.
146 E.L. dos Santos et al. / Paraconsistent Knowledge
The trouble with smooth search arises when an arbitrary word is compared with similar au-
thentic words2. The tolerant unification was designed not to care about some differences
when comparing two terms in order to allow partial matchings. Nevertheless, the tolerant
unification cannot realize when two similar words are both authentic because it does not have
any knowledge concerning this issue. This cannot be disregarded because it was a deficiency
introduced by the tolerant unification.
Normally, a misspelled word is not found in a dictionary; however, sometimes people can
transform the intended word into another word which is not invalid: the new word might have
been misspelled in such a way that it still exists in a thesaurus. Plainly, authenticity of a word
is not just the same as its existence in a dictionary.
The terms of the indices and their respective frequencies in the documents are all the
available information to decide whether a term £2 is either just the authentic term t\ misspelled
or, indeed, another authentic term. If it is a misspelling, its frequency should be added to 11 's
and its entry removed from the index.
T = < M, <>
|r| = [0, 1] x [0, 1]
< = {((MII Pi)> 0*2. 00) e
([0, 1] x [0, I])2 |
G = n-p (1)
Let pi be a proposition whose annotation [//i,pi] yields a certainty degree G\. So with a
certainty degree threshold 0 = 0.5, for example, the assertions in Table 2 hold. According
to Table 2, if the confidence of a proposition p is uncertain, some feature can be chosen
to remove this uncertainty.
Some craw heuristics related to term, based on relations between confidence on the authen-
ticity of an arbitrary word and behaviours of some features noticed in collections, were used
to qualify each term.
The behaviours of the features were converted from continuous to discrete for simplicity.
That allowed the use of facts only with annotations corresponding to values in the subset
{false, true} of the lattice r. The resulting heuristics are presented in Table 3. In this
table, opponents of a term t are all the terms in the collection which are similar to t according
to the smooth term unification.
The thresholds for a decision between the elements of the set {false, true} — that,
actually, denote {-> high, high} — are domain dependent. In the tests presented in this
paper, dynamic thresholds A* were obtained as indicated in (2) for each of the k features
evaluated.
In (2), fk means the global average of the values of the feature A;3.
In this approach, each term t is evaluated according to each of the features A;. Thereby,
a vector V = {v\, v\, • • • , u*, v%, u f , ' ' " » U 2 ' • • • i u n> vni" ' > vn} i§ obtained, where u,fc is
internally represented as a fact:
Thus far, every term was represented by a set of facts concerning the presence of features
observed in the collection, as defined in the previous subsection. In other words, there are
more explicit information about terms of the collection than before.
Rules based on domain knowledge are implemented as clauses named authentic and
annotated with any element of the lattice r and are formed by conjunctions of queries con-
cerning the information in Table 3 annotated only with elements of the subset {false, true}
of the lattice T. The high number of possible combinations permits paraconsistent results.
Therefore, ParaLog is used as a query module to retrieve the evidences related to the authen-
ticity of a given term.
Finally, both evidences — belief and disbelief— are used to get the certainty degree of
authenticity of that term as showed in (1). Consequently, the confidence of a term can be
obtained as in Table 2.
With knowledge of authenticity of terms available, it is possible to create a new index of
absolute frequency corrected. Let / be the index of absolute frequency of all terms in the
collection, where rows are terms and columns are documents. Each element e^ j contains the
frequency of the term U in the document dj. The confidence of each term ti is evaluated:
if the term ti is authentic, then its entry in the index / is conserved; otherwise, each of its
frequencies F/ is equally divided among all their authentic opponents Oj, increasing the
frequencies F^.. and its entry is removed from the index /.
3
Notice there is exactly one value of each feature for each term.
E.L. dos Santos et al. / Paraconsistent Knowledge 149
Real Class
+ -
Predicted + 4165 25
Class - 59 27
The results obtained with the automatic authentication of terms were compared with the re-
sults expected according to a human-made reference. Table 4 contrasts those results.
The high value of real positives mistakenly predicted as negatives are due to the high
number of acronyms. Acronyms are difficult to be identified without context. Thus, they are
likely to be wrongly classified as not authentic if they have a low frequency.
Words misspelled lots of times may be motivated to be considered authentic. On the other
hand, if a term is misspelled and there are not enough ocurrences of its correct form to allow
the identification of the authentic form, it is almost impossible to figure out the right form if
it does not appear in WordNet.
More than one mistake inside the interval delimited by the smooth unification threshold
forbids the identification of misspellings. To cope with that, a more robust method is needed.
Table 5 makes a comparison of the accuracy discriminated by class. Whereas the noise
is small, the classes are not balanced — so the total accuracy rate needs to take into account
that information.
This section presents the architecture of the program which implements the issues discussed
along the paper. The architecture is depicted in Figure 2 and the system is explained step by
step along the section.
t
Word
\ Authentication I
The collection utilized in the whole set of experiments was obtained from the DAML [8]
ontologies repository at http: //www.daml .org/ontologies. Although RDF [10, 5]
has a lot of tags representing properties, only two are interesting for this work: label, which
conveys the name of a resource in natural language, and comment, which provides a descrip-
tion of a resource.
Once the collection had been downloaded, the first step was to generate indices. A list of
stop words helped to select the words to include in the index. An absolute frequency index
was then generated.
Secondly, using the paraconsistent knowledge rules about authenticity of words like the
examples provided in Appendix A, a new index was obtained according to the procedures de-
scribed in Subsection 3.3, reducing noise due to misspelled words. Example facts concerning
the extra information automatically acquired about terms are provided in Appendix B.
Morphological WordNet [9, 7] operations were then used to reduce this new index to a
normalized form, where only primitive words exist.
Finally, that last index generated so far is used to create the ultimate index: a tf-idf index.
This is the ideal index upon which the search will be carried out. Recall that this index is
normalized — only primitive forms — and noise due to misspelling errors is inhibited.
5 Conclusion
It seems fairly clear knowledge is required in IR. The targets of this community are too much
related to cognitive processes because, in essence, they want to do something people use
knowledge to do — memory is the basis for every cognitive process in human beings.
Disambiguation, of course, is the main point. That is where knowledge will surely sur-
prise the crowd of researchers who persist in knowledge-free techniques. The argument is
clear: people only can disambiguate because they use their knowledge structures to guide the
process.
A very important point to discuss is the use of Data Mining techniques in the IR field. In
Data Mining, there is usually no or little knowledge about the database. The aim is to mine the
database in order to discover that knowledge. It seems weird to use Data Mining techniques
for understanding in IR: the goal is not to discover knowledge; the goal is to understand texts.
Experts have that knowledge. Thereby, expert knowledge should be used to understand text.
In short, there is no need to look for knowledge because it is already known.
Memory is the mine to be explored. Cognitive processes are the tools which must be pol-
ished. Natural language will only be perfectly tractable when computers have the knowledge
and competences human beings do.
6 Acknowledgements
We are thankful to Marcio Roberto Starke for his contribution in earlier studies of this work.
This research has been partially funded by Capes, Brazil.
References
[1] B.C. Avila. Uma Abordagem Paraconsistente Baseada em Logica Evidential para Tratar Excefdes em
Sistemas de Frames com Multipla Heranfa. PhD thesis, Escola Politecnica da Universidade de Sao Paulo,
Sao Paulo, 1996.
E.L. dos Santos et al. / Paraconsistent Knowledge 151
[2] R. Baeza-Yates and B. Ribeiro-Neto. Modern Information Retrieval. ACM Press, New York, 1999.
[3] H.A. Blair and V.S. Subrahmanian. Paraconsistent logic programming. In Proc. 7th Conference on Foun-
dations of Software Technology and Theoretical Computer Science, Lecture Notes on Computer Science,
volume 287, pages 340-360. Springer-Verlag, 1987.
[4] H.A. Blair and V.S. Subrahmanian. Paraconsistent foundations of logic programming. Journal of Non-
Classical Logic, 5(2):45-73,1988.
[5] D. Brickley and R.V. Guha. Resource Description Framework (RDF) Schema Specification. W3C World
Wide Web Consortium, March 2000. W3C Candidate Recommendation (Work in progress).
[6] N.C.A. da Costa et al. Logica Paraconsistente Aplicada. Atlas, Sao Paulo, 1999.
[7] G.A. Miller et al. Five Papers on WordNet. CLS Report 43, Cognitive Science Laboratory, Princeton
University, 1990.
[8] P.P. Patel-Schneider F. van Harmelen and I. Horrocks. Reference Description of the
DAML+OIL (March 2001) — Ontology Markup Language. Available on the Internet at
http: //www.daml.org/2001/03/reference.html on Feb 19t/l 2002, March 2001. Work in
progress.
[9] C. Fellbaum, editor. WordNet: an electronic lexical database. The MIT Press, Cambridge, Massachusetts,
1998.
[10] O. Lassila and R.R. Swick. Resource Description Framework (RDF) Model and Syntax. W3C World
Wide Web Consortium, February 1999. W3C Recommendation.
[11] V.S. Subrahmanian. Towards a theory of evidential reasoning in logic programming. In Logic Colloquium
'87, The European Summer Meeting of the Association for Symbolic Logic, Granada, Spain, July 1987.
a u t h e n t i c ( T ) : [0. 9, 0] <--
wordnet(T) : [1,0] &
h i g h _ d f ( T ) : [1,0] .
a u t h e n t i c ( T ) : [0, 0 . 7 ] <--
h i g h _ t f _ o p p o n e n t ( T ) : [ 1 , 0] &
high_freq_high_tf_opponents(T) : [1, 0] .
Some facts representing the information automatically acquired about terms of the collection
are supplied below as examples. There is a clause for each pair term-feature.
wordnet(ontology):[1, 0]
high_tf(person) : [0, 1] .
high_df(person):[1, 0].
152
Advances in Logic, Artificial Intelligence and Robotics
J.M. Abe and J.I. da Siha Filho I Eds.)
IOS Press. 2002
1 Introduction
possibly different states of an underlying system. The automata with concurrency re-
lations presented here are a much refined model, taking into account the states of the
system, the transitions, and a collection of local independence relations.
An automaton with concurrency relations is a quadruple A = (Q,£,T, ||). Here,
Q is the set of states or situations, E is the set of actions, T C Q x E x Q i s the
set of transitions, and ||= (||g)9eQ IS a collection of concurrency relations \\q for S,
indexed by the possible states q 6 Q. The relation a \\q b captures that the actions
a and b can occur independently of each other in state or situation q. Observe that
possibly ||p7^||g if p ^ q. Hence such automata with concurrency relations generalize
the asynchronous transition systems and trace automata which were investigated by
various authors, e.g. [2, 65, 67, 60, 5, 72], and used to provide a semantics for CCS
[56, 7], to model properties of computations in term rewriting systems and in the
lambda calculus [1, 6, 3, 40, 52], and in dataflow networks [60]. Structures with varying
independence relations have also been investigated in [42]. Note that trace alphabets
arise simply by considering automata with concurrency relations A with just a single
state. In general, let CS(./4) comprise all finite computation sequences of A, with
concatenation as (partially defined) monoid operation. Similarly as in trace theory,
we declare two computation sequences of the form (p, a, g)(g, 6, r) and (p, 6, q')(q', a, r)
equivalent, if a \\p b. This generates a congruence ~ on CS(.A), and the quotient
M(*4) := CS(.4)/~U{0} (formally supplemented with 0 to obtain an everywhere defined
monoid operation) is called the concurrency monoid associated with A.
Deterministic automata with concurrency relations where introduced in [19, 20],
where their associated domains of computations where investigated. These arise as fol-
lows. The prefix order for computation sequences induces a preorder < and an equiva-
lence relation ~ on the set CS°°(.A) of all (finite or infinite) computation sequences of
A and hence a poset M°°(A) := (CS°°(»4), <)/~. Let * e Q be a fixed state, and now
the poset (D(A), <) comprising all elements of M°°(.A) whose representing computation
sequences start in * is called the domain of computations associated with A. They can
also be defined if A is nondeterministic, and their order structure has been completely
described in [19, 20, 44]. Under suitable assumptions on A, they are closely related with
various classes of domains arising in denotational semantics of programming languages
like L-domains and event domains. In particular, stably concurrent automata generate
precisely the class of all dl-domains.
In place/transition-nets, where places may carry several tokens and transitions may
fire with multiplicities, the independence of two actions depends on the underlying
marking, i.e. the existence of sufficiently many tokens (resources of the modelled sys-
tem). Hence these nets provide natural structures whose dynamic behaviour can be
modelled by automata with concurrency relations. In [29, 30], this relationship was
described by an adjunction between a category of automata with concurrency relations
and a category of Petri nets, similarly to the results of [58, 72, 57].
In trace theory, an important tool used in many difficult applications is the repre-
sentation of traces by labeled graphs, or even labeled partially ordered sets. Here we
present such a representation for the computations in M(A) [9]. This assumes that A
is stably concurrent, i.e. that the concurrency relations of A depend locally on each
other in the form of the cube and the inverse cube axiom. Recall that stably concurrent
automata generate precisely the dl-domains; these distributivity properties are crucial
in the proof. We also note that somewhat surprisingly, they are closely related with
Levi's lemma in trace theory. Very recently, we learned that similar properties occurred
in the area of Gaussian and of Garside monoids [14, 15, 62].
154 M. Droste and D. Kuake / Automata with Concurrency Relations
In the study of concurrent processes labeled transition systems have been used fre-
quently as a model for an operational semantics [39, 56, 72]. A labeled transition
system may be defined to be a triple T = (Q, £,T) where Q is a set of states, £ is a
M. Drosle and D. Kuske / Automata with Concurrency Relations 155
The angle at p indicates that a ||p 6 holds. A computation sequence is either empty
(denoted by e) or a finite or infinite sequence of transitions u = t\ti... tn or u = t\tz • • •
of the form ti = (gj_i,ai,<7t) for i = 1,2,... , n (i e N, respectively). The state q0 is
called domain of u (denoted by domu), and a\a-i... is the action sequence of u (denoted
acsequ). If u is finite, the state qn is the codomain of u (denoted codu) and n is the
length of u (denoted |u|). For infinite u, we write |u| = u. We let CS(.A) denote the set
of all finite computation sequences, 08^(^1) denotes the set of all infinite computation
sequences and CS°°(.A) = CS(.A)UCSw(w4). Let u £ CS(A) and v 6 CS°°(^). Then, the
composition uv is defined in the natural way by concatenating u and v if cod u = dom v.
Formally, we put ue = u and ev = v. A finite computation sequence u is a prefix of the
computation sequence w if there exists v € CS°°(^4) such that w = uv.
156 M. Droste ami D. Kuske /' Automata with Concurrency Relations
for any transitions s, t, u, t', u' 6 T (here, we let e — s = {e} for any s G T).
We remark that a deterministic automaton with concurrency relations A satisfies
the cube axiom as just defined iff it satisfies the following implication:
a ||, b, b \\q c and a \\q.b c ==» a ||, c, b ||,.0 c and a ||,.c b. (*)
(where q.b is the unique state satisfying (q, 6, q.b) € T - in the same spirit, we write q.w
for the unique state reached from q by executing u; e E* if such a state exists.)
The proof of this is simple but tedious. It rests on the definition of the residuum
operation through the concurrency relations described above.
Suppose the assumptions of the implication above hold. Then, by the requirement 3
for a nondeterministic automaton with concurrency relations, the transitions marked
by solid lines in the following picture exist. The conclusion together with requirement 3
implies the existence of the transitions depicted by dotted lines exist. This picture
suggests the name " cube axiom".
158 M. Droste and D. Kuxke / Automata with Concurrency Relations
It turns out that for nondeterministic concurrent automata, we can extend the
residuum operation to one on CS(./4) such that it respects equivalence, i.e. u,v,u',r/ €
CS(.A) and u ~ u', v ~ v' imply that u — v ~ u' — v' (meaning that for each w € u — v
there is w' 6 u' — v' with w ~ w' and vice versa), see [44, pp. 122-135]. This makes
algebraic calculations with the elements of CS(./4) and M(X) possible, which is crucial
e.g. for the proof of Thm. 3.1 below (see also [14] for a similar, but more general,
calculus in semigroups).
Next we consider an axiom which for deterministic automata is precisely the converse
of condition (*).
o ||, 6, b ||, c and a ||,.6 c <= a ||, c, b ||,.a c and a ||,.c b (*-')
for any a, 6, c 6 £ and q € Q. This converse implication, as well as the following picture
(where we make the same conventions as above on solid and dotted lines) may justify
the name "inverse" cube axiom.
The importance of this axiom will be clear from Thm. 3.4. A similar axiom has
been introduced by Panangaden, Shanbhogue and Stark in [60] for the more specific
situation of input/output automata.
A (non-)deterministic automaton with concurrency relations that satisfies the cube
and the inverse cube axiom is called (non-)deterministic stably concurrent automaton.
Since any automaton induced by a trace alphabet has precisely one state, it is deter-
ministic and satisfies the cube and the inverse cube axiom. Hence, deterministic stably
concurrent automata still generalize trace alphabets.
For later purposes we now extend the concurrency relations ||, inductively to words
over E. Let A be a deterministic automaton with concurrency relations. Let o € £ and
v,w 6 £+. Then av ||, w o ||, w and v ||,.a w. Furthermore, a ||, w iff w ||, a.
For u, v € CS(.A) we say u and v commute (denoted u \\ v) iff domu = domi; =: q and
(acsequ) ||, (acsequ). One can show that then also v and u commute (cf. [21]).
Nielsen, Rosenberg and Thiagarajan [58] established a close relationship between a
class of transition systems and elementary Petri nets. In [72], Winskel and Nielsen es-
tablished an adjunction between the category of Petri nets (essentially, condition/event
nets) and the category of asynchronous transition systems. For a particular subclass of
M. Droste and D. Kuske / Automata with Concurrency Relations 159
2. TT(*) = V.
3. a ||p 6 in A implies 77(0) \\'K^ 77(6) in A'.
Theorem 2.4 ([30]) There exist functors an : A -> P and na : P -> A #ia£ /orm an
adjunction where an is the left adjoint ofna.
In [30] , the authors describe a full subcategory AQ of A and a functor noo : P —> AQ
such that an f AQ and noo form a coreflection.
Moreover, in [28], similar results were established for nets with capacities and mark-
ings which could even take values in partially ordered abelian groups. In [29] a continu-
ous version of P/T-nets and of continuous transition systems with situation-dependent
concurrency is investigated.
3 Domains
For a PT-net A/", usually the set of all initial firing sequences is considered as the
language of the net. Similarly, the set of initial computations D(A) can be regarded as
the "language" of a nondeterministic automaton with concurrency relations A. This
set carries a nontrivial partial order < as defined in Sect. 2.
In this section, we summarize the results on (D(A),<) for nondeterministic au-
tomata with concurrency relations. Therefore, we first introduce some order-theoretic
notations.
Let (D,<) be a partially ordered set. Let x, y 6 D. Then y covers x (denoted
x—<y) if x < y and x < z < y implies y — z. We put xj, := {d € D \ d < x}. A subset
A C D is directed if for any x,y 6 A there is z € A such that x, y < z. (D, <) is a
complete partial order or cpo, if it has a least element (usually denoted by _L) and any
directed subset of D has a supremum. An element of a cpo (D,<) is compact if any
directed subset A of D with sup .A > x contains some a € A with a > x. The set D°
comprises all compact elements of D. The cpo (D, <) is algebraic if for any x G D, the
set X = x\, r\D° is directed and sup X = x. A domain is an algebraic cpo with at most
countably many compact elements. A domain (D, <) is finitary if x| is finite for any
xe£>°.
A partially ordered set (D, <) is a lattice if any finite subset has a supremum and
an infimum. It has finite height if any totally ordered subset of D is finite. A lattice
of finite height is semimodular if whenever £,3/1,3/2 £ D with x —< 3/1,3/2 and y\ ^ 3/2,
then 3/1 —<y\ V 3/2- (In domain theory, semimodularity is known as axiom (C).) It is
modular if moreover y\ A 3/2 —<3/i for any 3/1, 3/2, •* € D with 3/1, 3/2 —<z and 3/1 ^ 3/2- A
lattice is distributive if x V (3; A z) = (x V y) A (x V z) for any x,y,z € D. Recall that
any distributive lattice of finite height is modular and finite.
Let A be a nondeterministic automaton with concurrency relations and initial state.
Then (D(-4), <) is a domain. Its compact elements are the finite initial computations,
i.e. D(A)° = D(A) HM(.4). Hence, any compact element dominates compact elements
only. If any set of the form {q € Q \ (p, a, q) e T} forp 6 Q and a e E is finite, (D(.4), <
) is moreover finitary. Obviously, this is the case for deterministic automata. In [19,
20] and [44] complete characterizations of the domains (D(A),<) generated by (non-)
M. Droste and D. Kuske /Automata with Concurrency Relations 16!
deterministic automata with concurrency relations and initial state are given. Here we
discuss the impact of the cube axiom, the inverse cube axiom and the determinism of
Ato(D(A),<).
Theorem 3.2 ([67, 19, 9]) Let A be a deterministic concurrent automaton. Then
M(A) is a cancellative monoid.
For nondeterministic concurrent automata A, in general M(A) is just left-cancellative,
but not necessarily right-cancellative, as examples show. Right-cancellation holds when-
ever A is stably concurrent.
Now we turn to deterministic automata with concurrency relations. A domain
(D, <) is a Scott-domain if any two elements bounded above have a supremum. Let
(D, <) be a domain and x,y e D°. If x—<y, we call [x, y] a prime interval. Let
>—< denote the smallest equivalence relation on the set of all prime intervals such
that [x,y]>—<[x',y'] for all prime intervals [x,y] and [x',y'] with x—<x', y—<y' and
y 7^ x'. A domain (D, <) satisfies axiom (R) if [x, y]>—<[x,2] implies y — z for any
x,y,z € D. A finitary Scott-domain satisfies axiom (C) if the lattice (xj,,<) for any
compact element x is semimodular. A concurrency domain is a finitary Scott-domain
(£),<) satisfying axioms (R) and (C).
Theorem 3.3 ([19, 20]) Let (D, <) be a partially ordered set. Then (D, <) is a con-
currency domain iff there exists a deterministic concurrent automaton with initial state
A such that (£>,<) = (D(.4), <).
It has been shown that the domain of configurations of an event structure is a
concurrency domain (cf. [71]). Hence for any event structure there exists a deterministic
concurrent automaton with initial state A that generates an isomorphic domain (but
not vice versa). In this sense, deterministic concurrent automata are more general than
event structures. An event structure that generates an infinite domain is always infinite.
On the other hand, an automaton with concurrency relations and initial state generates
an infinite domain as soon as it has a loop. Hence the representation of domains by
automata is much more compact than that by event structures.
In trace theory, Levi's Lemma plays a central role. In the theory of deterministic
concurrent automata, it can be reformulated as follows:
A deterministic concurrent automaton A satisfies Levi's Lemma if for any x, y,z,t G
M(.4) with xy — zt ^ 0 there exist uniquely determined elements r, [a], [b], [c],[d],u 6
162 M. Droste and D. Kuske / Automata with Concurrency Relations
M(A) such that x = r[aj, y = [6]u, z = r[c], t = [d]u, a || c, acseqa = acseq d and
acseq c = acseq 6.
This definition can be depicted by the following diagram:
The following theorem establishes the close relation between Levi's Lemma, the
distributivity of (D(A), <) and, as a direct consequence of Thm. 3.1, the inverse cube
axiom. A finitary Scott-domain where any element dominates a distributive lattice is
a dl-domain.
Theorem 3.4 ([44, 21]) Let A be a deterministic concurrent automaton with, initial
state. Then the following are equivalent:
1. A is stably concurrent.
2. (D(A),<) is a dl-domain.
3. A satisfies Levi's Lemma.
Together with Thm. 3.3, this theorem implies that the class of all domains generated
by deterministic stably concurrent automata with initial states and the class of all dl-
domains coincide. Hence we obtained a new representation result for the well studied
class of dl-domains (c/. [13]). This result generalizes e.g. [18, Thm. 11.3.11]. Note
that there are dl-domains that cannot be generated by a trace alphabet: in [5], a
universal dl-domain is constructed that arises from a trace alphabet. While there
exists a universal homogeneous dl-domain (which can, as any dl-domain, be generated
by a stably concurrent automaton), it cannot be generated by a trace alphabet [46].
Next we consider the class of all deterministic automata with concurrency relations
and initial state that generate up to isomorphism a given domain. This class may
contain finite and infinite automata. The question is if we can find minimal automata
in this class and if they are unique.
Let Ai = (Qi,E,Tj, ||*,*t) for i — 1,2 be deterministic automata with concurrency
relations and initial state; note that A\ and AI are required here to have the same set
of actions E. A surjective function / : Qi ->• Q2 is a reduction if
1. /(*!> = *2.
2. V(p l a,«)6T 1 :(/(p) > o > /(g))€T a .
3. Va, b 6 E Vg € Qi : a ||J b in AI <=* a \\2f(q) b in A2.
4. Vp € Qi V(f(p),a,j) 6 T2 3q 6 Ql : (p.o.g) € Tx.
M. Droste and D. Kuske / Automata with Concurrency Relations 163
Let (D, <) be a concurrency domain and S be a nice ideal in (D, <). Then there
exist deterministic concurrent automata with initial states A\ C «A2 such that (D, <) =
164 M. Droste and D. Kuske / Automata with Concurrence Relations
Domains and traces have also been considered from a topological point of view.
Historically the first topology for traces was introduced by Kwiatkowska [51] who gen-
eralized the prefix metric of words. While the infinite traces are the metric completion
of the finite ones, the drawback of this topology is that the multiplication is not contin-
uous. This observation led Diekert to the introduction of complex and a-complex traces
[17]. In [50], we extend his investigation of a-complex traces to automata with concur-
rency relations. Our main result is the construction of a metric on finite computations
that makes the concatenation uniformly continuous. We also describe the elements of
the metric completion as pairs consisting of a (possibly infinite) computation and a
finite set of transitions of the underlying automaton.
4 Dependence orders
Prom now on, we consider only deterministic automata with concurrency relations. To
simplify notation, they are called automata with concurrency relations, as mentioned
before.
One of the foundational results of trace theory [54], used in many difficult appli-
cations, is that each element of the (algebraically defined) trace monoid has a graph-
theoretical representation. Moreover, infinite traces are usually defined in terms of
dependence graphs, which is in contrast (but equivalent) to our approach to define
them as equivalence classes of infinite computation sequences. We will show that these
representation results can be generalized to monoids associated with stably concurrent
automata. Therefore, for all of this section, let A be a fixed stably concurrent
automaton.
We first define a labeled partial order D0(u) for any u € CS°°(.A). Let u € CS°°(>t)\
{e}. Analogously to trace theory we define the dependence order on those events that
appear in u. This order should reflect when an earlier event has to appear before a later
one, i.e. the earlier event is a necessary condition for the later one. Since an action a
can appear several times in u we have to distinguish between the first, the second . . .
appearance of a. For a finite or infinite word w over E and a 6 E let \w\a denote the
number of a's in w. Then define \u\a := \ acsequ|0 We abbreviate a* = (a, i) for a e E
and i 6 N. The precise definition of the dependence order of u can now be given as
follows. Let C£ = {a* | i 6 N, 1 < » < |u|«} for a € E and O(u) = (J{Q« I o 6 E}.
Then, obviously, | 0(w)| = |ti|. For a',^ € 0(u) let a* Cu &* iff the z-th appearance of
a in u precedes the j'-th appearance of 6, i.e., formally, there are words i, y e E* and
a possibly infinite word z over E with acsequ = xaybz, \x\a = i — 1 and |zay|& = j — 1.
Then Cu is a linear order on O(u). Since for equivalent computation sequences u and
v we always have O(u) = O(v), a partial order on O(w) can be defined by:
Hence, a* ^u tf if and only if the i-th a precedes the j-th b in any computation sequence
equivalent with u. Now DO(u) = (O(u), ^u, (Qa)as£idomu) is a relational structure
with one constant from Q. We call DO(u) the dependence order associated with u.
Since u ~ v implies D0(u) = DO(v), the dependence order DO(u) can be considered
as the dependence order of the computation [u] e M°°(>1) \ {0, 1}. To include 0 and
1, formally we put domO = _L and doml = T where J_ and T are additional symbols
not belonging to Q, and, using this, define DO(0) and DO(1) similarly as before (with
0(0) = 0(1) = 0).
M. Droste and D Knske / A utomata with Concurrency Relations 165
A second labeled partial order PR(u) can be extracted from the distributive lattice
(Ml, <)- A finite computation x 6 M(A) is a complete prime if there exists precisely
one finite computation y and a transition t such that x = y • [t]. Let Pr(w) comprise
all complete primes x < [u]. We only note that these are precisely the join-irreducible
elements of the lattice ([w]|, <). By a fundamental result in lattice theory [4], they
completely determine the structure of this lattice since it is distributive. For a e £ let
P" comprise all elements x of Pr(w) such that x = y • [t] for some transition t = (p, a, q)
and some computation y. Furthermore, define PR(u) = (Pr(u), <, (P")o6s,domu). A
foundational result on dependence orders is that, although defined quite differently,
the labeled partial orders D0(u) and PR(u) are isomorphic for u € CS°°(.4). This
isomorphism is also used to prove the following result on order preserving enumerations
of DO(u): a (possibly infinite) sequence A — (zi)i<n for n € N U {u;} is an order-
preserving enumeration of D0(u) if it is an enumeration of O(w) and x» ^u Xj implies
i < j. Then a computation sequence v is the linearisation of D0(w) induced by A
if domu = domt> and acseqv = aia2...a n _i (acsequ = aia 2 ..., respectively) with
x
i ^ Q£> f°r i < n. We call v a linearisation of D0(u) if it is the linearisation
induced by some order-preserving enumeration. Since A is deterministic, any order-
preserving enumeration induces at most one linearisation. Let Lin DO(u) comprise all
linearisations of DO(u). Then it is easy to see that M ^ Lin DO(u) for any u e CS°°(./4.).
(Actually, this holds even for arbitrary automata with concurrency relations .4.) If A
is stably concurrent, we even have
Theorem 4.1 ([9, 26]) Let A be a stably concurrent automaton and letu, v 6 CS°°(./4).
Then D0(u) = PR(u) and the following are equivalent.
1. u ~ v.
2. DO(w) = D0(v).
3. w e LinDO(u).
Furthermore, any order-preserving enumeration of D0(u) induces a linearisation.
Hence, in this case M = LinDO(w). The importance of such a description of
linearizations of the partial order for concurrent programs is discussed, e.g. in [42, 43].
This result enables us to represent computation sequences by their dependence orders.
In [9], all those labeled partial orders are characterized that are isomorphic to the
dependence order DO(u) of a finite computation sequence u. Moreover, a multiplication
on the set of (isomorphism classes of) finite dependence orders is defined and shown
to yield a monoid isomorphic to the monoid M(«4). This generalizes classical results of
Mazurkiewicz ([54, 16]).
Since dependence orders are relational structures, we can define logical languages
to reason on these dependence orders. The corresponding first-order language FO has
variables x, y,... for elements of O(u). The atomic formulas are x < y, E 0 (x) for a 6 E,
and constants cq for q € Q U {J_, T}. Then formulas are built up from atomic formulas
by the connectives -> and V and the quantifier 3. In the monadic second order language
MSO, also set variables X, Y,..., quantification of them and atomic formulas X ( x ) are
admitted. A sentence of MSO is a formula without free variables. The satisfaction
relation DO(u) \= (p between dependence orders and sentences is defined as usually:
x < y is satisfied iff x ^ u y, Sa(z) iff x e Q", cg iff dom(u) = q and X ( x ) iff x € A".
Now let (p be a sentence of MSO. Then L°°(ip) denotes the set of aU [u] 6 M°°(.A)
166 M. Droste and D. Kuske / Automata with Concurrencv Relations
such that DO(u) f= (p. Furthermore, L(tp) = L°°((p) n M(A). Since u ~ u implies
DO(u) = DO(t>), the set L°°(v) is well defined. The logical languages FO and MSO
will be used in the following sections.
Now the relationship between co-rational and recognizable languages in M(A) can
be described as follows:
Theorem 5.1 ([21, 27, 48, 47]) Let A be a finite automaton with concurrency rela-
tions.
1. If A forwardly preserves concurrency, then any recognizable language in M(A) is
co-rational.
M. Droste and D. Kuake / Automata with Concurrency Relations 167
Hence, M°°(./4) is the natural infinitary extension of M(A) together with an operation
• : U(A) x M°°(A) -+ U°°(A).
As usual, i f U C U(A) and V C M°°(.4), we put U • V = {u • v \ u e U, v € V} and
let U" be the set of all infinite products Xj • x2 • £3 • • • with Xj € U for i 6 N.
A language L C M°°(A) is recognizable if there exists a finite monoid (5, •) and a
homomorphism 77: M —> 5 such that for any sequence (xj)i<ta) of finite (possibly empty)
computations with xi • x2 • x3 • • • G L we have ^"^(xi) • ^"^(xa) • TJ^T^XS) • • • C L. It
can be checked that a language L C M(.A) is recognizable in M(A) iff it is recognizable
in M°°(.A). The recognizable languages in M.°°(A) are completely described by the
following result:
Theorem 5.4 ([26]) Let A be a finite stably concurrent automaton and L C M°°(.A).
Then the following are equivalent.
1. L is recognizable.
2. L is a Boolean combination of languages U • Vw where U,V C M(A) are recogniz-
able and V • V C V.
3. There exists a sentence y> GM5O such that L = L°° (</?).
The equivalence of 1 and 3 in this theorem generalizes the result of Ebinger and
Muscholl mentioned above. It also contains Thm. 5.2 (which, however, is used for its
proof). The result of Gastin, Petit and Zielonka does not follow in its whole strength
from the equivalence of 1 and 2, but at least we obtained a Kleene-type characterization
of recognizable languages in M°°(.4), since the recognizable languages of M(A) are
characterized in this way by Thm. 5.1. In this sense Thm. 5.4 extends the result of [34].
Theorem 6.1 ([22]) Let A be a finite stably concurrent automaton without commuting
counters. Then L C M.(A) is aperiodic iff it is starfree.
Observe that an automaton induced by a trace alphabet has no commuting counters
(since q.w — q for any w £E £*). Hence this result generalizes the result of Guaiana,
Restivo and Salemi.
We now turn to the definability of aperiodic languages by logical means similar
to Thm. 5.2. McNaughton and Papert [55] showed that the aperiodic and the first
order definable languages in a finitely generated free monoid coincide. This result has
been extended to trace monoids by Thomas [69] and by Ebinger and Muscholl [32]. For
finite stably concurrent automata A, the classes of aperiodic and of first-order definable
languages in M(A) are incomparable (see [24] for an example). Therefore, we again need
additional assumptions on A.
A stably concurrent automaton A is counter-free if q.wn = q implies q.w — q for
any q e Q and w € £*. A has no commuting loops if a ||g w implies q.w ^ q for any
a 6 E, w € S* and q G Q. It is an automaton with global independence if whenever
a ||p 6 and q.ab is defined then a \\q b for any a, b & £ and p, q G Q.
Again, observe that the automata induced by trace alphabets satisfy all these con-
ditions. Hence the following results generalize corresponding results on trace monoids.
Theorem 6.2 ([24]) 1. Let A be a finite counter-free stably concurrent automaton.
Let L C M(.4) be aperiodic. Then there exists a sentence (p £FO such that
L = L(<p).
2. Let A be a finite stably concurrent automaton without commuting loops or with
global independence. Let (p be a sentence of FO. Then L((p) is aperiodic.
Several temporal logics have been considered for traces. The first expressively com-
plete one was Ebinger's TLPO [31] (see [25] for its generalization to stably concurrent
automata). This logic is expressively complete for finite computations and uses both,
past and future modalities. Thiagarajan and Walukiewicz defined the temporal logic
LTL for traces [68]; it is expressively complete for finite and infinite computations and
it is a pure future logic. Unfortunately, its satisfiability problem is nonelementary [70].
The logic LTL can also be extended to stably concurrent automata along the lines
of [68]. Since concurrent automata generalize traces, this nonelementary lower bound
holds in this case as well - but we can show that under suitable restrictions on the
automaton, the problem becomes elementary:
Theorem 6.3 Let A be a finite stably concurrent automaton and n e N such that for
any q G Q and u, v € S+ we have
u\\qv,\u\>n => \v\ < n.
Then the satisfiability problem "L((p) = 0?" is solvable in EXPTIME for formulas of
the temporal logic LTL.
References
[1] P. Bachmann and Phan Minh Dung. Nondeterministic computations - structure and
axioms. Elekton. Inform.-verarb. Kybern. EIK, 22:243-261, 1986.
[2] G. Bednarczyk. Categories of asynchronous systems. PhD thesis, University of Sussex,
1987.
170 M. Drome and D. Kuske / Automata with Concurrency Relations
[3] G. Berry and J.-J. Levy. Minimal and optimal computations of recursive programs. J.
ACM, 26:148-175, 1979.
[4] G. Birkhoff. Lattice Theory. Colloquium Publications vol. 25. American Mathematical
Society, Providence, 1940. Page numbers refer to the third edition, seventh printing from
1993.
[5] P. Boldi, F. Cardone, and N. Sabadini. Concurrent automata, prime event structures and
universal domains. In M. Droste and Y. Gurevich, editors, Semantics of Programming
Languages and Model Theory, pages 89-108. Gordon and Breach Science Publ., OPA
Amsterdam, 1993.
[6] G. Boudol. Computational semantics of term rewriting. In M. Nivat and J.C. Reynolds,
editors, Algebraic Methods in Semantics, pages 169-236. Cambridge University Press,
1985.
[7] G. Boudol and I. Castellani. A non-interleaving semantics for CCS based on proved
transitions. Fundam. Inform., 11:433-452, 1988.
[8] F. Bracho and M. Droste. Labelled domains and automata with concurrency. Theoretical
Comp. Science, 135:289-318, 1994.
[9] F. Bracho, M. Droste, and D. Kuske. Representation of computations in concurrent
automata by dependence orders. Theoretical Comp. Science, 174:67-96, 1997.
[10] J.R. Biichi. On a decision method in restricted second order arithmetics. In E. Nagel
et al., editors, Proc. Intern. Congress on Logic, Methodology and Philosophy of Science,
pages 1-11. Stanford University Press, Stanford, 1960.
[11] J.R. Buchi. Weak second-order arithmetic and finite automata. Z. Math. Logik Grund-
lagen Math., 6:66-92, 1960.
[12] P. Cartier and D. Foata. Problemes combinatoires de commutation et rearrangements.
Lecture Notes in Mathematics vol. 85. Springer, Berlin - Heidelberg - New York, 1969.
[13] P.L. Curien. Categorical Combinators, Sequential Algorithms and Functional Program-
ming. Progress in Theor. Comp. Science. Birkhauser Boston, 1993.
[14] P. Dehornoy. On completeness of word reversing. Discrete Mathematics (special issue
on Formal power series and algebraic combinatorics, Toronto, 1998), 225:93-119, 2000.
[15] P. Dehornoy. Complete positive group presentations. Technical Report 2001-13, Univer-
sity of Caen, 2001.
[16] V. Diekert. Combinatorics on Traces. Lecture Notes in Comp. Science vol. 454. Springer,
1990.
[17] V. Diekert. On the concatenation of infinite traces. Theoretical Comp. Science, 113:35-54,
1993.
[18] V. Diekert and G. Rozenberg. The Book of Traces. World Scientific Publ. Co., 1995.
[19] M. Droste. Concurrency, automata and domains. In 17th 1CALP, Lecture Notes in
Comp. Science vol. 443, pages 195-208. Springer, 1990.
[20] M. Droste. Concurrent automata and domains. Intern. J. of Found, of Comp. Science,
3:389-418, 1992.
[21] M. Droste. Recognizable languages in concurrency monoids. Theoretical Comp. Science,
150:77-109, 1995.
[22] M. Droste. Aperiodic languages in concurrency monoids. Information and Computation,
126:105-113, 1996.
[23] M. Droste, P. Gastin, and D. Kuske. Asynchronous cellular automata for pomsets.
Theoretical Comp. Science, 247:1-38, 2000. (Fundamental study).
[24] M. Droste and D. Kuske. Logical definability of recognizable and aperiodic languages in
concurrency monoids. In Computer Science Logic, Lecture Notes in Comp. Science vol.
1092, pages 467-478. Springer, 1996.
[25] M. Droste and D. Kuske. Temporal logic for computations of concurrent automata.
Technical Report MATH-AL-3-1996, Inst. fur Algebra, TU Dresden, 1996.
[26] M. Droste and D. Kuske. Recognizable and logically definable languages of infinite
computations in concurrent automata. International Journal of Foundations of Computer
Science, 9:295-313, 1998.
M. Droste and D. Kuske /Automata with Concurrency Relations 171
[52] J.-J. Levy. Optimal reductions in the lambda calculus. In J.P. Seldin and J.R. Hindley,
editors, To H.B. Curry: Essays on Combinatory Logic, Lambda Calculus and Formalism,
pages 159-191. Academic Press, New York, 1980.
[53] A. Mazurkiewicz. Concurrent program schemes and their interpretation. Technical re-
port, DAIMI Report PB-78, Aarhus University, 1977.
[54] A. Mazurkiewicz. Traces theory. In W. Brauer ant others, editor, Petri Nets, Applications
and Relationship to other Models of Concurrency, Lecture Notes in Comp. Science vol.
255, pages 279-324. Springer, 1987.
[55] R. McNaughton and S. Papert. Counter-Free Automata. MIT Press, Cambridge, USA,
1971.
[56] R. Milner. Calculus of Communicating Processes. Lecture Notes in Comp. Science vol.
92. Springer, 1980.
[57] M. Mukund. Petri nets and step transition systems. International Journal of Foundations
of Computer Science, 3:443-478, 1992.
[58] M. Nielsen, G. Rosenberg, and P.S. Thiagarajan. Elementary transition systems. Theo-
retical Comp. Science, 96:3-33, 1992.
[59] E. Ochmariski. Regular behaviour of concurrent systems. Bull. Europ. Assoc. for Theor.
Comp. Science, 27:56-67, 1985.
[60] P. Panangaden, V. Shanbhogue, and E.W. Stark. Stability and sequentiality in dataflow
networks. In 17th ICALP, Lecture Notes in Comp. Science vol. 443, pages 308-321.
Springer, 1990.
[61] P. Panangaden and E.W. Stark. Computations, residuals and the power of indeterminacy.
In Automata, Languages and Programming, Lecture Notes in Comp. Science vol. 317,
pages 439-454. Springer, 1988.
[62] M. Picantin. The center of thin gaussian groups. J. of Algebra, 245:92-122, 2001.
[63] W. Reisig. Petri Nets. Springer, 1985.
[64] M.P. Schiitzenberger. On finite monoids having only trivial subgroups. Inf. Control,
8:190-194, 1965.
[65] M.W. Shields. Concurrent machines. Comp. J., 28:449-465, 1985.
[66] E.W. Stark. Concurrent transition systems. Theoretical Comp. Science, 64:221-269,
1989.
[67] E.W. Stark. Connections between a concrete and an abstract model of concurrent sys-
tems. In 5th Conf. on the Mathematical Foundations of Programming Semantics, Lecture
Notes in Comp. Science vol. 389, pages 53-79. Springer, 1989.
[68] P.S. Thiagarajan and I. Walukiewicz. An expressively complete linear time temporal logic
for Mazurkiewicz traces. In LICS'97, pages 183-194. IEEE Computer Society Press, 1997.
(full version to appear in Information and Computation).
[69] W. Thomas. On logical definability of trace languages. In V. Diekert, editor, Proceed-
ings of a workshop of the ESPRIT BRA No 3166: Algebraic and Syntactic Methods in
Computer Science (ASMICS) 1989, Report TUM-I9002, Technical University of Munich,
pages 172-182, 1990.
[70] I. Walukiewicz. Difficult configurations - on the complexity of LTrL. In 1C ALP'98,
Lecture Notes in Comp. Science vol. 1443, pages 140-151. Springer, 1998.
[71] G. Winskel. Event structures. In W. Brauer, W. Reisig, and G. Rozenberg, editors, Petri
nets: Applications and Relationships to Other Models of Concurrency, Lecture Notes in
Comp. Science vol. 255, pages 325-392. Springer, 1987.
[72] G. Winskel and M. Nielsen. Models for concurrency. In S. Abramsky, D.M. Gabbay, and
T.S.E. Maibaum, editors, Handbook of Logic in Computer Science vol. 4, pages 1-148.
Oxford University Press, 1994.
[73] W. Zielonka. Notes on finite asynchronous automata. R.A.I.R.O. - Informatique
Theorique et Applications, 21:99-135, 1987.
Advances in Logic, Artificial Intelligence and Robotics 173
J.M. Ahe and J.I. da Silva Filho (Eds.)
IOS Press, 2002
1 Introduction
Supervised learning is the process of automatically creating a classification model from
a set of examples, called the training set, which belong to a set of classes. Once a
model is created, it can be used to automatically predict the class of other unclassified
examples.
In other words, in supervised learning, a set of n training examples is given to an
inducer. Each example X is an element of the set FI x F2 x . . . x Fm where Fj is
the domain of the jth feature. Training examples are tuples (X, F) where Y is the
label, output or class. The Y values are typically drawn from a discrete set of classes
{!,..., K} in the case of classification. Given a set of training examples, the learning
algorithm (inducer) outputs a classifier such that, given a new example, it accurately
predicts the label Y.
In this work, we reserve our discussion to concept-learning1, so Y can assume one
of two mutually exclusive values. We use the general labels positive and negative to
discriminate between the two class values.
For a number of application domains, a huge disproportion in the number of cases
belonging to each class is common. For instance, in detection of fraud in telephone
calls [7] and credit card transactions [15], the number of legitimate transactions is
However some of the methods discussed here can be applied to multi-class problems.
174 M.C. Monard andG.E.A.P A. Batista /Skewed Class Distributions
much higher than the number of fraudulent transactions. In insurance risk modelling
[12], only a small percentage of the policyholders file one or more claims in any given
time period. Also, in direct marketing [11], it is common to have a small response rate
(about 1%) for most marketing campaigns. Other examples of domains with intrinsic
imbalance can be found in the literature. Thus, learning with skewed class distributions
is an important issue in supervised learning.
Many traditional learning systems are not prepared to induce a classifier that ac-
curately classifies the minority class under such situation. Frequently, the classifier has
a good classification accuracy for the majority class, but its accuracy for the minor-
ity class is unacceptable. The problem arises when the misclassification cost for the
minority class is much higher than the misclassification cost for the majority class.
Unfortunately, that is the norm for most applications with imbalanced data sets, since
these applications aim to profile a small set of valuable entities that are spread in a
large group of "uninteresting" entities.
In this work we discuss some of the most used methods that aim to solve the problem
of learning with imbalanced data sets. These methods can be divided into three groups:
1. Assign misclassification costs. In a general way, misclassify examples of the
minority class is more costly than misclassify examples of the majority class. The
use of cost-sensitive learning systems might aid to solve the problem of learning
from imbalanced data sets;
2. Under-sampling. One very direct way to solve the problem of learning from im-
balanced data sets is to artificially balance the class distributions. Under-sampling
aim to balance a data set by eliminating examples of the majority class;
3. Over-sampling. This method is similar to under-sampling. But it aims to achieve
a more balanced class distributions by replicating examples of the minority class.
This work is organised as follows: Section 2 discusses why accuracy and error rate
are inadequate metrics to measure the performance of learning systems when data have
asymmetric misclassification costs and/or class imbalance; Section 3 explains the rela-
tionship between imbalanced class distributions and cost-sensitive learning; Section 4
discusses which class distributions are best for learning; Section 5 surveys some meth-
ods proposed by the Machine Learning community to balance the class distributions;
Section 6 presents a brief discussion about some evidences that balancing a class dis-
tributions has little effect in the final classifier; finally, Section 7 shows the conclusions
of this work.
The error rate (E) and the accuracy (Ace) are widely used metrics for measuring the
performance of learning systems. However, when the prior probabilities of the classes
are very different, such metrics might be misleading. For instance, it is straightforward
to create a classifier having 99% accuracy (or 1% error rate) if the data set has a
majority class with 99% of the total number of cases, by simply labelling every new
case as belonging to the majority class.
Other fact against the use of accuracy (or error rate) is that these metrics consider
different classification errors as equally important. For instance, a sick patience diag-
nosed as healthy might be a fatal error while a healthy patience diagnosed as sick is
considered a much less serious error since this mistake can be corrected in future ex-
ams. On domains where misclassification cost is relevant, a cost matrix could be used.
A cost matrix defines the misclassification cost, i.e., a penalty for making a mistake
for each different type of error. In this case, the goal of the classifier is to minimize
classification cost instead of error rate. Section 3 discusses more about the relationship
between cost-sensitive learning and imbalanced data sets.
It would be more interesting if we could use a performance metric that disassociates
the errors (or hits) occurred in each class. From Table 1 it is possible to derive four
performance metrics that directly measure the classification performance on the positive
and negative classes independently, they are:
These four class performance measures have the advantage of being independent of
class costs and prior probabilities. It is obvious that the main objective of a classifier
is to minimize the false positive and negative rates or, similarly, to maximize the true
negative and positive rates. Unfortunately, for most "real world" applications, there is
a tradeoff between FN and FP and, similarly, between TN and TP. The ROC2 graphs
[13] can be used to analyse the relationship between FN and FP (or TN and TP) for
a classifier.
2
ROC is an acronym for Receiver Operating Characteristic, a term used in signal detection to
characterize the tradeoff between hit rate and false alarm rate over a noisy channel.
176 M. C Monard and G. E A PA Batista / Skewed Class Distributions
Consider that the minority class, whose performance will be analysed, is the positive
class. On a ROC graph, TP (1 - FN) is plotted on the Y axis and FP is plotted on the
X axis. Some classifiers have parameter for which different settings produce different
ROC points. For instance, a classifier that produces probabilities of an example being
in each class, such as Naive Bayes classifier, can have a threshold parameter biasing
the final class selection3. Plotting all the ROC points that can be produced by varying
these parameters produces a ROC curve for the classifier. Typically this is a discrete
set of points, including (0,0) and (1,1), which are connected by line segments. Figure 1
illustrates a ROC graph of 3 classifiers: A, B and C. Several points on a ROC graph
should be noted. The lower left point (0.0) represents a strategy that classifies every
example as belonging to the negative class. The upper right point represents a strategy
that classifies every example as belonging to the positive class. The point (0.1) represents
the perfect classification, and the line x = y represents the strategy of random guessing
the class.
From a ROC graph is possible to calculate an overall measure of quality, the under
the ROC curve area (A UC). The AUC is the fraction of the total area that falls under
the ROC curve. This measure is equivalent to several other statistical measures for
evaluating classification and ranking models [8]. The AUC effectively factors in the
performance of the classifier over all costs and distributions.
will not always yield optimal results, will generally lead to results which are no worse
than, and often superior to, those which use the natural class distributions.
One of the most direct ways for dealing with class imbalance is to alter the class distri-
butions toward a more balanced distribution. There are two basic methods for balancing
the class distributions:
1. Under-sampling: these methods aim to balance the data set by eliminating exam-
ples of the majority class, and;
2. Over-sampling: these methods replicate examples of the minority class in order to
achieve a more balanced distribution.
6 Discussion
Much of research done to solve the problem of learning from imbalanced data sets is
based on balancing the class distributions. However, recent research has shown that
many learning systems are insensitive to class distributions. Drummond and Holte [5]
showed that there are decision tree splitting criteria that are relatively insensitive to
a data set class distributions. Elkan [6] makes similar statements for Naive Bayes and
M.C. Monard and G.E.A.P.A. Batista /Skewed Class Distributions 179
Figure 2: Applying Tomek links to a data set. Original data set (a), Tomek links identified (b), and
Tomek links removed (c).
decision tree learning systems. If a learning system is insensitive to the class distribu-
tions, then changing the class distributions — or balancing a data set — might have
little effect in the induced classifier.
On the other hand, under- and over-sampling have been empirically analysed in
several of domains, with good results. In [9] several approaches for dealing with imbal-
anced data sets are compared, and it concludes that under- and over-sampling are very
effective methods for dealing with imbalanced data sets.
Moreover, Drummond and Holte [5] stated that under- and over-sampling should
be reconsidered in terms of how they affect pruning and leaf labelling. However, on
several experiments performed in [14], classifiers generated from balanced distributions
obtained results that were, frequently, better than those obtained from the naturally
occurring distributions. These experiments were conducted with no pruning, and ad-
justing the leaf labelling to account the changes made in class distributions.
7 Conclusion
Learning from imbalanced data sets is an important issue in Machine Learning. A direct
method to solve the imbalance problem is to artificially balance the class distributions.
This balance can be obtained by under-sampling the majority class, over-sampling mi-
nority class, or both. There are several works in the literature that confirm the efficiency
of these methods in practice. However, there is some evidence that re-balancing the class
distributions artificially does not have much effect on the performance of the induced
classifier, since some learning systems are not sensitive to differences in class distribu-
tions. It seems that we still need a clearer understanding of how class distributions affect
each phase of the learning process. For instance, in decision trees, how class distribu-
tions affect the tree construction, pruning and leaf labelling. A deeper understanding
of the basics will permit us to design better methods for dealing with the problem of
learning with skewed class distributions.
References
[1] G. E. A. P. A. Batista, A. Carvalho, and M. C. Monard. Applying One-sided Selection to Unbal-
anced Datasets. In O. Cairo, L. E. Sucar, and F. J. Cantu, editors. Proceedings of the Mexican
International Conference on Artificial Intelligence - MICAI 2000. pages 315-325. Springer-Verlag.
April 2000. Best Paper Award Winner.
[2] L. Breiman, J. Friedman, R. Olshen, and C. Stone. Classification and Regression Trees. Wadsworth
& Books, Pacific Grove, CA, 1984.
[3] Nitesh V. Chawla, Kevin W. Bowyer, Lawrence O. Hall, and W. Philip Kegelmeyer. SMOTE:
Synthetic Minority Over-sampling Technique. Jounal of Artificial Intelligence Research. 16:321-
357, 2002.
[4] Pedro Domingos. MetaCost: A General Method for Making Classifiers Cost-Sensitive. In Knowl-
edge Discovery and Data Mining, pages 155-164, 1999.
[5] Chris Drummond and Robert C. Holte. Exploiting the Cost (In)sensitivity of Decision Tree
Splitting Criteria. In Proceedings of the 17th International Conference on Machine Learning
(ICML'2000), pages 239-246, 2000.
[6] Charles Elkan. The Foundations of Cost-Sensitive Learning. In Proceedings of the Seventeenth
International Joint Conference on Artificial Intelligence, pages 973-978, 2001.
[7] Tom Fawcett and Foster J. Provost. Adaptive Fraud Detection. Data Mining and Knowledge
Discovery, 1(3):291-316, 1997.
[8] David J. Hand. Construction and Assessment of Classification Rules. John Wiley and Sons. 1997.
[9] Nathalie Japkowicz. Learning from Imbalanced Data Sets: a Comparison of Various Strategies.
In AAAI Workshop on Learning from Imbalanced Data Sets. Menlo Park, CA. 2000. AAAI Press.
[10] M. Kubat and S. Matwin. Addressing the Course of Imbalanced Training Sets: One-Sided Selec-
tion. In Proceedings of 14th International Conference in Machine Learning, pages 179-186. San
Francisco, CA, 1997. Morgan Kaufmann.
[11] Charles X. Ling and Chenghui Li. Data Mining for Direct Mining: Problems and Solutions. In
Proceedings of The Forth International Conference on Knownledge Discovery and Data Mining.
pages 73-79, 1998.
[12] Edwin P. D. Pednault, Barry K. Rosen, and Chidanand Apte. Handling Imbalanced Data Sets
in Insurance Risk Modeling. Technical Report RC-21731, IBM Research Report, March 2000.
[13] Foster J. Provost and Tom Fawcett. Analysis and Visualization of Classifier Performance: Com-
parison under Imprecise Class and Cost Distributions. In Knowledge Discovery and Data Mining.
pages 43-48, 1997.
[14] Foster J. Provost and Tom Fawcett. Robust Classification for Imprecise Environments. Machine
Learning, 42(3):203-231, 2001.
[15] S. J. Stolfo, D. W. Fan, W. Lee, A. L. Prodromidis, and P. K. Chan. Credit Card Fraud Detection
Using Meta-Learning: Issues and Initial Results. In AAAI-97 Workshop on AI Methods in Fraud
and Risk Management, 1997.
[16] I. Tomek. Two Modifications of CNN. IEEE Transactions on Systems Man and Communications.
SMC-6:769-772, 1976.
[17] Gary M. Weiss and Foster Provost. The Effect of Class Distribution on Classifier Learning: An
Empirical Study. Technical Report ML-TR-44, Rutgers University. Department of Computer
Science, 2001.
[18] D. R. Wilson and T. R. Martinez. Reduction Techniques for Exemplar-Based Learning Algorithms.
Machine Learning, 38(3):257-286, March 2000.
Advances in Lo^ic, Artificial Intelligence and Robotics 181
J.M. Abe and.1.1. da Silva Filho (Eds.)
IOS Press. 2002
Abstract. We shall enlarge the Sentential Calculus by Lukasiewicz. We use his sym-
bolizs,and the concept of 'yield' by Rosser. I introduced a deduction theorem,which we
call a con-junctive deduction theorem. Here we shall prove new theorems and theses. T
heorems areexpressed with 'yield' i.e., , f :a set of propositions',and theses are va-lid pro
positions in the system. |-
1 . Two axiom systems.
|jj — 1 Lukasiewicz's system.
Lukasiewicz formulated the well known symbolisms. These are written in his work
"Elements of Mathematical Logic"(1929). For details of these symbolisms,see the work
[1]. He published an axiom system L3 for his Sentential Calculus in 1930.
(1) Axioms of L3:
1 CpCqp.
2 CCpCqrCCpqCpr.
3 CCNpNqCqp.
Cpq : "if p,then q"(implication or conditional).
Np: "it is not true that p"(negation).
(2) Detachment : If a, C a yS , then ft (modus ponens).
Definition K: Kpq=NCpNq(conjunction).
[jj — 2 Rosser's system and Some theses in L3 .
(1) Rosser's system.
We are able to have the following theses in L3 by axiomatic method.
(D CpKpp. P-»PAP.
(D CKpqp. PAQ^P.
(D CCpqCNKqrNKrp. P^Q.^.~(PAR)^~(RAP)
Detachment : If a, NKaNft,then^(modus ponens).
1 82 S. Tanaka /An Enlargement of Theorems for Sentential Calculus
Definition C: Cpq=NKpNq.
These are axioms of Rosser"s truth- valued axiom system R(1942)[ ].
[2j A notation ' h ' and some theorems
We introduce a notations P,,P 2 , ••-,?„ h Q, read as ' P,,P 2 , ••-,?„ yield Q'.
|2|— 1 The notation Pi, P2, ••-,?„ h Q indicates that there is a sequence of sentences ft i
/?2,"',/8m such that ftm is Q and for each ft, either:
(1) ft ; is an axiom.
(2) ft , is a P.
(3) ft i is the same as some earlier ft -,.
(4) ft •, derived from two earlier ft 's by mouds ponens.
More precise version of (4) is: There are j and k,each less than i,such that ft P n \- Q.
Proof. From a i , - - , a n \~ft ^ and ft^,•••,ftti \-S we have a i/",a „,£ 2 ,--,/8 b \-6
by Theorem 1-2. Fromr i,"',rm h )S 2 and a i,a 2 ,--,a n ,/8 2 ,--,/3 b h^ we get
5. Tanaka /An Enlargement of Theorems for Sentential Calculus 183
a 1 :p premise
a 2 :CpCqr premise
a 3 :Cqr modus ponens( a , , a 2 ).
a 4 :q premise
a s :r modus ponens( a 3 , a 4 )
Theorem 2-4 CpCqr, Cpq, p \- r.
Proof.
a 1 :p premise
a 2 :Cpq premise
a 3: q modus ponens( or , , a 2 )
a 4 :CpCqr premise
a 5 :r modus ponens( a 3, a 4 )
Theorem 2-5 CpCpq, p h q-
Proof.
a ! :p premise
a2:CpCpq premise
a 3 'Cpq modus ponens( a , , a 2 )
a 5 :q modus ponens( a 2 , a 3 )
this.
Applying the deduction theorem to Theorem 3-4 twice,we have the following two:
11 CNpCpq.
12 CpCNpq.
Theorem 3-5. CNqNp, Cqr, p (- r.
Proof. Make a sequence : /8 ,,£ 2 ,£ 3 ,/8 4 ,£ 5 ,£ 6 h r : f t i :CCNqNpCpq, 0 2:CNq
Np, 0 3 :Cpq, ft 4 :p, /S 5 :q, 0 e :Cqr, r: r. By axiom 3 p/q, q/p, we have f- /? ,. By
modusponens we derive/? 3 (/S ,,/S 2 ) ,/S 5 ( / S 3 , / 3 4 ) and r ( f t $,&<>)• From Theorem 1-5
(m=6) we get this.
Applying the deduction theorem 3 times to Theorem 3-5, we get the following
thesis:
13 CCNqNpCCqrCpr.
Teorem 3-6. CCpqr \- CNpr.
Proof. By Teorem 3-2 p/Np, q/Cpq we have:/9, :CNpCpq, /S 2:CCpqr h r:CNpr.
By thesis 11 we have f- ft , . From Theorem 1-5 (nv=2) we get this.
Applying the deduction theorem to Theorem 3-6 , we get the following thesis:
14 CCCpqrCNpr.
Teorem 3-7. CCNpqr h Cpr.
Proof. By Teorem 3-2, q/CNpq we have: ft , :CpCNpq, £ 2 :CCNpqr' r:Cpr. By
thesis 12 we have \- 0 i . From Theorem 1-5 (m=2) we get this.
Applying the deduction theorem to Theorem 3-7 , we get the following thesis:
15 CCCNpqrCpr.
Theorem 3-8 CNqq, Nq |- p.
Proof. By Theorem 2-4:CpCqr, Cpq, p h r, p/Nq, r/p we have: ft ,:CNqCqp, ft 2:C
NqNq,/S 3 :Nq" r'-p- As h ft i by thesis 11 p/q, q/p, from Theorem 1-5 (m=3) we g
et this.
Applying the deduction theorem to Theorem 3-8 twice, we have the following
thesis:
16 CCNqqCNqp.
Theorem 3-9 CNqp h CCqrCpr.
Proof. We make a sequence ft , :CCNqNpCpq, ft 2 : CNqNp, £ 3:Cpq, y84:CCpqCCqr
Cpr h r:CCqrCpr. By thesis 11 p/q, q/p, and thesis 10 we get h ft i - h ft *• By
S. Tanaka /An Enlargement of Theorems for Sentential Calculus \ 87
modus ponens we derive/? 3(fi ,,£ 2 ) andr(£ 3 ,/J 4 ). From Theorem 1-5 (m=4) we
get this.
Applying the deduction theorem to Theorem 3-9 , we have the following thesis:
17 CCNqpCCqrCpr
Theorem 3-10 CNqq, Cqr, p [- r.
Proof. We make a sequence /S ^CNqNq,^ 2 : CNqNp , /33:CCqrCpr , /3 4 : Cqr,
P 5:Cpr ,/3 6 :p, r = /3 7 : r. By modus ponens, we have y S 2 ( ^ 1 , thesis 16), £ 3(thes
is!7,/3 2 ), £ 5 (/3 3 ,/3 4 )and r = /3 7 (/3 5 ,/3 6 ). Using Theorem 1-2 (n=l,m=2) twice,t
hen we get this.
Applying the deduction theorem to Theorem 3-10 , we have the following thesis:
18 CCNqqCCqrCpr.
Theorem 3-11 CNpp |- p.
Proof. By Theorem 3-10 q/p, p/Cpp, r/p we have /?i:CNpp, /8 2 : Cpp, £ 3:Cpp,
/ S 4 = r : p • Clearly \- J3 2 , h # 3 . From Theorem 1-5 we get this.
Applying the deduction theorem to Theorem 3-11, we have the following thesis:
19 CCNppp.
Theses 7, 12 and 19 are axioms of Li 924-system.
Applying the deduction theorem to Theorem 2-5:CpCpq, p h q twice, we get the
following thesis:
20 CCpCpqCpq.
Applying the deduction theorem to Theorem 4-2 , we have the following thesis:
22 CNNpp.
We use proof line. Axiom 3 p/NNp, q/Np * C22 p/Np-23.
23 CpNNp.
Theorem 4-3 Cpq h CNqNp.
Proof. h 0,:CNNpp (thesis 22), 0 2: Cpq, 0 3: CNNpq (MPyS 1 ( 0 2 ), h
£ 4 :CqNNq (thesis 23 p/q), yS s : CNNppNNq (MP£ 3, £ 4), ft 6 = r :CNqNp(axiom 3
p/Np, q/Nq). Hence we get this.
Applying the deduction theorem to Theorem 4-3, we have the following thesis:
24 CCpqCNqNp.
Theorem 4-4 Cpq h CNNCqNrNNCrNp.
Proof. From Theorem 3-2 r/Nr ,we have ft ,: Cpq, CqNr h CpNr. By Theorem
4-3 q/Nr we get ft 2:CpNr h CCrNp. ft 3: Cpq, CqNr h CrNp (HS,ft ,, ft 2), ft 4:
Cpq h CCqNrCCrNp (ft 3 and the deduction theorem) ,ft 5 : CCqNrCrNp hCNCrNp
NCqNr(Theorem 4-3 p/CqNr, q/CrNp),£ 6 = r :CNCrNpNCqNr h CNNCqNrNNCrNp
(Theorem 4-3 p/NCrNp, q/NCqNr). Hence from Theorem 1-5 we get this.
Applying the deduction theorem to Theorem 4-4, we have the following thesis:
25 CCpqCNNCqNrNNCrNp.
Applying Definition: Kpq=NCpNq to Thesis 25, we have the following thesis:
26 CCpqCNKqrNKrp.
This is Axiom 3 of Rosser (1942).
Theorem 4-5 CNpq h CNqp.
Proof. h ySi:CNpq , ft 2: CqNNq (thesis 23 p/q ), ft 3: CNpNNq (Theorem 3
-2, ft,, ft 2), h £ 4 :CCNpNNqCNqp (axiom 3, q/Nq ), £ 5 = r : CNqp (MPft3,
ft 4), Hence from Theorem 1-5 we get this.
Applying the deduction theorem to Theorem 4-5, we have the following thesis:
27 CCNpqCNqp.
Theorem 4-6 CpNq h CqNp.
Proof. h ft ,:CpNq , h ft 2 : CCpNqCNNqNp (thesis 24 q/Nq ), ft 3: CNNqNp
(MP, / 8 , , ft 2 ), \- ft*: CqNNq (thesis 23, p/q), £ 5 = r: CqNp (HS, ft 3, ft 4),
Hence from Theorem 1-5 we get this.
Applying the deduction theorem to Theorem 4-6, we have the following thesis:
S. Tanaka /An Enlargement of Theorems for Sentential Calculus 189
28 CCpNqCqNp.
h /S,:CNpCpNq (thesis 11 q/Nq), \- /3 2 : CCNpCpNqCNCpNqNNp (thesis 24
p/Np, q/CpNq ), 0 3 : CNCpNqNNp (HS, /S^ /3 2 ), h /34:CNNpp (thesis 22 ),
/ 3 5 = r : CNCpNqp (HS,/3 3 , /3 4), Hence from Theorem 1-5 we have the fol-
lowing thesis:
29 CNCpNqp.
Applying Definition: Kpq=NCpNq to Thesis 29, we have the following thesis:
30 CKpqp.
This is Axiom 2 of Rosser (1942)
31 CCpCqNrCpCrNq.
h /3 1 :CCCqNrCrNqCCpCqNrCpCrNq.(thesis 8 q/CqNr, r/CrNq), h /S 2 : CCqNrC
rNq (thesis 28 p/q, q/r ), 0 3=r: CCpCqNrCpCrNq (MP , ft,, & 2 ). Hence
we get this.
32 CpNCpNp.
Proof, h ft 1 :CCpCCpNpNpCpCpNCpNp.(thesis 31 q/CpNp, r/p), • /8 2 : CpCCp
NpNp (thesis 5, q/Np ), ft 3:CpCpNCpNp (MP, /8 1 ,/8 2 ), h ft 4:CCpNCpNpCpNCpNp
(thesis 20, q/NCpNp ), j3 5 = r : CpNCpNp (MP,/S 3 , /S 4). Hence we get this.
Applying Definition: Kpq=NCpNq to Thesis 32, we have the following thesis:
33 CpCKpp.
This is Axiom 1 of Rosser (1942). Therefore the set of 33,30 and 26 is the sy
stem of the truth-valued axioms by Rosser.
51 CpCqKpq.
Proof, h £ , :CCKpqKpqCpCqKpq (Thesis 50, r/Kpq), h P 2:CkpqKpq (Thesis 4,
p/Kpq), h /S3:CpCqKpq (MP, 0 „ 02 ).
Theorem 5-15 Cpq h CpNNq.
Proof. H 0,:Cpq, (Thesis 23,p/q), h /32:CqNNq (HS,/3 l 5 0 2).
Applying the deduction theorem to Theorem 5-15 , we have the following thesis:
52 CCpqCpNNq.
Theorem 5-16 Cpq,Crs \- NKNKqsKpr.
Proof, ft 1 :Crs, ' 0 z: CCrsCKprKsp (Theorem5-2,p/r,q/s,r/p), h /33:CKprKsp (MP,
£ 1 , 0 2 ), /3 4 : Cpq , /3 5:CCpqCKspKqs (Theorem 5-2, r/s ), /3 6 : CKspKqs(MP, /3
3, £ 4 ) , h /3 7 :CKprKsp, CKsKqs h NKNKqsKpr (Theorem 5-1, p/Kpr, q/Ksp, r/Kq
s), /3 8 : NKNKqsKpr (03,0<,07).
53 CKrNNpKpr.
Proof, h £ 1 :CNNpp h CKrNNpKpr(Theorem5-2,p/NNp,q/p), h ft 2:CNNpp
(Thesis 22,p/r,q/s,r/p), \- P 3:CKrNNpKpr (MP,£ 1 ? /8 2).
54 CKrNNpKrp.
Proof, h )8 , :CKrNNpKpr(Thesis 52), \-0 2:CKprKrp (Thesis 38, p/r,,r/p), \-/33:
CKrNNpKrp (MP,j8 ,, 0 2).
Theorem 5-17 CKpqr, CKpNqr h Cpr.
Proof. £i:CKpqr, ft 2:CKpNqr , \- 0 3:CKqpKpq (Thesis 38,r/q), ft 4:CKqpr(HS,
/ 8 i , yS 3 ), h 0 5:CKNqpKpNq (Thesis 38, r/Nq ), £ 6 : CKNqpr(HS, 0 2, 0 B),
h y S 7 : CKqprCqCpr (Thesis 50, p/q,q/p),0 8: CqCpr(MP,£ 4 ,y8 7), h /8 9 :
CKNqprCNqCpr (Thesis 50, p/Nq,q/p), y S 1 0 : CNqCpr (MP./8 B ,/8 9), h^n:CqCpr ,CN
qCpr h Cpr (Teorem 5-2, p/q, q/Cpr), £ , 2:Cpr (0 8, yS , 0, £ ,,).
Theorem 5-18 CpNr h CpNKqr.
Proof. y8 ,:CpNr, h 0 2:CpNr H CpNKrq (Theorem 5-13,q/r,r/q),£ 3:CpNKrq (;8 1,
j8 2 ), h yS 4:CNKrqNKqr(Thesis 41,p/r,r/q), £ 5:CpNKqr (HS, 0 3, ^8 4).
Applying the deduction theorem to Theorem 5-18 , we have the following thesis:
55 CCpNrCpNKqr.
Theorem 5-19 Cpq, CpNr h CpNCqr.
Proof.yS !:Cpq, /3 2:CpNr. By Theorem 5-7 r/Nr, 0 , and /S 2, we get/3 3:CpKqNr.
By Theorem 5-15 q/KqNr and /3 3,we get /3 4:CpNNKqNr , /3 5 : CpNCqr(Cpq=NKp
Nq, p/q,q/p).
END
194
Advances in Logic. Artificial Intelligence ami Robotics
J.M. Abe and J.I. da Sil\-a Filho (Eds.)
IOS Press. 2002
1 Introduction
The aim of this work is to present RT-Community, a specification language for real-time
reactive systems. Using the language, one is able to specify the computations of the individual
components of a real-time system as well as the way these components interact with each
other. This makes RT-Community suitable as an architecture description language (ADL) in
the sense of [1].
RT-Community is an extension of the specification language Community ([7]), from
which it inherits its characteristics as a coordination language ([4]). Roughly speaking, this
means that it supports the separation between computational concerns (what each component
does) and coordination concerns (how components are put together and how they interact to
present the behavior expected of the system as a whole).
This paper presents the syntax and informal semantics of RT-Community in section 2 (a
formal model-theoric semantics, not included here for lack of space, is found in [2]). Section 3
shows how composition is done in RT-Community by way of some basic notions of Category
Theory. Finally, section 4 offers some concluding remarks, especially about the potential role
of RT-Community in the search for interoperability of specification formalisms.
2 RT-Community
• V is a finite set of variables, partitioned into input variables, output variables and private
variables. Input variables are read by the component from its environment; they cannot be
modified by the component. Output variables can be modified by the component and read
by the environment; they cannot be modified by the environment. Private variables can
be modified by the component and cannot be seen (i.e. neither read nor modified) by the
environment. We write loc(V] for prv(V) U out(V). We assume given but do not make
explicit the specification of the data types over which the variables in V range.
F.N. do A moral et al. / A Real-time Specification Language 195
component P
in in ( V )
out out (V)
prv prv (V)
clocks C
init F
do
g:[T(g),B(g) -> R(g), \\v€D(g} v:= F(g,v)]
prv g:[T(g)tB(g) -> R(g), \\v€D(g) v:= F(g,v)}
end component
• C is a finite set of clocks. Clocks are like private variables in that they cannot be seen by
the environment; they can be consulted (i.e. their current values may be read) by the com-
ponent; however, the only way a component can modify a clock variable is by resetting it
to zero.
• F is a formula specifying the initial state of the component (i.e., conditions on the initial
values of variables in loc(V}). The initial values of input variables are unconstrained, and
clocks are always initialized to zero.
• F is a finite set of action names. Actions may be either shared or private. Shared actions
are available for synchronization with actions of other components, whereas the execution
of private actions is entirely under control of the component.
• The body of the component consists of a set of instructions resembling guarded com-
mands. For each action g 6 F, we have the time guard T(g), which is a boolean expres-
sion over atomic formulae of the form x ~ n and x — y ~ n, where x, y E (7, n E M>o and
~ is one of {<,>,=}; the data guard B(g), which is a boolean expression constructed
from the operations and predicates present in the specification of the data types of V; the
reset clocks R(g) C C, containing the clocks to be reset upon execution of g; and the
parallel assignment of a term F(g, v) to each variable v that can be modified by g. We
denote by D(g) the set of variables that can be modified by g. This set will be called the
write frame (or domain) of g.
The execution of a component proceeds as follows: at each step, an action whose time
and data guards are true (such an action is said to be enabled) may be executed. If more than
one enabled action exists, one may be selected non-deterministically. If an action is selected,
the corresponding assignments are effected and the specified clocks are reset. If no action
is selected, time passes, the component idles and all clocks are updated accordingly (i.e.,
the semantics is based on global time). However, three conditions must be met: (1) a private
action may not be enabled infinitely often without being infinitely often selected (the fairness
condition), (2) time may not pass if there is an enabled private action (the urgency condition),
and (3) a shared action may not be disabled solely by the passage of time (the persistency
condition).
A model-theoretic semantics of RT-Community based on Time-Labelled Transition Sys-
tems can be found in [2].
196 F.N. do A mural et al. /A Real-rime Specification Language
component snooze
in float initiallnterval, float minimum
out bool ringing
prv float interval
clocks c
init -Bringing A interval = -1
do
[]firstRing: —> ringing := true || interval := initiallnterval
[] snooze: ringing A interval > minimum —>
reset(c) || ringing := false || interval := interval/2
[] off: ringing —> ringing := false || interval := -1
[] prv timeout: c == interval —> ringing := true
end component
We present a component for the "snooze" feature of an alarm clock. Component snooze is
activated when the timekeeping component of the alarm clock (not shown) reaches the preset
time, as indicated by action firstRing. This action sets the output variable ringing to true, a
change that may be detected by a "bell" component (not shown either). If the user presses the
"off" button at this point, the alarm and the snooze component are turned off, as indicated
by the o/f action. However, if the user presses the "snooze" button (action snooze), the alarm
stops ringing, only to ring again after a preset time interval. This second ringing of the alarm
is activated by the snooze component upon detecting the timeout (private action timeout).
Now, if the user presses the "snooze" button this time, he will be allowed to sleep for an
additional period with half the duration of the initial interval. This pattern repeats, with the
interval being halved each time the alarm rings and the user presses the "snooze" button, until
either the user presses the "off' button or the interval reaches a certain minimum duration (in
this last case, the alarm will go on ringing until the user presses the "off' button).
The duration of the initial interval and the minimum duration are provided by the environ-
ment of the snooze component, as indicated by input variables initiallnterval and minimum.
The specification of the snooze component is given in Figure 2. There, time guards and data
guards that have the constant truth value true are omitted, and the resetting of a clock c is
indicated by the reset(c) instruction.
We use basic concepts from Category Theory ([6, 10]) to define how composition is done in
RT-Community. The use of Category Theory allows us to describe the interaction of compo-
nents in an abstract fashion, thus establishing the essentials of their coordination in such a
way that RT-Community components may be replaced by specifications in other formalisms
(e.g. a temporal logic with time-bounded operators such as MTL - see [3]), a desirable feature
when our ultimate goal is to promote the interoperability of formalisms.
F.N. do Amaral et al. /A Real-time Specification Language 197
Formally, an RT-Community component is a pair (E, A) with E the signature and A the body
of the component.
Definition 3.1 (Signature of a component). An RT-Community signature is a tuple E =<
V, C, F, tv, ta, D, R >, where V is a finite set of variables, C is a finite set of clocks, F is
a finite set of action names, tv : V —>• {in, out,prv} is the typing function for variables,
ta : F —> {shr,prv} is the typing function for action names, D : F —> 2ioc(V) ^ ^e wr^e
frame function for action names, and R : F —> 1C is the reset clock function for action
Definition 3.2 (Body of a component). The body of an RT-Community component with sig-
nature E is a tuple A =< T, B, F, I >, with T : F —> PKOP(C) the time guard func-
tion for action names, B : F —> PP,OP(V) the data guard function for action names,
F : F —> (loc(V) —>• T£7tM(V)) the assignment function for action names, and I a for-
mula over loc(V) specifying the initial state(s) of the component. Here, P*R,OP(C) denotes
the set of boolean propositions over clocks, P71OP(V) denotes the set of boolean proposi-
tions over variables and T£TIM(V) is the set of terms of the term algebra of the data types
involved. Function F must respect sorts when assigning terms to variables.
• For all v 6 V\, sort?.(o~v(v}} = sorti(v}. Variables of the component are mapped to
variables of the system in such a way that sorts are preserved;
• For all o,i,p e Vi, o € out(Vi) =» av(o) € out^), i € in(Vi) =>• av(i) 6 in(V2) U
out(V2), p e prv(Vi) => av(p) e ouHy-z). The nature of each variable is preserved, with
the exception that input variables of the component may become output variables of the
system. This is because, as will be seen below, an input variable of a component may be
"connected" to an output variable of another component; when this happens, the resulting
variable must be considered an output variable of the system: its value can be modified
by the system, and it remains visible to the environment (which, however, cannot modify
it).
ac : C\ —> GI is a total, injective function from the clocks of the component to the
clocks of the system. In other words, all clocks of the component must retain their identity in
the system.
°a : T2 —> FI is a partial function from the actions of the system to the actions of
the component. oa is partial because an action of the system may or may not correspond
to an action of the component; i.e., if the component does not participate in a given action
198 F.N. do Amoral et al. / A Real-time Specification Language
g of the system, then aa(g) is undefined. Furthermore, note the contravariant nature of aa
compared to av and oc: because each action of the system can involve at most one action of
the component, and each action of the component may participate in more than one action
of the system, the relation must be functional from the system to the component. Besides, aa
must satisfy the following conditions (D(v) for a variable v denotes the set of actions having
v in their domain):
• For all g 6 F2 for which aa(g) is defined, g 6 s/ir(F2) => oa(g) e s/ir(Fi) and
g 6 prv(T2) =>• oa(g) € prv(Ti). An action of the system is of the same nature (shared
or private) as the component action involved in it.
• For all g e F2 for which oa(g] is defined, and for all v € loc(V\), v 6 D i ( a a ( g ) ) =»
av(v) e D2(g) and g 6 D2(av(v)) =>• cra(g) € DI(V). If a component variable is
modified by a component action, then the corresponding system variable is modified by
the corresponding system action. Besides, system actions where the component does not
participate cannot modify local variables of the component.
The following item states analogous conditions for clocks of the component:
• For all g 6 F2 for which oa(g) is defined, and for all c € C\, c e R\ (aa(g)) =*• oc(c) €
R2(g) and g € R2(ac(c)) => aa(g) € R{(c).
• For all actions g in F2 with o-a(g) defined, we have $ (= B2(g) —> a(B^(aa(g})} and
$ (= T2(g) —> a(Ti(aa(g}}), where $ is a suitable axiomatization of the specification of
the data types involved, and a is the extension of ov to the language of terms and propo-
sitions. The behavior of the system (E 2 , A 2 ) is such that an action g of the system cannot
disrespect the data and time guards of the corresponding action aa(g) of the component;
i.e., the system can only strengthen said guards.
• $ — /2 —> <T(/I), with $ and a as above. The initial state of the component must be
implied by the initial state of the system.
• For all actions g in F2 with oa(g] defined, and for all local variables v in D\(oa(g}},
we have F2(g}(av(v)) = d~(Fi(o~a(g))(v)) , where, as before, a is the extension of ov
to the language of terms and propositions. Recall that F is the function that assigns to
each action g a mapping from the variables in the action's domain D(g] to terms of the
term algebra of the data types involved. This means that an action g of the system can
only assign to a local variable of the component the "translation" of the value that the
corresponding action oa(g] of the component does.
F.N. do Amaral et a I. / A Real-time Specification Language 199
The interaction of two components PI and P2 may be either in the form of the connection
of variables or in the form of the synchronization of actions (or both). This interaction is
specified using a third component, called a channel. Given components PI and PI and a
channel Pc, the composition of P\ and P2 via Pc is represented by the diagram
When certain conditions are met, such a diagram is called a configuration. In a configu-
ration, morphisms a\ and cr2 specify how PI and P2 should interact, as discussed below:
Variables: The only variables that the channel Pc can contain are of the input kind, so
they can be mapped to input or output variables of PI and P2. Given an input variable v of
Pc, we say that variables o~i(v) and cr2(t>) of PI and P2, respectively, are connected. If two
input variables are connected, the result will be an input variable of the composite system. If
an input variable and an output variable are connected, the result will be an output variable of
the composite system. We do not allow two output variables to be connected (a diagram where
this happens is not considered a well-formed configuration). Furthermore, private variables
and clocks cannot be connected, so we do not allow the channel Pc to have private variables
or clocks.
Actions: The only actions that the channel Pc can contain are of the shared kind. Fur-
thermore, these actions must be "empty" in the sense that their time and data guards are
the constant values true, they do not reset any clocks and do not modify any variables (i.e.
their write frame is empty). This means that the channel Pc is a "neutral" component with no
behavior of its own, serving only as a kind of "wire" for joining the actions of PI and P2.
Given an action g\ of Pt and an action <?2 of P2 having <7i(<?i) = cr2(#2) = g for g an
action of Pc, we say that g\ and g2 are synchronized. When two actions are synchronized, the
result will be a joint action of the composite system. This joint action will be of the shared
kind, its data and time guards being the conjunction of the corresponding guards of gi and <?2,
its effect being to reset the union of the sets of clocks reset by g\ and <?2 and to perform the
assignments in the union of the sets of assignments dictated by g\ and g2.
Formally, the system that results from the composition of PI and P2 through channel Pc is
the pushout (in the category Comp) of the configuration above, written PI +Pc P2. In general,
for a configuration diagram involving any (finite) number of channels and components, the
resulting system is given by the colimit of the diagram. A configuration is said to be well-
formed when it satisfies the conditions described above. For well-formed configurations, the
colimit will always exist.
component timekeeping
in Time alarm Time, Time currentTime
out float snoozelnterval, float minimum
prv int ticksPerSec, Time now, boolean alarmOn
clock c
init snoozelnterval = 10 A minimum = \ A ticksPerSec - ... A -> alarmOn
do
[] setTime: —> now := currentTime \\ reset(c)
[] setAlarm: —t alarmOn := true
[] ring'- alarmOn A now = alarmTime —> skip
[] alarmOff: —» alarmOn := false
[] prvkeepTime: c == ticksPerSec —> now := now + 1 || reset(c)
end component
AVhen composing the timekeeping and snooze components, we want to identify variables
snoozelnterval and initiallnterval, as well as variables minimum (in timekeeping) and m/'/i/-
WMW (in snooze), so that the input variables in snooze will contain the constant values pro-
vided by timekeeping.
Furthermore, we want to synchronize actions ring andfirstRing so that snooze will be
activated exactly when timekeeping detects that the current time equals the time the alarm
has been set to ring. We also want to synchronize the alarmOff and off actions, meaning that
when the user presses the "off" button, both the snooze and the alarm mechanisms are turned
off.
Notice that the resulting composite system is still open, in the sense that there are still
unconnected input variables: currentTime and alarmTime receive values given by the user
when he or she wants to set the time or the alarm, operations that are made available by the
shared actions setTime and setAlarm, respectively.
Further interaction with the environment is given by the following features:
• The snooze action of the snooze component remains as a shared action of the system; it
must be synchronized with an action of the environment representing the pressing of the
"snooze" button while the bell is ringing.
• The joint action (alarmOff] off) is a shared action of the system that must be synchronized
with an action of the environment representing the pressing of the "off" button while the
bell is ringing.
• The output variable ringing in the snooze component must be connected to an input vari-
able of a component representing the actual bell mechanism.
4 Concluding Remarks
and upper bounds for action data guards (i.e., safety conditions and progress conditions, re-
spectively), and the partial specification of the effect of an action g on its write frame D(g),
by means of a boolean expression (involving primed variables, as is customary in other for-
malisms) instead of parallel assignment.
When specifying the components of a real-time system, it may be appropriate to use
formalisms of a higher level of abstraction than RT-Community. One example would be the
use of a real-time temporal logic (e.g. MTL - [3]) to specify the behavior of a component. In
fact, we expect that by using mappings between logics such as those described in [5], a wider
range of formalisms may be employed in the specification of a single system, allowing for a
situation of interoperability among logics and specification languages.
The adaptation of RT-Community to systems involving mobility and dynamic reconfigu-
ration is the subject of current study. A similar goal is being pursued in relation to untimed
Community ([9]), where graph rewriting is used to reflect runtime changes in the system.
We are investigating an alternative approach, where channels may be passed between com-
ponents to allow them to engage in new connections at execution time - a strategy similar to
the one used in 7r-calculus ([8]).
RT-Community is currently being contemplated for the specification of hypermedia pre-
sentations, a domain where real-time constraints occur naturally.
References
[1] R. Allen and D. Garlan, A Formal Basis for Architectural Connectors, ACM TOSEM, 6(3)(1997) 213-
249.
[2] F.N. Amaral and E.H. Haeusler, A Real-Time Specification Language, Technical Report, Dept. of Infor-
matics, PUC-RJ, Brazil (2002).
[3] E. Chang, Compositional Verification of Reactive and Real-Time Systems, PhD Thesis, Stanford Univer-
sity (1995).
[4] D. Gelernter and N. Carriero, Coordination Languages and their Significance, Comm. ACM 35, 2 (1992)
97-107.
[5] A. Martini, U. Wolter and E.H. Haeusler, Reasons and Ways to Cope with a Spectrum of Logics, in J.
Abe and J.I. da S. Filho (eds.), Logic, Artificial Intelligence and Robotics (proc. LAPTEC 2001), series:
Frontiers in Artificial Intelligence and Applications 71, IOS Press, Amsterdam (2001) 148-155.
[6] B. Peirce, Basic Category Theory for Computer Scientists, The MIT Press (1991).
[7] J.L. Fiadeiro and A. Lopes, Semantics of Architectural Connectors, in M. Bidoit and M. Dauchet (eds),
TAPSOFT'97, LNCS 1214, Springer-Verlag (1997) 505-519.
[8] R. Milner, Communicating and Mobile Systems: the 7r-Calculus, Cambridge University Press (1999).
[9] M. Wermelinger and J.L. Fiadeiro, Algebraic Software Architecture Reconfiguration, in Software
Engineering-ESEC/FSE'99, volume 1687 of LNCS, Springer-Verlag (1999) 393-409.
[10] G. Winskell and M. Nielsen, Categories in Concurrency, in A.M. Pitts and P. Dybjer (eds.), Semantics
and Logics of Computation, Cambridge University Press (1997) 299-354.
202 Advances in Logic, Artificial Intelligence and Robotics
J.M. Abe and J.I. da Silva Filho (Eds.)
IOS Press. 2002
Abstract
1. Introduction
Fuzzy systems concerning medical diagnosis often seek the allocation of patients to
specific nosologic categories through some sort of defuzzification rule. This follows the
rationale of medical practice under which for a patient presenting with a given set of
clinical signs and symptoms a conclusive diagnosis should be produced.
Such conclusion, nonetheless, will always hinge on premises of previous composite
relational equations, which analyzing records of past patients concerning symptoms &
signs and diagnoses, establish how they are related. Depending on the model used in this
analysis, different sort of information will be available for the definition of a
defuzzification procedure. In the present paper, the yields of two different methods are
studied.
2. Methods
A data set of 153 children recording their clinical signs and diagnoses was
randomly separated into an analysis sample (75% of cases) and a validation sample (25%
of cases), the former been used to derive a matrix of relations between diagnoses and
signs, and the latter been used to assess performance in patient allocation to a diagnosis
category according with the aforementioned matrix. Diagnoses were pneumonia (d/),
diseased but not pneumonia (^), and healthy (dj), crisply assessed as either present or
absent. Clinical signs were chest X-ray (si), dyspnoea ($2), chest auscultation (s.?), cardiac
rate (54), body temperature (si), toxaemia (ss), respiratory rate (ST), which were originally
assessed through qualitative scales, whose values were normalised to the unit to express
J. C.R. Pereira et al. / Defuzzification in Medical Diagnosis 203
Max-min relation: Given S and T, two binary relations of U x V and V x W (e.g. signs x
patient and patient x diagnosis), the composition sup-min (e.g. signs x diagnoses) is a
fuzzy relation in U x W of the type
Godel's implication: Godel's implication treats Cartesian products in a way that preserves
the truth table. The relational equation that incorporates Godel's implication is given by
To allocate patients from the validation sample to the diagnosis categories, for each
patient (Pn) allocation for diagnoses (dm) was drawn from the composition of his vector of
clinical signs (si) with the relational matrix of each model according with the function
O = sup[mm(R(dm,SilPn(Si)]] (3)
The defuzzification rule was defined as the maximum points of the resulting
membership functions ll\ in other words, one patient should be allocated to the diagnosis
to which he had highest membership. As the relation between signs and healthy could
achieve some degree of membership, a patient should be allocated to the healthy category
if his membership to both pneumonia and other disease were null.
To assess performance of defuzzification under each model, the overall agreement
between results and the known classification of patients in the validation sample was
calculated.
3. Results
Derived from the analysis sample, the matrixes of relations between clinical signs
and diagnoses under each model were:
204 J.C.R. Pereira ct al / Defuzzification in Medical Diagnosis
max-mm Godel
d, d2 d3 d, d2 d3
Sl 0.43 0 0 Sl 1 0 0
S2 1 0.25 0 S2 0 0 0
S3 0.67 0 0 S3 1 0 0
S4 1 1 0.5 S4 0 0 0
s; 0.67 1 0 S5 0 0 0
S6 0.75 0.5 0 S6 0 0 0
S7 1 0.75 0 S7 0 0 0
max-mm Godel
Patient ID pneumonia other disease healthy Patient ID pneumonia other disease healthy
5 0.75 0.5 0.5 5 0.43 0 0
10 0.25 0.25 0.25 10 0.14 0 0
23 0.75 0.5 0 23 0.33 0 0
31 0.75 0.75 0.5 31 0.67 0 0
37 0.75 0.75 0.5 37 0.67 0 0
38 0.33 0.25 0.25 38 0.33 0 0
etc. etc.
Applying the defined defuzzification rule, in max-min model there were situations
when allocation of patients was missed due to ties in membership functions, e.g. patient n°
10 under. An alternative was to allow multiple allocation, so that a patient could be
simultaneously allocated to more than one category. Agreement between allocation of
patients through the two models information and the patient real status in the validation
sample is shown in tables 3 to 5:
J.C.R. Pereira et al. / Defuzzification in Medical Diagnosis 205
medical belief that inspection of patients has precedence over radiological or laboratory
investigation and reassures that, indeed, if either chest X-ray or auscultation is positive a
diagnosis of pneumonia is very likely, but other clinical signs should not go unattended as
they are as much or even more informative.
Likewise, finding out that the healthy status had some degree of membership to
cardiac rate allowed the identification of a few cases that had been considered healthy,
when they would have been better classified as bearers of some disease, either malnutrition
or anemia. Further exploring this table, one can learn how each signal relate to each
diagnosis and thus for which situation it is more important - for instance, toxaemia (5,5),
even though not pathognomonic to pneumonia, is more important to this diagnosis than to
the diagnosis of other disease.
Proceeding to table 2, where relations of each patient to each diagnosis are shown,
one can realize that any patient is better described by his vector of relations with diagnoses
than by a single allocation to a specific diagnosis. Table 4 provides evidence of this since it
shows that if multiple allocation is allowed, the level of agreement is enhanced and no
cases are left out of classification.
In conclusion, the results of this study were suggestive that fuzzy diagnostic
systems should perhaps rule out defuzzification in favor of providing information about
relations and leaving decision to medical judgement. Shirking defuzzification one should
be avoiding the loss of wisdom that the poet warned against. Giving emphasis to relations
one should be achieving real knowledge since according with Poincare [7] "outside
relations there is no reality knowable".
References
[1] KLIR, G. and YUAN, B.. Fuzzy Sets and Fuzzy Logic: Theory and applications. Prentice Hall, USA,
1995.
[2] PEDRYCZ W. and GOMIDE F.. An Introduction to Fuzzy Sets: Analysis and design. MIT Institute,
USA, 1998.
[3] SANCHEZ E.. Medical diagnosis and composite fuzzy relations. Advances in fuzzy set theory and
applications, 1979.
[4] SADEGH-ZADEH K... Fundamentals of clinical methodology: 1. Differential indication. Artificial
Intelligence in Medicine 1994; 6: 83-102.
[5] NGUYEN H.T., WALKER E.A.. A first course in fuzzy logic. 2nd Edition. New York: Chapman &
Hall/CRC, 1999:321.
[6] SUSSER M. Casual thinking in the health sciences. New York: Oxford University Press, 1973
[7] POINCARE, H. Science and hypothesis. New York: Dover Publications, 1952: xxiv.
208 Advances in Logic, Artificial Intelligence ami Robotics
J.M. Abe and J.I. da Silva Filho (Eds. I
IOS Press. 2002
Abstract
The purpose of this paper is to study of the evolution of positive HIV population
for manifestation of AIDS (Acquired Immunideficiency Syndrome), Our main
interest is to model the transference rate in this stage. For this purpose, we use
expert information on transference rate because it strongly depends of the viral load
and of the level CZ)4+of the infected individuals. More specifically, the
transference rate is modeled as a fuzzy set that depends of the knowledge on viral
load and CD4 + , respectively. The main difference between this model and the
classic one relies on the biological meaning of the transference rate A. Medical
science uses linguistic notions to characterize disease stages and to specify anti-
retroviral therapy. Fuzzy set theory provides the formal framework to model
linguistic descriptions such as the transference rate /I using expert knowledge.
1. INTRODUCTION
In the last decade, the mathematical literature on imprecision and uncertainty has grown
considerably, especially in system modeling, control theory, and engineering areas. More
recently, several authors have used the fuzzy set theory in epidemiology problems [6] and
population dynamics [2].
Since the advent of the HIV infection, several mathematical models have been
developed to describe its dynamics [3] and [5]. In this paper fuzzy set theory, introduced in
the sixties by Lotfi Zadeh [9], is used to deal with imprecision on the time instant or
moment in which AIDS begins its manifestation. Our study considers the transference rate
A , as used in the classical Anderson's model [3], as a fuzzy set.
Fuzzy set theory is a genuine generalization of classical set theory [1], [7] and [9] useful
to model unsharp classes. In this paper, the parameter /I is viewed as a linguistic variable
whose values are fuzzy sets that depends on the viral load v and on the CD4+ level.
CD4+ is the main T lymphocyte attacked by the HIV retrovims when it reaches the
bloodstream. The notion of A, rate as a linguistic variable with fuzzy values captures its
biological meaning more faithfully [6] and [7] to classify the disease stages, and to decide
on when the antiretroviral therapy should be used. We assume that, initially, the fraction of
R.S. da Motia Jafelice et ul. / Fuzzv Rules in Asymptomatic HIV Virus 209
infected asymptomatic individuals x is maximum and equal to 1, and that the fraction of
AIDS symptomatic individuals y is null. The next section introduces the main definitions
needed in this paper.
2. PRELIMINARY DEFINITIONS
| RULE BASE
OUTPUT^
3. CLASSIC MODEL
x(t) = e (2)
210 R.S. da Motfa Jafelitt' et ai /Fuzzy Rules in Asymptomatic HIV Virus
4. FUZZY MODEL
When the HIV reaches the bloodstream, it attacks mainly the lymphocyte T of the
CD4 + type. The amount of cells CD4 + in periferic blood has prognostics implication in
infection evolution by HIV. Nowadays, the amount of immunecompetence cells is the most
clinically useful and acceptable measure to treat of infected individuals by HIV, although it
is not the only one. We can classify the amount of CD4 + cells/ml in the peripheral blood
in four ranges (see: www.aids.gov.br):
1. CD4+ > 0.5 cells/ml: Stage of infection by HIV with low risk of to develop disease.
2. CD4+ between 0.2 and 0.5 cells/ml: Stage characterized for appearance of signs and
shorter symptoms or constitutional alterations. Moderate risk to develop opportunist
diseases.
3. CD4+ between 0.05 e 0.2 cells/ml: Stage with high possibility to develop opportunist
diseases.
4. CD4+ < 0.05 cells/ml: High risk of get opportunist diseases such as Kaposi's sarcoma.
High life risk and low survival chances.
On the other hand, a low HIV viral load not enough to destroy all the lymphocyte CD4 +
of the organism. Thus, antibodies have chances to act against opportunist diseases. High
viral load destroys large quantities of lymphocyte CD4 + and the immunologic system may
lose its function.
In the beginning (or when change of anti-retroviral therapy occurs), the literature
recommend viral load exams within one to two months period to evaluate the treatment.
The results should be interpreted of the following way:
1. Viral load below of 10.000 copies of RNA by ml: low risk of disease progression.
2. Viral load between 10.000 and 100.000 copies of RNA by ml: moderate risk of
disease progression.
3. Viral load above of 100.000 copies of RNA by ml: high risk of disease progression.
The identification of the disease stages and its respective treatment is based on the
relation between viral load and CD4+ cells level. Control of the viral load and CD4 +
cells level may interfere in the transference rate A control.
Thus, the conversion from an asymptomatic individual to a symptomatic individual
depends on the individual characteristics, as measured by the viral load v and CD4 + .
Therefore, we suggest the following modification of (1):
dc ,
0 (3)
dt
The difference between this and the first model (1) is that now the parameter
A. = /l(v,CD4+) has a clear biological meaning and thus is a more faithful characterization
of A. From the mathematical point of view we can think of (3) as a parametric family of
systems. In this case, A is the parameter dependent of v and CD4 +. It seems reasonable
that A, and consequently of the population y, can be controlled via v and CD4 +. From
(3) we have
X(t) = e-U'*»+» XO = 1 - <r-"v-CD4+",, > o . (4)
R.S da Motta Jafelice et al. /Fuzzy Rules in Asymptomatic HIV Virus 21
From of the rules base introduced in the previous section and from the inference method
adopted, we simulate the viral load v and the CD4+ level to a HIV-positive individual
during sixty months and to obtain the output /i = /i(v,CD4+) as depicted in figure 5. A
cross-section of the /I surface along a parallel plan to the CD4 + level axis is shown in
figure 6.
OS 06 07 OB 09
According with the section 4, CD4+ is the most useful parameter to control and to
diagnose HIV. A more detailed study can be done assuming that the transference rate is
X = /l(c), where c = CD4 + in the model (4). We assume X given by the function of figure
6, that is, we assume A(c) as follows:
if c
Figure 7: Transference rate A. in function of the CD4 +. Figure 8: Membership function p adopted forC.
In the figure 7, cmin represents the minimum CD4+ level for which the individual
becomes symptomatic, CM represents the CD4+ level for which the chance to become
symptomatic is minimum, and c^ is the largest quantity of CD4 + possible. In figure 6.
R.S. da Motta JafeHcc et al. /Fuzzy Rules in Asymptomatic HIV Virus 213
we can observe the approximate values of cmjn and CM , this is, cmin is approximately 0.3
cells/ml and CM is approximately 0.9 cells/ml. These values are compatible with the reality
if CD4 + is less than 0.3 cells/ml the tendency is of the individual becomes symptomatic
and when CD4 + is bigger than 0.9 cells/ml the tendency is of the individual to be
asymptomatic. Thus, the number of asymptomatic and symptomatic individuals at the time
instant / is:
2. n(A}<n(B} if A^B.
The value of the fuzzy expectancy of the asymptomatic individuals of the fuzzy set
x = x(t,c) is given by [8]
FEV[x]= sup inf[a, //{*>«}]
0<a<l
where {x > a} = {c : x(c) > a} for each / and // is a fuzzy measure. Let
H(a) - ju{c : x(c) > a} , for each / > 0 . Here we suggest the following fuzzy measure:
f sup p(c) if A*0
=\ ^
( 0 if A = 0
Let A = [a, cmax ] , where a = CM - (CM - cmin ) —— . Observe that cmin < a < CM . Note
that ju is an optimist measure, in the way that the CD4 + level in a group is evaluated for
individual with CD4 + best level.
Beyond of ce[0,c max ], we assume that the CD4+ level of the group HIV-positive
studied ( C ) has different possibility to occur. We assume c as a triangular fuzzy set (see
figure 8) given by:
0 if c<c-S
—(c-~c+S) if ~c-8<c<~c
fj (6)
— (c-c-S) if 'c<c<~c+6
8
0 if o~c+8
The parameter cf is the modal value and 8 is the dispersion of each one the set fuzzy that
defines the linguistic variable values. These fuzzy sets are defined from of the values c^ ,
CM and cmax of the definition of A . Thus, we the fuzzy expectancy by looking at three
cases:
1 . Case: CD4 + low ( C_ ). We assume cmjn > c + 8 . Thus, FEV[x\ = e~' .
214 R.S. da Motta Jafc/icc cf a/. / Fiiziy Rules in Asymptomatic HIV \ 'mis
_/ \
p(a) if•/" e^-A(C)/ <a<e
_ „ - ^-x(c+<))/ /"7\
(7)
0 // e'***** < a < 1
where p(a) = -\
s
-CM -(CM -cmj\ — \ + c+6 . As //(or) is continnuos and decreasing,
l Vt )
it has a unique fixed point that coincides with FEV[x}. Thus we obtain the following
inequality:
" ^ ->l(?+<5)f /O\
<e (8)
The inequality above shows that in the best hypothesis (/^optimist) is possible \h&FEV[x]
is less than 1 , since it is inferior a possible solution e~i(7*S)t .
This way, for each t>Q there exists a unique c(f)e(c,c +£), where
ln g(/)
c(0 = CM + (CM - cmm ) I I . Thus, FEV[x] = e-*(c(l)}'is an exponential curve and,
because the CD4 + level increases with /, FEV[x] is decreasing. Consequently, the fuzzy
expectancy is not solution of (3). Actually, at each instant t, the FEV[x] coincides with the
unique solution of (3). It is easy to verify that FEV[x] is differentiable and that it satisfies
the following differential equation with the time dependent parameter A:
A. . . „. dc
(9)
dt \_ dt^^'dt^"
Note that for the three cases studied here, we obtain the following inequality:
8. CONCLUSION
The main difference between the deterministic and the fuzzy model is that the fuzzy
model allows imprecise parameters and deffuzification at any time. In deterministic
modeling, deffuzification is done in the beginning of the mathematical modeling. We may
state that the deterministic models are particular instances of fuzzy models. In our case, the
deterministic model (3) results the solution x(t) = e~i(f}'. The fuzzy model also provides a
unique curve FEV[x] when decisions are necessary. Fuzzy expectancy of asymptomatic
individuals is bounded below by e'i(f)l. Therefore, the fuzzy model over evaluates the
number of asymptomatic individuals. We emphasize that, despite of using an optimist fuzzy
measure to evaluate the CD4 + level in the population, the FEV[x] is less than e'^+S)t and
thus depends of the populational dispersion 8 of the CD4 + level of the group studied.
From (8), we see that when 8 -+ 0, FEV[x] -+ e'W)l that is, it depends only of CD4 +,
indicating a policy for treatment. Clearly, the fuzzy model provides a clearer and
R.S. da Motta Jafelice et al. /Fuzzy Rules in Asvmptornatic HIV Virus 215
9. REFERENCES
[1] Barros, L.C., R.C. Bassanezi and M.B. Leite. The Epidemiological Models SI with
Fuzzy Parameter of Transmission (submitted).
[2] Krivan, V. and G. Colombo. A Non-Stochastic approach for Modeling Uncertainty in
Population Dynamics. Bulletin of Mathematical Biology (1998) 60 .
[3] Murray, J.D. 1990. Mathematical Biology. Springer-Verlag, Berlin.
[4] Nguyen, H.T. and E.A.Walker. 2000. A First Course in Fuzzy Logic. Chapman &
Hall/CRC.
[5] Nowak, M.A . 1999. The Mathematical Biology of Human infections. Conservation
Ecology 3 .
[6] Ortega, N.S. 2001. Aplica9&o da Teoria de Logica Fuzzy a Problemas da Biomedicina.,
PhD thesis., University of Sao Paulo (in Portuguese).
[7] Pedrycz, W. and F.Gomide. 1998. An Introduction to Fuzzy Sets: Analysis and Design.
Massachusetts Institute of Technology.
[8] Sugeno, M. 1974. Theory of Fuzzy Integrals and Its Applications, PhD thesis, Tokyo
Institute of Technology, Japan.
[9] Zadeh, L.A . 1965. Fuzzy Sets. Information and Control 8: 338-353.
216 Advances in Logic. Artificial Intelligence and Robotics
J.M. Abe and J.I. da Silva Filho (Eds.)
IOS Press. 2002
1 Introduction
It is very common the use of Category Theory and its constructors in the study of Algebraic
Specifications. In what concerns the construction of modular specifications, the colimit op-
eration plays a special role [3]: it makes possible to reduce a gamma of traditional ways of
composing specifications into a single operation. Parameterization, renaming of components
and sum of specifications are examples of operations that can be well described by colim-
its. Limits however do not appear very frequently as combinator of specifications. Their role
uses to be restricted to combinators of sorts. For example, in [7] we can find the product of
sorts as one of the basic operations. A possible justification for this fact is that, when con-
cerning the modular construction of specifications, we first think about using pre-existing
specifications to build richer ones. For this task the colimit is the appropriated operation. But
the opposite direction also seems to be very fruitful: to build less informative specifications
from pie-existing richer ones. Consider, for example the case where we have two specifica-
tions: equivalence relation and partial order. We want to specify another relation that has a
common intersection with these pre-existent ones: it is also a pie-order. Thus we just need a
way of extracting what these two specifications have in common. We claim that limit is the
appropriated operation to perform this task.
In this paper, we present a definition of limits in the category of signatures. We extend
this definition to the category of specifications showing how limits can be used as another
modular way of combining specifications. Finally, we show by example how inheritance and
reuse can be achieved from limits.
/. Cafezeiro and E. H. Haeusler / Categorical Limits 217
The paper is organised as follows: in section 2 we present the basic definitions of signa-
tures and the category Sign of signatures. In 3 we define limits in Sign. Section 4 presents
definitions of specifications and the corresponding category Spec. In 5 we define limits in
Spec and presents an algorithm to ensure that the text of specifications will be finite. In sec-
tion 6 we present the example to illustrate reuse. Finally, in 7 we conclude and comment
related works.
2 Algebraic Signatures
We adopt here a notation that slightly differs form that of [5] for algebraic signatures. We
put together ranks of the same size. Signature morphisms, however, are restricted to ranks (as
before) and not to ranks size.
Definition 1 An algebraic signature E is a pair < 5, F > where S is a set (of sort names)
and F is an indexed family of sets (of operator names). For a given f : u —> s £ Fn, where
u £ S* and s (= S \us\ (the length of us) is equal to n. We say that f : u —> s has rank us,
and we denote Fus the subset ofFn whose members have rank us.
Definition 2 Given E =< S, F > amYE' — < S',F' >, algebraic signatures, a signature
morphism a : E —> E' is a pair < as - S —> S', OF > where as is a function and OF ~<
apn : Fn —> F'n > is a family of functions, such that,
• ifapn(f) = /' then rank(f] — us andrank(f') = a*s(u)as(s);
• ifFn / 0 then aF(Fn) ^ 0.
The function rank returns the rank of a function, or of a set of functions.
apua refers to the restriction of apn whose domain is the set of functions with rank us.
The collection of algebraic signatures and signature morphisms form a category which
will be called Sign.
In this section we define limits in Sign and exemplify with the particular case of products. In
[1] the reader can find descriptions of particular kinds of limit in the category of signatures,
together with illustrative examples.
Recall from the definition of morphisms in E that, in order to have a morphism between
two signatures it is necessary to have a function <jpk for each rank length k from the domain
signature. Moreover, for each us from the domain signature, there must have at least one
operation name in the rank a*s (u)as (s). For example, there is not a morphism between the two
signatures E t =< {a, 6, c], < {Fabc, Faaa} » and E2 =< {p, r}, < {F^}, {F^} ».
From E! to E2, an attempt to define o^a^.c} could be linking a and 6 to p and c to r. In
this way, it would be possible to define o~pabc : Fotc —» Fppq- But ffFaaa '• ^aaa ~* ^ppp would
remain undefined as Fppp does not exists.
In the opposite side, the option would be linking both p and r to a so that we could define
^fppr '• Fppr ~^ ^aao- But, again, the rank pr would not have a corresponding aa in EI.
Now, we can define the limit of a diagram in Sign component by component: the limit
of sorts and the limit of each rank length that appears in all the signatures that compose the
diagram. The operation names whose rank length is in some, but not in all the signatures will
not be present in the limit signature.
218 /. Cafezeiro and E. H Haeusler / Categorical Limits
Definition 3 Let E =< 5, F > and Ef —< $, Ft- >. Given a diagram D with objects E;
and morphisms y)j, a D-cone p± : E —> E^ is a limit of D if and only if.
Pis : S —> 8^ is the limit of sorts, and.
F is the family of limits p^ : Fj. —> F^fc for each k, rank length of all the signatures E,-.
To see that the above definition is indeed the limit in Sign we must verify the universal
property of limits. Consider another D-cone $ : E' -» E, that could also be a limit of D.
By the definition of morphisms in Sign and by the fact that both E and E' are vertexes of
D, they do not have a rank length k that is not present in all the signatures E,. Thus E and
S' have at most the rank lengths that are present in all the signatures of D. Then, either E
and E' have the same rank lengths or E has some rank lengths that E' does not have. In both
cases, the unique morphism from E' to E is ensured by the universal property component by
component (sorts and each rank length).
We state the universal property of limits:
Given a diagram D with objects E^ and morphisms (pj, and the cone p : E —>
E,, a limit for D. For any other cone p' : E —» E^. there exists a unique
morphism T : E' -> E such that p\\o r — p\. for each i. index of object of D.
The universal property restated for a diagram without morphisms and just two signatures
characterises the universal property of a product:
Given a diagram D with EI and E2 as objects and without morphisms. Let E.
be a limit for D (thus, there are morphisms KI : E -> E! and 7T2 : E —» E2/).
For any otherE' with morphisms p'l : E' —> E t and p'z : E' —> E2. there exists
a unique morphism T : E' to E such that 7rt o r = p1^ and 7T2 o r = p^.
Note that, to play the role of E' a signature must have at most the rank lengths that are
present in both E L and S2. Otherwise, the morphisms p\ : E' —> EI and(or) p'2 : E' -> E2
would not exist.
As an example, the product of the two signatures EI =< (a, 6, c}, < {Fafrc, FQOO} »
and E2 =< for}, < {F^}, {Fp,.} » would be E t x E2 =< {ap,aq,bp,bq, cp.cq}, <
\fabc -* Fppr, .Taaa ^ ^pprj ->'>-
4 Algebraic Specifications
Definitions Given Sp =< E, $ > and Sp1 —< E',<£' >, algebraic specification, an
specification morphism from Sp to Sp' is a pair < o : E --» E',a > such that o is a
signature morphism and a : $ — » $ ' exists if and only if for all e £ $. $' h Sen(a) (e).
In the above definition, Sen is a functor from the category Sign to Set that maps a
signature to the set of all sentences that can be written in that signature. Sentences (equations)
are pairs of terms of the same sort denoted by i,- = tj, i, j < ui. Terms of a signature are those
obtained by the application of the two following items:
/. Cafezeiro and E.H. Haeusler / Categorical Limits 219
For a given signature < 5, F >, and a set Xs of variables for each 5 e 5
(i) A variable a; G Ars is a term of sort s;
(ii) If t t , ..., tn are term of sorts s l 3 ..., sn, and / : s1? ..., sn -» s is an operation symbol
of the signature then /(ti, ..., t n ) is a term of sort s.
When applied to a morphism a : E —» S', the functor Sen gives a function Sen(a) :
S'en(E) •-> Sen(E') that translates each sentence of the signature S to a sentence of the
signature E;. By Sen(a)(e) we mean the translation of an equation e of $ to the signature
E'. By 5en(cr)($) we mean the translation of all equations of $ to the signature £'. Thus,
given two specifications, there is a morphism linking the first to the second if and only if the
translation of each equation of the first can be proved in the second.
The collection of algebraic specifications and specifications tnorphisms form a category
which will be called Spec.
In this section we show that Spec has limits in general. In [2] the reader can find descrip-
tions of particular kinds of limit in the category of specifications together with illustrative
examples.
Limit in Spec will be constructed as pairs of limits: in the signature (as defined in 3) and
in the set of equations.
Let us call Cn(<&) the set of provable equations from <£, by the following set of rules:
(r) : xs ^ x,
(s) : ti --- ta —> ta = *i
(t) : t] -- ii,ii — £3 —> t\ = t%
(cang) : t, = t\, .., tn = t'n —> /(t l5 ....,tn) = /(t'l; ....,t'n)
(subs) : ti(y) = t2(y),t3 = t4 —> ti(t3) = t2(t4)
In (r), for each sort 5, xs is a distinguished variable of 5 used only to state the reflection
rule. The usual reflection rule is obtained from this by the application of (subs). In (cang), /
is any operation symbol and the sequence of terms is consistent with the rank of /. In (subs),
y is a variable of the same sort of t3 and £4. The notation t(y) in the left means that y appears
at least once in t, and the notation t,:(tj) means that one instance of y is replaced by tj in t,:.
Definition 6 Let Sp =< S, $ > and Spi =< £,:, $,: >. Given a diagram D with objects Spi
and tnorphisms < <pj . aj >, a D-cone < p,; , /% : $ —> <&,: >: Sp —> Spi is a limit ofD_ if and
only if,
pi is the limit of signatures, and,
v where ^,; = {e|5en(at;)(e) e Cn($i)}.
The set $ is intended to be just the intersection of the consequences of $,;, but as the sets
<3>, may not be written in the same signature, we will need a translation provided by 5en(a,).
Recall that a, : S —> S,; may not be injective, as usual in projections. Thus, for a symbol
in S,; we may have many corresponding symbols in E. As a consequence the number of
axioms in the limit specification will grow a lot. In addition, <jt; may not be surjective. In this
case, some symbols in EI may not have any corresponding in E. This is not a problem, as
axioms written with them would not be proved in the other specifications, thus they do not
belong to the limit.
220 /. Cafezeiro and E.H Haeusler / Categorical Limits
We assume that Cn($i) is decidable, otherwise they would not be of practical computa-
tional use. The set of equations of the specifications may not be finite, but must be recursive.
The intersection $ is also decidable, and thus, can be recursively described. This is enough
for considering Sp an specification. It would be desirable, however, that we could construct
this object. It is a very common situation in the practice of algebraic specifications that ob-
jects have both E^ and <£j finite. In the following we describe the construction of the limit in
such case.
Consider a diagram with objects SPi =< E,-, <!>, > and morphisms < <#,-, a, >, where E,
and <£t are finite.
Let us enumerate the set of sentences in $ in the same way we enumerate strings: by size,
and for string of same size, by a lexicographic order based on the enumeration of symbols of
EU V, where V is a sufficiently big finite set of variables for each soil. Consider the following
enumeration of subsets of $:
pn _
I F n _i, otherwise.
To get finite set of equations that axiomatize 4> we follow this chain until reaching a set F
which has the following property:
Note that F is not just, the set of equations that belongs to the union of specifications
(axioms) and can be proved from all specifications, considering the appropriated translations.
It also includes equations that can be proved in all specifications but are not axioms.
In addition, F solves the problem of equivalent, but syntactically different equations: if
e is an axiom of SPi, e' is na axiom of SP^ and e ^ e' but e is equivalent to e*, then,
for i = 1,2, Cn(n{e\Sen(ai)(e) E $i}) would not give the correct set of consequences.
r\{e\Sen(ai)(e) € Cn($i)} gives the desired set, but is infinite. Syntactical differences are
solved by the enumeration of F, because the larger of equivalent equations will be present in
F (by (*)), and thus, all equations equivalent equation of least size will also be in F.
Proposition 1 shows that F is indeed the limit.
from Pei and Pe, by (t). By hypothesis, since |PeJ < k and |PC2| < fc,T h ei and T h e2.
Then, by (£), T h e';
(com?) As |Pe<| = fc 4- 1, there exists e with |Pe| < k such that P^ is obtained from Pe by
(com?). By hypothesis, since |Pe| < fc, F h e. Then, by (cong), T h e';
(subs) As |Pe/| — fc + 1, there exists 61,62 with |Pei| < fc and |Pe2| < fc such that Pe> is
obtained from Pei and P62 by (su&s). By hypothesis, since |Pe,| < fc and |P62| < fc, F h ei
and F h 63. Then, by (su&s), T h e ' .
We turn now to the example cited in the introduction of this text to show how limit is a
promising operation to achieve reuse of specifications.
We transcribe below the specifications of equivalence relation and partial order. The
boolean part of this specifications could be specified in separate, as shows [3], but we are
avoiding the use of operations of modularization other than limit.
The limit specification will have as sorts the product of sort:; of both signatures, thus,
{BB)BS,SB:SS}. The operations will be also the product of operations, but respecting
the length of ranks. We have, thus, four operations of rank length one: {IT : BB, TF :
BE, FT : BB, FF : BB}. Of rank length two, we have just one operation {-^ : BB -»
BB}. Finally, of rank length three, we have {//, /A, / ->, / ~, A/, AA, A ->, A ~, -» /, -»
A, —»—>, —>~, }. The sorts and equations whose names are composed by different names are
irrelevant for the application and can be easily eliminated by restriction operations. What is
relevant for the application is to show that //, the operation that corresponds to / in both
specifications is still reflexive and transitive but is not symmetric or anti-symmetric.
For each <&.j, the set vp.; is the set of equations written in E that, when translated by <TJ
belong to the theory of specification i. When translated byCTJ,for j ^ i, it gives a set of equa-
tions that belong to the theory of specification j. Thus, fl^j 's the intersection of theories,
222 /. Cdfezeiro ami E. H Haeiixler / Categorical Limits
expressed in the language E. Then considering that /(a, a) = T is in both SPER and
there will be in the limit Sp the axiom //(aa, aa) ~ TT. In the same way, the transitive
law will be present in Sp. The anti-symmetric law can not be expressed in the limit Sp, as
the symbol ~ has not a corresponding in SPER- The equation /(a. 6) — /(&, a) of SPEQ can
not have a corresponding //(aa, 66) = //(66. aa) in the limit Sp as the translation by <JPO
results in /(a, 6) = /(6, a) that can not be proved in Sppo-
References
[1] 1 Cafezeiro and E. H. Haeusler. Limits and other Categorical Concepts in Algebraic Signatures. Toward
Limits of Algebraic Specifications. Technical report, L'ff, Rio de Janeiro. 2002.
[2] 1 Cafezeiro and E. H. Haeusler. Limits in the Category of Algebraic Specifications. Technical report. L'ff.
Rio de Janeiro. 2002.
[3] H. Ehrig. Introduction to Algebraic Theory of Graph Grammars. In V. Claus and G. Rozenberg, editors.
1** Graph Granunars Workshop, pages 1 - 69. Springer Verlag, 1979. Lecture Notes in Computer Science.
73.
[4] P. F. Blauth Menezes. Reificacao de Objetos Concorrenres. PhD thesis. Institute Superior Tcnico - Tese de
Doutorado, Lisboa, Portugal, 1997.
[5] A. Tarlecki. Algebraic preliminaries. In E. Artesiano, H. J. Kreowski. and B. Krieg-Bruckner, editors,
Algebraic Foundations of Systems Specification. Springer. Berlin. 1999.
/. Cafezeiro and E.H. Haeusler / Categorical Limits 223
[6] A. Tarlecki, J. A. Goguen, and R.M. Burstall. Some fundamental algebraic tools for the semantics of
computation, part iii: Indexed categories. Theoretical Computer Science 91, 1991.
[7] R. Waldinger. Y.V. Srinivas, Goldberg A., and R. Jullig. Specware Language Manual. Suresoft,lnc, Kestrel,
19%.
[8] G. Winskel and M. Nielson. Categories in Concurrency. In Andrew M. Pitts and Peter Dybjer, editors,
Semantics and Logics of Computation. Cambridge University Press, Cambridge, 1997.
224 Advances in Logic, Artificial Intelligence and Robotics
J.M. Abe and J.I. da Silva Filho (Eds.)
IOS Press. 2002
1 Introduction
Software development has to face two major problems: the cost of non-standard
software - caused by the development times and the constant need for maintenance - and
the lack of confidence in the reliability of software [13]. Many researchers are interested in
providing techniques for developing reliable software, which is guaranteed to be correct
and documented in a way that is easy to maintain and adapt. One of these research areas is
called program synthesis, which proposes to generate automatically a correct program from
specifications ([4,3,9,6,1]).
There are three basic categories of program synthesis: proof-as-program,
transformational synthesis'[14,7] and knowledge based program synthesis [13]. Some
authors [14,7] insert another category called inductive program synthesis
Here, we deal with the proof-as-program paradigm [2], which avoids the double work
of the software designing the implementation of the system and the program verification -
which can be seen as the same programming process in different degrees of formality. So
this paradigm has focused on developing a program and its correctness proof at the same
time [9,3].
This idea is based on the fact that: 1- Developing a program and prove that it is
correct are just two aspects of the same problem [8]; 2- A proof for an application may be
regarded as a (constructive) solution for a problem [17]; 3- A program can be extracted
from a (constructive) proof of the existence of a solution for the corresponding problem [3].
Thus, using formal proofs - as a method for reasoning about the specification - and
proving that the extraction process of a program preserves the proofs semantics, we get an
automated way to construct, from a mathematical specification, a program that is correct by
construction.
The specification of the problem and the deductive rules provide information about
the algorithm structures and the reasoning about their properties. There are many formal
logical calculi to represent it properly, e.g., ITT (Intuitionist type theory) and GPT (General
problem theory [15,17,18]). As we use GPT we will give only a brief explanation about it.
The description of a problem in predicate logic can be viewed within the goal of
GPT, since it is able to describe the input and output data as well as the relation between
them. It considers problems as mathematical structures, where solutions can be precisely
treated and provides a framework for studying some problem-solving situations, as well as
problem solving. However, these pieces of information aren't enough to assure the
existence of a method that solves the problem.
Besides the specification in predicate logic and given that the sentence that describes
a problem is a theorem of the specification, if we obtain a constructive proof we will be
able to understand it not only as a syntactic transcription, but also as a description of a
given object, in other words, a description of an algorithm [10].
The Curry-Howard (C.H.) isomorphism associates the inference rules in natural
deduction for intuitionist logic (used in the proof) with the formation rules of X-terms
(commands in a programming language), in such a way that one proof of a (a formula) can
be seen as a X-term, which has the type a. Hence, we can say that a proof has a
computational interpretation, that is, it can be seen as a set of commands in a programming
language, i.e., a program [11]. This isomorphism gives the base for the construction process
of the program from a proof that is generally called "extraction of computational contents
of a proof2. This process extracts a function that relates the input with the specific outputs
of the program. The inputs and outputs of the program reflect the application of inference
rules used to obtain the formula. The computational contents relate to the semi-
computational contents that describe the relations between the inputs and outputs of the
program. The input and output variables of the program, by the C.H. isomorphism are
represented, respectively, by the variables quantified by the universal quantifier and
existential quantifier, so, the theorem of the specification must be of form Vx3y P(x,y).
There are many proposals for constructive programs synthesis, which use constructive
logic - for instance, the ITT- to specify the problems. These systems use as deductive
system the sequent calculus ([4,3,9]) or the rewrite mechanism [6], and construct programs
in logical and functional programming languages.
Based upon those considerations, this work proposes a constructive synthesizer,
where the program is generated from a proof using natural deduction, avoiding the
conversion that is used in the related work found in the literature. In this method, a program
will be constructed in an imperative program language (Pascal-like) from a proof in many-
sorted predicated intuitionist logic. Using the concept of semi-computational contents of a
formula, we prove that the generated program is a true representation for the solution to the
specified problem by the theorem of any many-sorted theory that describes the data types
used by the problem.
In the next section, we will present our constructive synthesis program process, which
is composed by the labeling of memory configurations of the program, followed by the
association of each inference rule with commands in the imperative language. In the section
3, will be described the proof of correctness of the program synthesis, i.e., a proof that the
generated program achieve the specification. Section 4 has an example of our constructive
synthesis mechanism and finally, in the section 5, we will present the conclusion of the
work.
:
For more on this concept, see section 2
226 G.M.H. Silvci et a/. / Constructive Program Synthesis
In the process of program synthesis we start from the existence of one theorem prover
in many-sorted predicate intuitionist logic with arithmetic, which, beyond the usual
inference rules, has inference rules for equality and induction. The theorem prover
constructs a normal proof in natural deduction, for a certain theorem, which is the input to
the programs synthesizer.
There are restrictions related to the inference rules used by the proof (given as an input
to the synthesizer): 1- the proofs cannot have the negation introduction rule, 2-the existential
elimination rule can be only applied on a formula that represents the inductive hypothesis.
The last restriction3 can be weakened if we admit parameterized programs as solutions.
From the proof of the specified problem we extract the computational content. In
order to accomplish this, we first map all the memory configurations for each inference rule
(labeling memory configuration process), and then we make the associate commands, in an
imperative programming language, with each inference rule.
The labeling rules below must be analyzed in the bottom-up direction, according to the
labeling process4.
Top-Formulae
axioms : p^ , where V and T are empty sets Hypothesis: p^ , Where V and T contains the
variables of the program associate with the hypothesis
Universal quantifier elimination: Universal quantifier introduction
Vya(y)r a(a)^"
3ya.(y)vT
Kv
5 r
Conjunction elimination Conjunction Introduction
avT Pf (ciAp)f
Disjunction elimination Disjunction introduction
4
5
More details in [19].
6
Output terms and input variables set.
An example of the application of this rule in the section 4.
228 G.M.H. Silva el al. / Constructive Program Synthesis
A formula has logic content when it is derived from an axiom or hypothesis and
describes the nature of the objects used by the program to solve the problem proposed, i.e.,
it describes the data structures of a program and the set of operations that can be applied to
them.
The semi-computational content of a formula is the set of information that express the
relations between the input and the output data of a program - which is a solution for the
problem specified by the formula - where for more than one input we can have one or more
outputs.
The computational content of a formula is a function, within the semi-computational
contents, which relates the input with the specific outputs of the program. They reflect the
application of inference rules used to obtain the formula.
We use the following notation to describe the generation program rules: 1- A :OT -
where, A is a program that calculates the property described in or > 2- A.H'.F- programs; 3-
<j- description of the memory allocation; 4- p- name of the program related to the formula.
Remark: The commands of the language, in which the program will be produced, have the
same semantics of the equivalent commands in the usual imperative programming
languages.
Top-Formulae
Axioms : a : $• , where a indicates "logical contents". Hypothesis: p : Sr • where p is a symbol for
programs
Universal quantifier elimination:
1. Axioms and non-inductive hypothesis: q : ^yo^y),- 2. Inductive hypothesis: P-Vy«(y)r
p:
Conjunction Introduction:
A . yV «P ' VV
Disjunction elimination: '-LL-ill
// (a) then (A) else (if (P) then (40):Yr
:
Disjunction introduction: —^—^-
T:(avp)^ A:(avp)j:.
G.M.H. Silva ct al. / Constructive Program Synthesis 229
Implication elimination: The assertion associated to the implication elimination rule is:
_ ¥ :cer A : (a -> p)r_ where [A<={exec([p],v) = S*}], denotes the substitution of the supposed
[A*-{«wc(p,v) = «P}]:(o->p)j:
procedure call (p) by the real procedure call (¥) in the program(A). According to the proof restrictions (seen
below), the minor premise of implication elimination rule always has logic contents. Thus, the program to be
generated by the application of this rule is the program associated to the major premise. So, we have the
following assertion:
A:p r
proof process has only logic content. Thus, we have the following assertion: —
Induction'. This rule is an alternative form of the application for the introduction of the universal quantifier.
Consequently, the program generated will have the command related to the application of this rule (read(...))
and a recursive program formed by a conditional command
n
A:Vx3yaU,y)(
FigureZ - Schema of the program construction
Theorem: Let FI be a proof for a formula of the form V.x3ya(jt,v), from an axiom set
(A) and a set of hypothesis that are not axioms (9), and A the program provided by the
function GenProg(LabelMemoConfig (!!)), then: #,A N; A : Vx3ya(x,y)\\
Proof: The proof of the theorem was carried out by induction in the length of the
proof, through the comparison of the changed syntactic semantics - of the program - with
the semi-computational contents of the inference rules.
4 Example
In this section we show our program synthesis mechanism through an example, in
which a program that calculates the remainder of a division is generated. The proof tree
will be presented in blocks. The block of main proof - which has the theorem to be proved -
will be a solid frame. The others, with traced frames, represent the branches that are
connected with others by the numeration presented in the upper left side of the frame.
Example:
Proof Tree: Block Representation
a(a) P(a)
a(a)A P(q)
•/V
Vjr(or(jt)A 0
G.M.H. Silva et al. / Constructive Program Synthesis 231
To make easier the understanding of the proof, we use infix notation for addition,
subtraction and multiplication operations, and the equality predicate and comparison
predicates either. The functional s(x) expresses the operation of successor.
The program is generated from the theorem proof: VvVu((v -> o) -> 3r3k(k * v = r = u) A (r < v))
on the basis of the axioms: Vx(x*l=x), Vx(x+0=x), Vq(0*q=0), VzVp((z=p)->(z-p=0)),
VzVp((p>0)->(z-p<z)), VzVq((z*s(q))-Kz*q+z)), VzVpVq((z=p-q)->(z+q=p)) and the
hypothesis: l=y.
(D
232 G.M.H. Silva et al. / Constructive Program Synthesis
**In the proof hypothesis y=l represents a memory restriction, where / has the value of y.
Remark: In our system the communications between the functions (parameter passing) is
made by read and write file operations.
5 Conclusion
In this work we have presented an automatic method of program synthesis operating
as follows: it transforms a given proof based on a specification to a program (in a
imperative language), guaranteeing the correctness of the latter.
The syntactical restrictions imposed on the proof, from which we extract the semi-
computational contents may cause some loss of the expressive power, thus limiting the
domain of application of the synthesis procedure.
Among the main contributions of this work, we stress the proposal of a new
synthesizer that generates legible programs in an imperative language along with a
correctness proof of this mapping. The other synthesizers exposed in the existing literature
generate programs that are not very legible in functional or logical programming languages.
Also, our constructive program synthesis procedure receives as input a declarative
specification in predicate logic, which allows us to express the problem in a simple way
than the synthesizers using intuitionistic type theory (Nuprl[4], Oyster[3] e NJL[9]) and
equational logic (Lemma [6]).
Among the directions to extend this work, we expect to investigate the feasibility of
relaxing some restrictions on the proofs, so as to extract semi-computational contents from
proofs that use the negation, introduction rule or have existentially quantified formulas as
hypothesis. Also, the usage of more that one proof method seems attractive and program
synthesizers based on such ideas are being investigated.
G.M.H. Silva et al. / Constructive Program Synthesis 233
References
[I] BENL, H., BEGER, U., SCHWICHTENBERG, H., SEISENBERGER, M. and ZUBER.W. Proof
theory at work: Program development in the Minlog system, Automated Deduction, W. Bibeland
P.H.Schmitt,(eds.), Vol II, Kluwer 1998
[2] BATES, J.L. and CONSTABLE, R.L.- "Proof as Programs". ACM Transactions on Programming
Languagens and Systems, 7(1): 113-136,1985.
[3] BUNDY, A., SMAIL, A., and WIGGINS, G. A. - "The synthesis of logic programs from inductive
proofs". In J. Lloyd (ed.), Computational Logic, pp 135-149. Springer-Verlag, 1990.
[4] CALDWELL, J.L., IAN, P., UNDERWOOD, J.G. - "Search algorithms in Type Theory".
http://meru.cs.uwyo.edu/~jlc/papers.html
[5] CHANG, C, LEE, R.C - "Symbolic Logic and Mechanical Theorem Prover". Academic Press, 1973
[6] CHARARAIN, J. e MULLER, S. - "Automated synthesis of recursive programs from a V3 logical
Specification". Journal of Automated Reasoning 21: 233-275, 1998.
[7] DEVILLE, Y., LAU, K. - "Logic Program Synthesis"- Journal of Logic Programming 1993:12:1-199
[8] FLOYD, R. - "Assigning meaning to programs". Symposia in Applied Mathematics 19:19-32, 1967.
[9] GOTO, S. - "Program synthesis from natural deductions proofs". International Joint Conference on
Artificial Intelligence, 339-341. Tokyo 1979.
[10] GIRARD, J., LAFONT, Y. and TAYLOR, P. - Proof and Types. Cambrigde University Press, 1989.
[II] HOWARD, W.A. - "The Formulae-as-Types Notion of Construction". In Hindley, J.R., Seldin,
J.P.(ed.), To H.B. Curry: Essays on combinatory logic, Lambda Calculus and Formalisation.
Academic Press, 1980.
[12] HOARE, C.A.R and WHIRTH, N. - "An axiomatic Definition of the Programming Language
PASCAL" -December, 1972. Acta Informatica 2: 335-355, Springer-Verlag, 1973.
[13] KREITZ, C. - "Program synthesis - Automated Deduction - A basis for Applications", pp 105-134,
Kluwer, 1998.
[14 ] LAU, K. and WIGGINS, G. - "A tutorial on Synthesis of Logic Programs form Specifications". In P.
Van Hentenryck (ed.), Proceedings of the Eleventh International Conference on Logic Programming,
pp 11-14, MIT Press, 1994.
[15] MARTIN-LOF, P. - "Intuitionistic Type Theory".Edizioni difilosqfia e Scienza, Biblioplolis, 1984.
[16] MANNA,Z. and WALDINGER.R. - "A Deductive Approach to program synthesis". ACM
transactions on Programming Languages and Systems, 2(1):90-121, 1980
[17] VELOSO, P.A.S. - "Outlines of a mathematical theory of general problems". Philosophic Naturalis,
21(2/4): 234-362, 1984.
[18] HAEUSLER, E.H. - "Extracting Solution from Constructive Proofs: Towards a Programming
Methodology". Brasilian Eletronic Journal on Mathematics of Computation (BEJMC). No. 0- Vol 0,
1999. (http://gmc.ucpel .tche.br/bejmc)
[19] SILVA, G.M.H. - "Urn Estudo em Sintese Construtiva de Programas utilizando Logica Intuicionista".
Disserta9ao de Mestrado - Departamento de Informatica -PUC-Rio. 1999.
234 Advances in Logic, Artificial Intelligence and Robotics
J.M. Abe and J.I. da Silva Filho (Eds.)
IOS Press. 2002
1. INTRODUCTION
Expert systems (ESs) have been successfully applied in many areas (e.g. medical, games
and engineering). The use of expert human knowledge in computational systems has
motivated an increasing number of researches in ES domain [7] [10].
Basically, an ES consists of facts about the observed system, causal relations (rules)
which specifies the different ways to change the facts, and an inference engine (IE) to
effectiveness changes them. The IE has an inference cycle, which is the process to match
rules and facts to obtain the proved rules, when a new fact happens [7][10].
There are some difficulties in the development of ESs, as the construction of efficient IE
with acceptable response time. If the goal is to design a small-scale ES, it would be trivial
to develop a simple program where in each inference cycle all facts would be match with
all rules. However, this approach results in a computational complexity that is exponential
with respect to the number of rules [7] [9] [10].
It was needed more advanced policies to build lEs, like the very used Rete networks
proposed by Forgy [5]. The kernel of many commercial expert systems (e.g. ART, CLIPS,
OPS83 and ILOG Rules), designed to handle hundreds or thousands of rules, use this
technique or a variant thereof [9].
J.M. Simao and P.C. Stadzisz / An Agent-oriented Inference Engine 235
This paper proposes a new approach to implement ES, where agents represent both facts
and rules, and the IE is reached by their relationship. The agents applied in this work are
reactive and co-operatives, following the relations established by an object-oriented
architecture, which can be considered as a multi-agent approach [1][8][11][14]. The main
improvements are the search elimination in the inference process by using a notification
mechanism and the functional independence in the architecture composition obtained by
encapsulating and sharing the rule components and elements of facts base.
As instance of application, a Rule Based Systems (RBS), derived from the proposed
architecture, is applied to solve the problem of Supervisory Control of an Automated
Manufacturing System (SCAMS). SCAMS is a discrete control responsible for co-
ordinating the factory components (e.g. lathes and robots) carrying out predefined
production process [2][4][12]. The design of a SCAMS is a complex activity due to the
automation level, required flexibility and involved risks, costs and time. Therefore, it is an
appropriate example to demonstrate the robustness of the proposed approach.
This article is organised as follows: Section 2 describes expert systems and production
systems, Section 4 presents the proposed architecture, Section 5 presents the architecture
for SCAMS and Section 6 presents the conclusions of this work.
the recency of facts from the WM or the simplicity/complexity of the rule) and the rules
with better punctuation are, in order, selected from the agenda. The execution carries out
the actions of the selected rules. These actions may either assert the new inferred facts into
the WM or invoke external functions to the PS. These features are the nutshell of a PS
To present the approach propose in this paper and related concepts, the following two
rules (in Figure 2) will be considered. These rules were developed to control a hypothetical
manufacturing cell where the facts are frames.
In the Rule 1 , the components of the rule are indicated. This work considers that the
Condition is concerned with Premises and the Action is concerned with Orders. Each
Premise that compares the value of an attribute with a constant is called a Simple Premise
(SP), and each Premise that compares values of attributes is defined as Composed Premise
(CP). In the Rule 2, the composition of Premises and Orders is indicated for both object-
oriented and frame point of view.
An agent can be defined as a software module, with high degree of cohesion, with well-
defined scope, with autonomy and taking part of a certain context whose changes are
perceived by the agent. These perceptions may change the agent behaviour and it may
promote another changes in the context [1][6][14].
3.2 Rules
A rule is computationally represented by a Rule Agent (RA), composed of a Condition
Agent (CA) and an Action Agent (AA), which have a causal relation. CA and AA
represent, respectively, the Condition and the Action of the rule.
A CA carries out the logical calculus for the RA by the conjunction of boolean values
from the connected Premise Agents (PAs). There are simple and compose PAs
respectively representing the SPs and CPs. A simple PA points to an AT from a FBA (the
Reference), has a logical operator (the Operator) and has a limit of values (the Value).
This PA obtains its boolean value comparing the Value and Reference using the Operator.
The composed PA still can has the Value as a pointer to another AT. Others CAs may share
whichever PA.
An AA is connected to Orders Agents (OAs), which represent the Orders of a rule. The
OAs change states of FBAs (by means of their MAs) referenced in the respective CA. Each
AA is linked to a CA and will only be prone to execution if the evaluation produced by this
CA results true.
generic than a Rule, therefore its premises only analyse if a FBA is from a given class (e.g.
class A or B) without considering, at first, the ATs values. A FR filter creates combinations
of agents from the facts base, where each agent pertains to one derived class in the specific
architecture application. Each combination creates a Rule following the structure specified
by the FR elements. A FR is computationally represented by a Formation Agent (FA).
Each FA has a Condition Formation Agent (CFA) and an Action Formation Agent
(AFA). The CFA is connected to Premises Formation Agents (PFAs). The main function
of the PFA is to filter agents pertaining to a given class. Each PFA specifies a Reference (to
an AT of a class), a Value and an Operator. This information is not used at first, but may
be used as a model for the creation of PAs. The role of a CFA is to find the combinations
among the PFAs filtered elements for its FA. The FAA is a sequence of Orders Formation
Agents (OFA) that serves only as models for the creation of OAs.
Each resulting FA combination allows a RA instancing. The associates of the FA (i.e.
CFA, PFA, AFAs and OFAs) create the RA's associates (respectively CA, PAs, AA and
OAs). After the creation of a CA, the necessary PAs are created and connected. At the PA
creation, it receives the Reference, the Value and the Operator, in accordance to the FA
specifications. After the creation, the PA executes its logical calculus and attributes to itself
a boolean value. After having its PAs connected, each RA may know its boolean value. As
more RAs are created and need already existent PAs, connections between those PAs and
the RAs are done, avoiding redundancies and generating a graph of connections. The AA,
after being created, is connected to a sequence of OAs, in accordance to the FA's
specifications. The information of an AFA is used only to create the AAs.
The PFAs will just analyse the restrictions of the FBAs attributes if it has been explicitly
determined. This analysis is the earlier logical calculus, where a FBA is filtered only if the
boolean result is true, hi summary, this resource is a pruning in the graph of combination
avoiding the creation of RAs without semantic meaning (i.e. RAs that would never reach a
'true' state).
FRs are means to assist the Rule creation process. The architecture allows the direct
creation of Rules, without the use of FRs. As well as Rules, FRs need a way (maybe an
agent) to extract the knowledge in the respective FAs and Ras. FAs, like RAs, still have the
ability to share elements (i.e. PFAs and OFAs), avoiding redundancies in the filtering
process.
Figure 5 exhibits a FR in the form of agents to create control rules for a manufacturing
cell. It filters combinations of agents from the facts base, where each agent pertains to one
of following classes: FBAtation, FBARobot and FBAStorage.
changed when discrete events occur. It is integrated with other functions in a plant as
Planning, Scheduling and Supervision to make the production process management [2] [4].
The SCAMS is responsible for co-ordinating the factory components (e.g. lathes and
robots) to carry out predefined production process. This co-ordination includes: monitoring
of the discrete states from each element in the factory (i.e. facts); decision about these states
and current process (i.e. condition); and actuation over factory components by means of
appropriate commands and respecting specific protocols (i.e. action) [2][12].
In this example, the ATs understand and explicit the discrete events (e.g. robot is free,
storage has a part and lathe is occupied) and MEs command the components (e.g. to move
a robot's end-effector). To better represent, monitor and command, the FBAs are
specialised in Active Command Components (ACC) for physical components (e.g.
machines and parts), Control Unity (CU) for the hierarchy (e.g. station and cell) or
Abstract Agent (AB) for abstract elements that come from other decision elements (e.g.
process plan and lot of parts from scheduler).
The Rule structure (i.e. RAs) is essentially the same as in base architecture, since their
function is to analyse information from a standard interface (i.e. AT and MAs). FRs (i.e.
FAs) are also the same, but now working over specialised FBAs [12][13].
The derived architecture was experimented over an advanced AMS simulator, called
ANALYTICE II, which allows reproduce the mains factory features. SCSAM does not
interact with real but with simulated equipments. Tests demonstrate the easiness for the
creation of the SCSAM and the robustness of the constituted control system [13].
The control of discrete event systems has been extensively studied in academy and in
industry research centres. Many approaches and methods for design and implementation
have been proposed [3][4]. The developed SCSAM using the notification, agent-based
approach described in this paper, is an innovation in this research area and can be viewed as
a very flexible and efficient model to implement supervisory control for industrial plants.
5. CONCLUSIONS
The paper presented a new approach to create and carry out ES, where advances are met
by the use of reactive and co-operative agents. Agents are designed and implemented using
an object-oriented paradigm, consisting of a software architecture with advantageous
characteristics, like functional independence, easy comprehension and derivation, trade-off
between generality-applicability and robust execution of instances.
In this generic PS, the implementation of a notification mechanism was possible due to
the WM (facts) and the PM (rules) be self-contained agents, with some autonomy, and
distributed in a graph of connections. With this mechanism, searches are unnecessary and
temporal redundancies are eliminated. Logical calculus only occurs when there is a new
fact that is propagated only over the related elements. Notification mechanism makes
possible an incremental inference by its straightforward notifications, bringing to a quicker
response to the changes. Another feature of this approach is the sharing of Premises
between Rules, eliminating the structural redundancies and also improving the inference
process
The architecture still facilities the conception of Rules, allowing to its construction in a
generic way, by means of class-oriented Premises and agents that generates specific Rules
with object-oriented Premises.
This generic approach specifies the base of fact in a standard form. For create a derived
architecture, it is necessary that the facts be encapsulated in AT and the change of facts in
MAs. The architecture was specialised in RBS to SCSAM and, in fact, the main derivation
occurs in the FBA, in this case to treat different elements of factory. The tests over
J.M. Simdo and P.C. Stadzisz / An Agent-oriented Inference Engine 241
ANALYTICE n demonstrate that the principles of architecture are actually functional and
derived architecture is robust and applicable for different instance of SCSAM.
The notification principle, carried out by agents, to implement an IE is the main feature
of the proposed approach. The architecture may be applied in a number of applications,
including supervisory control systems. Future works include the expansion of the
applications and the development of a framework, including distributed processes and tools,
to assist the fact base and rule module creation. Others works include the development of
methods for the causal relation synthesis using Petri Nets and mapping techniques from
these solutions to implementations under the proposed architecture.
References
[I] Avila, B. C. & Prado, J. P. de A. & Abe, J. M., 1998: "Inteligencia artificial distribuida; aspectos". Serie
Logica e Teoria da Ciencia, Instituto de Estudos Avan9ados - Universidade de Sao Paulo.
[2] Bongaerts, L., 1998: Integration of Scheduling and Control in Holonic Manufacturing Systems. Ph. D.
Thesis PMA / Katholieke Universiteit Leuven.
[3] Chaar, J. K. & Teichroew, D. & Volz, R. A., 1993: Developing Manufacturing Control Software: A
Survey and Critique. The International Journal of Flexible Manufacturing Systems. Kluwer Academic
Publishers. Manufactured in The Netherlands, pp. 53-88.
[4] Cury, J. E. R., de Queiroz, M. H. e Santos, E. A. P., 2001. Sintese Modular do Controle Supervisors em
Diagrama Escada para uma Celula de Manufatura. V Simposio Brasileiro de Automa9ao Inteligente,
Canela, RS, Brasil.
[5] Forgy, C. L., 1982: RETE: A Fast Algorithm for the Many Pattern/Many Object Pattern Match Problem.
Artificial Intelligence.
[6] Franklin, S. & Graesser, A., 1996: Is it an Agent, or Just a Program? A Taxonomy for Autonomous
Agents, Institute for Intelligent Systems - University of Memphis, Proceedings of the Third International
Workshop on Agent Theories, Architectures and Languages, Springer-Verlag.
[7] Jackson, P., 1990: Introduction to Expert Systems. Addison-Wesley.
[8] Muller, J. P. 1998: Architectures and Applications of Intelligent Agents: A Survey. International House.
Baling London W5 5DB. Knowledge Engineering Review.
[9] Pan, J. & DeSouza, G. N. & Kak, A. C., 1998: "FuzyyShell: A Large-Scale Expert System Shell Using
Fuzzy Logic for Uncertainty Reasoning. IEEE Transactions on Fuzzy Systems, Vol. 6, No 4, November.
[10] Rich, E. & Knight, K., 1991: Artificial Intelligence. McGraw-Hill.
[II] Rumbaugh, J. & Jacobson, I. & Booch, G., 1999: The Unified Modeling Language Reference Manual.
Ed. Addison Wesley Longman.
[12] Simao, J. M., 2001: Proposta de uma arquitetura para sistemas flexiveis de manufatura baseada em regras
e agentes. Master of Science Thesis. CPGEI, CEFET-PR. Brasil.
[13] Simao, J. M. & Silva, P. R. O. da & Stadzisz, P.C. & Kunzle, L. A., 2001: Rule and Agent Oriented
Software Architecture for Controlling Automated Manufacturing Systems. Pg 224 in Logic, Artificial
Intelligence and Robotic - Frontiers in Artificial Intelligence (Serie) - LAPTEC. lOSPress Ohmsha.
[14] Yufeng, L. and Shuzhen, Y., 1999: Research on the Multi-Agent Model of Autonomous Distributed
Control System, In 31 International Conference Technology of Object-Oriented Language and Systems,
IEEE Press, China.
242 Advances in Logic, Artificial Intelligence and Robotics
J.M. Abe and J.I. da Silva Filho (Eds.)
IOS Press. 2002
Abstract
The multiagent systems paradigm represents one of the most promising approaches to the
development complex systems. However, multiagent system specification is a hard task, overcoat
if it is necessitated a formal specification. In this paper we develop a language we call LTLAS
equivalent to the AML language developed by M. Wooldridge. AML takes as theoretical models,
the intentional approach to agent-oriented paradigm and the temporal logic to construct a formal
model. As AML, our language LTLAS has the power to make a formal specification but is easier
to use and understand because it looks like any programming language commonly used by
developers. Thus, the main contribution of this paper is the creation of the object language based
on temporal logic, which let us give a computational interpretation for AML rules.
Key words: Agents, Temporal Logic, Agent Language, and Formal Specification
1. Introduction
The development of distributed applications comes to be a complex problem. The main cause
is that distributed applications need coordination, cooperation and interaction process among
the different distributed entities. The multiagent systems represent one of the most promising
approaches to solve these problems. However, as any other paradigm this approach
necessitates formal specification techniques to ensure the match among specification and
implementation. The main problem using formal specification to develop systems, is the
complexity faced by users, which ought to deal with formulas of temporal logic [], algebraic
equations [] or Petri Networks []. This problem motivates us to develop a language with
syntax more easy to understand but with power necessary to specify formally agent-based
systems.
The works we consider important for our article are those of: M. Wooldridge [16] who
develop a BDI logic used to establish the agents metalanguage AML, Shoham [12] who
proposed an agent-oriented language called AGENTO, R. Thomas [15] she proposed the
PLACA agent programming language, and N. Farias [3] who proposed the LCIASA Agent
Communication Language.
Today, the predominant approach to express MultiAgent Systems consists on treating them as
intentional systems, which can be understood attributing them mental states, such as desires,
beliefs and intentions. The most successful approach to deal with intentional systems is with
the temporary logic as is shown by important works as: the theory of intention of Cohen-
Levesque [5], the model of desire-belief-intentions of Rao and Gorgeff [10] and the temporal
logic of beliefs of Michael Wooldridge [16]. Unfortunately, specify formally systems using
temporal logic can become a difficult work, mainly because the notation is different to that
used in programming languages. The objective of LTLAS is to keep the power of temporal
N.F. Mendoza ami F.F.R. Corchudo /LTLAS 243
In order to remember the way in which the logic can be used to describe dynamic properties
of agents here we remember briefly the Agent Logic or AL. The AL represents the base of the
Agent Meta Language AML giving a formal model to study the characteristics, properties and
behavior of Multi-Agent Systems, before implementing it as a computational system. The
temporal logic used for AML is a kind of non-classical logic [16] where the time model offers
the basis to describe dynamical systems. In this paper are used time linear discrete sequences
as a basic temporal model. Using this temporal model is possible define an agent as the 4-
tuple:
def
A = <p°,/V,M,r|>
Where:
P° e BS is the agent's initial belief set.
f{ : BS -> Ac is the agent's action set.
M : BS —» p(Mess) is the agent's message generation function,
r] : BS X Ac X p(Mess) —» BS is the agent's next state function.
Using the previous agent's definition it's possible to specify agent's characteristics and
observe its behavior and state changes through actions executed by the agent. Taking agent's
initial belief an agent selects an action to execute using the function /\ and some messages to
send using the function M. The function r\ then change the agent from one state to another, on
the foundation of received message and the action it has just achieved.
On basis of previous agent's definition, we give the Multi-Agent definition as an indexed set
of such agents:
def
MAS = { < (3°,, /V ,-, Mi} i\i >:ie Ag>}
The linear time temporal belief logic is a branch of modal logic; it is associated with time
situations development that is, dynamic situations where the formulae interpretation is
accomplished using frames with linear accessibility relations.
The AL [16] is a prepositional logic and contains three atomic operators: Bel to describe
agent's Belief, Send to describe the messages that agents send and do for describing the
actions that agents execute. Likewise, AL contains a set of modal temporal operators, which
allows the description of agent's dynamic properties
3. - if <|> and \\i are formulas of AL, then the following are formulae of AL:
O<j) ®((> (J) (i \\i <j> coy
The first four rules deals with atomic formulae of Al, the formulae true is a logical constant
for truth, its always satisfied. The formula (Bel i, <j>) is read: agent i belief (j). The formula (Do
i, a) is read: agent i perform the action a. The formula Send(i, j, <{>) describe the sending of
messages and will be satisfied if agent / has senty a message with content <j>.
3. LTLAS Description
The interaction mechanism among the agents is carried out by means of the exchange of
messages, which are grouped to form a plan, this plan is perceived as a sequence of messages
to realize a certain task. The syntax of LTLAS is based on the plan concept. The Fig. 2 shows
in a general way, the syntax of LTLAS, which is based on the message concepts and of plans.
Each plan has an identifier that characterizes it, a block-of the-Plan that contains the attributes
of the plan, its pre-conditions, the body of the plan and its post-conditions.
In this paper we use an axiomatic approach to define the syntactic structure of LTLAS. This
approach is based in the following general notation for an axiom:
a; b; c;.... g |- z
Which is reading as: from the premises a, b, c, ...g can be asserted z. Moreover in the
derivation process we use the inference rule called modus ponens (|- <)> |-<|>=>x|/ | |-\|/). In this
246 N.F. Mendoza and F F R Conhudo /LTLAS
approach we use the semantic rules given above in 2.2 for the AL language as premises of the
axiomatic system in order to derive a set of rules as an axiomatic system, which guarantee the
specifications correctness.
The Fig. 3 displays the syntactic structure and the semantic interpretation of LTLAS, using an
axiomatic approach to describe it.
way a concordant solution. The interaction process presented by a Dooley graphic [1,2], and
modified by Parunak. In order to simplify the actions sequence we consider the principal
actions only. In the graphic representation are listed two kinds of agents: an client agent
(Purchaser) denoted by A, and three suppliers agents denoted by B, C, D (Sellers), A;, Bj, C,,
D, denote subsequent states of A, B, C and D respectively.
In our application we identify the next four stages: First; find suppliers, steps 1-9 in Fig. 4, In
this phase the client explores the products' description of a list of suppliers and in function of
the parameters like the price, cost, quality of the product, time of delivery and quality of
service, selects to the provider most appropriate.
Second; term agreement, steps 10-11 in Fig. 4, In this step the buyer revises the salesperson's
catalogue and he makes an order, he generates and fills the pre-purchase form, the salesperson
calculates the taxes, shipping and handling and he sends the OBI petition to the buyer, the
buyer approves the order, he carries out the payment and he sends the complete order to the
salesperson.
Third; supply the order, step 12 in Fig. 4 In this stage the salesperson requests the credit
authorization to the bank, he registers and notifies the order, if he has the authorization of the
credit and the product is available the salesperson orders the product shipment to the client.
Fourth support the client, steps 13, 14 in Fig 4. In this stage the order and the client personal
data are registered to take it like reference in future operations.
Next, in fig. 5 we expose the Find Suppliers phase, using, LTLAS to derive the
actions sequence associated with these phases of the application.
Plan Find-suppliers
Begin
Variables Catalogue, P, SI, S2, S3, F : Agents
Preconditions: KQML protocol
Begin-Plan
While (( Catalogue = True) A -.Eos)
Send (P,F, recruit-all (ask-all(x))) A (Bel F Authenticates-Originator (P))
248 N.F Memloza and F.F.R Corchado /LTLAS
Do (P, recruit(ask-all(x)))
Send (S1,F, Advertise (ask (x))) A (Bel SI Authenticates-Originator (SI))
Do (SI, Advertise(ask (x))
Send (S2,F, (Advertise (ask (x))) A (Bel SI Authenticates-Originator (S2))
Do (S2, Advertise(ask (x)))
Send (S3,F, Advertise (ask (x))) A (Bel SI Authenticates-Originator (S3))
Do (S3, Advertise(ask (x))
Send (F,S, broadcast(ask (x))) A (Bel S2 Authenticates-Originator (F))
Do (F, ask(x))
Send (S1,P, Tell(x)) A (Bel P Authenticates-Originator (SI))
Do (SI, Tell(P))
Send (S2,P, Tell(x)) A (Bel P Authenticates-Originator (S2))
Do (S2, Tell(P))
Send (S3,P, Tell(x)) A (Bel P Authenticates-Originator (S3))
Do (S3, Tell(P))
If (Val(Sl) >((Val(S2) and Val(S2))) then
Send (P, SI, Tell(x)) A ((Bel SI Authenticates-Originator (P))
Do (P, Tell(x)
Do (Cheks-Plan)
End -Plan
End
Fig.5 Find suppliers Specification phase in LTLAS
In this description, we use the properties of liveness and safety [11,4] to assure the
consistency of the sequence of actions corresponding to this phase of the CB protocol.
The specification in LTLAS looks like any programming language we know today.
However, it keeps all the power of specification of AML. In this way LTLAS can be
seen as a translation of AML.
5. Conclusion
In this article we present LTLAS a language useful to facilitate the specification of agent-
based systems. As showed in the previous sections, LTLAS application result suitable to
specify Multi-Agent system, because LTLAS sentences offer a Multi-Agent specification
expressive, trusted, legible and easy to use and understand. LTLAS is suitable to develop
cooperative applications as: selling-buying operations, bank transactions, electronic
commerce, and industrial applications.
The associations of LTLAS with AML offer another perspective for Multi-Agent
specification. Because is possible check the application consistence, the system behavior and
the characteristics of a specification for a Multi-Agent system applications, before
implementing it as a computational system.
Our future work is directed mainly towards: verification task process, in order to assure the
specification correctness. We are studying the possibility for using the STEP demonstrator
described in [13] or the Rao & Georgeff Algorithm [9]. Also we start developing a tool for
N.F. Mendoza and F.F.R. Corchado /LTLAS 249
automatic programming, that is, implement a tool for translating the specification to an
implementation in Java.
6. References
1 Introduction
System identification aims to obtain models that reproduce the dynamic behavior of a pro-
cess based on input and output data. The resulting models may be linear or nonlinear. The
linear models are normally easier to obtain, since the linear identification techniques are well
developed and the model is simpler. On the other hand, it is more difficult to get nonlinear
models, due to the variety of alternatives of structures and the complexity of selecting the
relevant terms. Identification of nonlinear is a relatively new area of research. The theoretical
basis of nonlinear identification is still being established.
In most studies of identification of a process by using its input-output data, it is assumed
that there exists a global functional structure between the input and the output such as a linear
relation.
R.A. Jeronimo et al. /Fuzzy Identification of a pH Neutralization Process 251
It is, however, very difficult to find a global functional structure for a nonlinear process.
On the other hand, fuzzy modeling is based on the idea to find a set of local input-output
relations describing a process. So it is expected that the method of fuzzy modeling can express
a nonlinear process better than an ordinary method.
The system identified in this work is the model of a pH neutralization process occurring
inside a CSTR (continuous stirred tank reactor) which has many industrial applications.
This article is organized as follows: in the 2nd section a abbreviation description of the
pH neutralization process is presented. The description of TSK-type models for dynamic
systems is introduced in the 3rd section. The identification of the process of pH neutralization
is accomplished and the obtained models are presented in the 4th section. Finally, the 5th
section presents the conclusions.
The pH neutralization process is a non-linear process and it has a high gain next to the neu-
trality point (pH=7), where the slope of the titration curve is very high. Another characteristic
of this process is that the composition of the incoming fluid is normally variable. The pro-
cess dead time is variable and it depends on the volume variation and on the process input
flows [2]. Furthermore, small quantities of intruder elements such as, for example, carbon or
potassium (with buffering capacity) change the process dynamics.
The pH neutralization plant used to perform the work is a CSTR unit, in which it is
desired to obtain an effluent with a desired pH (normally neutral) by introduction of a base in
a generic acid. A simplified diagram of this plant is shown in figure 1, which is an extension
of the process studied by [3], [1] and [7].
In the process, there are three input flows (q\,q2 and #3) which are mixed in tank TQ4
and an output flow q4, representing the flow of the output fluid. In this work, two measured
252 K.A. Jeronimo et al. / Fuzzy Identification of a pH Neutralization Process
inputs and one output variable were considered for the process identification. The variables of
considered inputs considered in the identification process were: 91 and q3. The output variable
was the pH.
The variables of the pH neutralization process and their values nominal are shown below.
Inputs:
• <73 = base flow (sodium hydroxide mixture, NaOH with sodium bicarbonate, NaHCO3),
(q3 = 0,4686 1/min);
Output:
The equations that describe the process as well as their parameters and the chemical
reactions are here omitted but can be found in [6].
3 Fuzzy Modeling
Takagi-Sugeno fuzzy models are suitable to model a large class of nonlinear systems. Fuzzy
modeling from measured data are effective tools for the approximation of uncertain nonlinear
systems.
To build fuzzy models from data generated by poorly understood dynamic systems,
the input-output representation is often applied. A very common structure is the NARX model
("Non-Linear Auto-Regressive with exogenous inputs"), which can represent the observable
and controllable models of a large class of discrete-time non-linear systems. It establishes a
relationship between the collection of past input-output data and the predicted output.
where A; denotes discrete time samples, ny and nu are integers related to the system order.
Consider the nonlinear multi-input single-output system
which has known m operating points (x^i, rr^, . . . , xijn), i = 1,... , m, around which the
system can be decomposed into m linearized subsystems: y> = f i ( x i , X 2 , . . . , xn). The TSK
model for the system can be genetically represented as
3=1
where Ojj, 6jj and Cj are the consequent parameters. The NARX local model can represent
MISO systems directly and MIMO systems in a decomposed form as a set of coupled MISO
models. It should be noted that the dimension of the regression problem in input-output mod-
eling is often larger than the state-space models, since the state of the system can usually be
represented by a vector of a lower dimension than, for instance, in the NARX model given
by equation (1).
The pH neutralization process identification was performed in open loop. The consequent part
of parameter identification was made "off-line", combining the "Modified Gram-Schmidt"
method (its description can be found in [6]) with the method presented in [5]). The model
significant terms can be selected based on the "error reduction ratio" (ERR).
The data were acquired during 1000 minutes with a sampling time Tss = 20/60 minutes
(= 0,333 min), which corresponded to 2001 samples. The inputs were disturbed with evenly
distributed random signals and they occurred simultaneously in all the input variables. The
254 R.A. Jeronimo et al. / Fuzzy Identification of a pH Neutralization Process
measured inputs are: acid flow (qi) and base flow (q3). The pH value (pH) was the output
variable. The other variables of the system, i.e., were kept constant at their nominal values.
2001 samples different from the identification data were used to do the validation of the
model.
It is very important to check the existence of nonlinear interactions in the identification data.
There are algorithms that detect the non-linearities and indicate if it is necessary to use
nonlinear models to reproduce the system dynamics.
Consider the dynamic system described in the following dynamic equation:
y(t)=fl[y(t-l),y(t-2),...,y(t-ny),u(t-d),u(t-d-I),u(t-d-2),...,
u(t-d-nu + l ) , e(t - 1), e(t - 2), . . . , e(t - ne)} + e(t) (5)
where / is a generic mathematical function. The sampling period Ts was normalized. [4]
demonstrated that:
= o, VTC (6)
Equation (6) is valid if and only if the system given in equation (5) is linear. In this case,
the system can be considered linear if equation (6) is kept within the probabilistic confi-
dence interval of 95%, that delimits the region where the autocorrelation function given above
has to remain to be considered null. The limits of the confidence interval are approximately
±1, 96/V^V, being TV the length of the available data vector.
The 2001 output data collected for the identification process were used to detect if there
were nonlinear interactions in the identification data. Figure 2 illustrates the obtained result.
The result shown in figure 2 presented expressive non-linearities out of the confidence
interval of 95%, indicating that nonlinear structures are necessary to represent the pH neutra-
lization process dynamics.
For the of input variables, acid flow (<?i), the fuzzy linguistic attributes were: very low (MB)0,
low (5), medium (M), high (A), very high (MA). The acid flow varies between (0—2) l/min,
whereas the base flow varies between (0—1.7) l/min. The output variable, pH, varies between
(0 —14) with the following attributes linguistic: pH very acid (pH MA), pH little acid
(pH A), pH medium (pH M), pH little basic (pH PB), pH very basic (pH MB).
Figure 3illustrates the membership functions of the input variables for the acid flow (gO
and base flow (g3), respectively.
Figure 3: Membership functions of the input variables, (acid flow) and (base flow)
Figure 4 illustrates the membership function of the output variable for the pH.
l) = f(pH(k),qi(k),q3(k)} (7)
where k denotes the sampling instant, / is an unknown relationship approximated by the TSK
fuzzy model, q\(k) and q3(k) represents the acid flow and base flow, respectively and pH(t)
represents the output variable, pH.
The figure 5 illustrates the graphic of surface of the process of pH neutralization.
The model significant terms were selected based on the "error reduction rate", resulting
in only 5 rules to describe the identification procedure considered in this work. The resulting
rules were:
5 Conclusions
Nonlinear fuzzy models were here employed to identify a pH neutralization process. The
application of an advance regression orthogonal estimator, based on the orthogonalization
procedure named Modified Gram-Schmidt (MGS), resulted in nonlinear fuzzy models that
satisfactorily represented the pH neutralization process dynamics.
References
[2] 0. A. Zanabria, Modeling, Simulation and Control of pH Neutralization Process, Master Dissertation,
Polytechnic School of the University of Sao Paulo, Department of Eletronic Engineering, Laboratory of
Automation and Control, Sao Paulo, SP, Brazil, (1997).
[3] R. C. Hall, Control of pH Based on Reaction Invariants, Master Thesis, University of California, Santa
Barbara, (1987).
[4] S. A. Billings and S. F. Voon, Structure Detection and Model Vality Tests in the Identification of Nonlinear
Systems, IEEE Proceedings ISOParteD, (1983) 193-199.
[5] L. Wang and R. Langari, Building Sugeno-Type Models Using Fuzzy Discretization and Orthogonal Pa-
rameter Estimation Techiques, IEEE Transcations on Fuzzy Systems 3 (1995) 454-458.
[7] M. E. Henson, Feedback Linearization Strategies for Nonlinear Process Control, Master Thesis, University
of California, Santa Barbara, (1992).
258 Advances in Logic, Artificial Intelligence and Robotics
J.M. Abe and J.I. da Silva Filho (Eds.)
1OS Press. 2002
Abstract
1. Introduction
C, = l - ( l - p ) ; V > 0 ,
* email: [email protected]
T.E.P. Succhetta el al. / A Fuzzy Reed-Frost Model for Epidemic Spreading 259
Several attempts were made to generalise the Reed-Frost model, considering a non-
homogeneous population in terms of infectivity or susceptibility [3"5]. But in all the
attempts the solution was given by the stratification of the population into subgroups,
considering homogeneous mixing within the subgroups. The groups are closed and the
individuals remain in the same subgroup during the whole epidemic. In such situations, the
level of infectivity and susceptibility is assumed as constant throughout the epidemic
course, not with understanding heterogeneity among the different groups.
In a previous paper, Menezes et al. [6] included in the classical Reed-Frost model
the uncertainty involved in the classificatory process, considering infectivity levels as a
time-dependent random variable. Each individual presents a series of clinical indicators
used to classify them as infected or not, and the classificatory process includes the
possibility of errors, such as an infected individual being classified as susceptible. This
model is a stochastic generalisation of a classical Reed Frost. Those authors[6] modelled an
individual's infectivity as a function of the clinical signals, allowing for time-dependent,
heterogeneous infectivity. Constant susceptibility levels are assumed and the probability
structure lead them to an expression for the epidemic Basic Reproduction Ratio of the
infection.
In this work we propose a generalisation of Menezes et al. ^ model, where the
individuals classification process present a fuzzy decision structure based on clinical
signals. We propose two different fuzzy structures and compare the simulation results with
that of the Menezes et al. model.
3. Fuzzy Structure
individual of the population, to the fuzzy set Infected and in the fuzzy set Susceptible,
through a max-min composition of fuzzy relation.
Consider a set of signals S and the matricial representation of a fuzzy relation.
Thus, S = [sik] is the array of k signals of the individual /, I = [iki] is the matrix that
associate each signal to the Infective statement and DI = [din] is the membership degree of
the individual / in the fuzzy set Infected. So, this fuzzy structure allows a fuzzy
classification of each individual to an Infected class, providing a degree of infectiousness.
In the same way, S = [sjk] is the array of not signal, i.e., S is the negation of S, which could
be computed through the standard complement(7] of the signals fuzzy set, S, using:
5 = 1-5
for each signal. For instance, if the set of signal is S = [fever, cough], i.e., si is fever and 82
is cough, and an individual presents fever degree equal S] = 0.7 and cough degree equal 82
= 0.4. So, the degree of not fever, s\, and not cough, §2, of this individual are §1= 0.3 and sj
= 0.6, respectively. It is possible to find the degree of susceptibility to each individual
computing the fuzzy relation between the fuzzy set not signal and the associative matrix
not signal/susceptibility through max-min composition.
For example, consider the same two symptoms, fever and cough, and an array
which represent the association of these symptoms and the infectiousness, RI = [rfCVer.
icoughL where rfever is the relationship degree between the symptom fever and Infected class
and rcough is the relationship degree between the symptom cough and Infected class. So, an
individual that have a degree of fever, Sfever, and a degree of cough, scough, will belong to
the infected fuzzy set with the degree given by:
The equivalent process is made by using each not signal degree and the relationship
between the respective not signal and Susceptible class for calculating the degree of
susceptibility,
If DI - DS, then a homogeneous random number will decide about the individual
classification. So, the number of infected individuals at time t is given by:
At time /, the probability Pijit that a contact between a susceptible individual / and
an infected individual j results in an infective contact is computed by two forms. First, we
assume the same probability function proposed by Menezes et al.. Thus, the only
difference between their model and the present one is the classificatory method. In this
case Pijit is given by:
'''' zk '
where S/s is the .9/7? signal of the infected individual j, k is the number of signals and z is a
parameter that makes Pi/it a small number, which is necessary for the operations. This P,y,
form is comparable to Menezes et al. model when /*//,= Djh where DJ)t is an average of the
clinical signals.
In the second form of/*#./ we consider the infectiousness degree of the individual j,
given by:
z
where D/is the infectiousness degree of the infected individual j.
In both cases the probability that a susceptible individual has, at time /, at least one
infectious contact is:
In this case, we do not classify the individuals in infected or not, we just consider
the degree of infectiousness of each individual. So the number of infected individuals, at
time /, is found through the fuzzy cardinality of the Infected fuzzy subset, that is,
262 T.E.P. Sacchetta el at. / A Fuzzy Reed-Frost Model for Epidemic Spreading
The probability />//,/ that a contact between a susceptible individual / and an infected
individual y results in a infective contact of the form:
DL+DI,J
P = —-
2z
where P^ - P^ and all possible contacts are considered. Since we are not using the
information about the susceptibility degree of individuals, it is considered the homogeneity
in the susceptible set.
Now, the probability of a susceptible individual having, at time /, at least one
infectious contact is given by
all [Xiirs
where there is no repetitive pair. Clearly this model is very different from Menezes et al.
proposal.
3.3. Simulation
As expected, the preliminary results indicate that the behaviour of the epidemic
depends on the studied region of parameters for all models. For certain regions in the
space of parameters, Menezes et al. model provides extended epidemics, however this is
not a rule, in other regions the epidemics are longer with the fuzzy model.
The model capacity of fitting for an epidemic clearly depends on the disease.
Thinking of an influenza epidemic, in which the infectivity degree varies a lot depending
on the symptoms, and considering that we are studying a small population, the fuzzy
structure seems to be more appropriate, since a symptom summary is not a good indicator
of the infective contact in this case. In order to simulate an influenza epidemic, inside a
small group like a family for example, we consider four clinical signals: fever, cough,
sneeze and running nose. The relational matrices signal/infected, /, and not
signal/susceptible, S, were:
~0.8l [0.8"
0.6 0.7
1= S=
0.7 0.6
0.6 0.7
i.e., the relation between fever and to be infected is 0.8, cough and to be infected is 0.6,
sneeze and to be infected is 0.7 and running nose and to be infected is 0.6. In the same
way, the relation between not fever and to be susceptible is 0.8 (if the individual do not
present fever, then the chance that he/she is susceptible to an infection is high), the not
T.E.P. Sacchetta et al. /A Fuzzy Reed-Frost Model for Epidemic Spreading
cough and to be susceptible is 0.7 and so on. To this case, in the fuzzy structure the
epidemic spread quickly, resulting in a entirely susceptible population. The epidemic size
depends on the severity of the clinical signals in the beginning of the epidemic course, as
expected.
4. Discussion
Several attempts have been made to generalise the Reed-Frost model to consider a
non-homogeneous group, either from the susceptibility or from the infectivity viewpoint
[2,4,5,8] ^ a jj mese^ me homogeneity assumption is relaxed by dividing the main group into
subgroups, and considering homogeneous mixing within each subgroup. Subgroups are
closed and individuals remain within the same subgroup for the entire duration of the
epidemics, which means that an individual's susceptibility and infectivity levels are taken
as constant throughout the epidemic course. To the best of our knowledge, Menezes et al.
work [6' is the first Reed-Frost model that considers the heterogeneity of individuals as
related to the clinical signals. The model proposed here provides generalisations forms to
deal with those heterogeneities, particularly in the fuzzy classification method.
Menezes et al. model consider the arithmetic average of the symptoms as a signal
summary to classify the individual as Infected or Susceptible. In Reed-Frost Fuzzy model
type 1 this classification was performed based on the degree of infectiousness and
susceptibility. In fact, the fuzzy structure do not require the classification of the individuals
in infected or not in order to elaborate the dynamic of the epidemic. This consists in one of
the main characteristic of fuzzy systems applied in epidemic studies. In the Reed-Frost
Fuzzy model type 2, the epidemic dynamic is performed considering the membership
infectiousness degree of all individuals. Apparently this approach is more useful for the
majority of epidemiological diseases in reality, since in our model we aim to incorporate
the individual heterogeneity according to which a person is infected or susceptible in
different degrees, and considering the ability of each signal to spread the disease.
The fuzzy model structure presented allow also to compute the Basic Reproduction
Ratio, RO, defined by the number of secondary infections caused over one generation, after
the introduction of a single infected individual in an entirely susceptible population [10]. In
the type 1 model, RO would be calculated based upon the expected value of /2 since // is 1,
and upon the probability distribution of the random variable DI, which is the infectivity
degree. In type 2 it could be calculated through fuzzy clustering structures, as proposed in
[ii]
Finally, the development of new tools to deal with heterogeneities is getting more
necessary. Including the several heterogeneities in our model we aim to incorporate
naturally existing differences among individuals in order to make it applicable to real
epidemic scenaria. For the one hand, there are several infections, like influenza, which,
besides being transmitted among small groups of individuals, produce highly
heterogeneous clinical pictures. On the other hand, the huge amount of genetic information
provided by the emerging field of genomics (and proteonomics), generates clinical
informations which may sharply distinguish individuals. We are convinced that the fuzzy
structures would contribute with this new areas.
264 T.E.P. Succhetta et al. / A Fuzzy Reed-Frost Model for Epidemic Spreading
References
[I] ABBEY, H. An examination of the Reed-Frost theory of epidemics. Human Biology 24 (3):201-33,
1952.
[2] MAIA, J.O.C. Some mathematical developments on the epidemic theory formulated by Reed and Frost.
Human Biology 24 (3): 167-200, 1952.
[3] JACQUEZ, J.A. A note on chain-binomial models of epidemic spread: what is wrong with the Reed-
Frost formulation. Math. Bioscience 87:73-82, 1987.
[5] LEFREVE C., PICARD P. On the formulation of discrete-time epidemic models. Math. Bioscience 95
(l):27-35, 1989.
[6] PICARD P., LEFREVE C. The dimension of Reed-Frost epidemic models with randomized
susceptibility levels. Math. Bioscience 107 (2): 225-33, 1991.
[7] MENEZES, R., ORTEGA, N.R.S. and MASSAD, E. A Reed-Frost Model Taking Account
Uncertainties in the Diagnostic of the Infection. Applied to Bulletin of Mathematical Biology, 2002.
[8] KLIR, G. and YUAN, B.. Fuzzy Sets and Fuzzy Logic: Theory and applications. Prentice Hall, USA.
1995.
[9] SCALIA-TOMBA, G.. Asymptotic final size distribution of multityped Reed-Frost process. Journal of
Applied Probability 23: 563-584, 1986.
[10] DIEKMANN, O.; HEESTERBEEK, J. A. P. & METZ, J. A. J. (1990). On the definition and the
computation of the Basic Reproduction Ratio RQ in models for infectious diseases in heterogeneous
populations. Journal of Mathematical Biology 28: 365-382.
[II] MASSAD, E., ORTEGA, N., BURATTINI, M., STRUCHINER, C.. Fuzzy RO and Epidemic Dynamics.
In: IMA - Conference on Mathematics Modelling and Statistical Analysis of Infectious Diseases, 2001,
Cardiff-UK., 2001, p. 1-4.
Invited Talks
Abstracts
This page intentionally left blank
Advances in Logic, Artificial Intelligence and Robotics 267
J.M. Abe and J.I. da Silva Filtw (Eds.)
1OS Press, 2002
Abstract:
The operation of a power system is intrinsically complex due to high
degree of uncertainty and the large number of variables involved.
The various supervision and control actions require the presence of
an operator, who must be capable of efficiently responding to the
most diverse requests, by handling various types of data and
information .
The problem found by the operator is to use all available data in his
analyses. A huge number of data and information in a control center
database must be manipulated and , mainly, Composed to allow the
operator to have a visualization of the current state of the system.
The manipulation of all data/information is not an easy task. This
paper presents an example of an alternative approach to help the
operators to produce the classification of the system in its three
possible states. This approach is based on the Rough Set theory,
proposed by Pawlak in 1982.
268 Advances in Logic, Artificial Intelligence and Robotics
J.M. Abe and J.I. da Silva Filho (Eds.)
IOS Press, 2002
We will present an automaton model for such situations with varying independencies of
actions. We will show how finite or infinite concurrent computations can be represented
by labelled partial orders and how different logics (predicate calculus, temporal logics)
can be used for specification of the partial orders and thereby of the computations. We
develop connections to formal language theory, in particular for recognizable, aperiodic
or star-free languages. This generalizes classical results of Kleene, BV'uchi,
McNaughton and Papert, and recent results from trace theory. If time permits, we also
point out a connection between our automaton model and Petri nets.
Advances in Logic, Artificial Intelligence and Robotics 269
J.M. Abe and J.I. da Silva Filho (Eds.)
IOS Press, 2002
Extended Abstract
Professor in the Graduate School and Director, Berkeley Initiative in Soft Computing (BISC), Computer
Science Division and the Electronics Research Laboratory, Department of EECS, University of California,
Berkeley, CA 94720-1776; Telephone: 510-642-4959; Fax: 510-642-1712; E-mail:
[email protected]. Research supported in part by ONR Contract NOOO14-00-1-0621, ONR Contract
N00014-99-C-0298, NASA Contract NCC2-1006, NASA Grant NAC2-117, ONR Grant N00014-96-1-
0556, ONR Grant FDN0014991035, ARO Grant DAAH 04-961-0341 and the BISC Program of UC
Berkeley
270 Abstracts
By a Polysynthetic Class Theory we understand a set theory in which (1) there are
sets and classes, (2) the well formed expressions can be interpreted both as propositions
and classes and (3) there is a well formed expression, V(z), with x as its only free
variable such that:
A. P. Morse rst developed the well known impredicative set theory of the Appendix
to Kelley's General Topology in a polysynthetic format; the polysynthetic version was
nally published in the monograph A Theory of Sets, 1965.
Although polysynthethism violates the familiar syntax rules for post Hilbert formal
languages of mathematical logic, in which the well formed expressions either correspond
to classes or (in the exclusive interpretation) to propositions, it was used by the founders
of the discipline, G. Boole, Jevons, E. Schrder, etc.
In this talk we extend the ideas of Boole,. . . , Morse in a formalism suitable for
this century.
Advances in Logic, Artificial Intelligence and Robotics 273
J.M. Abe and J.I. da Silva Filho (Eds.)
IOS Press, 2002
In this paper we discuss several issues underlying the problem P = NP and its difficulties,
and showing some advances concerning the theme, including some results obtained by us.
Advances in Logic, Artificial Intelligence and Robotics 275
J. M.Abe and J. I. da Silva Fitho (Eds.)
IOS Press, 2002
We are overwhelmed with data. The amount of data in the world, in our lives,
seems to go on and on increasing. More powerful computers, inexpensive
storage systems, and the World Wide Web directly contribute to make
enormous amounts of data available to everyone. We would all testify to the
growing gap between the generation of data and our understanding of it. As
the volume of data increases, inexorably, the proportion of it that people
understand decreases alarmingly. Lying hidden in all this data is information,
potentially useful information, which is rarely made explicit or taken
advantage of.
Evaluation is one of the key to making real progress in data mining. There are
many methods to induce knowledge from data. However, to determine which
276 Abstracts
Our talk will concentrate in this latter topic. Several case studies will be
treated in order to illustrate the importance of evaluating learning algorithms
during the data mining process.
277
Author Index