(Jorge Cardoso) Semantic Web Services Theory
(Jorge Cardoso) Semantic Web Services Theory
(Jorge Cardoso) Semantic Web Services Theory
Acquisitions Editor:
Development Editor:
Senior Managing Editor:
Managing Editor:
Assistant Managing Editor:
Copy Editor:
Typesetter:
Cover Design:
Printed at:
Kristin Klinger
Kristin Roth
Jennifer Neidig
Sara Reed
Sharon Berger
Julie LeBlanc
Elizabeth Duke
Lisa Tosheff
Yurchak Printing Inc.
Semantic Web services : theory, tools and applications / Jorge Cardoso, editor.
p. cm.
Summary: "This book brings together researchers, scientists, and representatives from different communities to study, understand,
and explore the theory, tools, and applications of the semantic Web. It joins the semantic Web, ontologies, knowledge management,
Web services, and Web processes into one fully comprehensive resource, serving as the platform for exchange of both practical
technologies and research"--Provided by publisher.
Includes bibliographical references and index.
ISBN 978-1-59904-045-5 (hardcover) -- ISBN 978-1-59904-047-9 (ebook)
1. Semantic Web. 2. Web services. I. Cardoso, Jorge, 1970TK5105.88815.S45 2006
025.04--dc22
2006033762
British Cataloguing in Publication Data
A Cataloguing in Publication record for this book is available from the British Library.
All work contributed to this book set is new, previously-unpublished material. The views expressed in this book are those of the authors, but not necessarily of the publisher.
Table of Contents
Chapter I
The Syntactic and the Semantic Web/ Jorge Cardoso............................................................................. 1
Chapter II
Logics for the Semantic Web/ Jos de Bruijn.......................................................................................... 24
Chapter III
Ontological Engineering: What are Ontologies and How Can We Build Them?
Oscar Corcho, Mariano Fernndez-Lpez, and Asuncin Gmez-Prez.............................................. 44
Chapter IV
Editing Tools for Ontology Creation/ Ana Lisete Nunes Escrcio and Jorge Cardoso......................... 71
Chapter V
Web Ontology Languages/ Grigoris Antoniou and Martin Doerr........................................................ 96
Chapter VI
Reasoning on the Semantic Web/ Rogier Brussee and Stanislav Pokraev.......................................... 110
Chapter VII
Introduction to Web Services/ Cary Pennington, Jorge Cardoso, John A. Miller,
Richard Scott Patterson, and Ivan Vasquez ........................................................................................ 134
Chapter VIII
Service-Oriented Processes: An Introduction to BPEL/ Chun Ouyang, Wil M.P. van der Aalst,
Marlon Dumas, Arthur H.M. ter Hofstede, and Marcello La Rosa .................................................... 155
Chapter IX
Semantic Web Services/ Rama Akkiraju.............................................................................................. 191
Chapter X
The Process of Semantic Annotation of Web Services/ Christoph Ringelstein,
Thomas Franz, and Steffen Staab ....................................................................................................... 217
Chapter XI
Semantic Web Service Discovery: Methods, Algorithms and Tools/ Vassileios Tsetsos,
Christos Anagnostopoulos, and Stathes Hadjiefthymiades................................................................. 240
Chapter XII
Semantic Web Service Discovery in the WSMO Framework/ Uwe Keller, Rubn Lara,
Holger Lausen, and Dieter Fensel....................................................................................................... 281
Chapter XIII
Semantic Search Engines Based on Data Integration Systems/ Domenico Beneventano
and Sonia Bergamaschi ...................................................................................................................... 317
About the Authors . ........................................................................................................................... 343
Index.................................................................................................................................................... 350
Foreword . ...........................................................................................................................................viii
Preface..................................................................................................................................................... x
Acknowledgments ............................................................................................................................. xvii
Chapter I
The Syntactic and the Semantic Web/ Jorge Cardoso............................................................................. 1
This chapter gives an overview of the evolution of the Web. Initially, Web pages were specified syntactically and were intended only for human consumption. New Internet business models, such as B2B
and B2C, require information on the Web to be defined semantically in a way that it can be used by
computers, not only for display purposes, but also for interoperability and integration. To achieve this
new type of Web, called Semantic Web, several technologies are being developed, such as the resource
description framework and the Web Ontology Language.
Chapter II
Logics for the Semantic Web/ Jos de Bruijn.......................................................................................... 24
This chapter introduces several formal logical languages which form the backbone of the Semantic
Web. The basis for all these languages is the classical first-order logic. Some of the languages presented
include description logics, frame logic and RuleML.
Chapter III
Ontological Engineering: What are Ontolgies and How Can We Build Them?
Oscar Corcho, Mariano Fernndez-Lpez, and Asuncin Gmez-Prez.............................................. 44
The term ontological engineering defines the set of activities that concern the ontology development
process, the ontology life cycle, the principles, methods and methodologies for building ontologies, and
the tool suites and languages that support them. In this chapter we provide an overview of ontological
engineering, describing the current trends, issues and problems.
Chapter IV
Editing Tools for Ontology Creation/ Ana Lisete Nunes Escrcio and Jorge Cardoso......................... 71
The activities associated with Ontological Engineering require dedicated tools. One of the first activities
is to find a suitable ontology editor. In this chapter we give an overview of the editing tools we consider
more relevant for ontology construction.
Chapter V
Web Ontology Languages/ Grigoris Antoniou and Martin Doerr........................................................ 96
This chapter gives a general introduction to some of the ontology languages that play an important role
on the Semantic Web. The languages presented include RDFS and OWL.
Chapter VI
Reasoning in the Semantic Web/ Rogier Brussee and Stanislav Pokraev........................................... 110
In this chapter we remember the reader the fundamental of description logic and the OWL ontology
language and explain how reasoning can be achieved on the Semantic Web. A real example using routers
is given to explain how ontologies and reasoning can help in determining the location of resources.
Chapter VII
Introduction to Web Services/ Cary Pennington, Jorge Cardoso, John A. Miller,
Richard Scott Patterson, and Ivan Vasquez ........................................................................................ 134
This chapter reviews the history out of which Web services evolved. We will see that Web services are
the result of the evolution of several distributed systems technologies. One of the concepts introduced
along Web services is service-oriented architecture (SOA). Since SOA is to be used by organizations,
we address important issues such as the specification of policies and security.
Chapter VIII
Service-Oriented Processes: An Introduction to BPEL/ Chun Ouyang,
Wil M.P. van der Aalst, Marlon Dumas, Arthur H.M. ter Hofstede, and Marcello La Rosa .............. 155
The Business Process Execution Language for Web Services (BPEL) is an emerging standard for specifying a business process made of Web services. In this chapter, we review some limitations of BPEL
and discuss solutions to address them. We also consider the possibility of applying formal methods and
Semantic Web technologies to support the development of a next generation of BPEL processes.
Chapter IX
Semantic Web Services/ Rama Akkiraju.............................................................................................. 191
Several researchers have recognized that Web services standards lack of semantics. To address this
limitation, the Semantic Web community has introduced the concept of Semantic Web service. When
the requirements and capabilities of Web services are described using semantics it becomes possible
to carry out a considerable number of automated tasks, such as automatic discovery, composition and
integration of software components.
Chapter X
The Process of Semantic Annotation of Web Services/ Christoph Ringelstein,
Thomas Franz, and Steffen Staab ....................................................................................................... 217
This chapter explains how Web services can be annotated and described with semantics. Semantic descriptions allow Web services to be understood and correctly interpreted by machines. The focus lies in
analyzing the process of semantic annotation, i.e., the process of deriving semantic descriptions from
lower level specifications, implementations and contextual descriptions of Web services.
Chapter XI
Semantic Web Service Discovery: Methods, Algorithms, and Tools/ Vassileios Tsetsos,
Christos Anagnostopoulos, and Stathes Hadjiefthymiades................................................................. 240
This chapter surveys existing approaches to Semantic Web service discovery. Semantic discovery will
probably substitute existing keyword-based solutions in order to overcome several limitations of the
latter.
Chapter XII
Semantic Web Service Discovery in the WSMO Framework/ Uwe Keller, Rubn Lara,
Holger Lausen, and Dieter Fensel....................................................................................................... 281
This chapter presents how the Web service modeling ontology (WSMO) can be applied for service
discovery. WSMO is a specification that provides a conceptual framework for semantically describing
Web services and their specific properties. This chapter is closely related to Chapter XI.
Chapter XIII
Semantic Search Engines Based on Data Integration Systems/ Domenico Beneventano
and Sonia Bergamaschi ...................................................................................................................... 317
Syntactic search engines, such as Google and Yahoo!, are common tools for every user of the Internet.
But since the search is based only on the syntax of keywords, the accuracy of the engines is often poor
and inadequate. One solution to improve these engines is to add semantics to the search process. This
chapter presents the concept of semantic search engines which fundamentally augment and improve
traditional Web search engines by using not just words, but concepts and logical relationships.
About the Authors . ........................................................................................................................... 343
Index.................................................................................................................................................... 350
viii
Foreword
Semantic Web is here to stay! This is not really a marketing campaign logo, but it is a truth that every
year is becoming more and more relevant to the daily life of business world, industry and society.
I do not know how it happened, but the last years, through our activities in the Special Interest Group
on Semantic Web and Information Systems in the Association for Information Systems (http://www.
sigsemis.org), I had the opportunity to contact and collaborate with several key people for the evolution
of the SW as well as many leaders in different domains trying to understand their attitude for Semantic
Web1. I feel many times my background in Informatics and Management Science helps me to go beyond
the traditional exhaustive technical discussions on Semantic Web and to see the Forest. This is full of
fertile grounds, fruits for the people who will put the required tough efforts for the cultivation of the
fields and many more, and of course much more value for the early adopters.
A couple years ago I had an interview with Robert Zmud, professor, and Michael F. Price, chair in
MIS, University of Oklahoma. Given his legendary work in the adoption of technologies in business/organizational contexts, I asked him in a way how can we promote Semantic Web to business world. His
answer influenced all of my Semantic Web activities until then. I am copying here:
As with all adoption situations, this is an information and communication problem. One needs to segment the base of potential adopters (both in the IS community and in the business community) and then
develop communication programs to inform each distinct segment of, first, the existence of the innovation (know-what), then the nature of the innovation (know-how), and finally why this innovation would
be useful to them (know-why). These adopter segments are likely to be very different from each other.
Each will have a different likelihood of adoption and will likely require that a somewhat unique communication strategy be devised and directed toward the segment
So this is why Jorges current edition, as well as planned editions, give an answer to the problem of
many people. Semantic Web is discussed in the triptych know-what, know-how, and know-why and the
editing strategy of the book boosts the excellent quality of well known contributors. It is really amazing
how Jorge made it and so many academics and practitioners collaboratively worked for this edition.
Robert Zmud concluded his answer with one more statement which is worthy to mention.
My advice thus, is to segment the adopter population, identify those communities with the highest potential
for adoption, develop a targeted communication strategy, and then develop the relationships necessary
to deliver the communication strategy. Hope this helps.
This answer really justifies why you are fortunate to read this book. Semantics are evident everywhere
in every aspect of business, life, and society (Sheth, 2005)1. In this sense, Semantic Web Services:
Theory, Tools, and Applications provides a critical step forward in the understanding of the state of
the art of the Semantic Web.
ix
I am convinced that the next years Semantic Web will drive a new era of real world applications.
With its transparent capacity to support every business domain, the milestone of the knowledge society
will be for sure a Semantic Web primer. Within this context, computer science and information systems
experts have to reconsider their role. They must be able to transform business requirements to systems
and solutions that go beyond traditional analysis and design. This is why a lot of effort must be paid to
the introduction of Semantic Web in computer science and information systems curricula. Semantic
Web: Theory, Tools, and Applications can be used as an excellent text book for the relevant themes.
As a concluding remark I would like just to share with you some thoughts. There is always a questioning for the pace of the change, and the current stage in the evolution of the SW. I do believe that
there is no need to make predictions for the future. The only thing we need is strategy and hard work.
Educating people in Semantic Web in computer science departments and in business schools means
making them realize that semantics, logic, reasoning, and trust are just our mankind characteristics that
we must bring to our electronic words. If we do not support them our virtual information world looks
like a giant with glass legs. This is why I like the engineering approach of Jorge in this edition. We must
be able to support the giant with concrete computer engineering in order to make sustainable solutions
for real world problems. The fine grain of strategy and computer science will lead Semantic Web to a
maturity level for unforeseen value diffusion.
My invitation is to be part of this exciting new journey and to keep in mind that the people who
dedicate their lives in the promotion of disciplines for the common wealth from time to time need encouragement and support because their intellectual work is not valued in financial terms. This is why I
want to express my deepest appreciation and respect for Jorge Cardoso as scientist and man, and to wish
him to keep rocking in Semantic Web.
Dear Jorge, you did once again great job. And dear Readers, from all over the world you did the
best choice. Let us open together the Semantic Web to the society. And why not let us put together the
new milestones towards a better world for all through the adoption of leading edge technologies in
humanistic visions.
Miltiadis D. Lytras, University of Patras, Greece
Endnotes
Sheth, A., Ramakrishnan C., & Thomas, C. (2005). Semantics for the Semantic Web: The implicit,
the formal and the powerful. International Journal on Semantic Web and Information Systems,
Inaugural Issue, 1(1), 1-18.
Lytras, M. (2005). Semantic Web and information systems: An agenda based on discourse with
community leaders. International Journal on Semantic Web and Information Systems, Inaugural
Issue, 1(1), i-xii.
Preface
What is This Book About?
The current World Wide Web is syntactic and the content itself is only readable by humans. The Semantic Web proposes the mark-up of content on the Web using formal ontologies that structure underlying
data for the purpose of comprehensive machine understanding. Currently most Web resources can only
be found and queried by syntactical search engines. One of the goals of the Semantic Web is to enable
reasoning about data entities on different Web pages or Web resources. The Semantic Web is an extension of the current Web in which information is given well-defined meaning, enabling computers and
people to work in co-operation.
Along with the Semantic Web, systems and infrastructures are currently being developed to support
Web services. The main idea is to encapsulate an organizations functionality within an appropriate
interface and advertise it as Web services. While in some cases Web services may be utilized in an
isolated form, it is normal to expect Web services to be integrated as part of Web processes. There is
a growing consensus that Web services alone will not be sufficient to develop valuable Web processes
due the degree of heterogeneity, autonomy, and distribution of the Web. Several researchers agree that
it is essential for Web services to be machine understandable in order to support all the phases of the
lifecycle of Web processes.
It is therefore indispensable to interrelate and associate two of the hottest R&D and technology areas
currently associated with the WebWeb services and the Semantic Web. The study of the application
of semantics to each of the steps in the Semantic Web process lifecycle can help address critical issues
in reuse, integration and scalability.
xi
design and integration, bio-informatics, education, and so forth ontological engineering is defined as the
set of activities that concern the ontology development process, the ontology life cycle, the principles,
methods and methodologies for building ontologies, and the tool suites and languages that support them.
In Chapter III we provide an overview of all these activities, describing the current trends, issues and
problems. More specifically, we cover the following aspects of ontological engineering: (a) Methods
and methodologies for ontology development. We cover both comprehensive methodologies that give
support to a large number of tasks of the ontology development process and methods and techniques
that focus on specific activities of this process, focusing on: ontology learning, ontology alignment and
merge, ontology evolution and versioning, and ontology evaluation; (b) Tools for ontology development. We describe the most relevant ontology development tools, which give support to most of the
ontology development tasks (especially formalization and implementation) and tools that have been
created for specific tasks, such as the ones identified before: learning, alignment and merge, evolution
and versioning and evaluation, and (c) finally, we describe the languages that can be used in the context
of the Semantic Web. This includes W3C recommendations, such as RDF, RDF schema and OWL, and
emerging languages, such as WSML.
Chapter IV gives an overview of editing tools for building ontologies. The construction of an ontology demands the use of specialized software tools. Therefore, we give a synopsis of the tools that we
consider more relevant. The tools we have selected were Protg, OntoEdit, DOE, IsaViz, Ontolingua,
Altova Semantic Works, OilEd, WebODE, pOWL and SWOOP. We started by describing each tool and
identifying which tools supported a methodology or other important features for ontology construction.
It is possible to identify some general distinctive features for each software tool. Protg is used for
domain modeling and for building knowledge-base systems and promotes interoperability. DOE allows
users to build ontologies according to the methodology proposed by Bruno Bachimont. Ontolingua was
built to ease the development of ontologies with a form-based Web interface. Altova SemanticWorks
is a commercial visual editor that has an intuitive visual interface and drag-and-drop functionalities.
OilEds interface was strongly influenced by Stanfords Protg toolkit. This editor does not provide
a full ontology development environment. However, it allows users to build ontologies and to check
ontologies for consistency by using the FaCT reasoner. WebODE is a Web application. This editor supports ontology edition, navigation, documentation, merge, reasoning and other activities involved in the
ontology development process. pOWL is capable of supporting parsing, storing, querying, manipulation, versioning and serialization of RDFS and OWL knowledge bases in a collaborative Web enabled
environment. SWOOP is a Web-based OWL ontology editor and browser. SWOOP contains OWL
validation and offers various OWL presentation syntax views. It has reasoning support and provides a
multiple ontology environment.
The aim of Chapter V is to give a general introduction to some of the ontology languages that play
a prominent role on the Semantic Web. In particular, it will explain the role of ontologies on the Web,
review the current standards of RDFS and OWL, and discuss open issues for further developments. In
the context of the Web, ontologies can be used to formulate a shared understanding of a domain in order
deal with differences in terminology of users, communities, disciplines and languages as it appears in
texts. One of the goals of the Semantic Web initiative is to advance the state of the current Web through
the use of semantics. More specifically, it proposes to use semantic annotations to describe the meaning
of certain parts of Web information and, increasingly, the meaning of message elements employed by
Web services. For example, the Web site of a hotel could be suitably annotated to distinguish between
the hotel name, location, category, number of rooms, available services and so forth Such meta-data
could facilitate the automated processing of the information on the Web site, thus making it accessible
to machines and not primarily to human users, as it is the case today. The current and most prominent
Web standard for semantic annotations is RDF and RDF schema, and its extension OWL.
xii
Semantic Web, ontologies, knowledge management and engineering, Web services, and Web processes. It serves
as the platform for exchange of both practical technologies and far reaching research.
xiii
In Chapter VI we describe and explain how reasoning can be carried out in on the Semantic Web.
Reasoning is the process needed for using logic. Efficiently performing this process is a prerequisite
for using logic to present information in a declarative way and to construct models of reality. In this
chapter we describe both what the reasoning over the formal semantics of description logic amounts
to and to, and illustrate how formal reasoning can (and cannot!) be used for understanding real world
semantics given a good formal model of the situation. We first describe how the formal semantics of
description logic can be understood in terms of completing oriented labeled graphs. In other words we
interpret the formal semantics of description logic as rules for inferring implied arrows in a dots and
arrows diagram. We give an essentially complete graphical overview of OWL that may be used as
an introduction to the semantics of this language. We then touch on the algorithmic complexity of this
graph completion problem giving a simple version of the tableau algorithm, and give pointers to existing implementations of OWL reasoners. The second part deals with semantics as the relation between
a formal model and reality. We give an extended example building up a small toy ontology of concepts
useful for describing buildings, their physical layout and physical objects such as wireless routers and
printers in the turtle notation for OWL. We then describe a (imaginary) building with routers in these
terms. We explain how such a model can help in determining the location of resources given an idealized wireless device that is in or out of range of a router. We emphasize how different assumptions on
the way routers and buildings work are formalized and made explicit in the formal semantics of the
logical model. In particular we explain the sharp distinction between knowing some facts and knowing
all facts (open, versus closed world assumption). The example also illustrates the fact that reasoning is
no magical substitute for insufficient data. This section should be helpful when using ontologies and
incomplete real world knowledge in applications.
Chapter VII gives an introduction to Web service technology. Web services are emerging technologies that allow programmatic access to resources on the Internet. Web services provide a means to create
distributed systems which are loosely couple, meaning that the interaction between the client and service
is not dependent on one having any knowledge of the other. This type of interaction between components
is defined formally by the service-oriented architecture (SOA). The backbone of Web services is XML.
Extensible Markup Language (XML) is a platform independent data representation which allows the
flexibility that Web services need to fulfill their promise. Simple object access protocol, or SOAP, is
the XML-based protocol that governs the communication between a service and the client. It provides
a platform and programming language independent way for Web services to exchange messages. Web
Service Description Language (WSDL) is an XML-based language for describing a service. It describes
all the information needed to advertise and invoke a Web service. UDDI is a standard for storing WSDL
files as a registry so that they can be discovered by clients. There are other standards for describing
policy, security, reliability, and transactions of Web services that are described in the chapter. With all
this power and flexibility, Web services are fairly easy to build. Standard software engineering practices
are still valid with this new technology though tool support is making some of the steps trivial. Initially,
we design the service as a UML class diagram. This diagram can then be translated (either by hand or by
tools like Posiden) to a Java interface. This class can become a Web service by adding some annotations
to the Java code that will be used to create the WSDL file for the service. At this point, we need only to
implement the business logic of the service to have a system that is capable of performing the needed
tasks. Next, the service is deployed on an application server, tested for access and logic correctness, and
published to a registry so that it can be discovered by clients.
In Chapter VIII we introduce and provide an overview of the Business Process Execution Language
for Web services (known as BPEL4WS or BPEL for short), an emerging standard for specifying the
behavior of Web services at different levels of details using business process modeling constructs. BPEL
xiv
represents a convergence between Web services and business process technology. It defines a model and a grammar for describing the behavior of a business process based on interactions between the process and its partners.
Being supported by vendors such as IBM and Microsoft, BPEL is positioned as the process language of the
Internet. The chapter firstly introduces BPEL by illustrating its key concepts and the usage of its constructs to
define service-oriented processes and to model business protocols between interacting Web services. A BPEL
process is composed of activities that can be combined through structured operators and related through control
links. In addition to the main process flow, BPEL provides event handling, fault handling and compensation
capabilities. In the long-running business processes, BPEL applies correlation mechanism to route messages
to the correct process instance. On the other hand, BPEL is layered on top of several XML specifications such
as WSDL, XML schema and XPath. WSDL message types and XML schema type definitions provide the data
model used in BPEL processes, and XPath provides support for data manipulation. All external resources and
partners are represented as WSDL services. Next, to further illustrate the BPEL constructs introduced above, a
comprehensive working example of a BPEL process is given, which covers the process definition, XML schema
definition, WSDL document definition, and the process execution over a popular BPEL-compliant engine. Since
the BPEL specification defines only the kernel of BPEL, extensions are allowed to be made in separate documentations. The chapter reviews some perceived limitations of BPEL and extensions that have been proposed
by industry vendors to address these limitations. Finally, for an advanced discussion, the chapter considers the
possibility of applying formal methods and Semantic Web technology to support the rigorous development of
service-oriented processes using BPEL.
Web services show promise to address the needs of application integration by providing a standards-based
framework for exchanging information dynamically between applications. Industry efforts to standardize Web
service description, discovery and invocation have led to standards such as WSDL, UDDI, and SOAP respectively. These industry standards, in their current form, are designed to represent information about the interfaces
of services, how they are deployed, and how to invoke them, but are limited in their ability to express the capabilities and requirements of services. This lack of semantic representation capabilities leaves the promise of
automatic integration of applications written to Web services standards unfulfilled. To address this, the Semantic
Web community has introduced Semantic Web services. Semantic Web services are the main topic of Chapter
IX. By encoding the requirements and capabilities of Web services in an unambiguous and machine-interpretable
form semantics make the automatic discovery, composition and integration of software components possible.
This chapter introduces Semantic Web services as a means to achieve this vision. It presents an overview of
Semantic Web services, their representation mechanisms, related work and use cases. Specifically, the chapter
contrasts various Semantic Web service representation mechanisms such as OWL-S, WSMO and WSDL-S and
presents an overview of the research work in the area of Web service discovery, and composition that use these
representation mechanisms.
Web services are software components that are accessible as Web resources in order to be reused by other
Web services or software. Hence, they function as middleware connecting different parties such as companies
or organizations distributed over the Web. In Chapter X, we consider the process of provisioning data about a
Web service to constitute a specification of the Web service. At this point, the question arises how a machine
may attribute machine-understandable meaning to this metadata. Therefore, we argue for the use of ontologies
for giving a formal semantics to Web service annotations, that is, we argue in favor of Semantic Web service
annotations. A Web service ontology defines general concepts such as service or operation as well as relations
that exist between such concepts. The metadata describing a Web service can instantiate concepts of the ontology. This connection supports Web service developers to understand and compare the metadata of different
services described by the same or a similar ontology. Consequently, ontology-based Web service annotation
leverages the use, reuse and verification of Web services. The process of Semantic Web service annotation in
general requires input from multiple sources, that is legacy descriptions, as well as a labor-intensive modeling
xv
effort. Information about a Web service can be gathered for example from the source code of a service
(if annotation is done by a service provider), from the API documentation and description, from the
overall textual documentation of a Web service or from descriptions in WS* standards. Depending on
the structuredness of these sources, semantic annotations may (have to) be provided manually (e.g., if
full text is the input), semi-automatically (e.g. for some WS* descriptions) or fully automatically (e.g.,
if Java interfaces constitute the input). Hence, a semantic description of the signature of a Web service
may be provided by automatic means, while the functionality of Web service operations or pre- and
post-conditions of a Web service operation may only be modeled manually. Benefits of semantic specifications of Web services include a common framework that integrates semantic descriptions of many
relevant Web service properties. It is the purpose of this chapter to explain the conceptual gap between
legacy descriptions and semantic specifications and to indicate how this gap is to be bridged.
Chapter XI deals with methods, algorithms and tools for Semantic Web service discovery. Semantic
Web has revolutionized, among other things, the implementation of Web services lifecycle. The core
phases of this lifecycle, such as service discovery and composition can be performed more effectively
through the exploitation of the semantics that annotate the service descriptions. This chapter focuses on
the phase of discovery due to its central role in every, service-oriented architecture. Hence, it surveys
existing approaches to Semantic Web service (SWS) discovery. Such discovery process is expected to
substitute existing keyword-based solutions (e.g., UDDI) in the near future, in order to overcome their
limitations. First, the architectural components of a SWS discovery ecosystem, along with potential
deployment scenarios, are discussed. Subsequently, a wide range of algorithms and tools that have been
proposed for the realization of SWS discovery are presented. The presentation of the various approaches
aims at outlining the key characteristics of each proposed solution, without delving into technologydependent details (e.g., service description languages). The descriptions of the tools included in this
chapter provide a starting point for further experimentation by the reader. In this respect, a brief tutorial
for a certain tool is provided as an appendix. Finally, key challenges and open issues, not addressed
by current systems, are identified (e.g., evaluation of service retrieval, mediation and interoperability
issues). The ultimate purpose of this chapter is to update the reader on the recent developments in this
area of the distributed systems domain and provide the required background knowledge and stimuli for
further research and experimentation in semantics-based service discovery.
Taking an abstract perspective, Web services can be considered as complex resources on the Web, that
is, resources that might have more complex structure and properties than conventional data that is shared
on the Web. Recently, the Web service modeling ontology (WSMO) has been developed to provide a
conceptual framework for semantically describing Web services and their specific properties in detail.
WSMO represents a promising and rather general framework for Semantic Web service description and is
currently applied in various European projects in the area of Semantic Web services and Grid computing.
In Chapter XII, we discuss how Web service discovery can be achieved within the WSMO Framework.
First, we motivate Semantic Web services and the idea of applying semantics to Web services. We give
a brief high-level overview of the Web service modeling ontology and present the main underlying
principles. We discuss the distinction between two notions that are often intermixed when talking about
Semantic Web services and thus provide a proper conceptual grounding for our framework, namely we
strictly distinguish between services and Web services. Consequently, we distinguish between service
discovery and web service discovery, whereas only the latter is then considered in detail in the chapter.
Since in open environments like the Web, the assumption of homogeneous vocabularies and descriptions
breaks, we briefly consider mediation and discuss its role in service and Web service Discovery. Hereby,
we try to identify requirements on the discovery process and respective semantic descriptions which allow
facing heterogeneity and scalability at the same time. We then present a layered model of successively
xvi
more detailed and precise perspectives on Web services and consider Web service descriptions on each
of them. For the two most fine-grained levels, we then discuss how to detect semantic matches between
requested and provided functionalities. Based on our model, we are able to integrate and extend matching
notions that have been known in the area already. First, we consider Web services essentially as concepts
in an ontology, where required inputs and the condition under which a requested service actually can be
delivered is neglected. Then, we move forward to a more detailed level of description, where inputs and
respective preconditions for service delivery are no longer ignored. We show how to adapt and extend
the simpler model and matching notions from before to adequately address richer semantic descriptions
on this level. The various levels of descriptions are meant to support a wide range of scenarios that can
appear in practical applications, requiring different levels of details in the description of Web services
and client requests, as well as different precision and performance.
Chapter XIII focuses on semantic search engines and data integration systems. As the use of the
World Wide Web has become increasingly widespread, the business of commercial search engines
has become a vital and lucrative part of the Web. Search engines are common place tools for virtually
every user of the Internet; and companies, such as Google and Yahoo!, have become household names.
Semantic search engines try to augment and improve traditional Web search engines by using not just
words, but concepts and logical relationships. We believe that data integration systems, domain ontologies and schema based peer-to-peer architectures are good ingredients for developing semantic search
engines with good performance. Data integration is the problem of combining data residing at different
autonomous sources, and providing the user with a unified view of these data; the problem of designing
data integration systems is important in current real world applications, and is characterized by a number
of issues that are interesting from a theoretical point of view. Schema-based peer-to-peer networks are
a new class of peer-to-peer networks, combining approaches from peer-to-peer as well as from the data
integration and Semantic Web research areas. Such networks build upon peers that use metadata (ontologies) to describe their contents and semantic mappings among concepts of different peers ontologies.
In this chapter, we will provide empirical evidence for our hypothesis. More precisely, we will describe
two projects, SEWASIE and WISDOM, which rely on these architectural features and developed key
semantic search functionalities; they both exploit the MOMIS (www.dbgroup.unimo.it/Momis/) data
integration system. The first, SEWASIE (www.sewasie.org), rely on a two-level ontology architecture:
the low level, called the peer level contains a data integration system; the second one, called super-peer
level integrates peers with semantically related content (i.e., related to the same domain). The second,
WISDOM (www.dbgroup.unimo.it/wisdom/), is based on an overlay network of semantic peers: each
peer contains a data integration system. The cardinal idea of the project is to develop a framework that
supports a flexible yet efficient integration of the semantic content.
xvii
Acknowledgments
This book describes the most recent advances in Semantic Web and results of a collaborative effort
towards the development of a comprehensive manuscript that exposes the major issues related to this
new area of research. I wish to express my gratitude to everyone who contributed to making this book a
reality. This project is the accumulation of months of work by many dedicated researchers. Some of the
most well-know researcher in the world have dedicated their precious time to share their experience and
knowledge with you. It would not have been possible for me to produce this work without their help.
Chapter I
abstract
This chapter gives an overview of the evolution of the Web. Initially, Web pages were intended only for
human consumption and were usually displayed on a Web browser. New Internet business models, such
as B2B and B2C, required organizations to search for solutions to enable a deep interoperability and
integration between their systems and applications. One emergent solution was to define the information on the Web using semantics and ontologies in a way that it could be used by computers not only
for display purposes, but also for interoperability and integration. The research community developed
standards to semantically describe Web information such as the resource description framework and
the Web Ontology Language. Ontologies can assist in communication between human beings, achieve
interoperability among software systems, and improve the design and the quality of software systems.
These evolving Semantic Web technologies are already being used to build semantic Web based systems
such as semantic Web services, semantic integration of tourism information sources, and semantic digital
libraries to the development of bioinformatics ontologies.
Copyright 2007, IGI Global, distributing in print or electronic forms without written permission of IGI Global is prohibited.
One restriction of HTML is that it is semantically limited. There is a lack of rich vocabulary
of element types capable of capturing the meaning behind every piece of text. For example,
Google search engine reads a significant number
of the worlds Web pages and allows users to
type in keywords to find pages containing those
keywords. There is no meaning associated to the
keywords. Google only carries out simple matches
between the keywords and the words in its database. The metadata of HTML is not considered
when searching for a particular set of keywords.
Even if Google would use HTML metadata to
answer queries, the lack of semantics of HTML
tags would most likely not improve the search.
On the other hand, the Syntactic Web is the
collection of documents in the World Wide Web
that contain data not just meant to be rendered
by Web browsers, but also to be used for data
integration and interoperability purposes. To
be able to understand data, a computer needs
metadata which will be provided by some kind of
markup language. A widespread markup language
is XML. With HTML the set of tags available
to users is predefined and new tags cannot be
added to the language. In contrast, XML is an
extremely versatile markup language allowing
users to be capable of creating new tags to add
syntactic meaning to information.
In order to allow data integration, the meaning of XML document content is determined by
agreements reached between the businesses that
will be exchanging data. Agreements are usually
defined using a standardized document, such
as the document type definition (DTD) (XML,
2005) or the XML schema definition (XSD)
(XMLschema, 2005) that specifies the structure
and data elements of the messages exchanged.
These agreements can then be used by applications to act on the data.
In a typical organization, business data is
stored in many formats and across many systems
and databases throughout the organization and
with partner organizations. To partially solve
integration problems, organizations have been
unstructurEd,
sEMistructurEd,
and structurEd data
Data breaks down into three broad categories
(Figure 3): unstructured, semistructured, and
structured. Highly unstructured data comprises
free-form documents or objects of arbitrary sizes
and types. At the other end of the spectrum, structured data is what is typically found in databases.
Every element of data has an assigned format and
significance.
unstructured data
Unstructured data is what we find in text, files,
video, e-mails, reports, PowerPoint presentations,
semi-structured data
<University>
<University>
<Student
<StudentID=
ID=">
">
<Name>John</Name>
<Name>John</Name>
<Age></Age>
<Age></Age>
<Degree>B.Sc.</Degree>
<Degree>B.Sc.</Degree>
</Student>
</Student>
<Student
<StudentID=
ID=">
">
<Name>David</Name>
<Name>David</Name>
<Age></Age>
<Age></Age>
<Degree>Ph.D.
<Degree>Ph.D.</Degree>
</Degree>
</Student>
</Student>
.
.
</University>
</University>
structured data
id
name
age
degree
John
B.Sc.
David
Ph.D.
Robert
Ph.D.
Rick
6
M.Sc.
Michael
9
B.Sc.
semistructured data
Semistructured data lie somewhere in between
unstructured and structured data. Semistructured
data are data that have some structure, but are
not rigidly structured. This type of data includes
unstructured components arranged according to
some predetermined structure. Semistructured
data can be specified in such a way that it can be
queried using general-purpose mechanisms.
Semistructured data are organized into entities. Similar entities are grouped together, but
entities in the same group may not have the same
attributes. The order of attributes is not necessarily
important and not all attributes may be required.
The size and type of same attributes in a group
may differ.
An example of semistructured data is a Curriculum Vitae. One person may have a section of
previous employments, another person may have
a section on research experience, and another
may have a section on teaching experience. We
can also find a CV that contains two or more of
these sections.
A very good example of a semistructured
formalism is XML which is a de facto standard
for describing documents that is becoming the
universal data exchange model on the Web and
is being used for business-to-business transactions. XML supports the development of semistructured documents that contain both metadata
and formatted text. Metadata is specified using
XML tags and defines the structure of documents.
Without metadata, applications would not be able
to understand and parse the content of XML
documents. Compared to HTML, XML provides
explicit data structuring. XML uses DTD or XSD
as schema definitions for the semistructured data
present in XML documents. Figure 3 shows the
(semi) structure of an XML document containing
students records at a university.
structured data
In contrast, structured data are very rigid and
describe objects using strongly typed attributes,
which are organized as records or tuples. All records have the same fields. Data are organized in
entities and similar entities are grouped together
using relations or classes. Entities in the same
group have the same attributes. The descriptions
for all the entities in a schema have the same
defined format, predefined length, and follow
the same order.
Structured data have been very popular since
the early days of computing and many organizations rely on relational databases to maintain very
large structured repositories. Recent systems, such
as CRM (customer relationship management),
ERP (enterprise resource planning), and CMS
(content management systems) use structured
data for their underlying data model.
lEvEls of sEMantics
As we have seen previously, semantics is the
study of the meaning of signs, such as terms or
words. Depending on the approaches, models, or
methods used to add semantics to terms, different degrees of semantics can be achieved. In this
section we identify and describe four represen-
controlled vocabularies
Controlled vocabularies are at the weaker end of
the semantic spectrum. A controlled vocabulary is
a list of terms (e.g., words, phrases, or notations)
that have been enumerated explicitly. All terms
in a controlled vocabulary should have an unambiguous, non-redundant definition. A controlled
ontology
taxonomy
controlled vocabulary
Weak semantics
Electronics
Travel
Popular Music
Music Downloads
Software
Outlet
Classical Music
Auctions
DVD
Office Products
zShops
VHS
Magazines
Everything Else
Apparel
Scientific Supplies
Yellow Pages
Outdoor Living
Medical Supplies
Restaurants
Kitchen
Indust. Supplies
Movie Showtimes
Automotive
Toys
Beauty
Home Furnishings
Baby
Lifestyle
Computers
Musical Instruments
Pet Toys
Video Games
Health/Personal Care
taxonomies
A taxonomy is a subject-based classification that
arranges the terms in a controlled vocabulary into
a hierarchy without doing anything further. The
first users of taxonomies were biologists in the
classification of organisms. They have employed
this method to classify plants and animals according to a set of natural relationships. A taxonomy
classifies terms in the shape of a hierarchy or
tree. It describes a word by making explicit its
relationship with other words. Figure 5 shows a
taxonomy of merchandise that can be bought for
a home.
The hierarchy of a taxonomy contains parentchild relationships, such as is subclass of or is
superclass of. A user or computer can comprehend
the semantics of a word by analyzing the existing relationship between the word and the words
around it in the hierarchy.
Home
Computers
thesaurus
Furnishings
Hardware
Software
Coffee table
Futon
Sofa
ontologies
Lavatory
Toilet
Bathtub
Printer
Scanner
Modem
Network
Antivirus
OS
Editing
Spreadsheet
Drawing
Term
Used for
Narrower than
Academic Overachievement
Academic Underachievement
College Academic Achievement
Mathematics Achievement
Reading Achievement
Science Achievement
Broader than
Achievement
Related to
2.
3.
0
XMl
XML is accepted as a standard for data interchange on the Web allowing the structuring of
data on the Web but without communicating the
meaning of the data. It is a language for semistructured data and has been proposed as a solution
for data integration problems, because it allows
a flexible coding and display of data, by using
metadata to describe the structure of data (using
DTD or XSD).
In contrast to HTML, with XML it is possible
to create new markup tags, such as <first_name>,
which carry some semantics. However, from
a computational perspective, a tag like <first_
name> is very similar to the HTML tag <h1>.
While XML is highly helpful for a syntactic
interoperability and integration, it carries as
much semantics as HTML. Nevertheless, XML
solved many problems which have earlier been
impossible to solve using HTML, that is, data
exchange and integration.
A well-formed XML document creates a balanced tree of nested sets of open and closed tags,
each of which can include several attribute-value
pairs. The following structure shows an example
of an XML document identifying a Contact resource. The document includes various metadata
markup tags, such as <first_name>, <last_name>,
and <email>, which provide various details about
a contact.
<Contact contact_id=>
<first_name> Jorge </first_name>
<last_name> Cardoso </last_name>
<organization> University of Madeira </organization>
<email> [email protected] </email>
<phone> + 9 0 6 </phone>
</Contact>
rdf
At the top of XML, the World Wide Web Consortium (W3C) has developed the Resource Description Framework (RDF) (RDF, 2002) language to
standardize the definition and use of metadata.
Therefore, XML and RDF each have their merits
as a foundation for the semantic Web, but RDF
provides more suitable mechanisms for developing ontology representation languages like OIL
(Connolly, van Harmelen, et al., 2001).
RDF uses XML and it is at the base of the
semantic Web, so that all the other languages
corresponding to the upper layers are built on
top of it. RDF is a formal data model for machine
understandable metadata used to provide standard
descriptions of Web resources. By providing a
standard way of referring to metadata elements,
specific metadata element names, and actual
metadata content, RDF builds standards for XML
applications so that they can interoperate and
intercommunicate more easily, facilitating data
and system integration and interoperability. At
first glance it may seem that RDF is very similar
to XML, but a closer analysis reveals that they
are conceptually different. If we model the information present in a RDF model using XML,
human readers would probably be able to infer
the underlying semantic structure, but general
purpose applications would not.
RDF has a very limited set of syntactic constructs, no other constructs except for triples is
allowed. Every RDF document is equivalent to
an unordered set of triples. The example from
Figure 7 describes the following statement using
a RDF triple:
http://dme.uma.pt/jcardoso/
resource
Creator
Property type
Jorge Cardoso
Property value
dces#>
<Description about = http://dme.uma.pt/jcardoso/ >
<DC:Title> Jorge Cardoso Home Page </DC:
Title>
<DC:Creator> Jorge Cardoso </DC:Creator>
<DC:Date> 00-0- </DC:Date>
</Description>
</RDF>
rdf schema
The RDF schema (RDFS, 2004) provides a type
system for RDF. The RDFS is technologically
advanced compared to RDF since it provides a
way of building an object model from which the
actual data is referenced and which tells us what
things really mean.
DC:Subject
Property value
Jorge Cardoso Web Page
Home Page
DC:Creator
Jorge Cardoso
Box 1.
<?xml version=.0?>
<rdf:RDF
xmlns:rdf= http://www.w.org/999/0/-rdf-syntax-ns#
xmlns:rdfs=http://www.w.org/000/0/rdf-schema#
xml:base= http://www.hr.com/humanresources#>
class
<rdf:Description rdf:ID=staff>
<rdf:type
rdf:resourc e="ht tp: // w w w.w .org /0 0 0/0/rdfsubclass of
schema#Class"/>
</rdf:Description>
class
<rdf:Description rdf:ID="manager">
<rdf:type
rdf:resourc e="ht tp: // w w w.w .org /0 0 0/0/rdfschema#Class"/>
<rdfs:subClassOf rdf:resource="#staff"/>
</rdf:Description>
</rdf:RDF>
ontologies
An ontology is an agreed vocabulary that provides a set of well-founded constructs to build
meaningful higher level knowledge for specifying
the semantics of terminology systems in a well
defined and unambiguous manner. For a particular
domain, an ontology represents a richer language
for providing more complex constraints on the
types of resources and their properties. Compared
to a taxonomy, ontologies enhance the semantics
of terms by providing richer relationships between
the terms of a vocabulary. Ontologies are usually
expressed in a logic-based language, so that detailed and meaningful distinctions can be made
among the classes, properties, and relations.
Ontologies can be used to increase communication either between humans and computers.
The three major uses of ontologies (Jasper &
Uschold, 1999) are:
2.
3.
4.
5.
6
aPPlications of thE
sEMantic WEb
Even though the Semantic Web is still in its infancy, there are already applications and tools that
use this conceptual approach to build semantic
Web based systems. The intention of this section
is to present the state of the art of the applications
that use semantics and ontologies. We describe
various applications ranging from the use of
semantic Web services, semantic integration of
tourism information sources, and semantic digital
libraries to the development of bioinformatics
ontologies.
information. Although the Web Services Description Language (WSDL) does not contain semantic
descriptions, it specifies the structure of message
components using XML schema constructs. One
solution to create semantic Web services is by
mapping concepts in a Web service description
(WSDL specification) to ontological concepts
(LSDIS, 2004). The WSDL elements that can be
marked up with metadata are operations, messages, and preconditions and effects, since all
the elements are explicitly declared in a WSDL
description. Approaches and initiatives which
goal is to specify Web Services using semantics
and ontologies include OWL-S (OWL-S, 2004),
SWSI (SWSI, 2004), SWWS (SWWS, 2004),
WSML (WSML, 2004), WSMO (WSMO, 2004),
WSMX (WSMX, 2004), and WSDL-S (Akkiraju,
Farrell, et al., 2006)
As traditional libraries are increasingly converting to digital libraries, a new set of requirements has emerged. One important feature of
digital libraries is the ability to efficiently browse
electronic catalogues browsed. This requires the
use of common metadata to describe the records of
the catalogue (such as author, title, and publisher)
and common controlled vocabularies to allow
subject identifiers to be assigned to publications.
The use of a common controlled vocabulary,
thesauri, and taxonomy (Smrz, Sinopalnikova
et al., 2003) allows search engines to ensure
that the most relevant items of information are
returned. Semantically annotating the contents
of a digital librarys database goes beyond the
use of a controlled vocabulary, thesauri, or taxonomy. It allows retrieving books records using
meaningful information to the existing full text
and bibliographic descriptions.
Semantic Web technologies, such as RDF
and OWL, can be used as a common interchange
format for catalogue metadata and shared vocabulary, which can be used by all libraries and search
engines (Shum, Motta et al., 2000) across the Web.
This is important since it is not uncommon to
find library systems based on various metadata
formats and built by different persons for their
special purposes. By publishing ontologies, which
can then be accessed by all users across the Web,
library catalogues can use the same vocabularies for cataloguing, marking up items with the
most relevant terms for the domain of interest.
RDF and OWL provide a single and consistent
encoding system so that implementers of digital
library metadata systems will have their task
simplified when interoperating with other digital
library systems.
semantic Grid
The concept of Grid (Foster & Kesselman, 1999)
has been proposed as a fundamental computing
infrastructure to support the vision of e-Science.
The Grid is a service for sharing computer power
9
0
conclusion
Since its creation, the World Wide Web has allowed computers only to understand Web page
layout for display purposes without having access to their intended meaning. The semantic
Web aims to enrich the existing Web with a
layer of machine-understandable metadata to
enable the automatic processing of information
by computer programs. The semantic Web is not
a separate Web but an extension of the current
one, in which information is given well-defined
meaning, better enabling computers and people
to work in cooperation. To make possible the
creation of the semantic Web the W3C (World
Wide Web Consortium) has been actively working
on the definition of open standards, such as the
RDF and OWL, and incentivate their use by both
industry and academia. These standards are also
important for the integration and interoperability
for intra- and inter-business processes that have
become widespread due to the development of
business-to-business and business-to-customer
infrastructures.
The Semantic Web does not restrict itself to
the formal semantic description of Web resources
for machine-to-machine exchange and auto-
rEfErEncEs
Akkiraju, R., Farrell, J., Miller, J., Nagarajan, M., Schmidt M., Sheth, A., & Verma,
K., (2006) Web service semantics: WSDLS. Retrieved February 20, 2007 from http://www.
w3.org/Submission/WSDL-S
Berners-Lee, T., Hendler, J., & Lassila, O. (2001,
May). The Semantic Web. Scientific American,
34-43.
Bodenreider, O., Aubry, M., & Burgun, A. (2005).
Non-lexical approaches to identifying associative
relations in the gene ontology. Paper presented at
the Pacific Symposium on Biocomputing, Hawaii.
World Scientific.
BPEL4WS (2002). Web services. IBM.
Brin, S., & Page, L. (1998). The anatomy of a
large-scale hypertextual Web search engine.
LSDIS. (2004). METEOR-S: Semantic Web services and processes. Retrieved February 20, 2007
from http://lsdis.cs.uga.edu/projects/meteor-s/
EndnotE
1
http://fwrlibrary.troy.edu/1/dbhelp/dbhelppsychology.htm
Chapter II
abstract
This chapter introduces a number of formal logical languages which form the backbone of the Semantic Web. They are used for the representation of both ontologies and rules. The basis for all languages
presented in this chapter is the classical first-order logic. Description logics is a family of languages
which represent subsets of first-order logic. Expressive description logic languages form the basis for
popular ontology languages on the Semantic Web. Logic programming is based on a subset of firstorder logic, namely Horn logic, but uses a slightly different semantics and can be extended with nonmonotonic negation. Many Semantic Web reasoners are based on logic programming principles and
rule languages for the Semantic Web based on logic programming are an ongoing discussion. Frame
Logic allows object-oriented style (frame-based) modeling in a logical language. RuleML is an XMLbased syntax consisting of different sublanguages for the exchange of specifications in different logical
languages over the Web.
introduction
An important property of the Semantic Web is
that the information contained in it is specified
using a formal language in order to enable machine-processability and the derivation of new
knowledge from existing knowledge. Logical
languages are such formal languages.
Using logical languages for knowledge representation allows one to derive new knowledge
which is implicit in existing descriptions. Additionally, the use of formal languages allows
one to write unambiguous statements and allows
machines to derived implicit information using
formal rules of deduction associated with the
language. Finally, logical languages have been
extensively studied in the research areas of databases and artificial intelligence. It is for these
reasons that logical languages form the backbone
of the Semantic Web.
Copyright 2007, IGI Global, distributing in print or electronic forms without written permission of IGI Global is prohibited.
Logical languages can be used for the representation of different kinds of knowledge, most
notably ontologies and rules. In this chapter we
describe a number of logical languages which are
being used for the representation of knowledge
on the Semantic Web.
Classical first-order logic (Fitting, 1996) is
the basis for all the languages we survey in this
chapter. Full first-order logic by itself is a very
expressive language. In fact, the language is so
expressive that reasoning with the language is in
general very hard and the most interesting problems are undecidable. The answer to a question
such as Does sentence follow from theory ?
cannot always be found. For these reasons, several
subsets of first-order logic have been investigated
and form the basis for several languages which are
used on the Semantic Web, most notably description logics and logic programming. Nonetheless,
full first-order logic has been proposed as a language for the Semantic Web (Battle, Bernstein,
Boley, Grosof, Gruninger, & Hull, 2005; Horrocks,
Patel-Schneider, Boley, Tabet, Grosof, & Dean,
2004; Patel-Schneider, 2005).
Description logics (Baader, Calvanese, McGuinness, Nardi, & Patel-Schneider, 2003) are a
family of languages which generally represent
strict subsets of first-order logic. Description logics
were originally devised to formalize frame-based
knowledge representation systems. Languages
in this family typically allow the definition of
concepts, concept hierarchies, roles and certain
restrictions on roles. Description logics receive a
lot of attention as a basis for ontology languages
on the Semantic Web; most notably, the W3C
recommendation OWL is based on an expressive
description logic (Horrocks, Patel-Schneider, &
Harmelen, 2003).
Logic programming (Lloyd, 1987) is based on
the Horn logic subset of first-order logic, which
allows one to write rules of the form if A then
B. In order to allow for efficient reasoning, the
semantics of logic programming is built around
Herbrand interpretation, rather than first-order
first-ordEr loGic
The basic building blocks of first-order logic (FOL)
are constants, function symbols and predicates.
Constants are interpreted as objects in some abstract domain. Function symbols are interpreted
as functions and predicates are interpreted as relations over the domain. The domain may consist
of objects representing such things as numbers,
persons, cars, and so forth. The relations may be
such things as greater-than, marriage, top
speed, and so forth. Constants, predicates and
function symbols are combined with variables and
logical connectives to obtain formulas. We want
6
Of these,
p(a,b),q( f(a),b),r(g( f(a))),a=f(b),
are ground atomic formulas.
Definition 2 (Formulas) Given the formulas ,
L, we define the set of formulas in L as follows:
x.(y.(p(x,y)q(f(a),x)r(y)))
(p(a,b)r( f(b)))z.(q(z,f(z))
x.(p(x,y))r(y)))
9
dEscriPtion loGics
Description logics (Baader et al., 2003) (formerly
called Terminological Logics) are a family of
knowledge representation languages, which
revolve mainly around concepts, roles (which
denote relationships between concepts), and role
restrictions. Description logics are actually based
on first-order logic. Therefore, concepts can be
seen as unary predicates, whereas roles can be
seen as binary predicates. Although there are also
Description Logic languages which allow n-ary
roles, we will not discuss these here.
In this section we will illustrate description logics through the relatively simple description logic
Attributive Language with Complement (ALC)
which allows concepts, concept hierarchies,
role restrictions and the boolean combination
of concept descriptions. The currently popular
expressive description logics, such as SHIQ and
SHOIQ, are all extensions of ALC.
0
Box 1.
C,D
(named class)
(universal concept)
(bottom concept)
CD
(intersection)
CD
(union)
(negation)
R.C
(existential restriction)
R.C
(universal restriction)
First-Order Formula
(A,X)
A(X)
(T,X)
(,X)
(CD,X)
(C,X)(D,X)
(CD,X)
(C,X)(D,X)
(C,X)
((C,X))
(R.C,X)
y.(R(X,y)(C,y))
(R.C,X)
y.(R(X,y)(C,y))
First-Order Formula
(CD)
x.((C,x)(D,x))
(CD)
x.((C,x)(D,x) ((D,x)(C,x))
Assertion
First-order Formula
(iA)
A(i)
((i1,i2)R)
R(i1,i2)
Both reasoning tasks can be reduced to reasoning problems in first-order logic. A concept
C is satisfiable with respect to a TBox T if and
only if (T){C(a)} is satisfiable. Essentially, we
translate the TBox to first-order logic and add an
instance a of the concept C and check whether the
resulting theory is consistent, that is, satisfiable.
Notice that we have already applied this technique
in Example 8.
Subsumption can be reduced to entailment
in first-order logic: CD if and only if (T) |=
(CD). Thus, we translate the TBox to a firstorder theory and translate the subsumption axiom
which we want to check for a first-order sentence
and check whether the one entails the other. We
have already seen that entailment in first-order
logic can be reduced to satisfiability checking.
Similarly, subsumption can be reduced to satisfiability, namely, C is subsumed by D with respect
to TBox T if and only if CD is not satisfiable
with respect to T.
Reasoning in most description logics is decidable and there exist optimized algorithms for
reasoning with certain description logics.
loGic ProGraMMinG
Logic programming is based on a subset of firstorder logic, called Horn logic. However, the
semantics of Logic Programming is slightly different from first-order logic. The semantics of logic
programs is based on minimal Herbrand models
(Lloyd, 1987), rather than first-order models.
A logic program consists of rules of the form
if A then B. Intuitively, if A is true, then B must
be true. Logic programming plays two major roles
on the Semantic Web. On the one hand, it is used
to reason with RDF (Klyne & Carroll, 2004),
RDF Schema (Brickley & Guha, 2004) and parts
of OWL (Dean & Schreiber, 2004). On the other
hand, it used is to represent knowledge on the
Semantic Web in the form of rules.
Euler1 and CWM2 are examples of reasoners
for the Semantic Web, based on logic programming. Euler and CWM both work directly with
RDF data and can be used to derive new RDF
data using rules.
Rules can be seen as a knowledge representation paradigm complementary to Description
logics. Description logics are very convenient
for defining classes, class hierarchies, properties and the relationships between them. More
specifically, compared with logic programming,
description logics have the following expressive
power: existential quantification, disjunction and
classical negation. Logic programs, on the other
hand, have the following additional expressive
power: predicates with arbitrary arities, chaining
variables over predicates (there are no restrictions
on the use of variables), and the use of nonmonotonic negation. An often quoted example which
illustrates the expressive power of rules compared
with ontologies is: if x is the brother of a parent
of y, then x is an uncle of y. This example cannot
be expressed using description logics, because the
variables x and y are both used on both sides of
the implication, which is not possible in description logics.
logic Programs
Classical logic programming makes use of the
Horn logic fragment of first-order logic. A Firstorder formula is in the Horn fragment, if it is a
disjunction of literals with at most one positive
literal, in which all variables are universally
quantified:
() hb1...bn
(1)
(2)
?- b,...,bn.
p(X) :- p(f(X)).
If b,...,bn w then h w
?- p(Y).
p(x)[x/a]=p(a)
(p(x,y)q(x)q(z))[x/f(a),y/a]=p( f(a),a)q( f(
a))q(z)
(q( y)r (z, f ( y)))[ y/g (b), z/
a]=q(g(b))r(a,f(g(b)))
mother(mary,jill).
father(john,jill).
parent(jack,john).
parent(X,Y) :- mother(X,Y).
parent(X,Y) :- father(X,Y).
ancestor(X,Y) :- parent(X,Y).
ancestor(X,Z) :- ancestor(X,Y), ancestor(Y,Z).
6
n,...,not nl
2.
In order to check whether a program is stratifiable, we can check the dependency graph. This
graph is build in a similar way as discussed above,
with the addition that if there is a rule with head
q and with a negative body literal not p, then
there is an arc between p and q and this arc is
marked with not. If there are cycles in the graph
which include a negative arc, then the program
is not stratifiable.
Example 17 Consider the logic program:
p :- q.
q :- not r, p.
r :- q.
stratum 0: {r}
stratum 1: {p,q}
Now consider the following logic program:
p :- q.
q :- not r.
p :- not s.
r :- s.
t :- not q.
stratum 0: {r,s}
stratum 1: {p,q}
stratum 2: {t}
fraME loGic
Frame logic (Kifer et al., 1995) (F-Logic) is an
extension of first-order logic which adds explicit
support for object-oriented modeling. It is possible to explicitly specify methods, as well as
generalization/specialization and instantiation
relationships. The syntax of F-Logic has some
seemingly higher-order features, for example, the
same identifier can be used for both a class and
an instance. However, the semantics of F-Logic
is strictly first-order.
Although F-Logic was originally defined as an
extension to full First-Order Logic, the original
paper (Kifer et al., 1995) already defined a Logic
Programming-style semantics for the subset of
F-Logic based on Horn logic. Intuitively, the
Horn subset is obtained in the usual way with
the addition that, beside predicate symbols with
arguments, F-Logic molecules can also be seen
as atomic formulas. In the remainder, we will
refer to the Horn subset of F-Logic with logic programming semantics as F-Logic programming.
There exist several implementations of F-Logic
programming, most notably (Decker, Erdmann,
Fensel, & Studer, 1999; Yang, Kifer, & Zhao,
2003). Since most attention around F-Logic is
around F-Logic programming, we will restrict
ourselves to the logic programming semantics for
F-Logic and disregard the FOL semantics.
f-logic Programs
To simplify matters, we focus only on a subset of
F-Logic. We do not consider parametrised methods, functional (single-valued) methods and we
consider only non-inheritable methods. We also
do not consider compound molecules.
Formally, an F-Logic theory is a set of formulas
constructed from atomic formulas, as defined for
first-order logic, and so called molecules. Let
be a first-order signature, as defined before and
let T be the set of terms which can be constructed
man::person.
woman::person.
mother::woman.
mother::parent.
father::man.
father::parent.
9
rulEMl
RuleML (Boley, Dean, Grosof, Kifer, Tabet,
& Wagner, 2005; Hirtle, Boley, Grosof, Kifer,
Sintek, & Tabet, 2005) provides an XML-based
exchange syntax for different rule languages, as
well as for first-order logic.
RuleML can be seen as an exchange format
for most of the languages which have been mentioned in this chapter. In order to capture different
logical languages, RuleML defines a number of
0
conclusion
Logical languages allow one to infer information
which is implicit in descriptions one creates in
these languages. The ability to infer new information from existing information is seen as an
important feature for languages on the Semantic
Web. Therefore, many Semantic Web languages,
described in the other chapters of this book, are
based on formal languages such as the ones we
have seen in this chapter.
As we have seen in this chapter, there are many
differences between these formal languages, but
also in terms of modeling in the language.
We summarize the most important traits of
the surveyed languages:
First-order logic. A very expressive language.
Reasoning with first-order logic (FOL) is in general undecidable.
rEfErEncEs
Angele, J., Boley, H., Bruijn, J. de, Fensel, D.,
Hitzler, P., & Kifer, M. (2005). Web rule language
(WRL). W3C Member Submission 09 September
2005.
Baader, F., Calvanese, D., McGuinness, D.L.,
Nardi, D., & Patel-Schneider, P.F. (Eds.). (2003).
The description logic handbook. Cambridge
University Press.
Battle, S., Bernstein, A., Boley, H., Grosof, B.,
furthEr rEadinG
EndnotEs
1
2
http://www.agfa.com/w3c/euler/
http://www.w3.org/2000/10/swap/doc/cwm.
html
Note that a recent version of SQL, namely
SQL:99, allows a limited form of recursion
in queries.
Note that there are extensions of logic programming which deal with classical negation
(Gelfond & Lifschitz, 1991), but we will not
discuss these here.
Chapter III
Ontological Engineering:
What are Ontologies and
How Can We Build Them?
Oscar Corcho
University of Manchester, UK
Mariano Fernndez-Lpez
Universidad San Pablo CEU and Universidad Politcnica de Madrid, Spain
Asuncin Gmez-Prez
Universidad Politcnica de Madrid, Spain
abstract
Ontologies are formal, explicit specifications of shared conceptualizations. There is much literature on
what they are, how they can be engineered and where they can be used inside applications. All these
literature can be grouped under the term ontological engineering, which is defined as the set of activities that concern the ontology development process, the ontology lifecycle, the principles, methods
and methodologies for building ontologies, and the tool suites and languages that support them. In this
chapter we provide an overview of ontological engineering, describing the current trends, issues and
problems.
introduction
The origin of ontologies in computer science
can be referred back to 1991, in the context of
the DARPA Knowledge Sharing Effort (Neches,
Fikes, Finin, Gruber, Senator, & Swartout, 1991).
The aim of this project was to devise new ways of
constructing knowledge-based systems, so that
Copyright 2007, IGI Global, distributing in print or electronic forms without written permission of IGI Global is prohibited.
Ontological Engineering
Ontological Engineering
Binary relations are sometimes used to express concept attributes (i.e., slots). Attributes
are usually distinguished from relations because
their range is a datatype, such as string, number,
and so forth, while the range of relations is a
concept. The following code defines the attribute
flightNumber, which is a string. We can also express
relations of higher arity, such as a road connects
two different cities.
46
Ontological Engineering
47
Ontological Engineering
resources needed for their completion. This activity is essential for ontologies that use ontologies
stored in ontology libraries or for ontologies that
require a high level of abstraction and generality.
The control activity guarantees that scheduled
tasks are completed in the manner intended to be
performed. Finally, the quality assurance activity
assures that the quality of each and every product
output (ontology, software and documentation)
is satisfactory.
Ontology development oriented activities
are grouped, as presented in Figure 1, into
predevelopment, development and postdevelopment activities. During the predevelopment, an
environment study identifies the problem to be
solved with the ontology, the applications where
the ontology will be integrated, and so forth. Also
during the predevelopment, the feasibility study
answers questions like: is it possible to build the
ontology?; is it suitable to build the ontology?;
and so forth.
Once in the development, the specification
activity3 states why the ontology is being built,
what its intended uses are and who the end-users
are. The conceptualisation activity structures
the domain knowledge as meaningful models
Ontological Engineering
9
Ontological Engineering
0
Ontological Engineering
Ontological Engineering
2.
None of the approaches covers all the processes involved in ontology building. Most of
the methods and methodologies for building
ontologies are focused on the development
activities, specially on the ontology conceptualisation and ontology implementation,
and they do not pay too much attention to
other important aspects related to management, learning, merge, integration, evolution
and evaluation of ontologies. Therefore,
such types of methods should be added to
the methodologies for ontology construction
from scratch (Fernndez-Lpez & GmezPrez, 2002b).
Most of the approaches are focused on development activities, especially on the ontology
implementation, and they do not pay too
much attention to other important aspects
related to the management, evolution and
evaluation of ontologies. This is due to the
fact that the ontological engineering field is
relatively new. However, a low compliance
with the criteria formerly established does
not mean a low quality of the methodology
or method. As de Hoog (1998) states, a not
very specified method can be very useful
for an experienced group.
3.
4.
Most of the approaches present some drawbacks in their use. Some of them have not
been used by external groups and, in some
cases they have been used in a single domain.
Most of the approaches do not have a specific
tool that gives them technology support.
Besides, none of the available tools covers all the activities necessary in ontology
building.
Ontological Engineering
Ontological Engineering
Rewriting rule
Travel := Traveling
TravelByPlane := C such as
subclassOf(C, Traveling)
C.hasTransporMean = Plane
C such as
subclassOf(C, Travel)
C.hasTransporMean = Bus
:= TravelingByBus
Origin := OriginPlace
Destination := DestinationPlace
New York := NY
Date :=
HasTransportMean := HasTransportMean
Ontological Engineering
Ontological Engineering
6
Ontological Engineering
tion (Daelemans & Reinberger, 2004; Gmez-Prez, 2004; Guarino, 2004; Noy, 2004). According
to their conclusions, although good ideas have
been provided in this area, there are still important
lacks. Other interesting works are (Guo, Pan, &
Heflin, 2004) and the aforementioned EON2004
experiment.
ontoloGy tools
Ontology tools appeared in the mid-1990s with
the objective of giving support to the development
of ontologies, either following a specific set of
methods or a methodology or not. Taking into
account the characteristics of their knowledge
models, ontology tools can be classified in the
following two groups:
Ontological Engineering
Ontological Engineering
9
Ontological Engineering
60
Ontological Engineering
ontoloGy lanGuaGEs
Ontology languages started to be created at the
beginning of the 1990s, normally as the evolution of existing knowledge representation (KR)
languages. Basically, the KR paradigms underlying such ontology languages were based on
first order logic (e.g., KIF (Genesereth & Fikes,
1992)), on frames combined with first order logic
(e.g., Ontolingua (Farquhar et al., 1997) (Gruber,
1992), OCML (Motta, 1999) and FLogic (Kifer et
al., 1995)), and on description logics (e.g., Loom
(MacGregor, 1991)). In 1997, OKBC (Chaudri et
al., 1998) was created as a unifying frame-based
protocol to access ontologies implemented in different languages (Ontolingua, Loom and CycL,
among others). However it was only used in a
small number of applications.
The boom of the Internet led to the creation
of ontology languages for exploiting the characteristics of the Web. Such languages are usually
called Web-based ontology languages or ontology markup languages. Their syntax is based
on existing markup languages such as HTML
(Raggett, Le Hors, & Jacobs, 1999) and XML
(Bray, Paoli, Sperberg-McQueen, & Maler, 2000),
whose purpose is not ontology development but
data presentation and data exchange respectively.
The most important examples of these markup
languages are: SHOE (Luke & Helfin, 2000), XOL
(Karp, Chaudhri, & Thomere, 1999), RDF (Las-
6
Ontological Engineering
conclusion
In the beginning of the 1990s ontology development was similar to an art: ontology developers did
not have clear guidelines on how to build ontologies but only some design criteria to be followed.
Work on principles, methods and methodologies,
together with supporting technology, made ontology development become an engineering. This
migration process was mainly due to the definition of the ontology development process and the
ontology lifecycle, which described the steps to
be performed in order to build ontologies and the
interdependencies among all those steps.
In this chapter we have reviewed existing
ontology principles, methods and methodologies,
tools, and languages. The following is a summary
of the chapter:
Ontology engineers have available methodologies that guide them along the ontology development process. Methontology is the methodology
that provides the most detailed descriptions of the
processes to be performed; On-To-Knowledge
6
Ontological Engineering
acknoWlEdGMEnts
This work has been partially supported by the
IST project Knowledgeweb (FP6-507482) and
by the Spanish project Semantic Services (TIN
2004-02660).
rEfErEncEs
Arprez, J.C., Corcho, O., Fernndez-Lpez, M., &
Gmez-Prez, A. (2003). WebODE in a nutshell.
AI Magazine.
Aussenac-Gilles, N., Bibow, B., & Szulman, S.
(2000a). Revisiting ontology design: A methodology based on corpus analysis. In R. Dieng &
O. Corby (Eds.), Proceedings of the 12th International Conference in Knowledge Engineering and
Knowledge Management (EKAW00), Juan-LesPins, France, (LNAI, 1937, pp. 172-188). Berlin:
Springer-Verlag.
Aussenac-Gilles, N., Bibow, B., & Szulman, S.
(2000b). Corpus analysis for conceptual modelling.
In N. Aussenac-Gilles, B. Bibow & S. Szulman
(Eds.), Proceedings 51 of EKAW00 Workshop
on Ontologies and Texts, Juan-Les-Pins, France
(pp. 1.1-1.8), CEUR Workshop. Amsterdam, The
6
Ontological Engineering
6
CRC Press.
Dean, M. & Schreiber, G. (2004). OWL Web ontology language reference. W3C Recommendation.
Retrieved October 23, 2006, from http://www.
w3.org/TR/owl-ref/
Declerck, T. & Uszkoreit, H. (2003). State of the
art on multilinguality for ontologies, annotation
services and user interfaces. Esperonto deliverable D1.5. Retrieved October 22, 2006, from
http://www.esperonto.net
Doan, A., Madhavan, J., Domingos, P., & Halevy,
A. (2002). Learning to map between ontologies
on the Semantic Web. In D. Lassner (Ed.), Proceedings of the 11th International World Wide Web
Conference (WWW 2002), Honolulu, Hawaii. Retrieved October 23, 2006, from http://www2002.
org/refereedtrack.html
Domingue, J. (1998). Tadzebao and webOnto:
Discussing, browsing, and editing ontologies on
the Web. In B.R. Gaines & M.A. Musen (Eds.),
11th International Workshop on Knowledge Acquisition, Modeling and Management (KAW98)
(KM4, pp. 1-20). Banff, Canada.
Ehring, M., & Staab, S. (2004). QOM quick ontology mapping. In S.A. McIlraith & D. Plexousakis
(Eds.), 3rd International Semantic Web Conference (ISWC04), Hiroshima, Japan. (LNCS 3298,
pp. 683-697). Berlin: Springer-Verlag.
Euzenat, J. (2004). An API for ontology alignment. In S.A. McIlraith & D. Plexousakis (Eds.),
3rd International Semantic Web Conference
(ISWC04), Hiroshima, Japan. (LNCS 3298, pp.
698-712). Berlin: Springer-Verlag.
Farquhar, A., Fikes, R., & Rice, J. (1997). The
ontolingua server: A tool for collaborative ontology construction. International Journal of Human
Computer Studies, 46(6), 707-727.
Fernndez-Lpez, M., & Gmez-Prez, A.
(2002a). Overview and analysis of methodologies
Ontological Engineering
KSL-94-65.html
Gmez-Prez, A. (1996). A framework to verify
knowledge sharing technology. Expert Systems
with Application, 11(4), 519-529.
Gmez-Prez, A. (2001). Evaluation of ontologies. International Journal of Intelligent Systems,
16(3), 391-409.
Gmez-Prez, A., Fernndez-Lpez, M., & Corcho, O. (2003). Ontological engineering. London:
Springer.
Gmez-Prez, A., Fernndez-Lpez, M., & de
Vicente, A. (1996). Towards a method to conceptualize domain ontologies. In P. van der Vet (Ed.),
ECAI96 Workshop on Ontological Engineering
(pp. 41-52). Budapest, Hungary.
Gmez-Prez, A., & Manzano, D. (2003). A survey
of ontology learning methods and techniques.
OntoWeb deliverable D.1.5. Retrieved October
23, 2006, from http://www.ontoweb.org
Gmez-Prez, A. & Rojas, M.D. (1999). Ontological reengineering and reuse. In D. Fensel & R.
Studer (Eds.), 11th European Workshop on Knowledge Acquisition, Modeling and Management
(EKAW99), Dagstuhl Castle, Germany. (LNAI
1621, pp. 139-156). Berlin: Springer-Verlag.
Genesereth, M.R. & Fikes, R.E. (1992). Knowledge interchange format. Version 3.0. Reference
Manual (Tech. Rep. Logic-92-1). Stanford University, Computer Science Department. Retrieved
October 23, 2006, from http://meta2.stanford.
edu/kif/Hypertext/kif-manual.html
Grninger, M. & Fox, M.S. (1995). Methodology for the design and evaluation of ontologies.
In D. Skuce (Ed.), IJCAI95 Workshop on Basic
Ontological Issues in Knowledge Sharing (pp.
6.16.10).
6
Ontological Engineering
66
Ontological Engineering
6
Ontological Engineering
Mena, E., Kashyap, V., Sheth, A.P., & Illarramendi, A. (1996). OBSERVER: An approach for
query processing in global information systems
based on interoperation across pre-existing ontologies. In W. Litwin (Ed.), First IFCIS International
Conference on Cooperative Information Systems
(CoopIS96) (pp. 14-25). Brussels, Belgium.
6
Ontological Engineering
69
Ontological Engineering
Welty, C., & Guarino, N. (2001). Supporting ontological analysis of taxonomic relationships. Data
and Knowledge Engineering, 39(1), 51-74.
Wu, S.H., & Hsu, W.L. (2002). SOAT: A semiautomatic domain ontology acquisition tool from
chinese corpus. In W. Lenders (Ed.), 19th International Conference on Computational Linguistics
(COLING02), Taipei, Taiwan.
4
5
6
EndnotEs
1
2
3
0
Component names depend on the formalism. For example, classes are also known
as concepts, entities and sets; relations are
also known as roles and properties; and so
forth.
http://www.omg.org/
In (28) specification is considered as a predevelopment activity. However, following
7
8
9
10
11
12
13
Chapter IV
abstract
This chapter gives an overview of some editing tools for ontology construction. At the present time,
the development of a project like the one of building an ontology demands the use of a software tool.
Therefore, it is given a synopsis of the tools that the authors consider more relevant. This way, if you are
starting out on an ontology project, the first reaction is to find a suitable ontology editor. Furthermore,
the authors hope that by reading this chapter, it will be possible to choose an editing tool for ontology
construction according to the project goals. The tools have been described following a list of features.
The authors believe that the following features are pertinent: collaborative ontology edition, versioning,
graphical tree view, OWL editor and many others (see Appendix 2).
introduction
The World Wide Web is mainly composed of
documents written in HTML (Hypertext Markup
Language). This language is useful for visual
presentation since it is a set of markup symbols
contained in a Web page intended for display on
a Web browser. Humans can read Web pages and
understand them, but their inherent meaning is
not shown in a way that allows their interpretation
by computers. The information on the Web can
be defined in a way that can be used by computers not only for display purposes, but also for
interoperability and integration between systems
and applications (Cardoso, 2005).
The Semantic Web is not a separate Web
but an extension of the current one, in which
information is given a well-defined meaning,
better enabling computers and people to work in
cooperation (Berners-Lee, Hendler, & Lassila,
2001). The Semantic Web was made through
incremental changes by bringing machine read-
Copyright 2007, IGI Global, distributing in print or electronic forms without written permission of IGI Global is prohibited.
ProtG
Protg (Noy, Sintek, Decker, Crubezy, Fergerson,
& Musen, 2001) is one of the most widely used
ontology development tool that was developed
at Stanford University. Since Protg is free and
open source, it is supported by a large community of active users. It has been used by experts
in domains such as medicine and manufacturing
for domain modeling and for building knowledgebase systems. Protg provides an intuitive editor
for ontologies and has extensions for ontology
visualization, project management, software
engineering and other modeling tasks.
In early versions, Protg only enabled users
to build and populate frame-based ontologies in
accordance with the open knowledge base connectivity protocol (OKBC). In this model, an
ontology consisted of a set of classes organized in
a subsumption hierarchy, a set of slots associated
to classes to describe their properties and relationships, and a set of instances of those classes.
Protg editor included support for classes and
class hierarchies with multiple inheritance; templates and slots; predefined and arbitrary facets for
slots, which included permitted values, cardinality restrictions, default values, and inverse slots;
metaclasses and metaclass hierarchy.
While the first architecture of Protg was
based on frames, in 2003 it has been extended to
support OWL. This extension has attracted many
users captivated by the Semantic Web vision. The
OWL plug-in extends the Protg platform into
an ontology editor for the OWL enabling users
to build ontologies for the Semantic Web. The
OWL plug-in allows users to load, save, edit and
visualize ontologies in OWL and RDF. It also
provides interfaces for description logic reasoners such as racer.
Protg ontologies can be exported into a
variety of formats including RDF(S), OWL, and
Extended Mark-up Language (XML) schema.
The current Protg version can be used to edit
classes and their characteristics, to access reasoning engines, to edit and execute queries and
rules, to compare ontology versions, to visualize
relationships between concepts, and to acquire
instances using a configurable graphical user
interface. Protg is a tool installed locally in a
computer and does not allow collaborative editing
of ontologies by groups of users.
Protg can be extended by way of a plug-in architecture and a Java-based Application Programming Interface (API) for building knowledge-base
tools and applications. Protg is based on Java
and provides an open-source API to develop
Semantic Web and knowledge-base stand-alone
applications. External Semantic Web applications can use the API to directly access Protg
knowledge bases without running the Protg
application. An OWL API is also available to
provide access to OWL ontologies. Its extensible
architecture makes it suitable as a base platform
for ontology-based research and development
projects. Protg also includes a Programming
Development Kit (PDK), an important resource for
programmers that describe how to work directly
ontoEdit
OntoEdit (Sure, Angele, & Staab, 2002) was developed by the Knowledge Management Group
of the AIFB Institute at the University of Karlsruhe. It is an ontology engineering environment
which allows creating, browsing, maintaining
and managing ontologies. The environment supports the collaborative development of ontologies
(Sure, Erdmann, Angele, et al., 2002). This is
archived through its client/server architecture
where ontologies are managed in a central server
and various clients can access and modify these
ontologies. Currently, the successor of OntoEdit
is OntoStudio (Figure 2) which is a commercial
product based on IBMs development environment
Eclipse. It can be downloaded for three months
free evaluation period.
OntoEdit was developed having two major
goals in mind. On the one hand, the editor was
designed to be as much as possible independent
and neutral of a concrete representation language.
On the other hand, it was planned to provide a
powerful graphical user interface to represent
concept hierarchies, relations, domains, ranges,
instances and axioms. OntoEdit supports FLogic
(fuzzy logic), RDF schema and OIL. The tool
is multilingual. Each concept or relation name
can be specified in several languages. This is
particularly useful for the collaborative development of ontologies by teams of researchers spread
across several countries and speaking different
languages. From the technical perspective, this
6
doE
DOE is a simple ontology editor and was developed by the INA (Institut National de lAudiovisuel
- France). DOE allows users to build ontologies
according to the methodology proposed by Bruno
Bachimont (Bachimont et al., 2002).
DOE has a classical formal specification process. DOE only allows the specification part of the
process of structuring ontology. DOE is rather a
complement of others editors (DOE, 2006). It is
not intended to be a direct competitor with other
existing environments (like Protg, OilEd, OntoEdit or WebODE), instead it was developed to
coexist with other editors in a complementary way.
This editor offers linguistics-inspired techniques
which attach a lexical definition to the concepts
and relations used, and justify their hierarchies
from a theoretical, human-understandable point of
view (DOE, 2006). Therefore, DOE should be used
in combination with another editor. For instance,
an editor that has advanced formal specification,
for example Protg.
DOE is a simple prototype developed in Java
that supports the three steps of the Bruno Bachimont methodology (Isaac et al., 2002). The Bruno
Bachimont methodology can be described in the
Figure 4. The differential principles bound to the notion addressed in the DOE tool (differential ontology view)
Figure 5. The referential ontology view with a concept tree of the traveling ontology
manual too.
In DOE Web page (http://opales.ina.fr/public/)
there is a self-contained installer of the complete
product for Windows. The user has to fill out a
form in order to access the download page. The
installation is fast. After downloading the product, the user has to run the Setup-DOE-v1.51.exe
file and follow the instructions. To run DOE in
another platform, it is required to have a Java 2
(TM) Platform, Standard Edition Version 1.3 or
later (recommended v1.4.1).
isaviz
IsaViz is a visual environment for browsing and
authoring RDF models as graphs. This tool is
offered by W3C Consortium. IsaViz (2006) was
developed by Emmanuel Pietriga. The first version
was developed in collaboration with Xerox Research Centre Europe which also contributed with
XVTM, the ancestor of ZVTM (Zoomable Visual
Transformation Machine) upon which IsaViz is
built. As of October 2004, further developments
are handled by INRIA Futurs project In Situ.
IsaViz also includes software developed by HP
9
1.
ontolingua
2.
3.
4.
0
5.
oilEd
OilEd is a graphical ontology editor developed by
the University of Manchester for description logic
ontologies (Gmes-Prez et al., 2004). The main
purpose of OilEd is to provide a tool for ontolo-
formats.
Figure 7 shows OilEd Editor. The tabs on the
main window give access to classes, properties,
individuals and axioms. In each tab, a list on the
left sides illustrates all the items. On the right
panel there is the editor. This editor is used to
see or to change the information concerning the
entry selected in the item list on the left. The editor
was reimplementated reasoner interface so that it
was possible to use the DIG (DL Implementation
Group) protocol. OilEd includes a DAML+OIL
checker. Given the location of an ontology the
application checks the syntax of DAML+OIL
ontologies and returns a report of the model.
OilEd is provided free of charge, but the user
has to provide some information before downloading this editor. Registration is a simple process
requiring a valid e-mail address. The user has to
be registered, in order to receive a password to
revisit the main Web page to download patches
and extensions when necessary.
In order to use OWL ontologies, the user must
download the sesame-oil.jar library, place it in the
/lib directory of the users installation and remove
the existing oil.jar. Sesame is a repository for OWL
ontologies. This will add new options to the menus
Figure 9. OilEd ontology editor view bounded with the class anomaly detection
There are several possible packages for installation. OilEd comes packaged with the FaCT
reasoner, although alternative reasoners can be
used. OilEd 3.5.7 (Windows)+reasoner. This
archive contains a FaCT reasoner for Windows.
OilEd 3.5.7 (Linux)+reasoner. This updated version supports export information knowledge base
models to OWL-RDF. This archive also contains
the FaCT reasoner. OilEd 3.5.7 (no reasoner).
This archive does not contain a reasoner. Use
this distribution only if you are not interested in
using a reasoner or wish to use OilEd with an
alternative DIG reasoner.
WebodE
WebODE has been developed by the Ontological
Engineering Group (OEG) from the Artificial
Intelligence Department of the Computer Science Faculty (FI) from the Technical University
Presentation Tier;
Middle Tier;
Database Tier.
6
poWl
pOWL (Auer, 2005) is a PHP-based open source
ontology management tool. pOWL is capable of
supporting parsing, storing, querying, manipulation, versioning, serving and serialization of RDFS
and OWL knowledge bases in a collaborative Web
enabled environment.
pOWL does not follow any specific methodology for developing ontologies. It supports
heterogeneous data and its formal description.
pOWL tries to follow the W3C Semantic Web
Standards. pOWL can import and export model
data in different serialization formats such as
RDF/XML, and N-Triple.
pOWL is designed to work with ontologies of
arbitrary size. This action is limited by the disk
space. Therefore, only some parts of the ontology
are loaded into main memory. The parts loaded
are the ones required to display the information
requested by the user on the screen. It offers an
RDQL (RDF Data Query language) query builder.
pOWL has a tab that correspond to the RDQL
query builder. This RDQL is an implementation
of an SQL-like query language for RDF. It is
possible to query the knowledge base as well as a
1.
2.
3.
4.
Figure 12. View of the pOWL ontology editor showing the property elements of the coCondition
sWooP
SWOOP is a Web-based OWL ontology editor
and browser. SWOOP contains OWL validation
and offers various OWL presentation syntax
views. It has reasoning support and provides a
multiple ontology environment. Ontologies can be
compared, edited and merged. Different ontologies can be compared against their description
logic-based definitions, associated properties and
instances. SWOOPs interface has hyperlinked
capabilities so that navigation can be simple and
easy. SWOOP does not follow a methodology for
ontology construction.
Users can reuse external ontological data
(Kalyanpur, Parsia, & Hendler, 2005). This is
possible either by purely linking to the external
entity, or importing the entire external ontology.
It is not possible to do partial imports of OWL.
There are several ways to achieve this, such as
a brute-force syntactic scheme to copy/paste
relevant parts (axioms) of the external ontology,
or a more elegant solution that involves partitioning the external ontology while preserving its
semantics and then reusing (importing) only the
specific partition as desired.
It is possible to search concepts across multiple
ontologies. SWOOP makes use of an ontology
search algorithm, that combines keywords with
DL-based in order to find related concepts. This
search is made along all the ontologies stored in
the SWOOP knowledge base.
With SWOOP it is possible to have collaborative annotation using the Annotea plug-in.
This plug-in presents a useful and efficient Web
ontology development. Users may also download
annotated changes for a given ontology. The
plug-in is used by the users to write and share
annotations on any ontological entity. Different
SWOOP users can subscribe to the server. Users
can maintain different version of the same ontology since mechanisms exist to maintain versioning
information using a public server.
SWOOP takes the standard Web browser as
the User Interface paradigm. This Web ontology
browser has a layout that is well known by most
of the users. There is a navigation side bar on the
left (Figure 13). The sidebar contains a multiple
ontology list and a class/property hierarchy of the
ontology. The center pane works like an editor.
There is a range of ontology/entity renders for
presenting the core content.
This editor provides support for ontology
partitioning. OWL ontologies can be automatic
portioned into distinct modules each describing a
separate domain. There is also support for ontology debugging/repair. SWOOP has the ability to
identify the precise axioms that causes errors in
an ontology and there are also natural language
explanation of the error. An automatic generation
of repair plans to resolve all errors are provided.
To better understand the class hierarchy, a CropCircles visualization format was implemented.
SWOOP is based on the conventional ModelView Controller (MVC) paradigm. The SWOOP
Model component stores all ontology-centric
information pertaining to the SWOOP workspace
and defines key parameters used by the SWOOP
UI objects. A SWOOP ModelListener is used to
reflect changes in the UI based on changes in the
SWOOP Model (Kalyanpur et al., 2005). Control
is managed through a plug-in based system. This
system loads new Renders and Reasoners dynamically. Therefore, it is possible to guarantee
modularity of the code, and encourage external
9
90
conclusion
Software tools are available to achieve most of
the activities of ontology development. Projects
often involve solutions using numerous ontologies
from external sources. Sometimes there is also the
need to use existing and newly developed in-house
ontologies. By this reason it is important that the
editing tools for ontology construction promote
interoperability. As we have stated, Protg is used
for domain modeling and for building knowledgebase systems. Protg provides an intuitive editor
for ontologies and has extensions for ontology
visualization, project management, software engineering and other modeling tasks. It also provides
interfaces for description logic reasoners such as
Racer. Protg ontologies can be exported into a
variety of formats including RDF(S), OWL, and
XML schema. It is a tool installed locally in a
computer and does not allow collaborative editing
of ontologies by groups of users. There are several
plug-ins available for this tool.OntoEdit offers an
ontology engineering environment which allows
creating, browsing, maintaining and managing
ontologies. This editor supports collaborative
development of ontologies. The successor of On-
9
rEfErEncEs
Auer, S. (2005, May 30). pOWL A Web based
platform for collaborative Semantic Web development. In Proceedings of the Workshop on Scripting
for the Semantic Web, Heraklion, Greece.
Bachimont, B., Isaac, A., & Troncy, R. (2002,
October 1-4). Semantic commitment for designing
ontologies: A proposal. In A. Gomez-Prez & V.R.
Benjamins (Eds.), 13th International conference
on knowledge engineering and knowledge management, (EKAW 2002), Sigenza, Spain. (LNAI
2473, pp. 114-121). Springer-Verlag.
Bechhofer, S., Horrocks, I., Goble, C., & Stevens,
R. (2001). OilEd: A reason-able ontology editor
for the Semantic Web. In Proceedings of KI2001,
Joint German/Austrian Conference on Artificial
9
aPPEndiX 1
Table 1. List of some of the representative ontology tools
Tool
Apollo
http://apollo.open.ac.uk/index.html
CoGITaNT
http://cogitant.sourceforge.net/
DAMLImp (API)
http://codip.grci.com/Tools/Components.html
http://opales.ina.fr/public/
9
Table 1. continued
9
Tool
http://lalab.gmu.edu/
DUET
http://codip.grci.com/Tools/Tools.html
http://www.semagix.com/
http://www.kermanog.com/
ICOM
http://www.cs.man.ac.uk/~franconi/icom/
http://www.ontologyworks.com/
IsaViz
http://www.w3.org/2001/11/IsaViz/
JOE
http://www.cse.sc.edu/research/cit/demos/java/joe/
http://kaon.semanticweb.org/
http://www.isis.vanderbilt.edu/Projects/micants/Tech/Demos/KBE/
http://www.georeferenceonline.com/
LinKFactory Workbench
http://www.landc.be/
http://www.sandsoft.com/products.html
NeoClassic
http://www-out.bell-labs.com/project/classic/
OilEd
http://oiled.man.ac.uk/
Onto-Builder
http://ontology.univ-savoie.fr/
OntoEdit
http://www.ontoprise.de/com/ontoedit.htm
http://www.ksl.stanford.edu/software/ontolingua/
http://www.ontopia.net/solutions/products.html
Ontosaurus
http://www.isi.edu/isd/ontosaurus.html
OntoTerm
http://www.ontoterm.com/
http://www.opencyc.org/
OpenKnoMe
http://www.topthing.com/
PC Pack 4
http://www.epistemics.co.uk/
Protg-2000
http://protege.stanford.edu/index.html
RDFAuthor
http://rdfweb.org/people/damian/RDFAuthor/
RDFedt
http://www.jan-winkler.de/dev/e_rdfe.htm
SemTalk
http://www.semtalk.com/
Specware
http://www.specware.org/
Taxonomy Builder
http://www.semansys.com/about_composer.html
TOPKAT
http://www.aiai.ed.ac.uk/~jkk/topkat.html
WebKB
http://meganesia.int.gu.edu.au/~phmartin/WebKB/doc/generalDoc.html
WebODE
http://delicias.dia.fi.upm.es/webODE/
WebOnto
http://kmi.open.ac.uk/projects/webonto/
aPPEndiX 2
The following table (Table 2) lists the features that we considered more important when deciding which
ontology tool to use: versioning; collaborative ontology edition; graphical class/properties taxonomy;
graphical tree view; support the growth of large scale ontologies; querying; friendly user interface;
consistency check and OWL editor. The feature versioning keeps track of the version evolution that the
ontology suffers. The collaborative ontology edition is a very useful feature since it allows users to edit
ontologies in a collaborative manner. The graphical class/properties taxonomy provides an interesting view of the classes and of the properties of the ontology. This feature permits viewing classes and
properties with graphically. The graphical tree view shows a generic graph view of the ontology. It is
possible to have an overview of the structure of the ontology. If the ontology editor supports the growth
of large scale ontologies, then the user can be sure that it is possible to build an ontology by using only
one editor. The feature concerning querying allows querying the knowledge base. It is important that the
user interface is friendly and similar to others interface. As a result, the user does not need to spend too
much time getting to know the tool. Consistency checking is important in order to look for propagation
of the arity all along the hierarchy and inheritance of domains.
Table 2. Most representative features
DOE
IsaViz
Ontolingua
Semantic
Works 2006
OilED
SWOOP
pOWL
WebODE
Versioning
Collaborative
ontology edition
Graphical class/
properties taxonomy
Support growth of
large scale ontologies
Querying
Consistency Check
(Syntax)
OWL Editor
Features
9
96
Chapter V
abstract
Web ontology languages will be the main carriers of the information that we will want to share
and integrate. The aim of this chapter is to give a general introduction to some of the ontology
languages that play a prominent role on the Semantic Web. In particular, it will explain the
role of ontologies on the Web and in ICT, review the current standards of RDFS and OWL, and
discuss open issues for further developments.
thE rolE of WEb ontoloGiEs
the role of ontologies in ict
The term ontology originates from philosophy. In
that context, it is used as the name of a subfield
of philosophy, namely, the study of the nature of
existence (the literal translation of the Greek word
, the branch of metaphysics concerned
with identifying), in the most general terms, the
kinds of things that actually exist, and how to
describe them. For example, the observation that
the world is made up of specific objects that can
be grouped into abstract classes based on shared
properties is a typical ontological statement.
In the early 1990s, a series of large-scale experiments took place in order to integrate multiple,
heterogeneous databases (Bayardo et al., 1996;
Chawathe, Garcia-Molina, Hammer, Ireland,
Papakonstantinou, Ullman, & Widom, 1994;
Wiederhold, 1992). These experiments revealed
that database integration must ultimately be based
on explicit, formal knowledge representation of
the underlying common meaning of the involved
data structures rather than on formal schema manipulation only. With the work of Thomas Gruber
(1994) and others (Guarino, 1998; Sowa, 2000;
Uschold & Gruninger, 1996) the extraordinary
importance of formal ontology for the design
and operation of information systems was widely
Copyright 2007, IGI Global, distributing in print or electronic forms without written permission of IGI Global is prohibited.
Properties (X teaches Y)
Value restrictions (only faculty members
can teach courses)
Disjointness statements (faculty and general
staff are disjoint)
Specification of logical relationships between objects (every department must
include at least ten faculty members)
9
search. Ontology mappings allow for relating keywords and thus widening the recall of the search.
In addition, Web searches can exploit generalization/specialization information. If a query fails
to find any relevant documents under a certain
term, it may try to find documents pertaining to
specializations of the term. Otherwise, the search
engine may suggest to the user a more general
query. It is even conceivable for the engine to run
such queries proactively to reduce the reaction
time in case the user adopts a suggestion. Or if
too many answers are retrieved, the search engine
may suggest to the user some specializations. In
this way, differences in terminology between Web
pages and the queries can be overcome.
Also, ontologies are useful for the organization of content and navigation of Web sites in the
manner of library classification systems. Many
Web sites today expose on the left-hand side of
the page the top levels of a concept hierarchy
of terms. The user may click on one of them to
expand the subcategories. Ontologies may better
represent and relate concepts the user is looking
for than traditional library classification.
9
basics of ontoloGy
lanGuaGEs
Once we have classes we would also like to establish relationships between them. For example,
every professor is an academic staff member.
We say that professor is a subclass of academic
staff member, or equivalently, that academic staff
member is a superclass of professor. The subclass
relationship defines a hierarchy of classes. In general, if A is a subclass of B then every instance
of A is also an instance of B.
A hierarchical organization of classes has a
very important practical significance, which we
outline now. Consider the range restriction
Courses must be taught by academic staff
members only.
Suppose Michael Maher was defined as a professor. Then, according to the restriction above, he
is not allowed to teach courses. The reason is that
there is no statement which specifies that Michael
Maher is also an academic staff member. Obviously, this difficulty could be overcome by adding
that statement to our description. However, this
solution adds redundancy to the representation,
with the usual negative effects on maintainability.
Instead we would like Michael Maher to inherit
the ability to teach from the class of academic
staff members.
Property Hierarchies
We saw that hierarchical relationships between
classes can be defined. The same can be done
for properties. For example, is taught by is a
subproperty of involves. If a course c is taught
by an academic staff member a, then c also involves a. The converse is not necessarily true. For
example, a may be the convenor of the course, or
a tutor who marks student homework, but does
not teach c. In general, if P is a subproperty of Q
then Q(x,y) whenever P(x,y).
99
well-defined syntax;
well-defined semantics;
efficient reasoning support;
sufficient expressive power;
convenience of expression.
00
0
Property(p)
rdf schema
SubPropertyOf(pi pj)
Class(c)
Individual(ei type(ci))
Given a statement
Individual(ei value(p ej))
Individual(ej type(cj))
There is the obvious distinction between individual objects and classes which are collections of
objects. The following states that an individual e
belongs to (is an instance of) a class c:
Individual(e type(c)) (e
is of type c).
0
oWl lite
One of the significant limitations of RDF Schema
is the inability to make equality claims between
individuals. Such equality claims are possible in
OWL Lite:
SameIndividual(ei ej)
0
ObjectProperty(pi inverseOf(pj))
0
When a property has a minCardinality and maxCardinality constraints with the same value, these
can be summarised by a single exact Cardinality
constraint.
oWl dl
With the step from OWL Lite to OWL DL, we
obtain a number of additional language constructs. It is often useful to say that two classes
are disjoint (which is much stronger than saying
they are merely not equal):
DisjointClasses(ci cj)
oWl full
OWL Lite and OWL DL are based on a strict segmentation of the vocabulary: no term can be both
an instance and a class, or a class and a property,
and so forth A somewhat less strict proposal is
RDFS(FA) (Pan & Horrocks, 2003), which does
allow a class to be an instance of another class,
as long as this is done in a stratified fashion. Full
RDFS is much more liberal still: a class c1 can have
both a type and a subClassOf relation to a class c2,
and a class can even be an instance of itself. In
fact, the class Class is a member of itself. OWL
Full inherits from RDFS this liberal approach.
Schreiber (2002) argues that this is exactly
what is needed in many cases of practical ontology
integration. When integrating two ontologies, opposite commitments have often been made in the
two ontologies on whether something is modeled
as a class or an instance. This is less unlikely than
it may sound: is 747 an instance of the class of
all airplane-types made by Boeing or is 747
a subclass of the class of all airplanes made by
Boeing? And, are particular jet planes instances of
this subclass? Both points of view are defensible.
In OWL Full, it is possible to have equality statements between a class and an instance.
In fact, just as in RDF Schema, OWL Full
allows us even to apply the constructions of the
language to themselves. It is perfectly legal to
(say) apply a max-cardinality constraint of 2 on
the subclass relationship. Of course, building any
efficient reasoning tools that support this very
0
Typical database applications assume that individuals with different names are indeed different individuals. OWL follows the usual logical
paradigm where this is not the case. This design
decision is plausible on the WWW. Future extensions may allow one to indicate portions of the
ontology for which the assumption does or does
not hold.
Defaults
Many practical knowledge representation systems
allow inherited values to be overridden by more
specific classes in the hierarchy. Thus they treat
inherited values as defaults. No consensus has
been reached on the right formalization for the
nonmonotonic behaviour of default values.
06
Procedural Attachments
A common concept in knowledge representation
is to define the meaning of a term by attaching
a piece of code to be executed for computing the
meaning of the term. Although widely used, this
concept does not lend itself very well to integration in a system with a formal semantics, and it
has not been included in OWL.
rule-based languages
A current trend is to study rule-based languages
for the Semantic Web. These languages have been
thoroughly studied in the knowledge representation community, are generally well understood
and can be implemented efficiently. Moreover,
rules are far better known and used in mainstream
IT, compared to other logical systems; thus they
may be of interest to a broader range of persons,
and may be integrated more easily with other
ICT systems.
One of the simplest kinds of rule systems is
Horn logic. It is interesting to note that description
logics and Horn logic are orthogonal in the sense
that neither of them is a subset of the other. For
example, it is impossible to assert that persons
who study and live in the same city are home
conclusion
The aim of this chapter was to give a general
introduction to some of the ontology languages
that play a prominent role on the Semantic Web.
In particular, it explained the role of ontologies on
the web and in ICT, reviewed the current standards
of RDFS and OWL, and discussed open issues for
further developments. Important open research
questions include:
Finding the right balance between expressive power and efficiency, especially when
combining description logic based with rule
based languages.
Adding nonmonotonic features, such as
closed world assumption, default knowledge
and inconsistency tolerance, to ontology
languages.
rEfErEncEs
Antoniou, G. & van Harmelen, F. (2003). Web
ontology language: OWL. In S. Staab & R. Studer
(Eds), Handbook on ontologies in information
systems. Springer.
Antoniou, G., & van Harmelen, F. (2004). A Semantic Web primer. MIT Press.
Bayardo, R.J., et al. (1996). InfoSleuth: Agentbased semantic integration of information in open
and dynamic environments (MCC Tech. Rep. No.
MCC-INSL-088-96).
Chawathe, S., Garcia-Molina, H., Hammer, J.,
Ireland, K., Papakonstantinou, Y., Ullman, J., &
Widom, J. (1994). The TSIMMIS project: Integration of heterogeneous information sources. In
Proceedings of IPSI Conference (pp. 7-18).
Eiter, T., Lukasiewicz, T., Schindlauer, R., &
Tompits, H. (2004). Combining answer set programming with description logics for the Semantic
0
Kalfoglou, Y. & Schorlemmer, M. (2003). Ontology mapping: The state of the art. The Knowledge
Engineering Review, 18(1), 1-31.
Fensel, D., Horrocks, I., van Harmelen, F., McGuinness, D.L., & Patel-Schneider, P.F. (2001).
OIL: An ontology infrastructure for the Semantic
Web. IEEE Intelligent Systems, 16(2), 38-44.
0
Sowa, J. (2000). Knowledge representation: Logical, philosophical, and computational foundations. Brooks/Cole.
Uschold, M. & Gruninger, M. (1996). Ontologies:
Principles, methods and applications. The Knowledge Engineering Review, 11(2), 93-136.
Wiederhold, G. (1992). Mediators in the architecture of future information systems. IEEE
Computer, 25(3), 38-49.
09
0
Chapter VI
abstract
We describe reasoning as the process needed for using logic. Efficiently performing this process is a
prerequisite for using logic to present information in a declarative way and to construct models of reality.
In particular we describe description logic and the owl ontology language and explain that in this case
reasoning amounts to graph completion operations that can be performed by a computer program. We
give an extended example, modeling a building with wireless routers and explain how such a model can
help in determining the location of resources. We emphasize how different assumptions on the way routers and buildings work are formalized and made explicit in our logical modeling, and explain the sharp
distinction between knowing some facts and knowing all facts (open vs. closed world assumption). This
should be helpful when using ontologies in applications needing incomplete real world knowledge.
What do WE MEan by
rEasoninG and What is it
Good for?
Any reader of this text is equipped with the most
capable reasoner found on this planet: the human
brain. Thus, it is not surprising that we have
come to take reasoning for granted. That sound
reasoning follows a restricted set of formal rules
is a relatively recent invention. Long after people
learned to grow food, rule empires and measure
Copyright 2007, IGI Global, distributing in print or electronic forms without written permission of IGI Global is prohibited.
loGical iMPlications in
dEscriPtion loGic
The reasoning we do here deals with organizing
Classes, Individuals and Properties. The associated logic is called description logic (DL) (Calvanese, McGuinness, Nardi & Patel-Schneider,
2003) which has become popular as the basis
for the Web Ontology Language (OWL) (Dean,
Schreiber, Bechhofer, van Harmelen, Hendler,
Horrocks, et al., 2004) which for our purposes
can be considered as an extension of RDF.1 Difficult papers have been written about the subject
but since we want to draw practical conclusions
we give as mundane a description as possible. A
similarly practical approach is present in the Protg OWL-manual (Horridge, Knublauch, Rector,
Stevens, & Wroe, 2004). A list of further reading
material was collected by Franconi (2002).2
Netherlands
Germany
Netherlands
James
Amsterdam
Germany
Amsterdam
James
John
London
John
Person
London
City
worksFor
John
TopTechCompany
to assume incomplete knowledge and cannot assume full control. It also fits with the idea that
different sources contain different pieces of a
more complete description. However, in many
applications the situation is opposite. For example,
if we have a database, then the database records
are all the individuals in the database, and the
columns define all (the values of) its properties.
Moreover, the result of a query is supposed to be
an authorative answer in that the records returned
do, and the others do not satisfy the query. These
different points of view are complementary. A
database often contains incomplete information,
and the answer of a query is merely authorative
for information about whatever people have put
in the database. Conversely, we can state that a
class consists exactly of a number of instances
and no others. It merely points out to us that we
have to be careful about the distinction. We will
later give an example that illustrates the difference later in this chapter.
More on classes
Londoner livesIn.London
John Londoner John livesIn London
Parent 1 hasChild.Human
Note that the existential restriction is the special
case of cardinality restriction at least one.
John ParentOfGirl
(y Girl) (John hasChild y)
ParentOfGirl
Peter
hasChild
hasChild
hasChild
hasChild
hasChild
LowBuilding 3 hasFloor.Floor
Exact cardinality restriction (=)
Human =2 biologicalParents.Human
Negating the existential condition we are
inevitably led to the universal restriction (coming from the universal quantifier , which reads
as for all). The membership criterion for the
universal restriction is that all (possibly zero!)
values from a property P are members of a class
C. For example,
Sober drinks.NonAlcoholicDrink
John Sober
y (John drinks y y NonAlcoholicDrink)
Girl
James
Vegetarian eats.VegetarianFood
eats.VegetarianFood
More on Properties
The only way we will define properties is by
introducing them axiomatically. Like classes,
properties can be more or less specific which
leads to property hierarchies. A property R is a
subproperty of a property P, denoted R P, if
R implies P. It follows that each asserted triple x
R y, implies a triple x P y. For example, if John
has a daughter, then in particular, John has a
child (Figure 5).
(hasDaughter hasChild) (John hasDaughter
Mary) John hasChild Mary
There are other useful ways that one asserted
triple implies other triples if the properties we
consider are special. If R is the inverse of a property P then it is the same predicate but with the
order of subject and object reversed. For example,
(see Figure 6), if isDaughterOf is the inverse of
hasDaughter then
John hasDaughter Mary
Mary isDaughterOf John
John
hasChild
6
Mary
John
isDaughterOf
Germany
Netherlands
bordersWith
Mary
Peggy
John
sameAs
Mum
hasBirthMother
A property P is transitive if (x P y) (y P z)
x P z. For example, if hasAncestor is transitive then
(Mary hasAncestor John) (John hasAncestor
Peggy) Mary hasAncestor Peggy
A property P is reflexive on a domain D if
x D x P x. For example, defined on the
integers is reflexive.
John
John.Smith@
toptech.com
sameAs
Mr.Smith
hasEmail
hasAncestor
Mary
hasAncestor
John
Peggy
Exercises
1.
Boy Man
(b)
(c)
Show that Man and Woman are disjoint classes if and only if
Human IntelligentBeing
(b)
(p.C ) p. (C)
(p.C ) p.(C)
5.
6.
rEasoninG as a ProcEss
To find the implications of a set of given logical
statements we have to perform a process using
an algorithm on a computer. We are thus immediately facing the core computer science problem
of finding algorithms with low complexity in
space and runtime. It is known that first order
logic is in generally undecidable. Description
logics have better decidability properties, and
can be decided in polynomial time and space
(Horrocks, & Patel-Schneider, 1999) and have
some working reasoning implementations (see
the section on reasoners). This seems to be one
of the main reasons why description logic was
chosen as a basis for the semantic Web (Horrocks,
Patel-Schneider, & Harmelen, 2003), although this
choice is certainly not uncontroversial even among
its designers (Harmelen, 2002). One should not
forget that while a provable complexity class is
a good thing, it comes at the cost of expressivity
and conceptual complexity.
Others have argued that description logic (and
even first order logic) is insufficiently expressive
for people to model their business logic, so that
they will end up writing transformation rules in
a language like the semantic Web rule language
(Horrocks, Patel-Schneider, Boley, Tabet, Grosof,
& Dean 2004). Since those have few decidability
properties this defeats some of the purpose of basing on logic in the first place. Also note that very
few computer programs in use are even provably
terminating. There is something to be said to
have a decidable declarative data model, even if
the information model build on top of that data
model undecidable. Rules of the type
clauses or prove that such a model does not exist. As we have seen above, if we can compute a
model we can check satisfiability, and therefore
subsumption. An overview of techniques is given
in (Baader Sattler, 2001). Here we explain only the
simplest version, where we only have individuals,
classes, membership (xC ), property assertions
(x p y), negation () conjunction (), disjunction
() existential restriction (p.C ) and universal
restriction (p.C). Using exercise 4 above (de
Morgans law) we can and will assume that assertion of negated classes only occurs before
named classes.
Let A0 be the original set of membership and
property assertions (called the A-box). We can
consider the A-box quite concretely as an oriented
graph with two types of nodes individuals and
classes, which include those defined as restrictions intersections unions or negations. The graph
has oriented edges labeled with properties if they
link individuals or the symbol if they link an
individual to a class.
Starting from A0, the algorithm now recursively defines a tree of A-boxes with at most
binary branches such that each branch has the
same models in set theory or no model at all,
and at each step the complexity in terms of the
maximal number of conjunctions disjunctions
and restrictions decreases to zero as we go down
the tree.
To be precise the rules of the algorithm are to
apply the following rules in any order
1.
2.
3.
119
4.
provided the triple and the membership assertion do not exist already.
If An contains xp.C and x p y for some
node y then let An+1 := An with y C added
provided the membership assertion does not
exist already.
The algorithm as presented may need exponential space an time. However refinements can
be made to prune the tree and terminate the algorithm sooner that make the algorithm polynomial
in space and time (Baader Sattler, 2001; Horrocks
& Patel-Schneider, 1999).
120
Reasoners
Unfortunately polynomial time does not mean
it is easy to make a fast algorithm (Haarslev &
Mller, 2004; Haarslev, Mller & Wessel, 2005;
Horrocks, 2003). Practical reasoners implement
variations and optimalisations of the above tableaux algorithm. An incomplete list of reasoners
are the Fact/Fact++,5 Pellet,6 SWI prolog OWL
library,7 and the racer engine.8 Most support DIG9
(Bechhover, 2003), an interface to communicate
between the reasoner and an application. A more
complete list is maintained by Sattler10 In addition
to dedicated reasoners, triple databases such as
Sesame11 support some form of reasoning especially to satisfy queries that take an OWL (light)
ontology into account (Kiryakov, Ognyanov, &
Manov, 2005).
scenario
We now consider a simple scenario. Suppose we
want to create an application that supports users in discovering resources in a building (e.g.,
printers, beamers or coffee machines) and their
own location for example to direct them to the
closest printer.
What we need is a way to represent physical resources, and locations in buildings and
being able to express where these resources are
located. We also need to express that a resource
is reachable from another location in a building.
For this we set up several ontologies with the
proper vocabulary that can be used generically,
and a knowledge base for a particular building.
For example, we could have
space:sameSpaceAs a owl:SymmetricProperty,
owl:TransitiveProperty,
owlx:ReflexiveProperty;
rdfs:subPropertyOf owl:sameAs;
rdfs:domain Space;
rdfs:range Space.
Exercise
7. Following (Mindswap 2003,15 example 9),
show that
space:sameSpaceAs a owlx:ReflexiveProperty.
space:contains a owlx:ReflexiveProperty.
accessibility
For our purposes it suffices that we can say that
a corridor gives access to rooms and a flight of
stairs, we only express accessibility of rooms
and other indoor spaces. Although not quite true,
we will assume for simplicity that accessibility
is symmetric: if room A is accessible from corridor B, then corridor B is accessible from room
A (lock yourself out and see why this is only an
approximation). Moreover we distinguish two
kinds of accessibility: direct physical accessibility
where, for example, a room is on a corridor, and
logical accessibility where you have to go from
a room to a directly accessible corridor and from
the corridor to the directly accessible stairs and
so forth until you can finally have physical access to another room. Direct accessibility implies
logical accessibility, and logical accessibility is
transitive: if we can get from A to B and from B
to C, we can get from A to C. Note that unlike
logical accessibility, direct accessibility is readily observable for somebody walking through
the building.
building:accessTo a owl:TransitiveProperty,
owl:SymmetricProperty;
rdfs:domain building:Indoor;
rdfs:range building:Indoor.
building:directAccessTo a owl:SymmetricProperty;
rdfs:subPropertyOf accessTo;
rdfs:domain building:Indoor;
rdfs:range building:Indoor.
alProperty,
resource:Printer a owl:Class;
rdfs:subClassOf space:PhysicalEntity.
resource:WirelessAccesPoint a Class;
rdfs:subClassOf space:PhysicalEntity.
resource:AccessRange rdfs:subClassOf space:Space.
resource:hasAccessRange a owl:ObjectProperty,
owl:FunctionalProperty,
owl:inverseFunctionalProperty;
owl:inverseOf isAccessRangeOf;
rdfs:domain WirelessAccessPoint;
rdfs:range resource:AccessRange.
stair
wlan
printer
room room
room
corridor
room
accessRange
corridor
wlan
corridor
room
corridor room
room
room
our:prn a resource:Printer;
space:hasExtension our:prnExtension.
our:prnExtension a space:Space;
space:isContainedIn our:room.
our:wlan a resource:WirelessAccessPoint;
resource:hasAccessRange our:accessRange
our:wlan a resource:WirelessAccessPoint;
resource:hasAccessRange our:accessRange.
# We will make this more precise later !
our:accessRange a space:Space;
space:contains our:corridor, our:room, our:room,
our:room.
our:accessRange a space:Space;
space:contains our:corridor, our:room, our:room.
our:corridor a our:Corridor.
our:room a our:Room;
our:room.
consistency
Indoor isContainedIn.Building
Room Indoor
every room that we define must be contained in a
resource:Building. Indeed for all the defined rooms
corridors and stairs in the Knowledge base there
is a triple asserting that they are contained in our:
building , and our:building is a building:Building.
accessibility
The properties we have defined allow us to reason about the accessibility of spaces inside the
building. Remember that we have stated accessibility from a physical point of view, in fact by
stating which rooms are accessible from which
corridors. On the other hand consider an application that informs a user of the status of a printer.
We might then want to show the user the printers that are logically accessible from the users
terminal only. The data we have available, allow
us to determine the logical accessibility from a
given room to say our:room. More precisely, we
can find the rooms that are known to contain a
printer and that are known to be accessible from
our:room. If there happens to be another printer
but the system administrator did not make this
knowledge available, then there is nothing we can
do about it. The above is a retrieval task that we
can formulate as follows: find all known instances
?prn, ?prnRoom such that
.
.
.
.
.
?prn a resource:Printer.
?prn space:hasExtension ?prnExtension.
?prnRoom a our:Room.
?prnRoom space:contains ?prnExtension.
?prnRoom building:accessTo our:room.
6
Note that this example is a bit degenerate because the first two entries are always the same.
In general, in listing possibilities we have to take
combinations into account because only certain
combinations may be possible.
building:di-
Line 4: Here we demand the existence of an implicit or explicit statement with the property space:
contains with a possibility for ?prnRoom as subject
and a possibility for ?prnExtension as object. Like
in line 2 we also have to take into account:
contains.
space:contains
The only such subproperty is space:sameSpaceAs. The only triples with sameSpaceAs are
implied triples from a space to itself forced by
reflexiveness, but there is no statement making
a possibility for ?prnroom space:sameSpaceAs
:FullyinAccessRange a owl:Restriction;
owl:onProperty space:isContainedIn;
owl:someValue our:accessRange.
:FullyInAccessRange a owl:Restriction;
owl:onProperty space:isContainedIn;
owl:someValue our:accessRange.
:ContainsPossibleHere a owl:Restriction;
owl:onProperty space:contains;
owl:someValueFrom :PossibleHere.
(?pub ?x)
(our:stairs our:stairs)
(our:room our:room)
(our:room our:room)
(our:room our:room)
(our:room our:prnExtension)
(our:room our:room)
(our:corridor our:corridor)
(our:corridor our:corridor)
(our:stairs our:stairs)
(our:room our:room)
(our:room our:room)
(our:room our:room)
(our:room our:prnExtension)
(our:corridor our:corridor)
?pub a :PossiblePublicSpace.
(?pub ?x)
(our:room our:room)
(our:room our:prnExtension)
.
.
.
.
.
?pub a our:PublicSpace;
?pub space:contains ?x.
?x a space:Space;
?x space:isContainedIn our:AccessRange;
?x space:isContainedIn our:AccessRange.
9
0
Let us do the formal argument. From the definition of PossiblePublicSpace we see that:
PossiblePublicSpace
(PublicSpace
contains . isContainedIn.accessRange1)
PublicSpaceSomePartInAccessRange1
with a similar statement for accessRange2. We
conclude that:
PossiblePublicSpace
PublicSpaceSomePartInAccessRange1
PublicSpaceSomePartInAccessRange2
We now just enumerate both classes on the
right hand side and take the intersection. We
conclude that:
PossiblePublicSpace
{room3, corridor1, corridor2}
On the other hand each of our:room, our:corand our:corridor can be classified as an our:
ridor
PossiblePublicSpace.
What we learn from this example is the usefulness of stating a formal definition of a class in
terms of the conditions that members must satisfy,
independent of whether we know such individuals
exist, but that it is also useful to have precise information available about a full list of possibilities,
that is what is true and what is not true.
Exercise
8. Show that given the tightened Knowledge base
if a device is in a PublicSpace and in accessrange1 only, we can conclude that the (extension
of) the device is contained in the stairs, room1
or room2.
conclusion
Description logic is a knowledge representation
that has well understood algorithms for reasoning with them and existing implementations.
Expressive description logics have been used as
the basis for the OWL Ontology Web Language
that has been accepted as a W3C recommendation.
However the last word on the viability of description logic as a practical tool has not been said.
In particular rule languages such as the SWRL
(Semantic Web Rule Language) have been defined
that are solving some reasonably obvious reasoning problems at the cost of decidability.
A good reason for using a logic language to
represent an information model is to make explicit
various assumptions by explicitly building them
into a formal declarative model rather than as
part of a procedural computer program that uses
data. This sometimes requires some thinking but
often we merely want to define a model with dots
and arrows that is close to what things are, rather
than how one would represent them in a computer.
It is then useful to be able to define classes and
properties from other classes and properties and
more generally define the relationship between
them. However this requires an algorithm to
compute the implications that are implicit in the
logical model. Practical reasoners to determine
these implications vary from extensions to (triple)
databases that efficiently deal with a limited set
of reasoning tasks, to reasoners that can deal
with more complex inferences at the cost of efficiency.
While using reasoners we find the important
distinction between the open and closed world
assumption. This reflects the important distinction between knowing all things with a certain
property, and knowing the existence of things with
a certain property. We gave an extensive example
that emphasized the importance of proper modeling next to proper reasoning and showed how the
difference between the open and closed world
assumption can drastically change the results.
rEfErEncEs
Baader, F., & Sattler, U. (2001). An overview of
tableau algorithms. Studia Logica, 69(1). Springer.
Retrieved October 18, 2001, from http://www.
cs.man.ac.uk/~franconi/dl/course/articles/baader-tableaux.ps.gz
Bechhofer, S. (2003). The DIG Description Logic
Interface: DIG/1.1. In Proceedings of DL2003
Workshop, Rome, Italy. Retrieved October 18,
2006, from http://dl-Web.man.ac.uk/dig/2003/02/
interface.pdf
Beckett, D. (2004). Turtle - Terse RDF triple
language. Retrieved October 18, 2006, from
http://www.dajobe.org/2004/01/turtle/
Berners Lee, T. (2000) Primer: Getting into RDF
& Semantic Web using N3. Retrieved October
18, 2006, from http://www.w3.org/2000/10/swap/
Primer
Calvanese, D., McGuinness, D., Nardi, D., & PatelSchneider, P. (2003). The description logic handbook: Theory, implementation and applications.
Cambridge University Press. Retrieved October
18, 2006, from http://www.cambridge.org/uk/
catalogue/catalogue.asp?isbn=0521781760
Dean, M., Schreiber, G. (Eds.), Bechhofer, S., van
Harmelen, F., Hendler, J., Horrocks, I., McGuinness, D.L., Patel-Schneider, P.F., & Stein, L.A.
(2004). OWL Web Ontology Language Reference, W3C Recommendation 10 February 2004.
Retrieved October 18, 2006, from http://www.
w3.org/TR/owl-ref/
Franconi, E. (2002). Description logics tutorial course description. Retrieved October 18,
2006, from http://www.inf.unibz.it/~franconi/
dl/course/
Haarslev, V. & Mller, R. (2004, June 2-4). Optimization techniques for retrieving resources
described in OWL/RDF documents: First results. In Proceedings of the Ninth International
5
6
7
9
10
11
12
13
14
EndnotEs
1
2
15
http://www.swi-prolog.org/packages/Triple20/
http://owl.man.ac.uk/factplusplus/
http://www.mindswap.org/2003/pellet/
http://www.swi-prolog.org/packages/semWeb.html see also http://www.semanticWeb.
gr/TheaOWLLib/
http://www.sts.tu-harburg.de/~r.f.moeller/
racer/
http://dig.sourceforge.net/
http://www.cs.man.ac.uk/~sattler/reasoners.
html
http://www.openrdf.org/
http://protege.stanford.edu/
http://www.swi-prolog.org/packages/Triple20/
http://owl1_1.cs.manchester.ac.uk/
http://www.mindswap.org/2003/pellet/
demo.shtml
http://www.w3.org/RDF/
http://www.inf.unibz.it/~franconi/dl/
course/
http://protege.stanford.edu/
134
Chapter VII
Abstract
This chapter introduces the theory and design principles behind Web Service technology. It explains
the models, specifications, and uses of this technology as a means to allow heterogeneous systems to
work together to achieve a task. Furthermore, the authors hope that this chapter will provide sufficient
background information along with information about current areas of research in the area of Web Services that readers will come away with an understanding of how this technology works and ways that
it could be implemented and used.
Introduction
As the World-Wide Web (WWW) exploded into
the lives of the public in the 1990s, people suddenly had vast amounts of information placed
at their fingertips. The system was developed to
Copyright 2007, IGI Global, distributing in print or electronic forms without written permission of IGI Group is prohibited.
solutions. EAI platforms were used for integrating incompatible and distributed systems such
as ERP (enterprise resource planning), CRM
(customer relationship management), SCM (supply chain management), databases, data sources,
data warehouses, and other important internal
systems across the corporate enterprise. While
useful, most EAI frameworks required costly
and proprietary protocols and formats, which
presented many technical difficulties when it was
needed to integrate internal systems with external
systems running on partners computers.
The limitations of EAI solutions made most
organizations realize that integrating internal
systems with external systems to business supply
chain members was a key to staying competitive,
since the majority of business processes spanned
across several organizations. Internal and external
systems needed to communicate over networks to
allow businesses to complete a transaction or part
of a transaction. To achieve this level of integration, business-to-business (B2B) solutions were
developed. B2B infrastructures were directed to
help organizations to streamline their processes
so they could carry out business transactions more
efficiently with their business partners (such as
resellers and suppliers). To reach a higher level
of integration, most B2B solutions have relied on
the use of XML as the language to represent data.
XML allows one to model data at any level of
complexity since it is extensible with the addition
of new tags. Data can be published in multiple
6
a briEf history of
distributEd coMPutinG
Once networking became widespread across
academia and industry, it became necessary to
share data and resources. In the early years of distributed computing, message passing (e.g., using
for example sockets developed in the early 1980s)
was the prevailing method for communication.
This involved encoding the data into a message
format (i.e., how a structured piece of information
is encoded prior to transmission) and sending the
encoded data over the wire. The socket interface
allowed message passing using send and receive
primitives on transmission control protocol (TCP)
or user datagram protocol (UDP) transport protocols for low-level messaging over Internet protocol
(IP) networks. Applications communicated by
sending and receiving text messages. In most
cases, the messages exchanged conformed to an
application-level protocol defined by programmers. This worked well but was cumbersome in
the fact that the data had to be coded and then
decoded. Using this approach, two programmers
developing a distributed application must have
sErvicE-oriEntEd
architEcturE
As we have seen, in the 1980s distributed computing was introduced. This research led to the
development of distributed objects architectures
through the 1990s. The distributed platforms
developed, such as Java RMI and DCOM, had
several restrictions. For example, RMI was limited
to Java, while DCOM was limited to Microsoft
platforms. Moreover, distributed applications
developed using different platforms were difficult
to integrate. Integration was and is still one of the
major concerns for Chief Information Officers.
Figure 2 gives us a very good indication that application integration tops the priority list of high
ranking business people.
To cope with the restrictions of more traditional
distributed objects architectures, in the early
2000s, the concept of service-oriented architecture (SOA) was introduced (or reintroduced, since
%
%
%
%
%
%
%
Commerce Server
%
%
Financial (Accounting)
6%
Intranet Improvements
9%
Database upgrade
%
HR
%
SCM/Logistics
%
CRM
0%
e-business
%
Application Integration
0%
%
%
0%
%
0%
% of respondents
%
0%
%
0%
Scalable: The past solutions were not designed with the scale of the Web in mind.
SOA should work in a variety of settings,
such as within an organization, between
business partners and across the world.
Loosely-coupled: SOA is an evolution from
tightly coupled systems to loosely coupled
ones. Senders and receivers of a SOA should
be independent of each other; the source can
send the message independently of the target.
Tight coupling is not suitable for SOA since
it leads to monolithic and brittle distributed
applications. Even trivial changes in one
component lead to catastrophic breaks in
function. Small changes in one application
require matching changes in partner applications (Channabasavaiah & Tuggle, 2003).
Interoperability: One party should be able
to communicate with another party regardless of the machine they are running on.
Discovery: One party should be able to
communicate with a second party selected
from a set of competent candidates. Services
need to be dynamically discoverable. This
is accomplished through services such as a
directory of service descriptions.
Abstraction: A SOA abstracts the underlying technology. Developers can concentrate
When comparing SOA with previous approaches we can find the following major differences. Traditional Middleware, such as distributed
object systems, are based on the client-server
paradigm, have heavily asymmetric interaction
model, are biased towards synchronous protocols,
assign public interfaces to network accessible
objects, and support name-oriented object
discovery. On the other hand, service-oriented
Middleware are based on a peer-to-peer paradigm, have symmetric interaction models, mixes
synchronous and asynchronous protocols, assigns
public contracts to network accessible objects,
and supports capability based service discovery
(Cardoso, Curbera, Sheth, 2004).
9
0
Figure 4. Web Services and list standards (Cardoso, Curbera, & Sheth, 2004)
WS-Policy
WS-Policy is a specification of a framework for
defining the requirements and capabilities of a
service. In this since, a policy is nothing more
that a set of assertions that express the capabilities
and requirements of a service. The specification
WS-Policy (http://www-128.ibm.com/developerworks/library/specification/ws-polfram/) defines
terms that can be used to organize a policy. Once
a provider has a policy defined in XML, then he
must publish that information by referencing it
in the description of the service.
WS-PolicyAttachment
This defines the method for attaching a policy
to a WSDL file so that it can be published to the
UDDI and thus used in deciding on services.
There are several mechanisms defined for accomplishing this task. The simplest method is
to write the policy directly into the WSDL file.
A more complex, and more powerful method is
to construct the policy as a stand alone file that
is referenced in the WSDL file as a URI. These
references can exist at any element of the WSDL.
WS-Policy and WS-PolicyAttachment together
give us hierarchy based on to which element the
policy is attached and direction for merging policies together to create an effective policy for an
element (WS-PolicyAttachment, 2005).
Both WS-Policy and WS-PolicyAttachment
have recently been submitted to W3C for standardization.
6
WS-Security Framework
The WS-Security specification provides a framework and vocabulary for requesters and providers
to secure messaging as well as communicate
information regarding security and privacy.
There are other security related specifications
worth mentioning. XML-Encryption specifies
the process of encrypting data and messages.
XML-Signature provides a mechanism for messages integrity and authentication, and signer
authentication. XACML is an XML representation of the Role-Based Access Control standard
(RBAC). XACML will likely play an important
function in Web services authorization. Security
Assertion Markup Language, or SAML, is an
OASIS framework for conveying user authentication and attribute information through XML
assertions. There are many specifications and
standards for Web services security. We would
like to encourage you to investigate these on your
own as an exercise.
WS-SecurityPolicy
Policies for Web services that describe the access
permissions as well as actions which a requester
or provider are required to perform. For example,
a policy may indicate that requesters must have
an active account with the service and that messages be encrypted using a PKI scheme from a
trusted certificate authority. A requester may
also have a policy indicating which encryption
schemes it accepts.
WS-Trust
Before two parties are going to exchange sensitive
information, they must establish a secure communication. This can be done by the exchange
of security credentials. However, one problem
remains, how one party can trust the credentials of
the other. The Web Service Trust Language (WSTrust) was developed to deal with this problem. It
offers extensions to the WS-Security elements to
exchange security tokens and establishing trust
relationships (WS-Trust, 2005).
WS-SecureConversation
The Web services protocol stack is designed to be
a series of building blocks. WS-Secure Conversation is one of those blocks. WS-Security provides
message level authentication, but is vulnerable to
some types of attacks. WS-SecureConversation
uses SOAP extensions to define key exchange
and key derivation from security context so that
a secure communication can be ensured (WSSecureConversation, 2005).
Ws-authorization
Authorization for Web services still remains an
area of research at the time of this publication. The
difficulty of authorization is the inability to dynamically determine authorization for a requester
whom a Web service has just been introduced.
Some authorization frameworks being suggested
include assertion based, role based, context based
and a hybrid approach.
Assertion based authorization uses assertions
about the requester to decided on the level of authorization. In a role based approach, requesters
are given user labels and these labels are associated with roles, which in turn have permissions
assigned to them. Context based authorization
examines the context in which a requester is acting. For instance: proximity to the resource, on
behalf of a partnership, or even the time of day.
Obviously a hybrid approach is some combination
of two or more approaches.
WS-Privacy
Privacy is in the context of data and can be associated with the requester or the provider. The
requester may be concerned that the information
given to a provider will be propagated to other
entities. Such information could be a credit card
number, address, or phone number. A provider
may be concerned with the proliferation of information which they have sold to a requester.
In this case the provider does not want the requester to resell this information without proper
compensation.
transaction Processing
The perceived success of composite applications in a service-oriented architecture depends
on the reliability of participants that are often
beyond corporate boundaries. In addition to already frequent errors and glitches in application
code, distributed applications must cope with
external factors such as network connectivity,
unavailability of participants and even mistakes
in service configuration. Web services transaction management enables participating services
to have a greater degree of confidence in that the
actions among them will progress successfully,
and that in the worst case, such transactions can
be cancelled or compensated as necessary.
WS-Transaction
To date, probably the most comprehensive effort to
define transaction context management resides in
the WS-Coordination (WS-C) (Microsoft, BEA,
IBM,`Web Service Coodination, 2005), WSAtomicTransaction (WS-AT) (Microsoft, BEA,
IBM, `Web Service Atomic Transaction, 2005)
and WS-BusinessActivity (WS-BA) (Microsoft,
BEA, IBM,`Web Service Business Activity,
2005) specifications. WS-C defines a coordination context, which represents an instance of
coordinated effort, allowing participant services
to share a common view. WS-AT targets existing
Messaging
WS-ReliableMessaging
Communication over a public network such as
the Internet imposes physical limitations to the
reliability of exchanged messages. Even though
failures are inevitable and unpredictable, certain
techniques increase message reliability and traceability even in the worst cases.
At a minimum, senders are interested in determining whether the message has been received
by the partner, that it was received exactly once
and in the correct order. Additionally, it may be
necessary to determine the validity of the received
message: Has the message been altered on its
way to the receiver? Does it conform to standard
formats? Does it agree with the business rules
expected by the receiver?
WS-Reliability and WS-ReliableMessaging
have rules that dictate how and when services
must respond to other services concerning the
receipt of a message and its validity.
WS-Eventing
Web services eventing (WS-Eventing) is a
specification that defines a list of operations that
should be in a Web service interface to allow for
asynchronous messaging. WS-Eventing is based
on WS-Notification that was submitted to OASIS
for standardization.
WS-Notification
Web service notification (WS-Notification) is
a family of specifications that provide several
capabilities.
1.
2.
3.
Figure 9 illustrates an example of a Java service which has been annotated. Note that in the
example the @WebService and @WebMethod
are the annotations. The complier will recognize
these tags and create the WSDL document.
9
0
5.
6.
7.
8.
Publish Service: Publishing a service requires the use of UDDI registries. Setting up
a registry varies based on which registry is
chosen. For our example, we used the jUDDI
registry on a Tomcat server. The action of
publishing a service is similar to advertising
a business. After deployment and testing,
the service is open to the world and ready to
accept request, but until it is published, it is
unlikely that anyone will know about your
service. Tools that simplify this process are
Radiant and Lumina (Li, 2005), both from
the METEOR-S tool suite.
conclusion
The service oriented architecture (SOA) is currently a hot topic. It is an evolution of the
distributed systems technology of the 1990s,
such as DCOM, CORBA, and Java RMI. This
type of architecture requires the existence of
main components and concepts such as services,
service descriptions, service security parameters
and constraints, advertising and discovery, and
service contracts in order to implement distributed
systems. In contrast to the Event-Driven Architecture, in which the services are independent,
the SOA-based approach requires services to be
loosely coupled.
SOA are often associated with Web services
and sometimes, SOA are even confused with
Web services, but, SOA does not specifically
mean Web services. Instead, Web services can
be seen as a specialized SOA implementation that
embodies the core aspects of a service-oriented
approach to architecture. Web service technology
has come a long way toward achieving the goal of
the SOA. With Web services, developers do not
need to know how a remote program works, only
the input that it requires, the output it provides
and how to invoke it for execution. Web services
provide standards and specifications that create
an environment where services can be designed,
rEfErEncEs
Arjuna Technologies Limited (2005). Arjuna
transaction service suite. Retrieved October 18,
2006, from http://www.arjuna.com/products/arjunats/index.html
Axis Development Team (2006) . Webservices
Axis. Retrieved October 18, 2006, from http://
ws.apache.org/axis/
Bellwood, T. (2002) UDDI Version 2.04 Api
specification. Retrieved February 20, 2007 from
http://uddi.org/pubs/ProgrammersAPI-V2.04Published-20020719.htm
Birrell, A.D. & Nelson, B.J. (1984). Implementing remote procedure calls. ACM Transactions
on Computer Systems, 2(1), 39-54.
Booth, D., Hass, H., McCabe, F., Newcomer, E.,
Champion, M., Ferris, C., & Orchard, D. (2004)
Web services architecture, W3C Working Group
Note. Retrieved October 18, 2006, from http://
www.w3.org/TR/ws-arch/
Brewer, D., LSDIS Lab, University of Georgia
(2005). Radiant. Retrieved October 18, 2006,
from http://lsdis.cs.uga.edu/projects/meteor-s/
downloads/index.php?page=1
Chapter VIII
Service-Oriented Processes:
An Introduction to BPEL
Chun Ouyang
Queensland University of Technology, Australia
Wil M.P. van der Aalst
Eindhoven University of Technology, The Netherlands and Queensland University of Technology,
Australia
Marlon Dumas
Queensland University of Technology, Australia
Arthur H.M. ter Hofstede
Queensland University of Technology, Australia
Marcello La Rosa
Queensland University of Technology, Australia
abstract
The Business Process Execution Language for Web Services (BPEL) is an emerging standard for specifying
the behaviour of Web services at different levels of details using business process modeling constructs. It
represents a convergence between Web services and business process technology. This chapter introduces
the main concepts and constructs of BPEL and illustrates them by means of a comprehensive example.
In addition, the chapter reviews some perceived limitations of BPEL and discusses proposals to address
these limitations. The chapter also considers the possibility of applying formal methods and Semantic
Web technology to support the rigorous development of service-oriented processes using BPEL.
Copyright 2007, IGI Global, distributing in print or electronic forms without written permission of IGI Global is prohibited.
Service-Oriented Processes
introduction
Web services are a standardised technology for
building and integrating distributed software
systems. Web services are an incarnation of a software development paradigm known as serviceoriented architectures (SOAs). Although there is
no broad consensus around the definition of SOAs,
it can be said that SOAs revolve around at least
three major principles: (1) software systems are
functionally decomposed into independently developed and maintained software entities (known
as services); (2) services interact through the
exchange of messages containing meta-data;
and (3) the interactions in which services can
or should engage are explicitly described in the
form of interfaces.
At present, the first generation of Web service
technology has reached a certain level of maturity
and is experiencing increasing levels of adoption,
especially in the context of business applications.
This first generation relies on XML, SOAP and
a number of so-called WS-* specifications for
message exchange (Curbera, Duftler, Khalaf,
Nagy, Mukhi, & Weerawarana, 2002), and on
XML Schema and WSDL for interface description. In the meantime, a second generation of Web
services, based on richer service descriptions is
gestating. Whereas in first-generation Web services, interface descriptions are usually equated
to sets of operations and message types, in the
second generation the description of behavioural
dependencies between service interactions (e.g.,
the order in which messages must be exchanged)
plays a central role.
The Business Process Execution Language for
Web Services (BEA Systems, Microsoft, IBM, &
SAP, 2003), known as BPEL4WS or BPEL for
short, is emerging as a standard for describing
the behaviour of Web services at different levels
of abstraction. BPEL is essentially a layer on top
of WSDL and XML Schema, with WSDL and
XML Schema defining the structural aspects of
service interactions, and BPEL defining the be-
6
Why bPEl?
BPEL supports the specification of serviceoriented processes, that is, processes in which
each elementary step is either an internal action
performed by a Web service or a communication
action performed by a Web service (sending and/
or receiving a message). They can be executed
to implement a new Web service as a concrete
aggregation of existing services to deliver its
functionality (i.e., composite Web service). For
example, a service-oriented process may specify
that when a Sales Web service receives a
purchase order from the Procurement Web
service of a customer, the Sales service engages
in a number of interactions with the Procurement Web service as well as several other Web
Service-Oriented Processes
Messaging: BPEL provides primitive constructs for message exchange (i.e., send,
receive, send/receive).
Concurrency: To deal with concurrency
between messages sent and received, BPEL
incorporates constructs such as block-structured parallel execution, race conditions,
and event-action rules.
XML typing: To deal with the XML-intensive nature of Web services, BPEL variables
have XML types described in WSDL and
XML Schema. In addition, expressions may
be written in XML-specific languages such
as XPath or XSLT.
Service-Oriented Processes
standardization within OASIS. The appropriate technical committee (OASIS Web Services
Business Process Execution Language TC, 2006)
is working since the time of submission and is in
the process of finalizing the appropriate standard
specification, namely Web Services Business
Process Execution Language (WS-BPEL) version
2.0 (OASIS, 2005).
Currently BPEL is implemented in a variety of
tools (see http://en.wikipedia.org/wiki/ BPEL for
a compendium). Systems such as BEA WebLogic,
IBM WebSphere, Microsoft BizTalk, SAP XI and
Oracle BPEL Process Manager support BPEL
to various degrees, thus illustrating the practical relevance of this language. Also, there is a
relatively complete open-source implementation
of BPEL, namely ActiveBPEL.
ovErviEW of bPEl
BPEL defines a model and a grammar for describing the behaviour of a business process based on
interactions between the process and its partners.
A BPEL process is composed of activities that can
be combined through structured operators and
related through so-called control links. In addition
to the main process flow, BPEL provides event
handling, fault handling and compensation (i.e.,
undo) capabilities. In the long-running business
processes, BPEL applies correlation mechanism to
route messages to the correct process instance.
BPEL is layered on top of several XML specifications: WSDL, XML Schema and XPath. WSDL
message types and XML Schema type definitions
provide the data model used in BPEL processes.
XPath provides support for data manipulation. All
external resources and partners are represented
as WSDL services.
Service-Oriented Processes
Figure 1. A purchase order process interacting with two partners: the client and the invoicing service
"purchasePT"
purchase
order
client
purchase
order
process
"computerPricePT"
request for
price calculation
invoice
invoicing
service
"invoiceCallbackPT"
activities
A BPEL process definition relates a number of
activities. Activities are split into two categories:
basic and structured activities. Basic activities are
also called primitive activities. They correspond
to atomic actions and stand for work being performed within a process. Structured activities
impose behavioural and execution constraints on
a set of activities contained within them. Structured activities can be nested and combined in
arbitrary ways, thus enabling the presentation of
complex structures.
Basic activities. These contain: invoke, invoking an operation on some Web service; receive,
waiting for a message from an external partner;
reply, replying to an external partner; wait, pausing for a certain period of time; assign, copying
data from one place to another; throw, indicating
errors in the execution; compensate, undoing the
effects of already completed activities; exit, terminating the entire service instance; and empty,
doing nothing. Below, we look closer into three
activities: invoke, receive, and reply.
Invoke, receive, and reply activities are three
types of interaction activities defined in BPEL.
Interaction activities must specify the partner
link through which the interaction occurs, the
operation involved, the port type in the partner
WSDL snippet:
...
<partnerLinkType name="purchasingPLT">
<role name="purchaseService">
<portType name="purchasePT"/>
</role>
</partnerLinkType>
<partnerLinkType name="invoicingPLT">
<role name="invoiceService">
<portType name="computePricePT"/>
</role>
<role name="invoiceRequester">
<portType name="invoiceCallbackPT"/>
</role>
</partnerLinkType>
...
BPEL snippet:
...
<partnerLinks>
<partnerLink name="purchasing"
partnerLinkType="purchasingPLT"
myRole="purchaseService"/>
<partnerLink name="invoicing"
partnerLinkType="invoicingPLT"
myRole="invoiceRequester"
partnerRole="invoiceService"/>
</partnerLinks>
...
9
Service-Oriented Processes
60
Service-Oriented Processes
First, BPEL does not allow two receive activities to be active (i.e., ready to consume
messages) at the same time if they have the
same partner link, port type, operation, and
correlation set which is used for routing messages to process instances (see Subsection
on Correlation). If this happens, a built-in
fault named conflictingReceive will be
raised at runtime.
Second, BPEL does not allow a request to
call a request-response operation if an active
receive is found to consume that request, but
a reply has not yet been sent to a previous
request with the same operation, partner
link, and correlation set. If this happens, a
built-in fault named conflictingRequest
will be thrown.
Structured activities. BPEL defines six structured activities: sequence, switch, pick, while, flow,
and scope. The use of these activities and their
combinations enable BPEL to support most of the
workflow patterns described in (Aalst, van der,
Hofstede, ter, Kiepuszewski, & Barros, 2003).
A sequence activity contains one or more activities that are performed sequentially. It starts
once the first activity in the sequence starts, and
completes if the last activity in the sequence completes. For example, Figure 6 defines a sequence
of activities performed within the purchase order
process shown in Figure 1. To improve readability,
this and the following code snippets do not use
XML syntax. Instead, BPEL element names are
written in bold while the level of nestings of elements is captured through indentation.
A switch activity supports conditional routing between activities. It contains an ordered
list of one or more conditional branches called
case branches. The conditions of branches are
evaluated in order. Only the activity of the first
branch whose condition holds true will be taken.
There is also a default branch called otherwise
branch, which follows the list of case branches.
The otherwise branch will be selected if no case
branch is taken. This ensures that there is always
one branch taken in a switch activity. The switch
activity completes when the activity of the selected
branch completes. For example, consider a supply-chain process which interacts with a buyer
and a seller. Assume that the buyer has ordered
6
Service-Oriented Processes
6
A while activity supports repeated performance of an activity in a structured loop, that is, a
loop with one entry point and one exit point. The
iterative activity is performed until the specified
while condition (a boolean expression) no longer
holds true. For example, the pick activity defined
in Figure 8 can occur in a loop where the seller
is accepting line items for a large order from
Service-Oriented Processes
control links
The sequence, flow, switch, pick, and while
described in the previous subsection provide a
means of expressing structured flow dependencies. In addition to these constructs, BPEL provides another construct known as control links
which, together with the associated notions of
join condition and transition condition, support
the definition of precedence, synchronization and
conditional dependencies on top of those captured
by the structured activities.
A control link denotes a conditional transition
between two activities. A join condition, which is
associated to an activity, is a boolean expression
in terms of the tokens carried by incoming
control links to this activity. Each token, which
represents the status of the corresponding control
link, may take either a positive (true) or a negative
(false) value. For example, a control link between
activities A and B indicates that B cannot start
before A has either completed or has been skipped
(e.g., A is part of an unselected branch of a switch
or pick). Moreover, activity B can only be executed
if its associated join condition evaluates to true,
otherwise B will not run. A transition condition,
which is associated to a control link, is a boolean
expression over the process variables (just like
Figure 10. A flow activity modeling two concurrent questionnaire interactions in a supply-chain process
begin flow
invoke FillQuestionnaire (request-response) operation on the buyer
invoke FillQuestionnaire (request-response) operation on the seller
end flow
...
6
Service-Oriented Processes
Figure 11. A directed graph representing the stock inventory check procedure within a supply-chain
process
invoke
StockResultInquiry
toFulfilment
(StockResult >100)
toOutOfStock
(100>StockResult>0)
toDiscontinued
(StockResult = 0)
activity
perform fulfillment
throw
OutOfStock
throw
ItemDiscontinued
afterFulfilment
afterOutOfStock
afterDiscontinued
invoke
UpdateStockResult
Legend :
164
Activity
Control Link
Service-Oriented Processes
Event Handlers
The purpose of event handlers is to specify logic
to deal with events that take place concurrently
while the process is running. An event handler is
an event-action rule associated with a scope, and
is in the form of an event followed by an activity.
Figure 12. Using control links to model the stock inventory check procedure sketched in Figure 11
begin flow (suppressJoinFailure =yes)
begin link declaration
link toFulfillment
link toOutOfStock
link toDiscontinued
link afterFulfillment
link afterOutOfStock
link afterDiscontinued
end link declaration
invoke StockResultQuery (request-response) operation on the seller
source of link toFulfillment with
transitionCondition (StockResult >100)
source of link toOutOfStock with
transitionCondition (StockResult > 0 and StockResult <100)
source of link toDiscontinued with
transitionCondition (StockResult = 0)
activity : performing fulfillment work
joinCondition LinkStatus(toFulfillment)
target of link toFulfillment
source of link afterFulfillment
transitionCondition (true)
throw OutOfStock fault
joinCondition LinkStatus(toOutOfStock)
target of link toOutOfStock
source of link afterOutOfStock
transitionCondition (true)
throw ItemDiscoutinued fault
joinCondition LinkStatus(toDiscontinued)
target of link toDiscontinued
source of link afterDiscontinued
transitionCondition (true)
invoke StockResultUpdate (one-way) operation on the seller
joinCondition LinkStatus(afterFulfillment) or
LinkStatus(afterOutOfStock) or
LinkStatus(afterDiscontinued)
target of link afterFulfillment
target of link afterOutOfStock
target of link afterDiscontinued
end flow
165
Service-Oriented Processes
166
Service-Oriented Processes
Fault Handling
Fault handling in a business process enables the
process to recover locally from possible anticipated faults that may arise during its execution.
For example, consider a fault caused by insufficient funds in the clients account for payment
during a purchase order process. The fault may be
handled by requesting the information of another
available account from the client, without having
to restart the entire process.
BPEL considers three types of faults. These
are: application faults (or service faults), which
are generated by services invoked by the process,
such as communication failures; process-defined
faults, which are explicitly generated by the
process using the throw activity; and system faults,
which are generated by the process engine, such
as conflictingReceive, conflictingRequest and
join failures introduced before. Note that the first
two types of faults are usually user-defined, while
the last one consists of built-in faults defined in
BPEL.
Fault handlers specify reactions to internal or
external faults that occur during the execution of
a scope, and are defined for a scope using catch
activities. Unlike event handlers, fault handlers
167
Service-Oriented Processes
Compensation
As part of the exception handling, compensation
refers to application-specific activities that
attempt to undo the already completed actions.
For example, consider a client requests to cancel
the air ticket reservation with a ticket order
process. The process will need to carry out the
following compensation actions, which involve
the cancellation of the reservation with the airline,
and optionally the conduction of fee charges to the
client if there are fees applied for the cancellation
of a reservation.
In BPEL, compensation actions are specified
within a compensation handler. Each scope, except
the top level scope (i.e. process scope), provides
one compensation handler that is defined either
explicitly or implicitly. Similarly to a default fault
handler, an implicit (or default) compensation
handler is created for a scope, if the scope is asked
for compensation but an explicit compensation
handler is missing for that scope. A fault handler
or the compensation handler of a given scope,
may perform a compensate activity to invoke the
compensation of one of the sub-scopes nested
within the given scope. Similarly to the control
link restrictions applied to event handlers, control
links are not allowed to cross the boundary of
compensation handlers.
It is important to mention that whether the
compensation handler of a scope is available for
168
Data Handling
In the previous subsections, we mainly focus
on the control logic of a BPEL process. Careful
readers may already notice that the process data
is necessary for the process logic to make datadriven decisions (e.g., in a switch activity). In the
following, we introduce how data is represented
and manipulated in BPEL.
Messages. Business protocols specified in
BPEL prescribe exchange of messages between
interacting Web services. These messages are
WSDL messages defined in the appropriate WSDL
definitions. Briefly, a message consists of a set
of named parts, and each of these parts is typed
generally using XML Schema. For example, in
Figure 15 below, the orderMsg is shown with three
message parts: an orderNumber of an integer type,
an orderDetails of a string type, and a timeStamp
of a dateTime type. Note that the integer, string
and dateTime are all simple XML Schema types. If
a complex XML Schema type is needed, it needs
to be defined in the corresponding XML Schema
file (see Section on BPEL At Work).
Service-Oriented Processes
<message name="orderMsg"/>
<part name="orderNumber"
type="integer"/>
<part name="orderDetails"
type="string"/>
<part name="timeStamp"
type="dateTime"/>
</message>
<variables>
<variable name="order"
messageType="orderMsg"/>
<variable name="order_backup"
messageType="orderMsg"/>
<variable name="number"
type="integer"/>
</variables>
Service-Oriented Processes
correlation
Business processes may in practice occur over
a long period of time, possibly days or months.
In long-running business processes, it is necessary to route messages to the correct process
instance. For example, when a request is issued
from a partner, it is necessary to identify whether
a new business process should be instantiated
or the request should be directed to an existing
process instance. Instead of using the concept of
instance ID as often used in distributed object
system, BPEL reuses the information that can
be identified from the specifically marked parts
in incoming messages, such as order number or
client id, to route messages to existing instances
of a business process. This mechanism is known
as correlation. The concept of correlation set is
then defined by naming specific combinations of
certain parts in the messages within a process.
This set can be used in receive, reply and invoke
activities, the onMessage branch of pick activities,
and the onEvent handlers.
Similarly to variables, each correlation set is
defined within a scope. Global correlation sets
are declared in the process scope, and local correlation sets are declared in the scopes nested
within a process. Correlation sets are only visible
for the scope (Q) in which they are declared and
all scopes nested in Q. Also, correlation set can
0
bPEl at Work
This section describes the example of a BPEL
process which provides sales service. This process, namely salesBP, interacts with a customer
process (customerBP) by means of asynchronous
messages. The process salesBP enacts the role
of service provider, whilst the customer is the
service requester.
Process description
Figure 19 depicts the behaviour of the process
salesBP. The process is instantiated upon receiving a request for quote (rfQ), which includes the
description and the amount of the goods needed, a
unique identifier of the request (rfQId), and a deadline (tO). Next, the process checks the availability
of the amount of the goods being requested. If not
available, a rejectRfQ is sent back to the customer,
providing the reason of the rejection. Otherwise,
the process prepares a quote with the cost of the
offer and then sends it back to the customer. After
Service-Oriented Processes
Wsdl document
Service-Oriented Processes
Service-Oriented Processes
Service-Oriented Processes
174
Service-Oriented Processes
Service-Oriented Processes
6
Service-Oriented Processes
bPEl EXtEnsions
The BPEL specification defines only the kernel
of BPEL, which mainly involves the control logic
of BPEL, limited definitions on the data handling
and even less in the communication aspect. Given
the fact that BPEL is already a very complicated
language, a complete BPEL specification covering
full definitions of BPEL will make the specification less maintainable and the corresponding
implementation will become less manageable. For
this reason, the OASIS technical committee on
WS-BPEL decides to keep the scope of the current
specification and allows future extensions to be
made in separate documentations. So far, there
have been three extensions proposed to BPEL.
BPEL-SPE. BPEL currently does not support
the modularization and reuse of fragments of a
business process. This has driven the publication
of a joint proposal of WS-BPEL Extension for SubProcesses, known as BPEL-SPE (Kloppmann,
Koenig, Leymann, Pfau, Rickayzen, Riegen, et
al., 2005 September), by two major companies
involved in Web services standards: IBM and
SAP. BPEL-SPE proposes an extension to BPEL
that allows for the definition of sub-processes
which are fragments of BPEL code that can be
reused within the same or across multiple BPEL
processes.
BPEL4People. In practice, many business
process scenarios require human user interactions. For example, it may be desirable to define
which people are eligible to start a certain business
process; a process may be stuck because no one
has been assigned to perform a particular task;
bPEl-rElatEd rEsEarch
Efforts
There has been a number of research activities
conducted on BPEL. These include: systemati-
Service-Oriented Processes
Figure 23. Oracle JDeveloper 10.1.2: Graphical view of the BPEL process salesBP
Service-Oriented Processes
Figure 24. OracleBPEL Process Manager Console 10.1.2: Execution flow of a running instance of the
BPEL process salesBP
9
Service-Oriented Processes
0
Service-Oriented Processes
choreography conformance
checking based on bPEl
To coordinate a collection of inter-communicating Web services, the concept of choreography
defines collaborations between interacting parties,
that is,, the coordination process of interconnected
Web services that all partners need to agree on.
A choreography specification is used to describe
the desired behaviour of interacting parties.
Language such as BPEL and the Web Services
Choreography Description Language (WS-CDL)
(Kavantzas, Burdett, Ritzinger, Fletcher, & Lafon,
2004 December) can be used to define a desired
choreography specification.
Assuming that there is a running process and
a choreography specification, it is interesting to
check whether each partner (exposed as Web
service) is well behaved. Note that partners have
no control over each others services. Moreover,
partners will not expose the internal structure
and state of their services. This triggers the question of conformance: Do all parties involved
operate as described? The term choreography
conformance checking is then used to refer to
this question. To address the question, one can
assume the existence of both a process model
which describes the desired choreography and
an event log which records the actual observed
behaviour, that is, an actual choreography.
Choreography conformance checking benefits
from the coexistence of event logs and process
models and may be viewed from two angles. First
Service-Oriented Processes
conclusion
In this chapter, we have presented the core concepts of BPEL and the usage of its constructs to
describe executable service-oriented processes.
We have also discussed extensions to BPEL that
have been proposed by tool vendors to address
some of its perceived limitations, as well as
long-term challenges related to the use of BPEL
in the context of rigorous system development
methodologies.
Currently, BPEL is being used primarily as a
language for implementing Web services using
a process-oriented paradigm. In this respect,
BPEL is competing with existing enhancements
to mainstream programming environments such
as WSE and WCF (which enhance the Microsoft
.Net framework with functionality for Web service
development), or Apache Axis and Beehive (which
do the same for the Java platform). Certainly, BPEL
is making inroads in this area, and there is little
doubt that it will occupy at least a niche position in
the space of service implementation approaches.
Several case studies related to the use of BPEL in
system development projects have been reported
in the trade press. These include a report of BPEL
use at the European Space Agency and in an outsourcing project conducted by Policy Systems for
a state health care service (http://tinyurl.com/zrcje
and http://tinyurl.com/krg3o).
However, it must not be forgotten that BPEL can
also be used to describe the behaviour of services
at a more abstract level. Unfortunately, up to now,
tool vendors have given little attention to exploring
the possibilities opened by the description of BPEL
abstract processes. BPEL abstract processes can be
used to represent service behaviour at different
Service-Oriented Processes
rEfErEncEs
Akkiraju, R., Farrell, J., Miller, J., Nagarajan, M.,
Schmidt, M., Sheth, A., & Verma, V. (2005, April).
Web Service Semantics WSDL-S (Technical
note). University of Georgia and IBM. Retrieved
October 18, 2006, from http://lsdis.cs.uga.edu/library/download/WSDL-S-V1.html
Ankolekar, A., Burstein, M., Hobbs, J., Lasilla,
O., Martin, D., McDermott, D., McIlraith, S.,
Narayanan, S., Paolucci, M., Payne, T., & Sycara,
K. (2002). DAML-S: Web service description
for the Semantic Web. In Proceedings of the 1st
International Semantic Web Conference (pp.
348-363).
BEA Systems, Microsoft, IBM & SAP (2003,
May). Business process execution language for
Web services (BPEL4WS). Retrieved October
18, 2006, from ftp://www6.software.ibm.com/
software/developer/library/ws-bpel.pdf
Berardi, D., Calvanese, D., De Giacomo, G.,
Hull, R., & Mecella, M. (2005). Automatic
Service-Oriented Processes
Service-Oriented Processes
van der Aalst, W.M.P., ter Hofstede, A.H.M., Kiepuszewski, B., & Barros, A.P. (2003). Workflow
Service-Oriented Processes
aPPEndiX 1
Exercises
1.
2.
3.
4.
Describe two different ways supported by BPEL for describing business processes. What are the
differences between them? What are the usages of them?
Describe how BPEL uses WSDL, XML Schema, and XPath.
Define the partner link between a purchase order process and the external shipping service, and
the corresponding partner link type. In this relationship, the purchase order process plays the
role of the service requester, and the shipping service plays the role of the service provider. The
requester role is defined by a single port type called shippingCallbackPT. The provider role is
defined by a single port type called shippingPT.
Consider the following fragments of a BPEL process definition:
<flow>
<sequence>
<invoke name="inv" partnerLink="pl"
portType="pt" operation="op"
inputVariable="var"/>
<receive name="rcv" partnerLink="pl"
portType="pt" operation="op"
variable="var"/>
</sequence>
<sequence>
<receive name="rcv" partnerLink="pl"
portType="pt" operation="op"
variable="var"/>
<receive name="rcv" partnerLink="pl"
portType="pt" operation="op"
variable="var"/>
</sequence>
</flow>
(a)
(b)
Write down all possible execution sequences of activities in the above definition.
Can we add the following pick activity in parallel to the two existing sequence activities in the
above flow? If yes, write down all possible execution sequences of activities in this updated
process definition, otherwise explain why not.
<pick>
<onMessage partnerLink="pl" portType="pt"
operation="op" variable="var"/>
<invoke name="inv" partnerLink="pl"
portType="pt" operation="op"/>
</onMessage>
<onAlarm for=PDTH>
6
Service-Oriented Processes
aPPEndiX 1. continuEd
<exit/>
</onAlarm>
</pick>
5.
6.
7.
This exercise involves two interacting BPEL processes P1 and P2. P1 consists of a sequence of
activities starting with a receive activity and ends with a reply activity. The pair of receive and reply
defines an interaction with process P2. In P2, there is an invoke activity calls a request-response
operation on P1, which triggers the executions of the above pair of receive and reply activities in
P1.
(a) Define an appropriate partner link between P1 and P2 (Assume that P1 plays myRole, and P2
plays partnerRole).
(b) Define the pair of receive and reply activities in P1.
(c) Define the invoke activity in P2.
Describe the difference between switch and pick constructs. Given the four scenarios described
below, which of them can be defined using switch and which of them can be defined using pick?
(a) After a survey is sent to a customer, the process starts to wait for a reply. If the customer
returns the survey in two weeks, the survey is processed; otherwise the result of the survey
is discarded.
(b) Based on the clients credit rating, the clients loan application is either approved or requires
further financial capability analysis.
(c) After an insurance claim is evaluated, based on the findings the insurance service either
starts to organize the payment for the claimed damage, or contacts the customer for further
details.
(d) The escalation service of a call centre may receive a storm alert from a weather service which
triggers a storm alert escalation, or it may receive a long waiting time alert from the queue
management service which triggers a queue alert escalation.
The diagram above sketches a process with five activities A0, A1, A2, A3 and A4. A multi-choice
node splits one incoming control flow into multiple outgoing flows. Based on the conditions associated with these outgoing flows, one or more of them may be chosen. A sync-merge node
synchronises all active incoming control flows into one outgoing flow. Based on the above, sketch
two possible BPEL definitions for this process using sequence, flow and switch constructs. Also,
sketch another BPEL definition of the process using only control link constructs (within a flow).
Service-Oriented Processes
aPPEndiX 1. continuEd
A0
Legend:
multi-choice
[ x>= ]
[ y<=6 ]
A
A
sync-merge
Activity
Control
Node
Control
Flow
[ . . . ] Condition
A
8.
The definition below specifies the execution order of the activities within a BPEL process:
begin flow
begin switch
case conditionC: activityA
case conditionC: activityA
end switch
while conditionC
activityA
begin sequence
activityA
activityA
end sequence
end flow
(a)
(b)
(c)
Can we create the following two control links? Justify your answer.
i) a control link leading from activityA1 to activityA3
ii) a control link leading from activityA3 to activityA5
Can we re-define the original process using only control links within the flow activity? If
so, re-write the process definition, otherwise explain why not.
Assume that there exist two control links: one leading from activityA1 to activityA4, the
other from activityA2 to activityA4. Both links have a default transition condition, that
is, a transition condition that always evaluates to true if the source of the link is executed.
Consider the following two scenarios:
i) activityA4 has a join condition that is a disjunction of all incoming links.
ii) activityA4 has a join condition that is a conjunction of all incoming links.
In both scenarios, activityA4 has its suppressJoinFailure attribute set to yes.
Determine whether activityA4 will be performed in each scenario? Justify your
Service-Oriented Processes
aPPEndiX 1. continuEd
answer and provide a possible execution sequence for each scenario.
What could verification do when analysing a syntactically correct BPEL process? Argue
why automated verification of a BPEL specification is useful.
9. Sketch the control logic of a BPEL process for requesting quotes from an a priori known set of
N suppliers. The process is instantiated upon receiving a QuoteServiceRequest from the Client,
and then a QuoteRequest is sent in parallel to each of the N suppliers (Supplier1, Supplier2, ,
SupplierN). Next, the process waits for QuoteResponse from these suppliers. Assume that (a) each
supplier replies with at most one response and (b) only M out of N responses are required (M<=N),
which means that after receiving the responses from M suppliers, the process can continue without
waiting for the responses from the remaining N-M suppliers. To provide the ability to define how
many responses are required, a loop is created that repeats until the required number of responses
are received. The responses are collected in the order in which they are received. For each response
received, the number of responses received (NofResponse) is incremented, and the variable containing the result (Result) so far is updated. Also, to provide the ability to stop collecting responses
after some period of time (e.g., 2 hours), the above loop is contained within a scope activity that
has an alarm event handler. If the alarm is triggered, an exception (TimeOutFault) is thrown to
be caught in the outer scope, thus allowing the process to exit the loop before it finishes. If the
exception is thrown, then all that needs to be done is to incorporate a Timed Out indication to
the Result. Finally, the process completes by sending the Result to the Client.
10. Below is the BPEL code for the definition of a Supplier abstract process. Since it is an abstract
BPEL process, not all elements are fully specified. In particular, you may note that the condition
in each while loop is omitted, which means that the loop may execute for an arbitrary number of
times.
(d)
<process name="Supplier">
<partnerLinks>
<partnerLink name="client"
partnerLinkType="clientPLT">
myRole="supplierProvider"
partnerRole="supplierRequestor"/>
</partnerLinks>
<variables>
<variable name="inputVariable"
messageType="supplierRequestMessage"/>
<variable name="outputVariable"
messageType="supplierResponseMessage"/>
</variables>
<sequence>
<receive name="order"
partnerLink="client" portType="servicePT"
operation="order" variable="inputVariable"
createInstance="yes"/>
<invoke name="orderResponse"
partnerLink="client"
portType="serviceCallbackPT"
operation="orderResponse"
inputVariable="outputVariable"/>
<scope name="cancellationScope">
9
Service-Oriented Processes
aPPEndiX 1. continuEd
<faultHandlers>
<catch faultName="orderChange">
<sequence>
<invoke name="orderChangeResponse"
partnerLink="client"
portType="serviceCallbackPT"
operation="orderChangeResponse"
inputVariable="outputVariable"/>
<while>
<invoke name="orderChangeResponse"
partnerLink="client"
portType="serviceCallbackPT"
operation="orderChangeResponse"
inputVariable="outputVariable"/>
</while>
</sequence>
</catch>
</faultHandlers>
<eventHandlers>
<onMessage partnerLink="client"
portType="servicePT"
operation="change"
variable="outputVariable"/>
<throw name="throwFault" faultName="orderChange"/>
</onMessage>
</eventHandlers>
<while>
<invoke name="orderResponse"
partnerLink="client"
portType="serviceCallbackPT"
operation="orderResponse"
inputVariable="outputVariable"/>
</while>
</scope>
</sequence>
</process>
(a)
Given the following sequences of executions, indicate which of them are possible and which
of them are not possible based on the above definition. Justify your answer.
i)
receive order;
ii) receive order, send orderResponse;
iii) receive order, send orderResponse, receive change;
iv) receive order, send orderResponse, send orderResponse, receive change, send orderChangeResponse;
v) receive order, send orderResponse, receive change, send orderResponse, send orderChangeResponse.
(b) In the current process definition, the execution sequence receive order, receive change, send
orderChangeResponse is not possible. Indicate what minimal changes need to be made to
the current process definition, so that this execution sequence becomes possible and all the
previous valid execution sequences are preserved.
90
9
Chapter IX
abstract
The promise of dynamic selection and automatic integration of software components written to Web
services standards is yet to be realized. This is partially attributable to the lack of semantics in the
current Web service standards. To address this, the Semantic Web community has introduced semantic
Web services. By encoding the requirements and capabilities of Web services in an unambiguous and
machine-interpretable form, semantics make the automatic discovery, composition and integration of
software components possible. This chapter introduces Semantic Web services as a means to achieve
this vision. It presents an overview of Semantic Web services, their representation mechanisms, related
work and use cases.
introduction
Web services show promise to address the needs
of application integration by providing a standards-based framework for exchanging information dynamically between applications. Industry
efforts to standardize Web service description,
discovery and invocation have led to standards
such as WSDL (2001), UDDI (2002), and SOAP
(2000) respectively. These industry standards,
in their current form, are designed to represent
information about the interfaces of services, how
they are deployed, and how to invoke them, but
are limited in their ability to express what the
Copyright 2007, Idea Group Inc., distributing in print or electronic forms without written permission of IGI is prohibited.
9
Modeling Activities
Domain Modeling,
Ontology Creation &
Management
Service Annotation
Service Registry
Build time Activities
Service
Composition
Deployment/Runtime
Activities
Service Binding
Runtime Adaptation
Service Invocation
Service Execution Monitoring
service discovery
Automatically discovering services involves finding a service that matches a given set of requirements (functional as well as nonfunctional) from
among a repository (either central or distributed)
of services. A match could be syntactic (based
on type and structure matching) and/or semantic
(based on lexical or other name similarities and
ontology matching). In a business-to-consumer
(B2C) setting, this would mean, finding a service
9
9
..
<message name=request >
<part name=SKU_in type=xsd:string />
<part name=reqAmt_in type=xsd:float />
<part name=reqDate_in type=xsd:string />
<part name=acctId_in type=xsd:string />
</message>
<message name=response >
<part name=quantity_out type=xsd:float/>
</message>
<portType name=verifyInventoryAvailability>
<operation name=inventoryAvailability >
<input message=tns:request name=request/>
<output message=tns:response
name=response/>
</operation>
</portType>
..
9
Match
Name
Semantics
Different
Different
Poor Match
False +ves
False +ves
No Match
service invocation
Service discovery can help identify a suitable
service at a semantic level. For example, it can
help identify an inventory checking service that
fits in within the general parameters specified in
the request. However, in order for the application on the requesters side to invoke the chosen
service automatically, a more detailed level of
matching may be required to identify the actual
interface mappings. For example, a service that
96
service composition
Composing existing Web services to deliver new
functionality is a requirement in many business
domains. Service composition extends the notion
of service discovery by enabling automatic composition of services to meet the requirements of a
given a high-level task description. For example, to
fulfill a high level task such as place my purchase
order with a supplier that can supply n number
of parts of type x by a date y, one may require
composing services that can perform digital signing and encryption if the supplier requires the
information to be secure. Figure 3 shows such a
composition. Currently, if such transformations
are required to connect up services, a user must
rEPrEsEntation MEchanisMs
for sEMantic WEb sErvicEs
Web services can be broadly classified as simple
and complex services. Semantic Web community
defines a simple or atomic Web service as a Web
service where a single Web-accessible computer
program, sensor, or device is invoked by a request
message, performs its task and perhaps produces
a single response to the requester. With atomic
services there is no ongoing interaction between
the user and the service (OWL-S, 2005). A simple
Web service is typically considered stateless. In
contrast a complex or composite service is defined
as one that is composed of multiple more primitive
services, and may require an extended interaction
or conversation between the requester and the set
of services that are being utilized (OWL-S, 2005).
A complex Web service typically involves data
flow from one step to another, possibly carrying
state information. For example, a purchase order
process can contain checking the availability of
an item and then placing the purchase order with
the same supplier with whom item availability has
been verified. This process requires information to
ACME Inc.
Order
Document
Order Confirmation
Signed & Encrypted
Document
Purchase Order
Order Document
Processor Service
Input
Output
Order Confirmation
Document
Output
Plain Text
Document
Is an InstanceOf
Digital
Signing Service
Signed
Document
Encryption
Service
Order Document
9
oWl-s
OWL-S defines an upper ontology (please refer
to end notes section of a definition of the term
upper ontology) for describing the properties
and capabilities of Web services in OWL (OWL,
9
2004). It is intended to enable users and software agents to automatically discover, invoke,
compose, and monitor Web resources offering
services, under specified constraints. It defines
high level constructs such as a service profile: to
represent the interfaces of services including inputs, outputs, preconditions and effects, a service
(process) model to represent the details of inner
working of a service, and a service grounding to
provide information about how to use a service.
Whereas OWL-S profile model views a service
as an atomic process, OWL-S service (process)
model captures the state of a service as a complex
interaction process. While OWL-S profile defines
a model for describing the functional properties
of a service via constructs such as inputs, outputs,
preconditions and effects (sometimes referred to
as IOPEs), OWL-S service model uses workflow
constructs such as sequence, if-then-else, fork,
repeat-until and so forth, to define a composite
processes. OWL-S grounding model defines the
necessary links to Web service industry standard
WSDL to use its invocation model. OWL-S, the
result of a DARPA funded project, is among the
first efforts to define a formal model for Semantic
Web services. This work helped trigger many
research efforts both in the academic and industrial research communities. OWL-S is one of the
submissions to W3C for defining a framework for
semantics in Web services.
The following OWL-S snippets illustrate how
checkInventory() service discussed in the Motivation section can be represented in OWL-S. The
profile, process and grounding models together
with a corresponding ontology define a semantic
Web service.
<profileHierarchy:CheckInv rdf:ID=Profile_CheckInventory_Service>
<profile:serviceName>CheckInventoryService_
Agent</profile:serviceName>
.
<profile:hasInput rdf:resource=http://www.daml.org/
services/owl-s/.0/CheckInvProcess.owl#UPC/>
<profile:hasInput rdf:resource=http://www.daml.org/
services/owl-s/.0/CheckInvProcess.owl#DueDate/>
<profile:hasInput rdf:resource=http://www.daml.org/
services/owl-s/.0/CheckInvProcess.owl#Qty/>
<profile:hasOutput rdf:resource=http://www.daml.
org/services/owl-s/.0/CheckInvProcess.owl#ItemAvail
Confirmation/>
</profileHierarchy: CheckInv >
OWL-S profile model snippet for CheckInventory()
service
<process:AtomicProcess rdf:ID=CheckInv>
<process:hasInput>
<process:Input rdf:ID=UPC>
<process:parameterType rdf:resource=http://www.
w.org/00/XMLSchema#string/>
</process:Input>
</process:hasInput>
<process:hasInput>
<process:Input rdf:ID=dueDate>
<process:parameterType rdf:resource=http://www.
w.org/00/XMLSchema#dateTime/>
</process:Input>
</process:hasInput>
<process:hasInput>
<process:Input rdf:ID=numItems>
<process:parameterType rdf:resource=http://www.
w3.org/2001/XMLSchema#float/>
</process:Input>
</process:hasInput>
</process:AtomicProcess>
OWL-S process model snippet for CheckInventory()
service
<message name=CheckInv_Input>
<part name=UPC owl-s-wsdl:owl-s-parameter=
CheckInv:#UPC/>
<part name=dueDate owl-s-wsdl:owl-s-paramete
r=CheckInv:#DueDate/>
<part name=numItems owl-s-wsdl:owl-s-paramete
r=CheckInv:#Qty/>
</message>
OWL-S grounding model for CheckInventory() service
WsMo
Web Service Modeling Ontology (WSMO, 2004),
a European Union funded project, proposes four
main elements: (1) ontologies, which provide the
terminology used by other WSMO elements, (2)
Web service descriptions, which describe the
functional and behavioral aspects of a Web service, (3) goals that represent user desires, and (4)
mediators, which aim at automatically handling
interoperability problems between different
WSMO elements (WSMO, 2004).
Just as OWL-S relies on OWL ontology language for defining its upper ontology, WSMO
framework relies on Web Service Modeling
Language (WSML) (Bruijn, Fensel, Keller,
Kifer, Lausen, Krummenacher, Polleres, &
Predoiu, 2005) for language constructs. Just as
OWL defines multiple flavors namely OWLDL, OWL-Lite, OWL-Full, WSML has variants
such as WSML-Core, WSML-DL, WSML-Rule,
and WSML-Flight. The conceptual difference
between OWL and WSML is primarily in the
formal logic languages that they rely on. While
OWL relies on Description Logics (DL), WSML
is based on different logical formalisms, namely,
type
EAN
type
subClassOf
EAN
UPC
type
UPC
Version
A
type
UPC
Version
E
99
ontology _http://www.example.org/ontologies/exam.
concept Item
nonFunctionalProperties
dc#description hasValue concept of an Item
endNonFunctionalProperties
hasUPC ofType xsd#string
hasDate ofType xsd:dateTime
hasQty ofType xsd:float
concept AvailabilityResponse
nonFunctionalProperties
dc#description hasValue concept of an AvailabilityResponse
endNonFunctionalProperties
hasConfirmation ofType xsd:string
..
/*****************************
* WEBSERVICE
*****************************/
00
webService _http://example.org/CheckInv
capability
sharedVariables ?item
precondition
nonFunctionalProperties
dc#description hasValue The input has to contain
a UPC, due date and qty.
endNonFunctionalProperties
definedBy
?item memberOf Item.
effect
nonFunctionalProperties
dc#description hasValue After the item availability
check confirmation is sent
endNonFunctionalProperties
definedBy
?conf memberOf AvailabilityResponse
[forItem hasValue ?item].
phy
interface
choreography _http://example.org/exChoreograorchestration _http://example.org/exOrchestration
Wsdl-s
Keeping upward compatibility of WSDL, semantic
support for XML schema and industry adoption
issues in view WSDL-S (2005) proposes an incremental approach to add semantic annotations
to WSDL documents. In WSDL-S, users can
add semantic annotations to WSDL documents
using the extensibility framework defined in
the WSDL specification. The semantic annotations could be references to concepts defined
in an external ontology. WSDL-S as such does
not prescribe any particular ontology language
and is defined to be agnostic to the semantic
representation language. Users can use OWL or
WSMO or any other modeling language of their
choice. WSDL-S work came out of METEOR-S
(2003) project from University of Georgia which
was later significantly revised jointly by IBM and
METEOR-S team. Figure 5 captures the essence
of WSDL-S. It highlights how the domain model
WSDL
Types
ComplexType
Element
Annotation
Element
Annotation
..
xmlns:wssem= http://www.myschema.com/schemas/00/wssem
xmlns:ElectronicsOnt=http://www.standards.com/
ontologies/ElectronicsDomain.owl >
st>
<message name=CheckInventoryServiceReque
Interface
Operation
Precondition
Annotation
Effect
Annotation
Comment
Semantic Language
Formalism
OWL-S
Profile Model
OWL
Description Logics
Defines connectivity to
WSDL via Grounding
Model but overlaps in
defining inputs, outputs
and operations exist.
WSDL-S
Agnostic to ontology
language (can work with
OWL, WSMO, UML, XML
or any other modeling
language)
Agnostic to ontology
language. Therefore,
users are at will to pick
a formalism of their
choice
Specifies annotations
directly in WSDL as
extensibility elements.
WSMO
WSML
Description Logics,
First-Order Logic and
Logic Programming
(F-Logic)
Defines connectivity to
WSDL via Grounding
Model but overlaps in
defining inputs, outputs
and operations exist.
0
discussion
Table 1 summarizes the Semantic Web service
representation language submissions to W3C. This
is a modified version of the table given by Sheth
(Sheth, Verma, & Gomadam, 2006).
OWL-S, WSMO and WSDL-S all three have
been submitted to W3C as alternate proposals
to define a framework for semantics in Web
services. While in principle, all three proposals
use similar conceptual underpinnings, the difference is primarily in the scope. As has been
discussed OWL-S proposal presents a framework
for simple (atomic) as well as complex (process)
services. WSMO proposal includes frameworks
for Web service choreography, orchestration in
the scope. WSDL-S, on the other hand, proposes
an approach to add semantics to Web services
specifically aligning itself with industry standard
WSDL while excluding the process specification
models from the scope. In addition, while OWL-S
and WSML services are closely tied with OWL
and WSMO ontology languages respectively,
WSDL-S is agnostic to ontology languages. It
can work with any ontology language because
the annotations are externalized. WSDL-S on the
other hand does not have anything to say about
semantic Web processes as it only deals with adding semantic annotations to simple Web services
represented as WSDLs.
At the time of writing of this chapter, a new
working group has been initiated by W3C organization to help define semantic annotations
for Web services. The working group is called
Semantic Annotations for Web Services
(SAWSDL, http://www.w3.org/2002/ws/sawsdl/).
The charter for the working group (http://www.
w3.org/2005/10/sa-ws-charter) indicates that A
Member Submission, WSDL-S, related to this
work, has been acknowledged by W3C and should
be used as one input to the Working Group.
Apart from these, efforts to create architectures and language requirements for Semantic
Web services were started under the Semantic
0
service discovery
Many approaches such as information retrieval,
AI, database schema matching, software engineering and so forth. have been applied to accomplish
syntactic and semantic matching of service requirements with capabilities.
One of the earliest works on matchmaking engines put in the context of semantic Web services
is presented by Sycara et al. (1999). An updated
system that uses OWL-S based semantics for
match making is given by Paolucci (2002a). In
addition to utilizing a capability-based semantic
Comment
Ontology Language
Formalism
OWL-S Process
Model
OWL
Description Logics
FLOWs
First-Order Logic
WSMO
Orchestration
WSML
0
0
by inferencing the semantic annotations associated with Web service descriptions. Matches due to
the two cues are combined to determine an overall
semantic similarity score. They demonstrate that
by combining multiple cues, we show that better relevancy results can be obtained for service
matches from a large repository, than could be
obtained using any one cue alone.
Recently, clustering and classification techniques from machine learning are being applied
to the problem of Web service matching and
classification at either the whole Web service
level (Hess & Kushmerick, 2003) or at the operation level (Dong, Halevy, Madhavan, Nemes, &
Zhang., 2004). In Hess and Kushmerick (2003),
for example, all terms from portTypes, operations
and messages in a WSDL document are treated
as a bag of words and multidimensional vectors
created from these bag of words are used for Web
service classification. Although this type of classification retrieves matches with higher precision
than full-text indexed search, the overall matches
produced, however, do not guarantee a match of
operations to operations, messages to messages,
and so forth. The paper by Madhavan and colleagues (2001) addresses this aspect by focusing
on matching of operations in Web services. Specifically, it clusters parameters present in inputs
and outputs of operations (i.e., messages) based
on their co-occurrence into parameter concept
clusters. This information is exploited at the
parameter, the inputs and output, and operation
levels to determine similarity of operations in
Web services.
service composition
The literature on Web services matching and
composition has focused on two main directions.
One body of work explored the application of AI
planning or logic-based algorithms to achieve
composition while the other investigated the application of information retrieval techniques for
searching and composing of suitable services in
the presence of semantic ambiguity from large
repositories.
0
First, we consider work that is done on composing Web services using planning based on
some notion of semantic annotations. A general
survey of planning based approaches for Web
services composition can be found in (Peer, 2005).
SWORD (Ponnekanti & Fox, 2002) was one of the
initial attempts to use planning to compose Web
services. It does not model service capabilities in
an ontology but uses rule chaining to composes
Web services. In McIlraith et al. (2001), a method
is presented to compose Web services by applying
logical inferencing techniques on predefined plan
templates. The service capabilities are annotated
in DAML-S/RDF and then manually translated
into Prolog. Given a goal description, the logic
programming language of Golog (which is implemented over Prolog) is used to instantiate the
appropriate plan for composing the Web services.
In Traverso and Pistore (2004), executable BPELs
are automatically composed from goal specification by casting the planning problem as a model
checking problem on the message specification of
partners. The approach is promising but presently
restricted to logical goals and small number of
partner services. Sirin and colleagues (2003) use
contextual information to find matching services
at each step of service composition. They further
filter the set of matching services by using ontological reasoning on the semantic description of
the services as well as by using user input. Synthy
(Agarwal, Chafle, & Dasgupta, 2005) takes an
end-to-end view of composition of Web services
and combines semantic matching with domain
ontologies with planning.
While these bodies of work use the notion of
domain annotations and semantic reasoning with
planning, none of them use domain-independent
cues such as thesaurus. Moreover, they do not
consider text analysis techniques such as tokenization, abbreviation expansions, and stop word
filtering and so forth. in drawing the semantic
relationships among the terms referenced in Web
06
Kinds/Approaches
Simple, Complex
Functional, Nonfunctional
Information retrieval (such as text analysis, vector space, probabilistic models), AI techniques
(such as machine learning, semantic reasoning), Mathematical modeling techniques (such as
linear programming) and combinations
Information retrieval (such as text analysis, vector space, probabilistic models), AI techniques
(such as machine learning, semantic reasoning, planning), Mathematical modeling techniques
(such as linear programming) and combinations
Public, Private
0
0
Examples in B2C domain include consumer oriented online shipping services such as travel reservation, book buying, financial service services
and so forth. Examples articulated by semantic
Web community include: Find me an airline
service that enables me to reserve a flight before
providing a credit card number (Gruninger et al.,
2005). Find me a florist that enables me to pay
with PayPal (Gruninger et al., 2005)
Web services provide a foundation for easier system integration by providing a standards-based
approach. Usage of Web services is continuing
to grow with Web services forming the basis of
integration solutions in many industries. However,
integration and process automation projects are
often expensive and time consuming even with
Web services, and may not result in solutions that
are as flexible and reconfigurable as needed to
best react to todays dynamic business environment. Semantic Web services offer the promise
of automating integration tasks. This automation facilitated by semantic Web services could
potentially save development time and reduce
implementation costs. Technologically this seems
credible. However, these claims are yet to be
verified in rigorous benchmarking exercises by
applying the technology on real-world scenarios.
PotEntial bEnEfits of
sEMantic WEb sErvicEs
Below we examine some of the potential benefits
Semantic Web services as applied to business
process integration when the vision of semantic
Web services is realized.
09
conclusion
Web services are becoming an important technological component in the application integration
domain. Large enterprises are deploying as many
as several hundred Web services, a situation that
brings into focus the need for tools to discover and
integrate these services with each other and with
the other applications in use. Semantics can play
a crucial role in the development of these tools
0
Integration
Progression
on demand solutions
semantics
Workflow collaboration
syntax
application connectivity
data integration
data
Degree of
Semantic
Specification
HTML
XML
RDF
XML
Schema
DTDs
RDF
OWL
Schema
Modeling Language
Presentation
syntax
acknoWlEdGMEnts
The author would like to thank her collaborators
Richard Goodwin, Tanveer Syeda-Mahmood,
semantics
Anca-Andreea Ivan, Biplav Srivastava, Juhnyoung Lee, Prashant Doshi, Kunal Verma, Amit
Sheth, Joel Farrell and John Miller for the stimulating and constructive discussions throughout
her work on Semantic Web services.
rEfErEncEs
Agarwal, V., Chafle, G., Dasgupta, K., et al. (2004).
Synthy. A system for end to end composition of
Web services. Journal of Web Semantics, 3(4).
Akkiraju, R., Goodwin, R., Doshi, P., & Roeder,
S. (2003). A method for semantically enhancing
In Proceeding of the 26th International Conference on Software Engineering (ICSE 2004) (pp.
189-199).
WSDL Technical Committee (2001). Web services
definition language (WSDL) (A W3C Tech. Rep.).
Retrieved October 25, 2006, from http://www.
w3.org/TR/wsdl
WSDL-S Technical Committee (2005). WSDL-S
Web services semantics WSDL-S, W3C Member
Submission. Retrieved October 25, 2006, from
http://www.w3.org/Submission/WSDL-S/
WSMX Technical Committee (2004). Web service
execution environment (WSMX). Retrieved October 25, 2006, from http://www.wsmx.org/
WSMO Technical Committee (2005). Web service
modeling ontology (WSMO). A W3C Member
Submission. Retrieved October 25, 2006, from
http://www.w3.org/Submission/WSMO/
Zheng, L., Benatallah, B., Dumas, M., Kalagna-
6
EndnotE
Chapter X
abstract
Web services are software components that arein generaldistributed over multiple organizations.
They provide functionality without showing implementation details for the purpose of abstracting from
implementation as well as for the purpose of hiding private, that is, organization-internal, processes.
Nevertheless, to use a Web service one must know some of its details, that is, what it does, what it requires,
what it assumes, what it achieves, and to some extent, how it achieves its purpose. The different Web
service standards, frequently summarized as WS*, allow Web services to be specified with descriptions
of such details. In this chapter, we argue that one should go beyond WS* and that it is preferable to
provide semantic descriptions, that is, specifications that can be understood and correctly interpreted
by machines. Thereby, the particular focus of this contribution lies in analyzing the process of semantic annotation, that is, the process of deriving semantic descriptions from lower level specifications,
implementations and contextual descriptions. Hence, the concern of this chapter is really orthogonal
to most other work which equates Web service annotation with Web service specification. We illustrate
here that this is not the case.
Copyright 2007, IGI Global, distributing in print or electronic forms without written permission of IGI Global is prohibited.
introduction
Web services are software components that are
accessible as Web resources in order to be reused
by other Web services or software. Hence, they
function as middleware connecting different parties such as companies or organizations distributed
over the Web. Thereby, a party providing a service
may not be interested in exhibiting their organization-internal processes to the outside world. A
second party consuming such a service may not
be interested in analyzing a given Web service in
order to be able to use it. Therefore, an abstracting
description of a Web service is necessary to allow
for its effective provisioning and use.
The description of a Web service needs to
include some bare technical information in order
that it can be used. This includes:
1.
2.
3.
What it does;
How to invoke the Web service (i.e., the used
communication protocol);
What parameters to provide to the Web
service (i.e., its signature);
The process of semantic Web service annotation in general requires input from multiple
sources, that is, legacy descriptions, as well as
a labor-intensive modeling effort. Information
about a Web service can be gathered for example
from the source code of a service (if annotation
is done by a service provider), from the API
documentation and description, from the overall
textual documentation of a Web service or from
descriptions in WS* standards. Depending on
the structuredness of these sources, semantic
annotations may (have to) be provided manually
(e.g., if full text is the input), semi-automatically (e.g. for some WS* descriptions) or fully
automatically (e.g., if Java interfaces constitute
the input). Hence, a semantic description of the
signature of a Web service may be provided by
automatic means, while the functionality of Web
service operations or pre- and postconditions of
a Web service operation may only be modeled
manually.
Benefits of semantic specifications of Web
services include a common framework that integrates semantic descriptions of many relevant
Web service properties. In contrast to WS* standards, such a semantic framework allows complex
requirements to be queried for (e.g., Give me
all Web shop services that offer customer care
facilities and use 128-bit encoding of all communication; cf. (Oberle, 2006)). The benefits
harvested from semantic specifications must be
traded off against manual semantic modeling, that
is, manual semantic annotation of Web services.
While the benefits of semantic specifications, that
is, the result of a semantic annotation process, are
explored in many papers there are few papers that
investigate the efforts that have to be carried out
during modeling (cf. some very notable exceptions like Hess & Kushmerick, 2003; Patil et al.,
2004; Sabou, Wroe, Goble, & Struckenschmidt,
2005; Zaremba, Kerrigan, Mocan, & Moran,
2006). It is the purpose of this chapter to explain
the conceptual gap between legacy descriptions
9
usE casE
As use case scenario we consider the development
of a Web shop. In order to save time, reduce costs,
and improve maintainability, existing software
should be reused to implement the shop. Web
services are chosen for the implementation as they
meet such requirements, among others, allowing
for easy use of services provided by other companies, for example payment services. The Web
shop should provide a Web based user interface,
but should also be reusable. Hence, the Web shop
will be a complex service composed of several
other complex services. Ideally, the development
process is supported by a Web service development
environment that can be used to organize the Web
services from which the Web shop is built.
architecture
Figure 1 shows parts of the Web shop architecture
using other Web services.
The payment service is a central component
of the Web shop. It is not an atomic service but an
aggregation of several specialized services. The
aggregated payment service provides a uniform
Figure 1. Web shop architecture example
0
development Process
In order to build the Web shop, the developer
proceeds as follows: When he needs a specific
Web service, he queries a Web service registry
by specifying the desired properties of that Web
service. For instance, a query specifying the effect of a successful money transfer has as result
a list of all payment services. A query with the
condition that only credit card payment can be
used will get as response all credit card payment
services. In the ideal case, the query processor
finds a service that matches all requirements, but
in most cases it only finds services that fulfill some
but not all requirements, so the engineer has to
make some adjustments to use them.
sources of information
Various information sources of different quality
describing different Web service properties exist.
In the following, we give a list of such sources
and analyze the provided information:
Description Example:
Documentation and API-Description
Listing 1 shows a cutout of the documentation
provided for the payment service as used in our
example. The cutout describes the Transactionoperation and refers to the API-description, here
called format description. The description of
the operation is very short and does not contain
very valuable information as it only inaccurately
characterizes input parameters by stating that all
transaction information including the payment
service provider are submitted to the service.
The specification of the output parameter, the
transaction identification number, is more detailed because only one parameter exists and
is mentioned in the description, but the exact
format of it is also missing. While information
about preconditions that must hold before the
operation of a Web service can be executed and
Description Example:
Operation Interface Description
To enable reuse and discovery of Web services,
information about a Web services interface is
needed. The interface is the property of a Web
service which can be described very simply as it
consists of all methods a Web service provides as
an endpoint for machine-to-machine communication, and the messages which those methods accept
and which they return. A method interface can be
described with languages like the Web Services
Description Language (WSDL), which is a XML
based general-purpose language designed for the
description of Web service interfaces (Christensen, Curbera, Meredith, & Weerawarana,
2001). WSDL is a widespread language adopted by
many Web service supporting application servers
which often facilitate the dynamic generation of
WSDL files out of the source code or java class
descriptions. Because WSDL is designed for the
syntactical specification of Web service interfaces,
a WSDL description does not include any semantic
information about the service.
Listing 3 illustrates the use of WSDL to
describe the operation interfaces of a simplified
version of a credit card payment service. The first
lines (lines 2-6) of the description declare used
namespaces, which define the used XML-elements and XML-types, and facilitate readability.
After the namespace specifications, new complex
data types are declared from line 7 to line 23, for
example the validity information type that holds
all needed information for a validity check. WSDL
describes operations of a Web service as network
endpoints (Christensen et al., 2001) that communicate by retrieving and sending messages which
contain the data delivered to or returned from the
operation (lines 24-29). From line 31 to line 35
port types are defined and associated with corresponding operations, which are connected with
the appropriate input and output messages. After
the port types are defined the binding between the
Web services endpoints, their port types, and the
kind of their supported invocation is specified (in
line 38 to line 55). The last part of the listing (from
line 56) describes the Web service and contains
the locations (as URI) where the services operations can be reached. Some further information
can be gained from optional comments, but in
general other information sources are needed to
gain all knowledge needed to automate the reuse
of a Web service described with WSDL.
Using WSDL, an engineer developing the Web
shop gets detailed descriptions of Web services
telling him how to invoke operations on Web
6
Listing 3. continued
</binding>
56 <service name=SimplifiedCreditCardPaymentService>
<documentation> This is a ... </documentation>
58 <port name=SimplifiedCreditCardPaymentPort
59
binding=tns:SimplifiedCreditCardPaymentBinding>
60 <soap:address location=http://example.org/creditcardpayment/>
6 </port>
6 ...
6 </service>
64 </definitions>
semantics of descriptions
Functional Semantics
The functional semantics describe the functionality of a Web service operation. This can be
expressed with the help of effects. Before the
execution of an operation, certain pre-conditions
have to be fulfilled. For instance, a money transfer
from account A to account B can only take place
if account A is solvent. Preconditions and effects
can be used to express the protocol semantics and
functional semantics of Web services. Before we
continue to examine how information about them
can support the Web service development process
we want to analyze both in more detail.
We distinguish between internal preconditions
which are checked (internally) by the application
logic or ensured by type conditions of the programming language, and external, more general
preconditions which must be fulfilled for a cor-
9
0
Nonfunctional Semantics
Many different nonfunctional properties exist.
We provide a short list of common nonfunctional
properties below:
The costs of a Web service are a nonfunctional property that is relevant for the customer who wants to use the service.
Security is a nonfunctional property that is
(like costs) relevant to most Web services.
Security enfolds aspects like communication
and data encryption.
The quality of service property is a bundle
of different properties that all affects the
quality of specific aspects of the service,
its usage and its output. Examples for such
properties are the response time of an operation call and the capacity of the service.
The information about the resource usage
of a Web service can be of importance, if
Listing 8. Expansion to Listing 5: Definition of classes and relations in a payment domain ontology
securityType::enumeration[values *->> {none, norm_X}].
securityNFProps::nonFunctionalProperties[security *=> securityType].
Listing 9. Expansion to Listing 6: Service specification of a simplified credit card payment service
creditCardPayment:service[
...
nfProp -> validityCheckNFP::securityNFProps[
serviceName -> SimplifiedCreditCardPaymentService:string,
providerName -> Credit Card Service Provider X:string,
security -> validityCheckSType::securityType[
values ->> {norm_X}].
...].
6
others the reuse of the service. To decrease annotation efforts one has to increase the management
efforts and vice versa. A reasonable tradeoff is
usually achieved when the combination of both
is minimal; however, for different scenarios there
are other factors influencing the optimal combination. For instance, a service provider may increase
his modeling efforts to decrease his customers
management efforts to gain an advantage over his
competitors. Thus, everyone annotating a Web
service has to find his own tradeoff depending
on his individual context.
rElatEd Work
In this chapter we have shown how Web services
can be annotated to ease Web service discovery,
reuse, composition, and further Web service
management tasks. The approach followed in
this chapter uses existing resources to fulfill the
task of Web service annotation. The techniques
and methods presented here are independent of
the used languages or tools. In addition, our approach uses one description language to specify
all semantic descriptions. In this section we introduce some other approaches to describe Web
services.
While we have shown what specifications of
Web services are and how they can be achieved,
different approaches exist for developing semantic
descriptions of Web services:
For instance, in Patil et al. (2004), the METEOR-S Web Service Annotation Framework (MWSAF) is introduced that provides
algorithms, which semi-automatically match
items in WSDL files to ontology concepts
by means of SchemaGraphs. SchemaGraphs
are devised by the authors as graphs that are
used to represent ontologies independent of
the language the ontology is specified in.
Another example is ODESWS (GmezPrez, Gonzlez-Cabero, & Lama, 2004),
a tool suite that provides tools for semi-automatic annotation, design and composition
of Web services. For this purpose ODESWS
provides miscellaneous ontologies to support semantic specifications.
acknoWlEdGMEnts
This work is conducted with respect to the
Adaptive Services Grid (ASG) project and the
project Knowledge Sharing and Reuse across
Media (X-Media), both funded by the Information Society Technologies (IST) 6th Framework
Programme.
conclusion
rEfErEncEs
EndnotEs
1
9
0
Chapter XI
abstract
This chapter surveys existing approaches to Semantic Web service discovery. Such semantic discovery
will probably substitute existing keyword-based solutions in the near future, in order to overcome the
limitations of the latter. First, the architectural components along with potential deployment scenarios
are discussed. Subsequently, a wide range of algorithms and tools that have been proposed for the realization of Semantic Web service discovery are presented. Moreover, key challenges and open issues, not
addressed by current systems, are identified. The purpose of this chapter is to update the reader on the
current progress in this area of the distributed systems domain and to provide the required background
knowledge and stimuli for further research and experimentation in semantics-based service discovery.
introduction
Right after the Web infrastructure had matured
enough, both academia and industry have recognized the necessity of enabling interactions
between processes over the Web. The nature
Copyright 2007, IGI Global, distributing in print or electronic forms without written permission of IGI Global is prohibited.
and conceptually similar to the service descriptions. Finally, the service requests are matched
against the service descriptions with the aid of
matching algorithms (implemented in a matching
engine) in order the service requestor to discover
the available services that are most relevant to its
requirements. If the matching algorithm successfully discovers a service description, it provides
the requestor with the service invocation details
for the corresponding service instance.
the UDDI API used for inquiring and publishing UDDI data
Matching Methods
Standard UDDI registries allow for keywordbased search methods. Specifically, services can
be searched by category, name, location, business, bindings or tModels. Such matching can
be facilitated by the so-called find qualifiers,
which enable case-sensitive/insensitive search
and approximate search through wildcards.
Many researchers have proposed extensions to
this querying mechanism. The most interesting
approaches are based on information retrieval
techniques. Their core idea is to represent the
6
Service Advertisement
Service advertisements are not described
through the WSDL parameters or operation/service names anymore, but according to specific
Service Annotation Ontologies. Such ontologies
define semantic models for describing various
different perspectives of WS (functionality,
execution flow, invocation details). The actual
annotation terms used in the service advertisements are expressed through shared vocabularies,
defined by Domain Ontologies. Hence, service
advertisements are documents that comply to
specific models and refer to description terms
from external terminologies.
The values of the IOPE attributes are usually terms (i.e., atomic or complex concepts)
specified in domain ontologies (see subsequent
paragraph).
Several upper (i.e., application-independent)
ontologies have been already proposed for service
description. The first one was DAML-S (MacIlraith & Martin, 2003), based on the DAML+OIL
ontology definition language. However, with the
wide acceptance of the Web Ontology Language
(OWL) family of languages, DAML-S was replaced by OWL-S (Martin, 2005). At the same
time, other academic and industry groups were
working on similar specifications, resulting in
the WSDL-S (Li, 2006), WSMO (Roman, 2005)
and SWSO (SWSL Committee, 2005) ontologies.5
All these specifications, although sharing many
modeling elements, differ in terms of expressiveness, complexity and tool support.
Most service annotation ontologies are divided
in three logical parts, which, according to the
popular OWL-S terminology, are:
Domain Ontologies
The semantic service annotations, as specified by
the Service Annotation Ontologies, are no more
expressed through unstructured text, but rather
refer to concepts in Domain Ontologies. Such
ontologies describe the terminology and term
relationships in specific application domains. For
instance, when describing services about wines,
a Wine Domain Ontology describes the various
kinds of wine, by also defining their special
characteristics. Domain ontologies may be written in various ontology languages (e.g., OWL,
DAML+OIL or RDF(S)). For an overview of such
languages the reader is referred to (Antoniou,
2004; Gomez-Perez, 2004). Such languages, in
general, have different expressive power. However, all of them can at least model hierarchies
of concepts and concept roles (i.e., properties,
attributes or relationships). As discussed in
Service Request
Depending on the system implementation, the service request may vary from natural language text
to documents compliant to the service annotation
ontologies. Irrespectively of the adopted format,
a request should contain information relevant to
that specified by the used service annotation ontology (e.g., IOPE values). In general, possibly after
appropriate mediation, the request is transformed
to a document that structurally resembles the service advertisement. The use of the same Domain
Ontologies by both the service requestors and
providers, although not mandatory, significantly
simplifies the matching process by eliminating
the need for mediation layers. However, this is a
rather optimistic scenario, not suitable for open
information environments like the Web.
Service Registry
The traditional WS registries (e.g., UDDI) are
still used by most SWS discovery architectures.
However, besides the traditional WS descriptions
(e.g., tModel entries, WSDL documents), they also
store, or contain references to, semantics information annotating the advertised services. Several
methods have been proposed for linking semantic
service annotations to UDDI registry entries (e.g.,
tModels). Some of these will be briefly discussed
in the following paragraphs.
Matching Algorithm
The semantic matching algorithms are, in
general, more complex and intelligent than
their syntax-based counterparts. They are designed so as to exploit the explicit functionality
semantics of the provided service descriptions and
requests. Many matchmaking algorithms have
Architecture I:
Importing Semantics to UDDI
The authors in (Paolucci, 2002b; Srinivasan,
2004) discussed an extension to the UDDI registry so that they take advantage of the semantic
9
Architecture II:
External Matching Services
A more seamless integration of semantic matching mechanism with UDDI registry can be seen
in Figure 4, where the matching algorithms are
published as WS in the UDDI. Thus, UDDI can
detect and select the most suitable matching
service for each service request, and then invoke
it for the actual semantic matching. Moreover,
the UDDI registry may offer the possibility to
use multiple matchmaking services to fulfill a
given service request. For instance, there may
be matching service providers that offer diverse
semantic matching mechanisms (e.g., implementing different algorithms or supporting different
service annotation ontologies).
In order to support such functionality, UDDI is
extended with two processes. The first determines
0
algorithmic approaches to
Matchmaking
Approach I:
Semantic Capabilities Matching
One of the first, and probably most influential,
works in the field of SWS discovery is that described in (Paolucci, 2002a). The basic idea behind
this approach is that an advertisement matches
a request when all the outputs of the request are
matched by the outputs of the advertisement, and
all the inputs of the advertisement are matched
by the inputs of the request (p. 338). Thus, this
method takes into account only the inputs and outputs of the service profiles during matchmaking.
The degree of match between two outputs or two
inputs depends on the relationship between the
domain ontology concepts associated with those
inputs and outputs. Four degrees of match have
been specified in Paolucci (2002a), as shown in
Table 2 (in decreasing significance order). The
DoM in this table is computed per output. For service inputs, req.o and adv.o should be substituted
by adv.i and req.i, respectively (such inversion is
dictated by the approach). The DoM between a
service advertisement and a service request is the
minimum DoM of all inputs and outputs.
adv
A single advertisement in A
req.O
adv.O
req.I
adv.I
X.o
X.i
X.SC
X.par
Approach II:
Multilevel Matching
A variant of the Approach I is presented in (Jaeger,
2004). The key differentiator in this work is the
Matching conditions
EXACT
PLUGIN6
SUBSUMES
FAIL
Subsumption relation
Meaning/Potential problems
Approach III:
DL Matchmaking with Service Profile
Ontologies
The Approach I, as already discussed, is based
on subsumption matching of service capabilities. Particularly, the various capabilities (i.e.,
inputs and outputs) of service advertisements are
matched against the corresponding request capabilities. The actual degree of match is evaluated
according to the subsumption relations between
the domain ontology concepts that describe these
capabilities. In (Gonzales-Castillo, 2001; Li,
2004) another logic-based approach is presented,
where the matchmaking of services is performed
through a service profile ontology. In this ontology, each service is represented by a complex
DL concept expression, which describes all the
service constraints (inputs, outputs, etc.). For the
Dating Services domain, a part of such ontology
would resemble that shown in Figure 8(a). In fact,
such ontology can be considered as a logic-based
registry for service advertisements.
Figure 8. A service profile ontology before (a) and after (b) the insertion/classification of the service
request Q
(a) before
(b) after
Example. Figure 9 depicts two service advertisements in DL syntax. Specifically, FreeDatingService is described by its service profile through
the hasServiceProfile role (we assume that this
role and the concept ServiceProfile are specified
in a service annotation ontology, while all other
6
Approach IV:
Similarity Measures and Information
Retrieval Techniques
All the variants of the aforementioned logic-based
approaches have the drawback of exploiting only
the subsumption relations in various ontologies
(domain or service profile) in order to assess
similarity of services, service capabilities, and
so forth. However, this is not sufficient as shown
in the following example.
Example. A requestor of a dating service
searches for services that take as input the concept
{InterestProfile hasInterest.SciFiMovies},
which is a subconcept of InterestProfile, and
return as output the concept ContactProfile (i.e.,
the service finds contact details for persons that
are interested in SciFi movies). In the service
registry there are only two services whose inputs
match (as defined in Approach I) with the concept
InterestProfile:
Find_Interests_Of_Female_MSN_Use r s : i n p u t = { P e r s o n h a s G e n d e r. M a l e }
output={InterestProfile, ChatID}
Find_ Person_by _Interest: input={InterestProfile}
output={ContactProfile}
Approach V:
A Graph-Based Approach
In Trastour (2001) a semantic graph matching approach is proposed. A service description (request
or advertisement) is represented as a directed
graph (RDF-graph), whose nodes are instances of
concepts (i.e., individuals) and arcs are properties
(i.e., concept roles) relating such instances. The
root node of each graph is the individual representing the service advertisement/request itself.
The other nodes refer to concepts borrowed from
domain ontologies. Such concepts describe the service functionality (capabilities, constraints, etc.).
An example graph for a dating service is shown in
Figure 11. The actual nodes of the service graph
are those written in lowercase letters and having
outgoing io arcs. The matchmaking between
two graphs, one representing a service request
and another representing a service advertisement,
is performed with the recursive algorithm shown
in Figure 12.
This approach, as described in Figure 12,
exploits standard subsumption matching through
the subsumes operator. However, other
operators could be used as well. For instance,
one could replace the subsumes operator
Figure 11. Graph representing the service S (io:instance-of, dashed line:concept role)
9
Approach VI:
Indirect Graph-Based Matching
As already mentioned, indirect matching refers
to the identification of the service compositions
that are most relevant to the user request. In case
of simple compositions, we can think of them as
graph-like workflows (see Figure 13), which can
2.
60
Inputs
Outputs
S1
A, B
S2
A, B, C
F, N
S3
E, C
S4
K, M
S5
K, D
Z, Y
S6
D, Z
S7
If we impose further restrictions on the discovery process, for example, always prefer the shortest service chains, then a shortest path algorithm
could be used (e.g., Dijkstra). In our example, this
would result in selecting the Service Chain 4. Such
matchmaking method can be further extended and
become more flexible. One extension point is the
introduction of weighted arcs, possibly depending
on the degree of match between the inputs/outputs
of two successive services. Moreover, one could
define arcs as follows, resulting in more refined
service composition:
(ni, nj): a directed arc originating from ni, created
only when there is at least one match between an
input of nj and an output of ni
6
Approach VII:
Indirect Backward Chaining Matching
Another, more logic-based, approach for indirect service matching is based on the backward
chaining inference method. Backward chaining
is a goal-driven reasoning procedure (see Figure
14) similar to the way the Prolog language works.
The main idea behind this approach is that, starting from services that match the service request
outputs but not its inputs, we recursively try to link
them with other services until we find a service
6
(i.e., logic- and similarity-based), and hybrid-g (i.e., logic- and graph-based).
synopsis
In Table 5, we present a synopsis of the previously described matchmaking approaches. The
columns of this table reflect the following characteristics:
other approaches
The approaches described above, are only a portion of the approaches that have been proposed in
the relevant literature. The area of SWS discovery
is still in its initial phase (i.e., no standards have
gained wide adoption yet) and, thus, there is much
experimentation from researchers worldwide.
One of the most important approaches, which is
also pursuing standardization, is WSMO, which
is the cornerstone for WSMO Discovery (Keller,
2004). This involves a quite mature suite of
modeling specifications and tools and has been
developed by the Digital Enterprise Research
Institute (DERI) in the general context of the Web
Services Modeling FrameworkWSMF (Fensel,
2002). This approach focuses on a very flexible
SWS architecture suitable for commercial and
industrial use scenarios. The researchers have
developed a suite of languages, ontologies and
execution environments (WSML, WSMO, and
WSMX, respectively) that can support practical
SWS provision. The discovery in this framework
is supported by the WSMO service annotation
ontology. This approach focuses on a more re-
Characteristics
Matching elements
Indirect matching
Algorithm
type
IO
Yes
No
Logic
II
Yes
No
Logic
III
Service profile
Yes
No
Logic
IV
Yes
No
Hybrid-s
Service profile
No
No
Hybrid-g
VI
IO
No
Yes
Hybrid-g
VII
IO
No
Yes
Logic
6
6
OWL-S/UDDI Matchmaker
(OWL-S/UDDIM)
This is one of the first tools supporting semantic
service matching. It adopts and implements the
matchmaking Approach I, a subsumption-based
matching of OWL-S service requests and advertisements. The tool also allows users to build their
OWL-S descriptions, either advertisements or
6
66
Feature
Service Annotation
Algorithmic Approach
OWL-S/UDDIM
OWL-S
Approach I
OWL
STWS
WSDL-S
N/A
OWL
OWLS-MX
OWL-S
Approach IV
OWL
MWSDI-Lumina
WSDL-S
N/A
OWL
OWLSM
OWL-S 1.0
Approach II
OWL
WSMX
WSMO
WSMO Discovery
b.
By directly matching and aligning the concepts of the requestors ontology to those of
the providers ontology, possibly through the
use of similarity measures
By linking the top-level concepts of each
domain ontology to a common standard
upper (i.e., abstract) ontology, like Standard
Upper Ontology (Niles, 2001).
6
mediation. The trend for service mediation is gaining attention in academia and industry. Among
the leaders of this effort is the WSMO working
group (Roman, 2005), who introduce mediators
as key business actors in the service discovery
architectures of the previous section. Finally, one
should note that the necessity of mediation is also
dependent upon the adopted architecture. For
example, in a P2P discovery scenario we expect
a wider diversity in domain ontologies than in a
centralized scenario, since service providers are
more inclined to the use of local ontologies.
6
acknoWlEdGMEnt
This work was performed in the context of the
PENED Programme, co-funded by the European Union and the Hellenic Ministry of Development, General Secretariat for Research and
Technology (research grant 03ED173).
rEfErEncEs
Akkiraju, R., Goodwin, R., Dishi, P., & Roeder,
S. (2003, August). A method for semantically
enhancing the service discovery capabilities of
UDDI. Paper presented at the IJCAI-03 Workshop
69
0
Jaeger, M.C. & Tang, S. (2004). Ranked matching for service descriptions using DAML-S. In J.
Grundspenkis & M. Kirikova (Eds.), CAiSE04
Workshops (pp. 217-228). Riga, Latvia. Faculty of
Computer Science and Information Technology,
Riga Technical University.
Kalfoglou, Y. & Schorlemmer, M. (2003). Ontology mapping: The state of the art. The Knowledge
Engineering Review, 18(1), 1-31.
Keller, U., Lara, R., Polleres, A., Toma, I., Kifer,
M., & Fensel, D. (2004), WSMO Web service
discovery (WSML Deliverable D5.1v0.1). Retrieved October 24, 2006, from http://www.wsmo.
org/TR/
Klusch, M., Fries, B., Khalid, M., & Sycara, K.
(2005). OWLS-MX: Hybrid semantic Web service
retrieval. In Proceedings of the 1st International
AAAI Fall Symposium on Agents and the Semantic
Web, Arlington, Virginia. AAAI Press.
Langley, B., Paolucci, M., & Sycara, K. (2001,
May). Discovery of infrastructure in multi-agent
systems. Paper presented at Agents 2001 Workshop
on Infrastructure for Agents, MAS, and Scalable
MAS, Montreal, Canada.
Lara, R., Lausen, H., Arroyo, S., de Bruijn, J., &
Fensel, D. (2003, June). Semantic Web services:
Description requirements and current technologies. Paper presented at the Semantic Web Services for Enterprise Application Integration and eCommerce Workshop (SWSEE03), in conjunction
with ICEC 2003, Pittsburgh, Pennsylvania.
Li, L., & Horrocks, I. (2004). A software framework for matchmaking based on Semantic Web
technology. International Journal of Electronic
Commerce, 6(4), 39-60.
Li, K., Verma, K., Mulye, R., Rabbani, R., Miller,
J., & Sheth, A. (2006). Designing Semantic Web
processes: The WSDL-S approach. In J. Cardoso
& A. Sheth (Eds.), Semantic Web services, pro-
Richard, G.G. (2002). Service and device discovery: Protocols and programming. New York:
McGraw-Hill Professional.
STWS (2006). IBM semantic tools for Web services. Retrieved October 24, 2006, from http://
www.alphaworks.ibm.com/tech/wssem
EndnotEs
1
In the rest of the chapter, the terms matching, discovery and matchmaking will
be used interchangeably.
NAICS: North American Industry Classification System.
Creating an OWL Ontology describing the context for the offered services, like person profile,
ChatIDs, location, and interests,
Semantically annotating the service advertisements according to the IO parameters of the OWL-S
service profile,
Semantically annotating the requestors service requests describing the preferred IO attributes,
also expressed in OWL-S,
Discovering the most relevant matching SWSs.
Step 1: Creation of the Domain Ontology and the Service Definitions (in OWL)
We introduce an OWL representation of the domain knowledge used by the advertised services. Such
ontology (Ont) consists of concepts and roles among them. Ont defines the concept Person, which is
associated with a Location, a Gender (male or female) and a PersonalProfile. Moreover, the concept
PersonalProfile subsumes the ChatID, the ContactProfile, and the InterestProfile concepts. The first
denotes the specific chat identifier given by Instant Messaging providers, while the other two contain
information related to contact information (e-mail, mobile phone number) and user interests (entertainment, movies, sport), respectively. The complete taxonomy and the basic roles of this ontology are
shown in Figure 16. In practice, one would use a different spatial ontology for the spatial concepts, and
so forth. We have merged all domain concepts into Ont for simplicity reasons.
Let us consider the example service S1 (see Figure 15). Such service takes as inputs the persons
gender, the persons location (any city area) and the persons interests (movies). It returns the persons
ChatID and the persons contact information. Service S2 is expressed similarly (see Figure 15). The
two service advertisements are those that will be matched against any request description in the OWLS/UDDI Matchmaker.
Figure 15. Service advertisements S1 and S2 (written in DL notation). The concept ServiceProfile and
the roles hasInput, hasOutput and presents are OWL-S terminology.
s1 presents. (ServiceProfile
hasInput.(PersonhasGender. GenderhasLocation.CityArea
hasInterestProfile.(InterestProfilehas Interest.Movies))
has Ouput.(PersonhasChatID.ChatIDhasContactProfile.ContactProfile))
s2 presents.(ServiceProfile
hasInput.(PersonhasGender.FemalehasLocation.UrbanArea
hasInterestProfile.(IntrestProfilehasInterest.Music))
hasOutput. (PersonhasPersonalProfile.PersonalProfile))
ally, the service provider has to explicitly define: (1) the corresponding domain ontology, (2) the inputs
and outputs, and (3) other service parameters, using the OWL-S notation. Figure 17 depicts a part of the
advertisement S1 in OWL-S syntax. Note that the hasInput and hasOutput tags refer to Ont concepts.
Figure 19 shows how such specifications can be loaded in the Matchmaker registry. Specifically,
the provider sets a service name and a text description.7 Furthermore, the service provider defines the
two inputs and the two outputs through their corresponding concepts. In the service category field,7
the provider can define one or more service category classes referring to some well-known service
classification schemes, like NAICS and UNSPSC. Once the provider has successfully annotated the
advertisement, she submits it to the OWL-S/UDDI registry, and an advertisement ID is returned. We
repeat Step 2 for service S2.
</profile:hasOutput>
<profile:hasOutput>
<process:UnConditionalOutput rdf:ID=out_>
<process:parameterType>
http://p-comp.di.uoa.gr/ont#ChatID
</process:parameterType>
</profile:hasOutput>
6
matched with S2 gender type (both are of type Female). Additionally, Q1 interest type is subsumed by
S2 interest type, something that not applied for S1. Finally, the Q1 location type (CapitalArea) is subconcept of the S2 UrbanArea. Thus, all of the S2 inputs subsume the inputs of Q1. Moreover, Q1 output
type is a ChatID, which is more specific than the S2 output type (PersonalProfile). Hence, the S2 is
selected with the same score as S1, and the Matchmaker returns it with a score of 5.
More interesting is the semantic matching between Q2, as defined in Table 7, with the two service
advertisements. Moreover, Q2 inputs are partially subsumed by the S1 and S2 inputs, but Q2 outputs are
exactly matched with those of S1, something that does not hold for the outputs of S2 (Personal Profile
is super-concept of both Q2 output types). Hence, the Matchmaker assigns a score 7 to S1 and a score
2 to S2.
Input Type
Output Type
Q1
In_1: Female
In_2: CapitalArea
In_3: ClassicMusic
Q2
In_1: Male
In_2: Sport
In_3: RuralArea
S1
In_1: Gender
In_2: CityArea
In_3: Movies
Out_1: ChatID
Out_2: ContactProfile
S2
In_1: Female
In_2: Music
In_3: UrbanArea
Out_1: PersonalProfile
Out_1: ChatID
Out_1: ChatID
Out_2: ContactProfile
Score out of 10
Q1 with S1: 5
Q1 with S2: 5
Q2 with S1: 7
Q2 with S2: 2
(see Table 8). Moreover, DLs have terminological axioms, which make statements about how concepts
or roles are related to each other (see Table 9).
The concepts of a terminology may be either primitive (described through necessary conditions,
i.e., inclusion axioms) or defined (described through necessary and sufficient conditions, i.e., equality
axioms).
Example: The following DL description illustrates the primitive concept of those young males (C)
that are interested in sports and dislike all kinds of movies. Hence, in DL syntax:
C Young Male hasInterest.(Interest Sports) dislikes.(Movies)
Through this description it is implied that if a person is a kind of C, then she is interested in sports
and dislikes all kinds of movies. The inverse does not hold. If, on the other hand, we define C as:
C Young Male hasInterest.(Interest Sports) dislikes.(Movies)
we additionally imply that if a young male person is interested in sports and dislikes all kinds of movies,
then it definitely is a kind of C, which may not be the case in general. Thus, one should be sure that the
defined concepts are well defined, or else she may receive false inferences.
The popularity of DL-based ontologies is based on the fact that DL reasoning engines (a.k.a. reasoners) offer efficient services over the TBox and ABox assertions (i.e., concepts, roles and individuals).
DL
syntax
Example
Meaning
Intersection
CD
Young Male
Union
CD
Young Male
Value restriction
R.C
hasInterest.Movies
Existential role
quantification
R.C
hasInterest.Sports
Atomic negation
Male
Axiom
DL syntax
Example
Meaning
Inclusion
(subsumption)
CD
Young Person
Equality
CD
Young Teenager
Disjoint
CD
TeenagerAdult
The most important services are concept satisfiability (i.e., if a concept can be populated with instances
and, thus, the TBox knowledge is consistent), and determination of concept subsumption (i.e., whether a
concept C is more general than a concept D, or, otherwise stated, C subsumes D). Another service that
is provided by a DL reasoner is the decision on whether a set of ABox assertions is consistent, that is,
the instances do not have contradicting implications. Satisfiability and consistency checking are useful to determine whether a knowledge base is meaningful at all. The following example illustrates the
concept of concept satisfiability.
Example: There could never exist a person P who has an interest I which is both Sports and Movies, that is, I (Interest Movies Sports), since the latter two concepts are disjoint. Hence, the TBox
containing a concept P such that:
P Person hasInterest.(Interest Movies Sports)
is satisfiable, since it describes a person interested, at least, one interest in Movies and at least another
interest in Sports.
DL reasoners also perform classification in a TBox. This is the task of placing a new concept expression in the proper position in a hierarchy of concepts. An example of classification is the following:
Example: A young person, who is interested in SciFiMovies (concept C), is subsumed by another young
person, who is interested in Movies (concept D). On the other hand, a young person who is interested in
Sports (concept E) does not subsume C, since Movies (and consequently SciFiMovies) are considered
as disjoint with Sports. Hence, the following TBox statements, could only infer that C D:
C Young hasInterest.(Interest SciFiMovies)
D Young hasInterest.(Interest Movies)
E Young hasInterest.(Interest Sports)
SciFiMovies Movies Interest
Sports Interest
Sports Movies.
The reader is referred to (Baader, 2003) for further information on DL theory and applications.
9
0
API
DAML
DL
Description Logic
DoM
Degree of Match
FOL
IOPE
IR
Information Retrieval
OWL
P2P
Peer-to-Peer
RDF
SLP
SW
Semantic Web
SWS
SWSL
SWSO
TFIDF
UDDI
UPnP
W3C
WS
Web Service
WSDL
WSML
WSMO
Chapter XII
abstract
The Web service modeling ontology (WSMO) provides a conceptual framework for semantically describing Web services and their specific properties. In this chapter we discuss how WSMO can be applied for
service discovery. We provide a proper conceptual grounding by strictly distinguishing between service
and Web service discovery and then present different techniques for realizing Web service discovery.
In order to cover the complete range of scenarios that can appear in practical applications, several
approaches to achieve the automation of Web service discovery are presented and discussed. They
require different levels of semantics in the description of Web services and requests, and have different
complexity and precision.
Copyright 2007, IGI Global, distributing in print or electronic forms without written permission of IGI Global is prohibited.
introduction
The Web is a tremendous success story. Starting
as an in-house solution for exchanging scientific
information, it has become, in slightly more than
a decade, a world-wide used media for information dissemination and access. In many respects,
it has become the major means for publishing
and accessing information. Its scalability and the
comfort and speed in disseminating information
have no precedent. However, it is solely a Web
for humans. Computers cannot understand the
provided information and in return do not provide
any support in processing this information. Two
complementary trends are about to transform the
Web, from being for humans only, into a Web
that connects computers to provide support for
human interactions at a much higher level than is
available with current Web technology.
The Semantic Web promises to make information understandable to a computer and Web
services promise to provide smooth and painless integration of disparate applications. Web
services offer a new level of automation in eWork
and eCommerce, where fully open and flexible
cooperation can be achieved, on-the-fly, with
low programming costs. However, the current
implementations of Web service technology are
basics
In the following, we briefly overview the Web
service modeling ontology (WSMO) (Roman et
al., 2005) and discuss the notions that are relevant
for discovery. In particular, WSMO distinguishes
the notions of Web service and service in the
context of discovery. Furthermore, we summarize
what WSMO eventually aims at in regard of Web
service discovery.
Web Services
WSMO defines a description model that encompasses the information needed for automatically
determining the usability of a Web service. As
shown in Figure 2, a WSMO Web service description is comprised of four elements: (1) nonfunctional properties, (2) a capability as the functional
description of the service, summarized as service
interfaces, (3) a choreography that describes the
interface for service consumption by a client (i.e.,
how to interact with the Web service), and (4) an
orchestration that describes how the functionality
of the service is achieved by aggregating other Web
services. These notions describe the functionality
and behavior of a Web service, while its internal
implementation is not of interest.
Goals
In order to facilitate automated Web service usage and support ontological separation of user
desires, service usage requests, and Web service
descriptions, goals in WSMO allow specifying
objectives that clientswhich can be humans or
machineswish to achieve.
The general structure of WSMO goal descriptions is similar to Web service descriptions. The
client can specify the functionality expected in
a requested capability. Also, a goal can carry
information on the expected behavior of an
acceptable Web service in so-called requested
interfaces that can define the excepted communication behavior for consuming a Web service
with respect to its choreography interface as well
as restrictions on other Web services aggregated
in the orchestration of an acceptable Web service
(e.g., only Web services are accepted that utilize
6
a trusted payment facility). It is important to remark that goal descriptions are defined from the
client perspective, thereby decoupled from Web
service descriptions.
Mediators
Mediation is concerned with handling heterogeneity, that is, resolving possibly occurring
mismatches between resources that ought to be
interoperable. Heterogeneity naturally arises in
open and distributed environments, and thus in
the application areas of Semantic Web services.
Hence, WSMO defines the concept of mediators
as a top level notion.
Mediator-orientated architectures as introduced in (Wiederhold, 1994) specify a mediator
as an entity for establishing interoperability of
resources that are not compatible a priori by
resolving mismatches between them at runtime.
The aspired approach for mediation relies on
declarative description of resources whereupon
mechanisms for resolving mismatches work on a
structural and a semantic level, in order to allow
generic, domain independent mediation facilities as well as reuse of mediators. Concerning
the needs for mediation within Semantic Web
services, WSMO distinguishes three levels of
mediation:
1.
2.
3.
What is a Service?
It has been pointed out in Preist (2004) that the
notion of service is semantically overloaded. Several communities have different interpretations
which make it difficult to understand and relate
single approaches and exchange ideas and results.
In order to reach a common understanding of the
problem we address here, we need to precisely
define the term service and, therefore, what kind
of entities we aim at locating in principle.
In this chapter, we use the following interpretation for the term service, as described in the
conceptual architecture for Semantic Web services
presented in Preist (2004): Service as provision
of value in some domain. This definition regards
a service as a provision of value (not necessarily
monetary value) in some given domain, independently of how the supplier and the provider
interact. Examples of services in this sense are
the provision of information about flight tickets or
the booking of a trip with certain characteristics
by a tourism service provider.
description: in general, an abstract service AP offered by some provider P does not stay the same,
but changes over time. For instance, a hotel will
not be able to book a room with a single bed on
a specific date if all such rooms in the hotel are
already booked on this date.
Since clients are basically interested in finding
abstract services which actually can solve their
problem at hand (as specified in a WSMO goal),
discovery in general needs to take into account this
dynamics in order to create accurate results. This
basically means, that purely static descriptions of
Web services are not sufficient in the general case.
In applications where highly accurate results are
requested, Web service descriptions will have to
consist of a dynamic component as well.
On a description level there are various options
to achieve the proper representation of dynamically changing abstract services of a provider:
(1) Use a purely static description of the abstract
service and change the description in its entirety
every time the abstract service changes or (2)
use a static description where some parts refer
to a well-defined resource that reflects that currently valid information of the dynamic part (for
instance a database table with all available rooms
of a hotel at a certain point in time). Clearly, the
first approach is not really scalable since constant
changes of a stored abstract service description are
needed, whereas the second approach is a viable
one, as we can see with dynamically generated
Web pages of online stores like Amazon.com that
give an up-to-date perspective on prices of single
books. In the latter case, we simply externalize
the details of the changing aspects of an abstract
service and provide a reference in the remaining
part of the Web service description. Hence, the
abstract service description including the reference does not need to change over time when the
abstract service of the provider changes and Web
service descriptions get more compact.
Nonetheless, the latter approach requires communication with the provider of an abstract service
(in some way, for instance via a Web service that
9
90
Syntactic Level
In case of a natural language free text, we could
literally use the scenario description given above,
that is,
A = {ABank is a specialized German bank that
offers international wire transfers to arbitrary branches of Austrian Banks, whereby
money can only be transferred in European
currencies, and a transfer may not exceed
100.000 Euros for private customers and may
not exceed 2.000.000 Euros for companies.
For the successful execution of the transfer,
a (customer-specific) minimum balance of
the customers account is required. The
minimum balance is computed as follows:
()}
or a more condensed description based on a list
of keywords, for example,
A = {Bank Wire Transfer, ABank, in Germany, to
Branch of Austrian Bank, European Currencies only, not more than 100.000 Euros for
9
9
9
heuristic Classification
Clancey (1985) provided a land marking analysis in the area of experts systems. Based on an
analysis of numerous rule-based systems for classification and diagnosis he extracted a pattern of
three inference steps that helped to understand
the various production rules implemented in the
various systems.8 The problem-solving method he
called heuristic classification separates abstraction, matching, and refinement as the three major
activities in any classification and diagnosis task
(see Figure 4).
Abstraction
Abstraction is the process of translating concrete
description of a case into features that can be used
for classifying the case. For example, the name
of a patient can be ignored when making a diagnosis, his precise age may be translated into an
age class, and his precise body temperature may
be translated into the finding low fever. The
process is about extracting classification relevant
features from a concrete case description.
Matching
Matching is the process of inferring potential
explanation, diagnoses, or classifications from the
extracted features. It matches the abstracted case
description with abstract categories describing
potential solutions.
Refinement
Refinement is the process of inferring a final
diagnosis explaining the given findings. This pro-
9
service discovery
Now, what has this to do with Web service or service discovery? We strongly believe that a scalable
and workable service discovery approach should
follow the same pattern (see Figure 5).
9
96
Service Discovery
Service discovery is based on the usage of Web
services for discovering actual services. Web
service technology provides automated interfaces
to the information provided by software artifacts
that is needed to find, select, and eventually buy
a real-world service or simply find the piece of
information somebody is looking for. service discovery requires strong mediation and wrapping,
since the specific needs of a choreography of a
Web service have to be met in order to interoperate with it. Notice that automation of service
discovery defines significant higher requirements
on mediation than Web service discovery, as it
also requires protocol and process mediation. In
a sense, the role of Web service discovery can
be compared with the role of an internet search
engine like GooGle,10 and service discovery with
the process of extracting the actual information
from the retrieved Web sites.
Assumptions on Mediation
One could assume that Web services and goals
are described by the same terminology. Then
no data mediation problem exists during the
discovery process. However, it is unlikely that
a potentially huge number of distributed and
autonomous parties will agree beforehand in a
common terminology.
Alternatively, one could assume that goals
and Web services are described by completely
independent vocabularies. Although this case
might happen in a real setting, discovery would
be impossible to achieve. In consequence, only
an intermediate approach can lead to a scenario
where neither unrealistic assumptions nor complete failure of discovery has to occur. Such an
scenario relies in three main assumptions:
Goals and Web services most likely use different vocabularies, or in other words, we do
not restrict our approach to the case where
both need to use the same vocabulary.
Goals and Web services use controlled
vocabularies or ontologies to describe requested and provided services.
There is some mediation service in place.
Given the previous assumption, we can
optimistically assume that a mapping
has already been established between the
used terminologies, not to facilitate our
specific discovery problem but rather to
support the general information exchange
process between these terminologies.
9
keyword-based discovery
The keyword-based discovery is a basic ingredient
in a complete framework for Semantic Web service
discovery. By performing a keyword-based search
the huge amount of available services can be fil-
9
w U where the world has changed (some effects are observable) and some output has been
provided to the user. Both effects eff S(w,i1, , in)
and outputs outS(w,i1, , in) can be seen as sets
of objects depending on the initial state w and
the input information i1, , in which has been
provided to the service provider by the service
requester in w. The circumstances under which
a service S can be delivered by the provider are
represented by w and i1, , in. For example, the
description of a concrete service provided by a
European airline could be that a business-class
flight is booked for the male passenger James Joyce
on January 5th, 2005 from Dublin to Innsbruck,
and 420 Euros are charged on a MasterCard with
number #120127933.
If we abstract the description of an abstract
service A from the dependency of the contained
concrete services on the provided inputs i1, , in
and on the particular initial states w dom(A(i1,
, in)), the description will only specify which
objects we can expect from the abstract service as
effects effA and as outputs outA. For example, an
abstract description of a European airline could
state that the airline provides information about
flights within Europe as well as reservations for
these flights, but not what input has to be provided
and how this input will determine the results
of the service provision. In general, we expect
completeness but not necessarily correctness of
the abstract capability: every concrete service
provided by an abstract service should be covered
by the capability (on this intermediate level of
abstraction), but there might be services which
are models of capability but can actually not be
delivered as part of the abstract service A by the
provider (since we abstract from the circumstances
under which a service can be provided). More
formally, we assume:
eff S ( w, i1 , , in ) eff A
and
99
outS ( w, i1 , , in ) out A
00
Table 1.
Goal / WS
Intention of R
Existential ()
{f|f is a flight starting at city s and ending at city e, s any city in Europe,
e any city in Europe}
Universal ()
Semantic Matching
In order to consider a goal G and an abstract service
A to match on a semantic level, the sets RG and R A
describing these elements have to be interrelated;
precisely spoken, we expect that some set-theoretic relationship between RG and R A exists. The
most basic set-theoretic relationships that might
be considered are the following: RG = R A, RG
R A, RG R A, RG R A , RG R A = .
These set-theoretic relationships provide the
basic means for formalizing our intuitive understanding of a match between goals and abstract
services. For this reason, they have been considered to some extent already in the literature, for
instance, in Li and Horrocks (2003) or Paolucci
et al. (2002), in the context of Description Log-
0
Figure 6. Interaction between set-theoretic criteria, intentions and our intuitive understanding of matching
I =
Intention of /
I =
I =
I =
R = R
Match
R = R
PossMatch
R R
Match
R R
PossMatch
R R
ParMatch
R R
ParMatch
R R
ParMatch
R R
PossParMatch
R R=
No Match
R R=
NoMatch
R = R
Match
R = R
Match
R R
Match
R R
PossMatch
R R
Match
R R
Match
R R
Match
R R
PossMatch
R R=
No Match
R R=
NoMatch
Table 2.
Goal / WS
Intention of R
Existential ()
{f|f is a flight starting at city s and ending at city e, s any city in Europe,
e any city in Europe}
Universal ()
Match IG = , IA = , RG R A:
The requester wants to get all the objects
specified as relevant (IG = ,), whereas the
provider claims that he is able to deliver all the
objects specified in R A (IA = ). In this case, the
requester needs are fully covered by the abstract
service since all the requested objects RG can be
delivered by the abstract service according to its
abstract capability A.
Example.
0
ParMatch IG = , IA = , RG R A :
The requester wants to get all the objects that
he has specified as relevant, whereas the provider
claims that the abstract service is able to deliver
all the objects specified in R A. However, the two
sets of reference objects do only overlap. In this
case, the requester needs cannot be fully satisfied
by the abstract service. At best, the service can
contribute to resolve the desire of the client. Thus,
we consider this case as a partial match.
Example.
Table 3.
Goal / WS
Intention of R
Existential ()
{f|f is a flight starting at city s and ending at city e, s any city in Europe,
e any city in Europe}
Universal ()
Goal / WS
Intention of R
Universal ()
{f|f is a flight starting at city s and ending at city e, s any city in Europe,
e any city in Europe}
Existential ()
Table 4.
PossMatch IG = , IA = , RG R A:
The requester wants to get all the objects that
he has specified as relevant, whereas the provider
claims he is only able to deliver some of the objects specified in R A. Finally, the set of relevant
objects to the service requester is a subset of the
set of reference objects advertised by the service
provider. In this case, we cannot determine from
the given descriptions whether there is a match
or not, since we dont know which (non-empty)
subset of R A the provider actually can deliver.
However, it might turn out when examining a
more detailed description (or interacting with the
provider at service discovery time) that there is
a match. Such detailed description is considered
during service discovery. Hence, we consider this
as a possible match.
Example.
PossParMatch IG = , IA = , RG R A :
The requester wants to get all the objects that
he has specified as relevant, whereas the provider
claims that the abstract service is able to deliver
only some of the objects specified in R A. Additionally, the two sets of reference objects do
only overlap (and this is the strongest applicable
set-theoretic relation between RG and R A). In this
case, the requester needs cannot be fully satisfied
by the abstract service, but at best only partially.
However, we cannot determine from the given
descriptions whether there is such a partial match
or not, since we dont know which (non-empty)
subset of R A the provider actually can deliver.
When examining a more detailed description (or
0
Table 5.
Goal / WS
Intention of R
Universal ()
{f|f is a flight starting at city s and ending at city e, s any city in Europe,
e any city in Europe}
Existential ()
Table 6.
Goal / WS
Intention of R
Existential ()
{f|f is a train connection starting at city s and ending at city e, s any city
in Europe, e any city in Europe}
Universal ()
NoMatch IG = , IA = , RG R A :
The requester wants to get some of the objects
that he has specified as relevant, whereas the
provider claims that the abstract service is able to
deliver all the objects specified in R A. However,
the two sets of reference objects have no common
elements. In this case, the requester needs clearly
cannot be satisfied by the abstract service and we
consider this case as a non-match.
0
Example.
The Plugin-Criterion (RG RW)
0
precisely combines the Subsumes- and the Plugincriterion and thus specifies that objects the Web
service description refers to and the objects the
requester refers to precisely match; In particular,
it holds (independent of the intention of the Web
service description) that irrelevant objects will not
be delivered by the Web service. For the property
that is represented by the Plugin-Match part, the
same argument as for the Plugin-Match holds.
Hence, the corresponding semantic property is
irrelevant and the Exact-Match basically coincides
(in the context of our discussion in this paragraph)
with the Subsumes match.
To sum up, we have seen that there are cases
where a client could benefit from exploiting the
additional semantics captured by matching criteria
that are stronger (i.e., -smaller) than the weakest
(i.e., -maximal) criterion which represents an
actual match. Hence, it makes sense to not only
allow the use of the weakest (i.e., -maximal)
criterion that actually denotes a match (for the
respective intentions of the goal and the Web
service) to be applied for matching but to allow
the user to manually raise the semantic requirements that are captured by the criterion to apply
and thus to reflect his interest faithfully.
We have seen as well that in our general framework there is only one such additional property
that actually can be considered as useful, namely
the property of a Web service to not deliver objects
that are irrelevant to the user.
06
Discovery Scenario
During the discovery process the scenario for
matching between goal and Web service descriptions in general can be considered as follows:
A requester specifies his goal by means of a set
of relevant objects and the respective intention.
Moreover, he might additionally specify that he is
interested in Web services which deliver objects
that are relevant for his goal only (and such raise the
semantic requirement for matches). Furthermore,
the requester can indicate in his request whether
Figure 7. Which formal criteria should be used for checking different degrees of matching?
Intention of
/
I =
IW =
Match
R RW
Match
ParMatch
R RW
ParMatch
R RW
PossMatch
PossMatch
R RW
PossParMatch
PossParMatch
R RW
R RW=
No Match
I =
IW =
R RW=
No Match
Match
R RW
Match
R RW
ParMatch
ParMatch
PossMatch
PossMatch
R RW
PossParMatch
PossParMatch
No Match
R RW=
No Match
R RW=
Discussion
The proposed modeling approach is based on
set theory and ontologies for capturing domain
knowledge. By abstracting from dynamic aspects
of abstract services, we provide static and general
abstract capability descriptions. All the information necessary for checking a match is already
available when abstract service descriptions are
published, and no interaction with any of the involved parties (requester and provider) is needed
for this discovery step. On the other hand, the
accuracy we can achieve when is limited. Hence,
this discovery step based on such simple descriptions allows an efficient identification of candidate
abstract services, but does not guarantee that a
0
Figure 8. Which formal criteria should be used for checking different degrees of matching
when a requester insists on services delivering relevant objects only?
Intention of
/
I =
I =
IW =
IW =
Match
R = RW
Match
ParMatch
R RW
ParMatch
R RW
PossMatch
PossMatch
R = RW
PossParMatch
PossParMatch
R RW
No Match
R RW
No Match
R RW=
Match
R RW
Match
R = RW
ParMatch
ParMatch
PossMatch
PossMatch
PossParMatch
PossParMatch
No Match
R RW
No Match
R RW
0
09
0
This way, in principle we end up with (almost) eight different notions of matching which
potentially could be used by a client to specify
his desire in a service request.
Exact-Match (W 1 G, W + G).
If we consider Exact-match under the assumption that the Web service is executed only once,
we have to formalize the following statement:
there are input values i1, , in such that the sets of
objects RW(i1, , in) that the Web service claims to
deliver15 when being invoked with input values i1,
, in coincides with the set RG of objects which are
relevant for the requester. In this case we write W
1 G to indicate this particular kind of match.
If we instead want to consider multiple executions we use the following condition: For each
object x it holds that x can be delivered by Web
service execution on some input values i1, , in iff
x is relevant for the client. In this case we write W
+ G to indicate this particular kind of match.
Subsumes-Match (W 1 G, W + G).
If we consider Subsumes-match under the assumption that the Web service is executed only
once, we have to formalize the following statement: there are input values i1, , in such that the
sets of objects RW(i1, , in) that the Web service
claims to deliver when being invoked with input
values i1, , in is a subset of the set RG of objects
which are relevant for the requester. In this case
we write W 1 G to indicate this particular kind
of match.
If we instead want to consider multiple executions we would have formalize the following
statement: For each object x in the universe it
holds that if x can be delivered by Web service
execution on some input values i1, , in then x is
relevant for the client. In this case we write W +
G to indicate this particular kind of match.
Plugin-Match (W 1 G, W + G).
Intersection-Match(W 1 G, W + G).
If we consider Intersection-match under the assumption that the Web service is executed only
once, we have to formalize the following statement: there are input values i1, , in such that the
sets of objects RW(i1, , in) that the Web service
claims to deliver when being invoked with input
values i1, , in has a common element with the set
of objects RG which are relevant for the requester.
In this case we write W 1 G to indicate this
particular kind of match.
If we instead want to consider multiple executions we would have formalize the following
statement: There is an object x in the universe
such that x that can be delivered by Web service
execution on some input values i1, , in and x is
relevant for the client. In this case we write W +
G to indicate this particular kind of match.
Obviously, the both criteria 1 and + are
logically equivalent and thus are the very same
criteria. Thus, we do not have to distinguish between the two cases.
rElatEd Work
By defining a mathematical model for Web services, goals and the notion of matchmaking we
provide a basis for applications like Semantic Web
service repositories and discovery engines. Work
in this area has previously leveraged a different
(less detailed) formal view on the concept of a Web
service: Web services there have been formally
mostly considered as sets of objects (describing
input, outputs). On a description (language) these
sets allow for a natural representation by means
of concept expressions in Description Logics.
Matching then has been reduced to standard
reasoning tasks in the language (Li & Horrocks,
2003; Paolucci et al., 2002), however the dynamics
associated with a detailed (state-based) perspective on Web services, can not be represented in
such a setting.
Until recently, it seemed to be a common
practice in the Semantic Web Community when
considering semantic descriptions of Web service,
to strictly focus on languages (e.g., Description
conclusion
In this chapter we have presented a conceptual
model for service discovery which avoids unrealistic assumptions and is suitable for a wide range
of applications. One of the key features of this
model is the explicit distinction between the notions of Web services and services. Moreover, the
model does not neglect one of the core problems
one has to face in order to make discovery work
in a real-world setting, namely the heterogeneity
of descriptions of requestors and providers and
the required mediation between heterogeneous
representations. As discussed in the conceptual
model for service discovery, the discussed approaches are based on using terminologies, controlled vocabularies or rich descriptions which are
based on ontologies. For each of them, a working
solution to the mediation problem is possible or
not unrealistic.
We have outlined various approaches on
discovery of Web services with different requirements on the description of Web services
and the discovery request itself. Our main focus
has been on semantic-based approaches to Web
service discovery.
For further details, we refer the interested
reader to related documents of the WSMO and
acknoWlEdGMEnts
The work summarized in this chapter was carried out in the context of the WSMO and WSML
working groups, and various academic projects.
We want to express our special thanks to Michael
Stollberg, Cristina Feier and Dumitru Roman
for contributing to the first section. Numerous
people have been involved in the discussions and
in generating ideas. We especially would like to
thank Axel Polleres (Universidad Rey Juan Carlos,
University of Madrid, Spain) and Michael Kifer
(State University of New York (SUNY) at Stony
Brook, USA) for their continuous critique and
advice. Finally, we would like to thank to all the
members of the WSMO, WSML and WSMX working groups for their input and fruitful discussions
about the work presented in this chapter.
The presented work has been funded by the
European Commission under the projects DIP,
Knowledge Web, InfraWebs, SEKT, SWWS, ASG
and Esperonto; by Science Foundation Ireland
under Grant No. SFI/02/CE1/I131 and the Austrian Federal Ministry for Transport, Innovation,
and Technology under the project RW (FIT-IT
contract FFG 809250).
rEfErEncEs
Akkiraju, R., Goodwin, R., Doshi, P., & Roeder,
S. (2003). A method for semantically enhancing
the service discovery capabilities of UDDI. In S.
Kambhampati & C.A. Knoblock (Eds.), Proceedings of the IJCAI-03 Workshop on Information
Integration on the Web (IIWeb 03) (pp. 87-92).
Acapulco, Mexico.
Bonner, A. & Kifer, M. (1998). A logic for programming database transactions. In J. Chomicki
& G. Saake (Eds.), Logics for databases and information systems (Chapter 5, pp. 17-66). Kluwer
Academic Publishers.
Clancey, W.J. (1985). Heuristic classification.
Artificial Intelligence, 27(3), 289-350.
de Bruijn, J. (2005). The WSML specification.
Working draft, Digital Enterprise Research Insitute (DERI). Retrieved October 25, 2006, from
http://www.wsmo.org/TR/d16/
Fellbaum, C. (Ed.). (1998). WordNet: An electronic
lexical database. MIT Press.
Fensel, D. (2003). Ontologies: Silver bullet for
knowledge management and electronic commerce
(2nd ed.). Berlin: Springer-Verlag.
Fensel, D. & Bussler, C. (2002). The Web service modeling framework (WSMF). Electronic
Commerce Research and Applications, 1(2),
113-137.
Gonzlez-Castillo, J., Trastour, D., & Bartolini,
C. (2001). Description logics for matchmaking of
services. In Proceedings of the KI-2001 Workshop
on Applications of Description Logics.
Gruber, T.R. (1993). A translation approach to
portable ontology specification. Knowledge Acquisition, 5(2), 199-220.
Hoare, C.A.R. (1969). An axiomatic basis for
computer programming. Communications of the
ACM, 12(10), 576-580.
Jones, C.B. (1990). Systematic software development using VDM. Upper Saddle River, NJ:
Prentice Hall.
Keller, U., Lara, R., & Polleres, A. (2004). WSMO
Web service discovery. Deliverable D5.1, WSMO
Working Group. Retrieved October 25, 2006, from
http://www.wsmo.org/TR/d5/d5.1/
Lara, R., Binder, W., Constantinescu, I., Fensel, D.,
Wiederhold, G. (1994). Mediators in the architecture of the future information systems. Computer,
25(3), 38-49.
Zein, O. & Kermarrec, Y. (2004). An approach for
describing/discovering services and for adapting
them to the needs of users in distributed systems.
In Proceedings of the AAAI Spring Symposium
on Semantic Web Services.
EndnotEs
1
10
11
12
13
6
http://www.google.com
The WordNet homepage: http://www.cogsci.
princeton.edu/wn/
Again, this observation has not been made
by the Semantic Web community before,
because of a strict language focus.
Please note, that when assigning the intuitive
notions we assume that the listed set-theoretic properties between RG and R A are the
strongest ones that actually hold between
RG and R A.
14
15
Chapter XIII
abstract
As the use of the World Wide Web has become increasingly widespread, the business of commercial
search engines has become a vital and lucrative part of the Web. Search engines are common place
tools for virtually every user of the Internet; and companies, such as Google and Yahoo!, have become
household names. Semantic search engines try to augment and improve traditional Web Search Engines
by using not just words, but concepts and logical relationships. In this chapter a relevant class of semantic search engines, based on a peer-to-peer, data integration mediator-based architecture is described.
The architectural and functional features are presented with respect to two projects, SEWASIE and
WISDOM, involving the authors. The methodology to create a two level ontology and query processing
in the SEWASIE project are fully described.
introduction
Commercial search engines are mainly based
upon human-directed search. The human directed
search engine technology utilizes a database of
keyword, concepts, and references. The keyword
searches are used to rank pages, but this simplistic
Copyright 2007, IGI Global, distributing in print or electronic forms without written permission of IGI Global is prohibited.
We believe that data integration systems, domain ontologies and peer-to-peer architectures
are good ingredients for developing Semantic
Search Engines with good performance. In the
following, we will provide empirical evidence for
our hypothesis. More precisely, we will describe
two projects, SEWASIE and WISDOM, which
rely on these architectural features and developed
key semantic search functionalities. They both
exploit the MOMIS (Mediator EnvirOnment for
Multiple Information Sources) data integration
system (Beneventano, Bergamaschi, Guerra, &
Vincini, 2003; Bergamaschi, Castano, Vincini,
& Beneventano, 2001).
9
0
sEWasiE architecture
Figure 1 gives an overview of the architecture of
SEWASIE. A user is able to access the system
through a central user interface where the user
is provided with tools for query composition, for
visualizing and monitoring query results, and
for communicating with other business partners
about search results, for example, in electronic
negotiations. SEWASIE Information Nodes
(SINodes) are mediator-based systems, providing
a virtual view of the information sources managed within a SINode. The system may contain
multiple SINodes, each integrating several data
sources of an organization. Within a SINode,
wrappers are used to extract the data and metadata
(local schemas) from the sources. The Ontology
Builder, based on the MOMIS framework, is a
semi-automatic tool which creates a bootstrap
domain ontology by extracting and integrating
the local schemata of the sources into a GVV.
The GVV is annotated w.r.t. a lexical ontology
(Wordnet) (Miller, 1995). The annotated GVV
and the mappings to the data source schemas are
stored in a metadata repository (SINode ontology
in Figure 1) and queries expressed in terms of the
GVV can be processed by the Query Manager of
a SINode.
Brokering agents (BA) integrate several GVVs
from different SINodes into a BA Ontology, that
is of central importance to the SEWASIE system.
On the one hand, the user formulates the queries
using this ontology. On the other hand, it is used
to guide the Query Agents to select the useful
SINodes to solve a query.
The SEWASIE network may have multiple
BAs, each one representing a collection of SINodes
A peer (called SINode) contains a mediatorbased data integration system, which integrates heterogeneous data sources into an
ontology composed of: an annotated GVV,
A (possibly empty) set of local classes, denoted by L(C), belonging to the local sources
in N .
A conjunctive query QN over L(C).
Intuitively, the GVV is the intensional representation of the information provided by the
Integration System, whereas the mapping assertions specify how such an intensional representation relates to the local sources managed
by the Integration System. The semantics of an
Integration System, and then of the SEWASIE
system, is defined in Beneventano and Lenzerini
(2005) and Cal, Calvanese, Di Giacomo, and
Lenzerini (2004).
SINodes and the BA are defined as integration systems:
2.
Mapping refinement: The Ontology Designer interactively refines and completes the
automatic integration result; in particular,
the mappings, which have been automatically created by the system can be fine tuned
and the query associated to each global class
defined.
ontology Generation
The Ontology generation process can be outlined
as follows (see Figure 3):
1. Extraction of Local Source Schemata:
Wrappers acquire schemas of the involved local sources and convert them into ODLI3. Schema
description of structured sources (e.g., relational
database and object-oriented database) can be
directly translated, while the extraction of schemata from semistructured sources need suitable
techniques as described in (Abiteboul, Buneman,
& Suciu, 2000). To perform information extraction
5. GVV Annotation:
Mapping Refinement
The system automatically generates a Mapping
Table (MT) for each global class C of a GVV ,
whose columns represent the local classes L(C)
belonging to C and whose rows represent the
global attributes of C. An element MT [GA][LC]
represents the set of local attributes of LC which
are mapped onto the global attribute GA. Figure
6
Join Conditions
Merging data from different sources requires
different instantiations of the same real world
object to be identified; this process is called object
identification (Naumann & Haussler, 2002). The
topic of object identification is currently a very
active research area with significant contributions
both from the artificial intelligence (Tejada, Knoblock, & Minton, 2001) and database communities
(Ananthakrishna, Chaudhuri, & Ganti, 2002;
Chaudhuri, Ganjam, Ganti, & Motwani, 2003).
To identify instances of the same object and
fuse them we introduce Join Conditions among
pairs of local classes belonging to the same global
class. Given two local classes L1 and L2 belonging
to C, a Join Condition between L1 and L2, denoted
with JC(L1,L2), is an expression over L1.Ai and
L2.Aj where Ai (Aj) are global attributes with a
not null mapping in L1 (L2).
As an example, in Figure 4, the designer can
define:
JC(L1, L2) : L1.CompanyName = L2. Name
w h e r e L1= u s a w e a r . C o m p a n y
L2=fibre2fashion.Company.
and
Resolution Functions
The fusion of data coming from different sources
taking into account the problem of inconsistent
information among sources is a hot research
topic (Bertossi & Chomicki, 2003; Di Giacomo,
Lembo, Lenzerini, & Rosati, 2004; Greco, Greco,
& Zumpano, 2003; Lin & Mendelzon, 1998; Naumann & Haussler, 2002). In MOMIS the approach
proposed in Naumann and Haussler (2002) has
been adopted: a Resolution Function for solving
data conflicts may be defined for each global
attribute mapping onto local attributes coming
from more than one local source.
Homogeneous Attributes : If the designer
knows that there are no data conflicts for a global attribute mapped onto more than one source
(that is, the instances of the same real object in
different local classes have the same value for this
common attribute), he can define this attribute as
an Homogeneous Attribute; this is the default in
MOMIS. Of course, for homogeneous attributes
resolution functions are not necessary. A global
attribute mapped onto only one source is a particular case of an homogeneous attribute.
As an example, in Enterprise we defined all
the global attributes as homogeneous attributes
except for Address where we used a precedence
function: L1.Company.Address has a higher precedence than L2.Company.Address.
Full Disjunction
QN is defined in such a way that it contains a unique
tuple resulting from the merge of all the different
tuples representing the same real world object.
This problem is related to that of computing the
natural outer-join of many relations in a way that
preserves all possible connections among facts
(Rajaraman & Ullman, 1996). Such a computation has been termed as Full Disjunction (FD) by
Galindo Legaria (Galindo-Legaria, 1994).
In our context: given a global class
C composed of L,L, ..., Ln, we consider
full
j o i n T ( L ) o n J C ( L , L ) )
fulljoinT(L)on(JC(L,L)ORJC(L,L))
. . .
f u l l
j o i n
T ( L n )
on (JC(L,Ln) OR JC(L,Ln) OR ... OR JC(Ln-
,Ln))
Figure 5.
finally the agent-based prototype for query processing implemented in the SEWASIE system is
briefly described.
Query Formulation
Initially the user is presented with a choice of
different query scenarios which provide a mean-
9
0
unfolding process of an atom by taking into account the query QN introduced before.
where
Query Reformulation
Query reformulation takes into account the two
different levels of mappings (Figure 2): in Cal et
al. (2004) and Beneventano and Lenzerini (2005),
it is proved that if m1 and m2 are GAV mappings,
the mapping is indeed the composition of m1 and
m2; this implies that query answering can be
carried out in terms of two reformulation steps:
(1) Reformulation w.r.t. the BA Ontology and (2)
Reformulation w.r.t. the SINode Ontology.
These two reformulation steps are similar
and are:
1.
2.
EXPQuery = Q Q
where Q1 = Q and
Q={(X,X,X)|
Enterprise(X),Name(X,X), Address(X,X ),
Web(X,X),
Like(X,*.com),
C ont ac t Per s on( X , X), Equal To ( X,ye s),
BusinessOrganization(X),HasCategory(X,X),
ProductClassification (X5)
}
that is, Q2 takes into account the constraint ProductClassification IS-A Category.
A set of EXPAtoms:
Query Unfolding
The query unfolding process is performed for
each EXPAtom which is a Single Global Query
Q over a global class C of the GVV (for sake of
simplicity, we consider the query in an SQL-like
format):
Q = SELECT <Q_SELECT-list> from C where
<Q_condition>
SN1.Enterprise
SN2.Manufacturer
Name
Name
Name
Address
Address
Address
Web
Web
Website
ContactPerson
ContactPerson
where
otherwise
The set of FDAtoms for Expatom is:
FDAtom: SELECT Name,Address,Web
FROM SN.Enterprise
WHERE Web LIKE *.com AND ContactPerson = yes
Step 2) Generation of FDQuery which computes the Full Disjunction of the FDAtoms:
In our example:
SELECT * FROM FDATOM full join FDATOM
on (FDATOM.Name=FDATOM.Name)
The coordination of query processing is performed by the Query Agent, which accepts the
query from the Query Tool Interface, interacts
with a Brokering Agent and its underlying SINode
Agents, and returns the result as a materialized
view in the SEWASIE DB. Playmaker performs the
reformulation of the query w.r.t. the BA ontology.
It has two components: the Expander, which
performs the query expansion, and the Unfolder,
which performs the query unfolding. Once the
execution of the PlayMaker is completed, the
output of the Play Maker computation is sent from
the BA to the QA with a single message.
Query Agent: it performs the following three
steps:
Step 1) Execution: for each FDAtom (Parallel
Execution):
INPUT: FDAtom
MESSAGES: from QA to an SINode Agent
OUTPUT: a view storing the FDAtom result in the
SEWASIE DB
SEWASIE DB.
WisdoM architecture
thE WisdoM ProjEct
The WISDOM (Web intelligent search based
on DOMain ontologies) project aims at studying, developing and experimenting methods and
techniques for searching and querying information sources available on the Web. The main
goal of the project is the definition of a software
framework that allows computer applications to
leverage the huge amount of information contents
offered by Web sources (typically, as Web sites).
In the project it is assumed that the number of
sources of interest might be extremely large, and
that sources are independent and autonomous one
each other. These factors raise significant issues,
in particular because such an information space
implies heterogeneities at different levels of abstraction (format, logical, semantics). Providing
semantic Peer
A semantic peer contains a data integration system, which integrates heterogeneous data sources
M ,
C ,
Peer
Ontology
Source
...
M,n
C,n
Semantic Peer P
M ,
C ,
...
GVV
GVV
mapping
mapping
Source
Source
M,n
C,n
Source
6
rElatEd Work
Several projects have been developed in the area
of Semantic Web and semantic search engines.
approach is not the traditional Semantic Web approach with coded metadata, but rather an engine
that can build on content through semi-automatic
analysis.
conclusion
In this chapter we discussed some ingredients
for developing Semantic Search Engines based
on Data Integration Systems and peer-to-peer
architectures.
With reference to Data Integration Systems,
we refer to the list outlined in the invited tutorial
on Information Integration by Lenzerini (2003)
to point out the topics covered by the chapter.
First of all, the strength of the proposed approach is a solution to the problems of How to
construct the global schema (GVV) and How
to model the mappings between the sources and
the global schema. MOMIS follows a semantic
approach to perform extraction and integration
from both structured and semistructured data
sources. Semantics means semantic annotation
w.r.t. a lexicon ontology and Description Logics
background. MOMIS follows a GAV approach
where the GVV and the mappings among the
local sources and the GVV are defined in a semiautomatic way.
Regarding the problem of Data extraction,
cleaning and reconciliation we adopted some adhoc solutions, such as Data Conversion Functions,
Join Conditions and Resolution Functions. For
more general solutions and a deeper discussion,
the reader may refer to the bibliography given
in the Building the SEWASIE System Ontology
section.
Regarding The querying problem: How to answer queries expressed on the global schema we
overview the major aspects involved in querying
the system, that is, the query building, the query
reformulation and the query evaluation process,
in the context of query reformulation for the two-
rEfErEncEs
Boley, H. (2002). The Semantic Web in ten passages. Retrieved October 24, 2006, from http://
www.dfki.uni-kl.de/~boley/sw10pass/sw10passen.htm
Buntine, W.L. & Taylor, M.P. (2004). Alvis: Superpeer semantic search engine. In Proceedings
of the EWIMT Workshop (pp. 49-58).
9
tion representation and exploration. In Proceedings of the WEBIST Conference (pp. 332-340).
Gong, L. (2001). Industry report: JXTA: A network programming environment. IEEE Internet
Computing, 5(3), 88-95.
0
EndnotE
1
Synonym of (SYN) relationships are defined between two terms ti and tj that are synonyms;
Broader terms (BT) relationships are defined between two terms ti and tj, where ti has a broader,
more general meaning than tj. BT relationships are not symmetric.
Narrower terms (NT) relationships are the opposite of BT relationships.
Related terms (RT) relationships are defined between two terms ti and tj that are generally used
together in the same context in the considered sources.
An intensional relationship is only a terminological relationship, with no implications on the extension/compatibility of the structure (domain) of the two involved classes (attributes).
Extensional relationships. Intensional relationships SYN, BT (NT) between two class names and
may be strengthened by establishing that they are also extensional relationships:
Ci SYNext Cj: this means that the instances of Ci are the same of Cj;
Ci BText Cj: this means that the instances of Ci are a superset of the instances of Cj;
The standard IS-A relationship Ci IS-A Cj of object-oriented languages implies the extensional
relationship Cj BText Ci .
ODLI3 also extends ODL with the addition of integrity-constraint rules, which declaratively express
if-then rules at both the intra- and inter-source level. ODLI3 descriptions are translated into the Description Logic OLCD - Object Language with Complements allowing Descriptive cycles - (Beneventano,
Bergamaschi, Lodi, and Sartori, 1988), in order to perform inferences that will be useful for semantic
integration.
Because the ontology is composed of concepts (represented as global classes in ODLI3) and simple
binary relationships, translating ODLI3 into a Semantic Web standard such as RDF, DAML+OIL, or
OWL is a straightforward process. In fact, from a general perspective, an ODLI3 concept corresponds
to a class of the Semantic Web standards, and ODLI3 attributes are translated into properties. In particular, the IS-A ODLI3 relationships are equivalent to subclass-of in the considered Semantic Web
standards. Analyzing the syntax and semantics of each standard, further specific correspondences
might be established. For example, there is a correlation between ODLI3s simple domain attributes
and the DAML+OIL DataTypeProperty concept. Complex domain attributes further correspond to the
DAML+OIL ObjectProperty concept (www.w3.org/TR/daml+oil-reference).
Jorge Cardoso joined the University of Madeira (Portugal) in 2003. He previously gave lectures at
University of Georgia (USA) and at the Instituto Politcnico de Leiria (Portugal). Dr. Cardoso received
his PhD in computer science from the University of Georgia (2002). In 1999, he worked at the Boeing Company on enterprise application integration. Dr. Cardoso was the co-organizer and co-chair of
the first, second, and third International Workshop on Semantic and Dynamic Web Processes. He has
published over 60 refereed papers in the areas of workflow management systems, Semantic Web, and
related fields. He has also edited three books on Semantic Web and Web services. Prior to joining the
University of Georgia, he worked for two years at CCG, Zentrum fr Graphische Datenverarbeitung,
where he did research on computer supported cooperative work.
***
Wil M. van der Aalst is a full professor of information systems at the Technische Universiteit
Eindhoven (TU/e) having a position in both the Mathematics and Computer Science Department and
the Technology Management Department. Currently he is also an adjunct professor at Queensland University of Technology (QUT) working within the BPM group. His research interests include workflow
management, process mining, Petri nets, business process management, process modeling, and process
analysis.
Rama Akkiraju is a senior technical staff member at the IBM T.J. Watson Research Center in New
York. She holds a masters degree in computer science and an MBA from New York University, Stern
School of Business. Since joining IBM Research in 1995, she has worked on agent-based decision support systems, electronic market places and business process integration technologies. She is interested
in applying artificial intelligence techniques to solving business problems. Her current focus is on
Semantic Web services and its applications to services science.
Copyright 2007, IGI Global, distributing in print or electronic forms without written permission of IGI Global is prohibited.
Christos Anagnostopoulos received his BSc (2002) in informatics from the informatics and telecommunications department at the University of Athens, Greece, and his MSc (2004) in the division of
advanced information systems from the same department. He is currently a PhD student in the same
department and member of the Pervasive Computing Research Group. He has been involved as software
designer and developer in several national and European R&D projects (E2R, PoLoS). His research
interests are in the areas of knowledge representation, context awareness, pervasive computing, and
Semantic Web.
Grigoris Antoniou is a professor of computer science at the University of Crete, and head of the
Information Systems Laboratory at FORTH, Greece. Previously he held professorial appointments at
Griffith University, Australia, and the University of Bremen, Germany. His research interests lie in
knowledge representation and reasoning, and their applications to the Semantic Web, e-commerce and
ubiquitous computing. He is author of over 150 technical papers, and co-author of three books, among
them A Semantic Web Primer, MIT Press 2004. In 2006 he was elected ECCAI Fellow by the European
Coordinating Committee for Artificial Intelligence.
Domenico Beneventano, PhD, is an associate professor at the Faculty of Engineering of the University
of Modena e Reggio Emilia (Italy). His research activity has been mainly devoted to the application of
Description Logics reasoning techniques to databases for knowledge representation and query optimization. His current research interests are in the area of intelligent information integration and Semantic Web
and are devoted to the development of techniques for building a common ontology, that is, an integrated
view of the information in the separate sources, and for query processing and optimization.
Sonia Bergamaschi is a full professor at the Information Engineering Department at the University
of Modena and Reggio Emilia (Italy). She leads the database research group (DBGROUP). Her research
activity has been mainly devoted to the application of description logics techniques to databases for
consistency check and query optimization. Her current research efforts are devoted to intelligent information integration and Semantic Web. She has published about 90 international journal and conference
papers and was the coordinator of research projects founded by the European Community and Italian
MURST, CNR, ASI institutions. She has served on the committees of international and national database and AI conferences.
Jos de Bruijn received his masters degree in technical informatics from the Delft, University of
Technology, The Netherlands (2003). Since 2003 he has been employed as a researcher at the Digital
Enterprise Research Institute (DERI), at the University of Innsbruck, Austria. His research interests
include Semantic Web (services, languages), logical languages, logic programming and nonmonotonic
reasoning. He is the author of over 15 peer-reviewed publications in the area of Semantic Web (Services)
languages, including several journal publications. He has been actively involved in European funded
projects COG, SEKT, DIP, and Knowledge Web, and has been the project lead for SEKT at DERI. He
is a member of the WSMO and WSML working groups and of the W3C RIF working group.
Rogier Brussee is a researcher at the Telematica Institute in Enschede The Netherlands since 2000.
He is working on multimedia content management, information extraction, and applications to knowledge
management. He received a PhD in mathematics at Leiden university and subsequently held positions
Asia-Link Eastweb, COG, DIP, enIRaF, Esperonto, eSwan, h-TechSight, IBROW, InfraWebs, Knowledge Web, Multiple, MUSING, Ontoknowledge, Ontoweb, SALERO, SEEMP, SEKT, SemanticGov,
SUPER, SWAP, SWING, SWWS, SystemOne, TransIT, TripCom, and Wonderweb, the SFI funded
project DERI-Lionas well as the Austrian projects DSSE, Grisino, LBSCult, RW2, SemBiz, Sense,
Semantic Web Fred, SemNetMan, and TSC. He has supervised over 50 master theses and PhDs and is
a recipient of the Carl-Adam-Petri-Award of the faculty of economic sciences from the University of
Karlsruhe (2000).
Mariano Fernndez-Lpez is director of the Software and Knowledge Engineering Department
at the Technical School of Universidad San Pablo CEU, and he belongs to the Ontological Engineering
Group at Universidad Politcnica de Madrid. His current research activities include, among others:
methodologies for ontology development, ontological foundations and ontology mappings.
Thomas Franz is a doctoral student in the Computer Science Department at the University of Koblenz-Landau, Germany. His research interests include knowledge representation, personal information
management, and the Semantic Web. He received his MSc in computer science from the University of
Freiburg.
Asuncin Gmez-Prez (BA) is associate professor at the Computer Science School at Universidad
Politcnica de Madrid, Spain. She has been the director of the Ontological Engineering Group since 1995.
The most representative projects she is participating in are: SEEMP (FP6-23747), NeOn (FP6-027595),
OntoGrid (FP&-511513) as project coordinator, Knowledge Web NoE (Fp6-507482) acting as scientific
vice-director, Esperonto (IST-2001-34373), the OntoWeb (IST-2000-25056) thematic network, and also
the MKBEEM (IST-1999-10589) project. She has published more than 25 papers on the above issues.
She is author of one book on ontological engineering and co-author of a book on knowledge engineering. She has been codirector of the summer school on ontological engineering and the Semantic Web
in 2003, 2004, 2005. She is program chair of ESWC05 and was of EKAW02.
Stathes Hadjiefthymiades received his BSc, MSc, PhD degrees (in computer science) from the
University of Athens (UoA), Athens, Greece and a joint engineering-economics MSc from the National
Technical University of Athens. Since 1992, he was with the consulting firm Advanced Services Group.
He has been a member of the Communication Networks Laboratory of the UoA. He has participated in
numerous EU-funded and national projects. He served as visiting assistant professor at the University of
Aegean,in the information and communication systems engineering department. He joined the faculty
of Hellenic Open University (Patras, Greece) as an assistant professor. Since December 2003 he has
belonged to the faculty of the informatics and telecommunications department, UoA, where he is an
assistant professor. His research interests are in the areas of mobile/pervasive computing and networked
multimedia applications. He is the author of over 100 publications in the above areas.
Arthur H.M. ter Hofstede received his PhD in computer science from the University of Nijmegen
in The Netherlands (1993). Currently he works as an associate professor at the School of Information
Systems of the Faculty of Information Technology of Queensland University of Technology in Brisbane, Australia. He is co-leader of the BPM group in the faculty. His main research interests are in the
6
conceptual and formal foundations of workflow. He is committed to the Workflow Patterns Initiative
and the Yet Another Workflow Language (YAWL) Initiative.
Uwe Keller is a researcher in semantic technologies and their applications at the Digital Enterprise
Research Institute (DERI), Leopold Franzens University, Innsbruck, Austria. He joined DERI in 2004.
His current research interests include logics and automated reasoning with specific focus on Semantic
Web applications, semantic description of Web services and the exploitation of such descriptions in applications, knowledge-based applications, formal methods and semantically-enriched service-oriented
architectures. He has been involved in various European and national research projects, such as DIP,
Knowledge Web, ASG, RW2 and SENSE. At DERI, he is responsible for the Semantic Engineering
Support Environment (SEnSE) project. Uwe holds a diploma degree in computer science which he attained at the University of Karlsruhe (TH), Germany. He has been awarded with a scholarship by the
Studienstiftung des deutschen Volkes in 1996.
Marcello La Rosa received his MS in computer engineering from Politecnico di Torino, Italy (2005).
His thesis focused on model-based development based on collaborative business processes. As part of
his degree, he also investigated the areas of Service Oriented Design and Web Services standards. He
is currently a PhD candidate within the Business Process Management Group of the faculty of IT, at
the Queensland University of Technology, Australia. His PhD research concerns the tool-based design
and configuration of reference process models, with the aim of facilitating their workflow-based execution.
Rubn Lara is R&D director at Tecnologa, Informacin y Finanzas (TIF). He is an active researcher
in the area of Semantic Web services and service-oriented architectures. Before joining TIF, he has
worked as a researcher in the area of Semantic Web services at the Digital Enterprise Institute (DERI),
where he has also been the managing director of the EU Network of Excellence Knowledge Web. Rubn
obtained his MS in computer science at Universidad Autnoma de Madrid in 2001, and received the
First National Award in computer science by the Spanish Ministry of Culture and Science, as well as
the Special Award in computer science at Universidad Autnoma de Madrid.
Holger Lausen obtained his diploma in computer science at Flensburg University of Applied Science in 2003. His diploma thesis discussed the integration of Semantic Web technologies and document
management systems. Before he joined the Digital Enterprise Research Institute (DERI) as researcher
in April 2003, he carried out various software development projects in Germany and abroad. Within
DERI, he has been project coordinator for the Semantic Web Enabled Web Services (SWWS) project,
a European research project pioneering in the field of Web services and is coordinating the Austrian
funded project Reasoning with Web services (RW2). Continuing his engagement in semantic portal
technologies, he currently is active in the field of ontology and language design for the annotation of
services.
Miltiadis Lytras holds a PhD from the management science and technology department of the Athens
University of Economics and Business (AUEB), Greece. His research focuses on e-learning, knowledge
management and Semantic Web with more than 35 publications in these areas. He is guest co-editing
a special issue in International Journal of Distance Education Technologies with the special theme
Knowledge Management Technologies for E-learning as well as one in IEEE Educational Technology
and Society Journal with the theme Ontologies and the Semantic Web for E-learning.
John A. Miller is a professor of computer science at the University of Georgia. Dr. Miller received
a BS (1980) in applied mathematics from Northwestern University and an MS (1982) and PhD (1986)
in information and computer science from the Georgia Institute of Technology. Dr. Miller is the author
of over 100 technical papers in the areas of database, simulation, bioinformatics and Web services. He
has been active in the organizational structures of research conferences in all these areas. He has served
in positions from track coordinator to publications chair to general chair of the following conferences:
Annual Simulation Symposium (ANSS), Winter Simulation Conference (WSC), Workshop on Research
Issues in Data Engineering (RIDE), NSF Workshop on Workflow and Process Automation in Information Systems, and Conference on Industrial & Engineering Applications of Artificial Intelligence and
Expert Systems (IEA/AIE). He is an associate editor for ACM Transactions on Modeling and Computer
Simulation and IEEE Transactions on Systems, Man and Cybernetics as well as a guest editor for the
International Journal in Computer Simulation and IEEE Potentials.
Chun Ouyang received her PhD in computer systems engineering from the University of South
Australia (2004). She currently works as a postdoctoral research fellow at the faculty of information
technology in Queensland University of Technology, Australia. Her research interests are in the areas
of business process management, process modeling and analysis, Petri nets, and formal specification
and verification.
Richard Scott Patterson is a masters candidate of computer science at the University of Georgia.
His research interests include Web services, Semantic Web, security, and access control. Patterson
received his BBA in economics form the University of Georgia (1998). Patterson worked for five years
in IT architecture and security consulting before returning to pursue his masters degree.
Cary Pennington has worked in the computer science field for eight years. He received his undergraduate degree in computer science from Furman University (under Dr. Ken Abernethy), where he
focused on software engineering. He is currently completing the masters program in computer science
at the University of Georgia (with Dr. John Miller) focusing on Semantic Web services. His work was
on the automatic deployment time binding of Web services into a composite business process. This
should aid nontechnical personnel in using the advances that are being made by researchers to develop
efficient and accurate processes. He is committed to making the world of information technology easier
to understand and use for the people it will benefit the most. Before continuing his education at UGA,
Cary worked at Ipswitch Inc., a leading internet software company, on the WS-FTP product and IMail
Server.
Stanislav Pokraev is member of the scientific staff at Telematica Instituut, the Netherlands since
2001 and PhD candidate in the computer science department of the University of Twente, the Netherlands
since 2003. Previously he was employed as Scientific Researcher at KPN Research, the Netherlands.
Stanislav holds a MSc (Eng) degree from the Technical University of Sofia, Bulgaria. His main expertise
is in the area of information modeling and service-oriented business integration.
Christoph Ringelstein is a doctoral student in the Computer Science Department at the University
of Koblenz-Landau. His research interests include Web service annotation, knowledge representation,
ontology engineering, and the Semantic Web. He received his diploma in computer science form the
University of Koblenz-Landau.
Steffen Staab is professor for databases and information systems in the computer science department at the University of Koblenz-Landau and heads the research group on information systems and the
Semantic Web (ISWeb). His research interests range from ontology management and ontology learning
to the application of Semantic Web technologies in areas such as multimedia, personal information
management, peer-to-peer and Web services. In particular, he is interested in combining sound and
diligent ontology engineering with real world concerns and applications. His research activities have
led to over 100 refereed publications and 7 books as well as to the foundation of Ontoprise GmbH, a
company focusing on the exploitation of Semantic Web technology.
Vassileios Tsetsos received his BSc in informatics from the Informatics and Telecommunications
Department at the University of Athens, Greece, in 2003, and his MSc in the division of communication systems and data networks from the same department in 2005. He is currently a PhD student in
the same department and member of the Pervasive Computing Research Group. He has been involved
as software designer and developer in several national and European R&D projects (PoLoS, PASO,
PENED). His research interests are in the areas of mobile and pervasive computing, Semantic Web,
ontological engineering, and Web applications.
Ivan Vasquez is an MSc candidate at the LSDIS Lab (University of Georgia), working under the
direction of Dr. John Miller. He joined the program in Fall 2002, while working at the Information
Technology Outreach Services (ITOS) of the University of Georgia. Given his industry experience
with relational databases and middleware, his research has been focused on conducting transactions on
service oriented computing environments. The product of his research resulted in OpenWS-Transaction, a METEOR-S module that can be easily adopted by existing applications to deliver reliable Web
service transactions.
9
0
Index
D
data
semistructured 523
structured 523
unstructured 523
description logic 111133
classes 111133
Dewey Decimal Classification (DDC) 1823
document type definition (DTD) 4
Dublin Core metadata 13
E
enterprise resource planning (ERP) 623
Extensible Markup Language (XML) 323
F
first order logic (FOL) 1623
Copyright 2007, IGI Global, distributing in print or electronic forms without written permission of IGI Global is prohibited.
Index
G
Grid 1923
H
Hyper Text Markup Language (HTML) 123
hypertext transfer protocol (HTTP) 223
L
logic 1623, 2443
classical first-order 25
description (DL) 25, 3043, 41
ABox 3043
attributive language with complement (ALC) 3043
reasoning 3243
SHIQ 3043
SHOIQ 3043
TBox 3043, 3243
first-order (FOL) 2543
formal 24
frame (F-Logic) 25, 3843, 42
intentionally 38
object identity 38
program 3843
Horn logic 33
programming 2543, 42
Horn formula 33
minimal Herbrand model 3443
negation 3643
recursion 3543
RuleML 4043
terminological 30
M
metadata 4
Dublin Core 13
microarray gene expression data (MGED) 2023
O
ontology 823, 4470
activity 47
alignment 49
configuration management 49
development 52
documentation 49
evaluation 49
integration 49
knowledge acquisition 49
merging 49
scheduling 47
alignment 5370
method 5370
AnchorPrompt 54
Artemis 54
Cupid 54
Pan and colleagues proposal 54
PROMPT 55
S-Match 54
Similarity Flooding 54
technique 5370
tools
OLA 60
Pan and colleagues 60
QOM toolset 60
S-Match 60
classes 46
concept attributes 46
construction 7195
editing tool 7195
core ontologies 53
Cyc method 51
definition of 45, 96
description model (ODM) 46
development 4770
method 5270
oriented activity 48
SENSUS 53
technique 5270
tools 5770
KAON1 58
OntoEdit 58
Protg 57
WebODE 57
evaluation
method 5670
content perspective 56
Gmez-Prez 56
methodology perspective 56
OntoClean 56
evolution and versioning 5570
method 5570
change management KAON plug-in 56
PromptDiff algorithm 56
formal axioms 46
instances 46
knowledge base (KB) 51
language 6170, 96109
alternatives and extensions 106109
basics 99
hierarchies 99
design issues 100
ontology markup language 61
Ontology Web Language 61
OWL 101
RDF Schema 101
Web-based ontology language 61
Index
merging 49
multilingualism activity 49
tool 5770
Altova SemanticWorks 2006 8295
IsaViz 79
OilEd 8395
OntoEdit 7595
ontolingua 8095
pOWL 8795
Protg 7395
software architecture and tool evolution 58
SWOOP 8995
Uschold and King's method 51
versioning 5570
Web Ontology Language (OWL) 72
R
resource description framework (RDF) 3, 1123
basic structure 12
schema (RDFS) 1423, 72
RuleML 4243
S
Semantic
Web 123
reasoning 110133
case study 120133
service
discovery peer-to-peer (P2P) architecture 251280
semantic
bioinformatic system 2023
digital library 1823
search engines 317342
data integration systems 319342
peer-to-peer (P2P)
schema 320
SEWASIE 321
ontology 322
WISDOM 334
tourism information system 1823
Web 123
architecture 923
discovery
centralized architectures 248280
reasoning
knowledge base (KB) 125133
ontology 125133
process 118133
tasks 117133
service 1723, 191216
Index
composition 205
discovery 240280, 281316
discovery architectures 248280
discovery open issues 266
matchmaking tools 264
registries 204
semantics
levels 623
service
discovery
heuristic classification 294
model 293
service-oriented architecture (SOA) 138154
syntactic Web 123
T
taxonomy 823
thesaurus relationship types
associative 8
equivalence 8
hierarchical 8
homographic 8
U
unicode 1023
uniform resource locator (URL) 2, 1023
universal resource identifier (URI) 1023
W
Web 123
Ontology Language (OWL) 323
Semantic 123
architecture 923
services 1723
server 123
services 1723
syntactic 123
Web service 134154, 217239, 287
abstraction 289
authorization 147154
description technologies 244280
developing 149154
discovery 243280
matching methods 245280
shortcomings and limitations 245280
messaging 148154
nonsemantic description 224239
policy 144154
reference architecture 243280
security 144154
semantic annotation
descriptions 227239
purpose 221
use case 220
architecture 220
development process 220
up-and-running Web shop 221
semantic annotition
process 233239
standards 141154
Web service modeling ontology (WSMO) 281316
basics 283
top level elements 284
Web services
semantic annotation 217239
World Wide Web (WWW) 1
X
XML Schema 40