Design Patterns in Application Integration
Design Patterns in Application Integration
Design patterns
in application integration
based on messages
by
Warsaw, 2007
Abstract
This thesis is devoted to the issues connected with application integration. It presents
the importance of the knowledge of the design patterns connected with this subject.
At the beginning this thesis introduces the reader to the area of application integra-
tion and different types of problems connected with it. The thesis explains why the
task of integration is being undertaken and why it can be very difficult and compli-
cated. Next, the integration styles are being presented starting with the oldest and
the simplest ones going through more complex ones and ending on the integration
based on messages, which forms the main area of interest in this thesis. After fa-
miliarising the reader with the basics of this integration approach the thesis is aimed
to provide him/her with the essential theoretical background, which covers knowl-
edge about basic terms and design patterns connected with the integration based
on messages. Afterwards, the practical application of the discussed terms is being
shown based on the case study describing system integration issues. Last part of this
thesis is dedicated to the integration platform that has been created as an integral
part of the thesis. The description of this platform contains information about used
technologies, application architecture and an example of its usage based on the case
study presented earlier.
Contents
Contents i
1 Introduction 7
1.1 Loose Coupling . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.2 Case study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2 Integration Styles 17
2.1 Application integration . . . . . . . . . . . . . . . . . . . . . . . . 18
2.2 Application coupling . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.3 Integration simplicity . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.4 Integration technology . . . . . . . . . . . . . . . . . . . . . . . . 19
2.5 Data format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
2.6 Data timeliness . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
2.7 Data or functionality . . . . . . . . . . . . . . . . . . . . . . . . . 20
2.8 Asynchronicity . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
2.9 Styles of integration . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.9.1 File Transfer . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.9.2 Shared Database . . . . . . . . . . . . . . . . . . . . . . . 24
2.9.3 Remote Procedure Invocation . . . . . . . . . . . . . . . . 26
2.9.4 Messaging . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
i
ii CONTENTS
7 Implementation 73
7.1 The origin of the name . . . . . . . . . . . . . . . . . . . . . . . . 73
7.2 Concept . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
7.3 Technology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
7.4 Classification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
7.5 Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
7.6 Processing sequence . . . . . . . . . . . . . . . . . . . . . . . . . . 77
7.7 Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
7.7.1 Configuration model . . . . . . . . . . . . . . . . . . . . . 79
7.7.2 Configuration example . . . . . . . . . . . . . . . . . . . . 80
7.7.3 More on transformers and routers . . . . . . . . . . . . . . 81
7.7.4 Performing the configuration . . . . . . . . . . . . . . . . . 82
7.8 Integration design patterns supported by pESB . . . . . . . . . . 83
7.9 Problems encountered during the implementation . . . . . . . . . 84
8 Summary 85
Bibliography 87
List of Figures 91
List of Tables 93
Index 95
Acknowledgements
We would like to thank Dr Piotr Habela, who conducted a class on the software
integration (Technologie Internetu), which inspired us to perceive more informa-
tion about this subject.
I, Adam Siemion, would also like to thank Remigiusz Weska, with whom I had
have the pleasure to work and the company IMPAQ Sp. z o.o., where we — both
— had been working on a project, which aimed to choose the best messaging
based integration solution that would fulfill the requirements of our customer.
That experience also motivated me to delve into the subject of integration.
1
Preface
3
4 PREFACE
that would satisfy all of the participants can be a demanding and difficult task
itself, not only because of this aspect, but the designers must also concentrate on
the functionality, flexibility and reliability of the solution, which makes this task
even more difficult.
After the integration project is completed, the appropriate tool must be cho-
sen to put the whole solution in motion. The IT market offers many available
solutions that are specifically designed to solve those type of problems (i.e. build-
ing integration solutions). We can choose from both open source solutions such
as the Mule or ServiceMix and propriety products from IBM, Oracle, Sonic Soft-
ware, BEA and so on.
Upon taking a closer look at those products and considering the problems and
challenges of the application integration topic, we decided to make it our object
of interest and the topic of this thesis.
Having a limited amount of time and available resources we did not aim
to create a solution that could compete with those made by large IT companies.
Instead, we decided to take a different approach — create a lightweight integration
platform. It would provide the basic functionality needed for an application
integration combined with the ease of use and a short learning curve, so that
the potential user with a knowledge in programming would be able to effectively
create an integration solution without the need to sacrifice a large amount of time
to learn the functionality of the software program itself. The simplicity of our
platform, in comparison to the large and complicated tools offered by large IT
companies, would be its greatest strength. We aimed to create a tool, based on
widely available technologies, that could be used as a base for further development
by adding extensions to it.
The work itself has been structured in such a way that a reader would get
an overview of the whole topic of integration and especially messaging systems
before moving forward to the description of the created integration solution itself.
The thesis is divided into two main parts. Part one contains chapters two,
three, four and five, which describe the basic theory behind the subject of integra-
tion. While part two introduces the reader to the integration solutions currently
available on the market such as Message Oriented Middleware (MOM) and En-
terprise Service Bus (ESB), which are heavily using the concepts depicted in the
part one and presents a case study that is aimed to present the usage of terms
and concepts presented in the first part of this thesis from a practical point of
view. Part two contains chapters six, seven and eight. The ninth and last chapter
contains the summary of our work.
this integration style. The description will cover two different concepts of
communication using this integration style — the synchronous and asyn-
chronous one and basic concepts directly connected with this integration
style — Message, Message Channel, Message Router, etc.
4. Chapter four covers the concept of the design patterns in messaging systems.
It gives the depiction of several selected design patterns with possible ways
of usage and different variants that can be applied in different situations.
5. Part two begins with chapter five, which will introduce the concept of an En-
terprise Service Bus (ESB). First, we will define briefly what an ESB is,
then we will focus on the basis of the ESB — Message Oriented Middle-
ware (MOM), advantages of introducing an ESB, its capabilities and finally,
we will provide a couple of ESB integration patterns.
6. Chapter six presents a case study, which aims to present the usage of the
concepts from the previous chapters in a real life business example. It starts
with a general overview of the problem and goes through all phases of the
integration process up to the final solution.
8. Finally, chapter eight will contain the summary of our work with final re-
marks regarding the goals that we managed to achieve and the ones that
have not been achieved, along with possible reasons why it has happened.
It will also contain the suggestions about the possible ways of further de-
velopment and some ideas of extensions that could be made to make the
existing tool more usable and enriched with new functionalities.
Chapter 1
Introduction
• cost reduction
• increased effectiveness
7
8 CHAPTER 1. INTRODUCTION
and data exchange have not been taken into consideration during the design of
those kind of systems.
What makes the process of integration even more difficult is the fact that:
of view, it might be considered as the best one to use. Moreover, this measure
emphasises the main idea of loose coupling that the integrated systems should
make as little assumptions as possible about each other. The fewer assumptions
will be made, the more changes can be made inside the connected system without
affecting the operation of others.
The opposite concept of loose coupling is tight coupling. Tight coupling
might be depicted using local method invocation as an example. Local method
call imposes a lot of assumptions on the caller, which are as the following:
The main advantage of this approach is that it is easier for the developers, who
are used to invoking local methods, to start using these technologies. Therefore,
using those techniques may lead to making the same assumptions as in the case
of local method invocation. However, it should be kept in mind that while those
assumptions are valid in local environment, many of them are not valid in the case
of remote calls (making them valid, if possible, will greatly restrict the flexibility
of the integration solution).
When integrating applications it is not usually desired for a calling applica-
tion to wait until the results of the remote processing will be available. Such
a waiting, at best, might lead to delays that are very often not acceptable by
the business entities taking part in the integration process. Moreover, in case
of a communication failure (e.g. due to lost of connectivity) an application can
get suspended waiting for the response. This can lead to an application crash.
The processing time of a remote call is also much longer than in the case of a local
call, and the call as a whole is far less reliable. When a remote call is being made
it cannot be assumed, as in the case of local call, that a response to this call will
be received, because there might be communication failure, crash of the remote
system, and so on.
One of the assumptions made in the case of tight coupling is that both called
and the calling method are written in the same programming language. This as-
sumption significantly reduces the scope of possible integration scenarios, because
it is only possible to integrate applications written in the same programming lan-
guage (e.g. it is not possible to integrate applications written in Java and C#
using JAVA RMI). This restriction highly reduces the flexibility and scope of
applications of the technologies mentioned above. What is more, it makes it im-
possible to integrate newly written systems with the legacy systems. As it can
be easily seen, this approach is also burdened with problems that do not occur
in the case of tightly coupled local calls.An example showing what problems can
appear while trying to integrate systems with tightly coupled dependencies can
be found in [6], along with the detailed description of this approach and problems
that it might cause.
Loose coupling apart from being a popular term is also one of the core concepts
of an application integration. By making integrated system less dependent from
the things such as the programming language in which they are written, their data
model, internal business logic and architecture they are more flexible and change
tolerant. This approach allows to modify one application to some point without
the negative effects on the communication with the other system. This assures
flexibility, which cannot be achieved in case of tight coupling due to restrictions
mentioned earlier.
Apart from such benefits as flexibility, scalability, higher tolerance for internal
changes loose coupling has also some disadvantages. Designing a loosely coupled
integration solution is a more complex task than in the case of tightly coupled
solutions. A lot of new problems need to be solved in order to effectively perform
12 CHAPTER 1. INTRODUCTION
• transfer title
• transfer amount
The sample source code snippet, written in C#, could look like this:
String hostName = "www.mybankingapp.com";
int port = 8080;
socket.Close();
Above source code excerpt first initiates the connection to the banking system,
then sets an amount of money that will be transferred to the destination account
along with the transfer title and the destination account number and finally sends
that information as a byte stream. Of course in the real life this method would
CHAPTER 1. INTRODUCTION 13
be much more sophisticated, but the goal here is to show general concept, not to
write a complete business solution.
The communication solution presented above is quite straightforward and
simple. It does not require the usage of any sophisticated integration software.
But this solution carries hidden problems that can be very hard to track and
repair.
In multiple books about network programming the above solution would be
presented as the one, which enables to communicate the client (presented above)
with the server regardless of the operating system and programming language
these two systems are using. This is not completely true, as it will be explained
later.
In order to obtain data that will be sent to the banking application, the
transfer amount, transfer title and destination account number are converted to
arrays of bytes. Then each of them is sent to the destination. The BitConverter
class is used to convert transfer amount to the array of bytes. The conversion
made by this class is performed using internal memory representation of a given
data type (integer in this case). .NET uses 32-bits integer type and this type will
be used in this case to make the conversion of integer to a array of bytes. Other
systems might use not a 32 bit representation, but a 64 one for example. In case
of a system using 64 bit representation it will read not 32 but 64 bits from the
incoming byte stream. What does this mean in the case of our example? If the
destination system uses the 64 bit integer representation it will read not only the
4 bytes of the transfer amount but also the preceding 4 bytes of the transfer title
and try to interpret whole 8 bytes as an integer. This difference in data types
would cause a different amount of money being transferred than the user had
initially desired! Apart from that, for the same reason the destination account
number would be different than the one given by the user. Such a behaviour is
at least undesirable and will lead to the disastrous effects both for the bank and
the client.
Moreover, client and bank computer systems may use different formats to
store numbers. One of them may use big-endian system, which stores numbers
starting with the highest byte first, while the other one may use the small-endian
system, which stores numbers starting with the lowest bytes first. This will also
cause difference in the transfer amount!
At least two assumptions must be made about integrated systems in order
for the above solution to work properly. First one that both of them have to use
the same data types, and the second one that both of them have to use the same
internal number storage format. However, it is not the end of the restrictions
imposed on by this approach.
Upon the closer examination of the above source code, a couple of things might
be spotted. First of all, the connection information have been written directly in
the source code. Any change concerning this data, like changing the destination
host name or adding an alternate destination address would require altering the
source code. In order to take effect of those changes the whole application would
have to be recompiled and redeployed. In the case of a simple application it
might not appear as a difficulty, but when more complex, critical applications
are being concerned such a way of performing changes can become a very serious
14 CHAPTER 1. INTRODUCTION
issue. It would significantly increase the cost and time needed to introduce even
the simplest change to the application. The above source code should be written
in such a way that changes could be made to it in the most efficient possible way
(efficiency in that case covers both time and cost efficiency alike).
Furthermore, the usage of client-and-server mode assumes that both the server
and the client are connected to the network at the same time. If one of the
participants is currently not available, because of network problems, too high
network traffic, connection link problems, etc., then the connection cannot be
established and the data cannot be exchanged.
It has already been mentioned before that in case of the presented solution
any changes made to the application require changes within the application code
itself. Those changes would have to be made each time a destination of the request
being sent would change (that would involve changing the code, recompiling it
and redeploying application).
The same way of introducing changes to the application — changing the
source code, recompiling it and redeploying — would also have to be taken also
if there would be a need to change the number of parameters being sent to the
banking application. But this time the changes would have to take place in both
applications, because the banking application needs to know in advance the exact
structure of the request so it can parse and process it correctly.
This example shows how the tightly coupled solution could look like and what
assumptions must be made in order for it to work properly. To sum it up, those
assumptions are as follows:
1. Both client and host system must use the same internal number format
representation and use the same data types.
2. Both client and host applications must be working and connected to the
network at the same time in order to exchange data.
3. A client application must know the host location during the coding phase
and this location cannot be changed without updating the application code
itself.
4. A host application must know the exact number and type of the request
parameters during the coding phase in order to process and interpret them
correctly. Changes can only be made by changing the host application’s
code itself.
As it has been shown, a lot of assumptions must be made about the appli-
cations in order to make them communicate correctly. Therefore, the presented
solution can be qualified as a tightly coupled and makes a good illustration of
restrictions and limitations of this approach. In order to make this solution more
flexible and less restricted it should be designed as a loosely coupled. Redesign-
ing it to achieve that goal, would mean removing the restrictions limiting the
flexibility of the presented solution. That goal would be achieved if there would
no longer be a need for all those conditions, listed earlier, to take place in order
to make the systems fully functional.
CHAPTER 1. INTRODUCTION 15
The first step on the way of decoupling the previous solution would be defining
a platform independent data format, which would be resistant to issues connected
with different internal format number representations and the usage of different
data types. An XML can be a solution of this problem, it can be used to define
request description format, then requests would be sent as XML documents.
The destination host could parse it and extract all the necessary information to
process the request.
In order to remove the restriction caused by the the assumption concerning the
host location and communication issues, a Message Channel design pattern (3.3)
can be introduced. Message Channel is a logical address, which both the client
and the host application use to communicate. The application has to be able to
connect to the channel only, not directly to the server. This resolves the location
issue — there is no longer the need to know the connection details about the host
application — the channel is used to communicate.
If the channel will be able to store the requests in form of a request queue,
then the necessity for both systems to be connected at the same time will be
eliminated. Every request will be stored in the channel until the destination
system fetches it. The response delivery will work the same way. Thanks to that,
the systems will be able to communicate without the requirement for both of
them to be on-line at the same time.
Applying the above mechanisms to the given problem would change the so-
lution from a tightly coupled into a loosely coupled. With those mechanisms a
solution — far more flexible than the previous one — can be obtained, free of all
the restrictions of the first approach. From now on the client and the host appli-
cation might be developed simultaneously, independently of each other. Changes
done in one participant would not require altering the other one.
However, there are also disadvantages of loosely coupled solutions — the main
drawback is that it becomes much more sophisticated and complicated. This
means that it would take much more lines of code to implement this solution,
and it would not be so simple and straightforward. Also the process of debugging
and testing becomes more complex. It is very important to keep that in mind
while making the decision between loose and tight coupling model.
The above example forms a good illustration of the differences between the
tight and loose coupling. It shows what both of them have to offer and what
their advantages and drawbacks are.
Chapter 2
Integration Styles
• Application integration
• Application coupling
• Integration simplicity
• Integration technology
• Data format
• Data timeliness
• Data functionality
• Asynchronicity
When the integration task at hand will be examined and analysed based on
those criteria. Conclusion that will be the result of such an analysis can be very
helpful later on. Basing on those conclusions, the decision on, which integration
style will be the most suitable for the given situation, can be made. Before
moving forward to the description of those styles, a brief overview of each of
the mentioned criteria will be given. This will allow the reader to have a better
understanding of those criteria and enable applying them in a proper way.
17
18 CHAPTER 2. INTEGRATION STYLES
the data sender. In the case of the synchronous communication scenario, short
data exchange time is essential to prevent delays in the application processing.
Moreover, long delays may cause another problem. The data, while being
transferred to the destination, may become stale. In this situation the processing
of this data can lead to errors that may have serious consequences for the business.
This issue, named the data timeliness issue, is especially important in case of
applications that deal with volatile data, which changes very frequently in a
short periods of time. Delays are not the only threat connected with this issue.
A so called deadlock situation, may also occur when the sender is waiting to
receive the result of the processing from the receiver and the receiver application
is out of order, because it had just crashed or because of other reason, it cannot
currently process the sender’s request. In that case the whole system is suspended
and cannot perform its activities.
The mentioned issues make the integration process even more complex and
should be taken into consideration, when choosing the most appropriate inte-
gration technique. The solution, which in given circumstances will provide the
shortest latency, should be chosen. This should prevent communication and pro-
cessing deadlocks and errors caused by the stale data.
2.8 Asynchronicity
Another issue, which should be thought through while designing integration so-
lution is the way, in which the integrated applications will communicate. There
are two possibilities:
• synchronous communication
• asynchronous communication
results from the remote system. When the results finally come the sender system
is notified then it postpones its current activities and processes the response from
the remote system. This type of call - the sender does not need the results of
the request and it is the asynchronous call - is named ”Fire and Forget” and is
a design pattern [5] used in the application integration.
Synchronous communication is simpler to design and implement, but reduces
the overall performance of the system. Also, it may cause deadlocks (see the
above section ”Data timeliness” (2.6)) and increase time spent by the system in
the idle state - waiting for the response from the other system. Asynchronous
communication, on the other hand, is more effective and offers a greater perfor-
mance at the cost of greater design and implementation complexity.
Synchronous communication is more suitable in cases when there is no need
to create a complex solution, when messages being sent are small and the process-
ing time is negligible. In that case the delays caused by waiting for the response
will not affect the overall performance of the application in a noticeable degree.
In other cases when an application needs to maintain the request-response inter-
action model in order to provide desired functionality (e.g. web browsers, online
chats, etc.), a synchronous model of communication is also necessary.
Asynchronous communication can be used when the sender does not expect
the response to arrive right away. Amazingly, this situation occurs very frequently
in the real life. For example, after filling in the form for a VISA, we do not expect
the embassy to examine our application before we leave the building. Also, we
are not waiting inside the embassy for the decision. Instead, we can continue with
our lives. Similarly, after sending a letter in the post office, we do not expect the
post office to deliver the letter before we leave. Those analogies are very similar
to the working of the asynchronous model. They are proving that this method of
communication is very popular in the real life, therefore it also has to be available
in the computer software.
Asynchronous communication might be more efficient, especially when there
is no place for delays, caused by the waiting for a reply of the request. An asyn-
chronous application instead of waiting for the response, as it is in case of a syn-
chronous one, might continue its processing. Although, it requires solving some
additional issues, such as the ability to process the data received in a response
for the request sent earlier, what complicate the design of an asynchronous ap-
plication, it may significantly improve its performance.
Moreover, the asynchronous communication is more reliable than the syn-
chronous one. Because the asynchronous model usually involves the usage of
queues, which can store persistently every received message, it guarantees that
no message will ever be lost. Even if the receiver system is currently not operat-
ing, the queue will store all message designated for it, and when the system will
be online again it will fetch all of them.
Another advantage of the asynchronous model is the fact that it enables to
create systems more resistant to high-loads. The difference between those two
models is the way they behave during high traffic. In such a situation a syn-
chronous application would not be able to provide a service to all clients, some
of them would get an error, some of them would get no response at all, finally
even the whole application could become inaccessible. An asynchronous applica-
22 CHAPTER 2. INTEGRATION STYLES
tion, on the other hand, would statistically process each request longer, because
it would have a lot of requests waiting in the queue, but sooner or later each of
them would be processed, no request would be left without a response.
The trade-offs between synchronous and asynchronous model has been summed
up in the table 2.1.
• File Transfer
• Shared Database
• Remote Procedure Invocation
• Messaging
Each of those techniques has been developed to handle the same task - the
application integration. Although, the task remains the same, the approach rep-
resented by each of them is different. Every one of them is more sophisticated
than its predecessor (e.g. the Shared Database is more complicated solution than
the File Transfer). When faced with the application integration task the point is
not to use the same technique in all cases, but to be flexible and basing on the
criteria described above choose the most suitable style for a given task. More
than one style can be used to achieve the best final result. As mentioned before,
Messaging would be the style on which this thesis will concentrate, but other
styles will also be briefly described to give a wider scope of possibilities at hand.
applications. Because files exists on every operating system and almost every
programming language has files operations, it makes them very universal solution
for the purpose of information exchange. Also, as using files does not require any
additional integration tools and as they are already available, they might seem
to be an obvious solution, but they do have lot of disadvantages.
What is required in order to integrate application using this technique is
an agreement on the format of the file used to exchange data. In most cases two
special components (Figure 2.1), usually designed by the integration team, are
created for that purpose:
• Export - component responsible for putting the data from the Applica-
tion A into the file (according to the file format)
• Import - component responsible for reading and parsing the data from the
file (according to the file format) and inserting that data into the Applica-
tion B
Apart from the file format another arrangements must be made. Naming
convention for the files has to be agreed, so that the file names remain unique (it
should be impossible for two separate files to have the same names). This is very
important in order to avoid name conflicts and situations, when the file with old
data would be processed.
Another important issue is to decide when the files will be written and read.
Creating and processing such a file too often will burden an application unnec-
essarily. Usually some fixed time periods are set based on the business activity
cycles, e.g. files can be created on daily or weakly basis. Basing on those time
periods the recipient application (Application B) checks if there is a new file
available to process. If too large time period is set then the application could be
desynchronised and errors in data processing could arise, because by the time the
data in a shared file would be consumed by the second application, they could
become stale and processing them might lead to errors.
When a file is created and data is being written to it, the lock mechanism
must be set in order to make sure that the other application is not trying to
24 CHAPTER 2. INTEGRATION STYLES
access the same file. This issue is also very important and should be taken care
of in order to prevent errors while reading data from a file (e.g. unexpected end
of a file).
Application using the File Transfer technique can be modified without af-
fecting each other, because it has components, responsible for the export and
the import, separated from the application itself. Also because they only need
to access the file containing exchanged data, no knowledge about the internal
processing performed in each of them is required (such as method return types,
method names, number and types of parameters passed to method and so on).
One of the main disadvantages of that technique is the fact that data is being
synchronised in a batch mode. There might be situations when data processed by
Application B is no longer valid, but because synchronisation is taking place not
in realtime but in time periods, Application B is not aware of that fact, until the
next synchronisation process. This excludes this type of integration in certain
situations, e.g. checking current bank account balance.
• how error situations are handled: is the exception being thrown or some
negative value is being returned?
2.9.4 Messaging
The Messaging (Figure 2.4) is considered as the integration style involving the
smallest amount of assumptions about other parties and hence the most promising
for performing well in the integration task. Despite that fact, this is the most
sophisticated technique that can be used to solve the integration problem. It can
be said that this approach combines the features of the previous styles. Just like
the File Transfer it allows the applications to be loosely coupled (sent messages
can be transformed in order to comply with the format expected by the receiver,
without the sender and the receiver being aware of the transformation itself),
but it is also free of its weakness, i.e. high frequency of changes does not cause
desynchronisation of the integrated applications and processing of stale data by
one of them.
28 CHAPTER 2. INTEGRATION STYLES
The Messaging enables quicker data exchange and collaboration between inte-
grated applications. In contrary to shared database approach it does not couple
applications to one database. The Shared Database also does not handle well
with very frequent data changes, especially if the data is being shared between
applications placed in different locations, while Messaging is free of this problem.
The usage of Remote Procedure Invocations forces to make many assumptions
about applications and as a result couples them tightly. What is more, the
semantics and syntax of those invocations can be misleading, i.e. causing the
developer to think about remote invocations in the same way as he/she thinks
about local invocations. That way of thinking may lead to slow and ineffective
solutions. Messaging gives the means to transfer data in a quick and efficient way
(large number of small data units), with the receiver application being notified
automatically if there is another data waiting for the processing.
Messaging also provides a retry mechanism in order to assure the delivery of
the sent data. Applications integrated using this technique have no need to use
the same unified data structure and are not forced to make so many assumptions
about each other as in the case of the Remote Procedure Invocation. Messaging
also offers asynchronous data transfer, which means that the sender does not
have to wait for the results in order to continue its processing. It also does not
require both systems to be operational in order to pass data from the sender to
the receiver. More about the asynchronous method of communication can be
found in one of the previous sections called ”Asynchronicity” (2.8).
Chapter 3
Messaging was one of the integration techniques that had been briefly described
in the previous chapter. In the current chapter this description will be broad-
ened and detailed. As mentioned before, messaging is the most sophisticated
integration style that in exchange of high complication provides high decoupling,
asynchronous communication between integrated applications and other features
that make this solution the most flexible among all the described in the previous
chapter.
The previous chapter covered messaging in comparison to the remaining three
other integration techniques. This chapter will describe the concept of messaging,
key terms connected with this topic and the mechanism by which the messaging
based solutions work. Before going deeper into the description of this technique
an understanding of the basic messaging terms and concepts — such as chan-
nel, message, routing, transformation, endpoint, synchronous and asynchronous
communication — should be perceived.
3.1 Message
In order to transmit the data, it first must be marshaled by the sender into a byte
form and then unmarshaled by the receiver so that the receiver has its own local
copy of it. During the transmission data is being wrapped into a Message (Fig-
ure 3.1). Each Message forms an undividable entity, it cannot be split into parts
or divided. It is the data record that can be transmitted and read by the mes-
saging system. In order to communicate the sender’s application must transform
data that is being transmitted into one or more messages and then send those
messages to the receiver. The receiver gathers these messages, extracts the data
from them, merges them if the data have been split into more than one Message,
and finally processes it. Messaging solutions guarantee delivery of the message
to the receiver (it can be repeatedly transmitted from the sender to the receiver
until the transmission will succeed).
A message is the smallest undividable portion of data exchanged between
29
30 CHAPTER 3. MESSAGING BASED SYSTEMS
• Body — contains the data being transmitted, usually this part of the
message is treated as a black box by the messaging systems and sent between
the sender and the receiver as it is
Moreover, the message payload might contain special, separated section called
Properties, which contain a list of key-value pairs, defined by the sender of
a message.
The messaging system does not differentiate types of messages being sent.
The programmer can choose among different types of messages that can be sent.
Those types are as follows:
• Event Message — used to notify the receiver about some event that has
occurred on the sender’s machine
The concept of sending a stream of data divided into discrete parts is not
only used in messaging systems. It is also applied in the network protocols,
where data is grouped into discrete units of data, i.e. datagrams/packets in case
of the Internet Protocol (IP) and segments in case of the Transmission Control
Protocol (TCP).
CHAPTER 3. MESSAGING BASED SYSTEMS 31
1. JAVA JMS
2. .NET Messaging
3. SOAP
case of an communication error, how the message is being converted into a stream
of bytes and so on.
There are two types of channels:
The above division is based on the way the messages are being distributed
from the sender to the receiver. Other division is based on the purpose of the
message channel:
messages using the same Message Channel. Quite often it is necessary to perform
some processing of the sent message before it will be directed to its destination.
Messages sent by a single sender may require different processing while being
sent through the Message Channel. Different processing can be required depend-
ing on the message origin, business rules, message type or some other criteria.
In order to assure this, each Filter component connected to the channel has to
know those rules. However, if the rules change, then all of the components within
the Message Channel also have to be changed so that they would have updated
rules. This would make any changes to existing solution very time consuming
and ineffective, both time and performance like. Very often the components that
would be used to determine the further processing of the message could not be
changed because it would be too expensive, time consuming or even impossible.
Moreover, in order to determine the further processing of the message
(e.g. state if the message is destined for this component or not using business
rules based on the message content) the component has to fetch the message
from the Message Channel. But after the message has been consumed, it cannot
just be put back to the channel the same as it was before, because the messaging
system does not enable that.
In order to solve the problem of redirecting, the message depending on a set
of conditions without involving all components participating in the message pro-
cessing a new type of component has been introduced into messaging solutions.
This component is called a Message Router (Figure 3.3). The role of a router
is to decide where the particular message should be delivered basing on a set of
defined business rules.
Other components using messaging system are not aware of the router’s exis-
tence, because it does not change message content, it only redirects messages to
the proper channel. If the need to change the decision rules will arise, then only
the router component has to be changed, other components remain unchanged.
A router is a single point where the decision concerning further message trav-
elling path is being made, therefore in case of heavy traffic the routing component
might become a system bottleneck, but the likelihood of such a situation might
be significantly decreased by using several parallel routing components or by
improving the hardware used to run the system.
The Message Router needs to know the full list of possible message recipients
along with rules that govern the routing process. The alternate solution, that
can be used in case of frequently changing list of final recipients, is to let each
of the recipient to decide whether to fetch the message from the queue or not.
This alternative solution can be build by using Publish-Subscribe channels and
Message Filters, it is called reactive filtering, while using a routing component is
called proactive routing.
There are a few possible variants of a Message Router that can be used in
integration solution:
• Fixed Router
This is the simplest variant. In this variant the router has one input and
one output channel defined. It does not perform routing as such, but is
used to decouple systems or pass messages between different integration
CHAPTER 3. MESSAGING BASED SYSTEMS 35
solutions. Most often this type of routers are used combined with a Message
Translator or a Message Adapter in order to pass the message between
different integration solutions or different types of message channels.
• Content-Based Router
This type of routers use the properties of the message such as, for example,
the type of the message or the values of the specified message fields in order
to determine the message destination. It is the most commonly used router
type.
• Context-Based Router
This type of routers use the information about the surrounding environment
to determine the message destination. Those routers can be used to perform
load balancing or change the message destination if the original recipient
is not responding. Context-Based Routers can be used to increase the
flexibility and reliability of the system in case of unexpected errors.
Routers can also be divided into two other groups: stateless and stateful.
In the case of the first group, a stateless router only considers the message that
it had just received and makes the routing decision basing on only single —
current — message. A Stateless router, on the other hand, in order to determine
an incoming message destination also takes also into account previous messages.
This feature might be used to remove duplicated messages, for example.
changes to the integrated applications can be very difficult or in some cases even
impossible. It may also cause some changes to the internal business logic of
the application, which is an undesired situation, because integrated applications
should be unaffected by the integration process as much as possible. Making
such changes would also neglect the idea of loose coupling described earlier (1.1).
After implementing that kind of change into both applications, they would not be
loosely coupled anymore. The change in data format in one of them would have
to be reflected immediately in the other one, otherwise the integration solution
would not work as it was intended to.
The simplest way to ensure that the data format of the arriving message
will correspond to the internal data format of the receiver’s application is to use
a separate component, which will changethe message body to the appropriate
format. This component is called a Message Translator or a Message Trans-
former (Figure 3.4). The usage of this component enables to preserve the loose
coupling between applications. In the case of a change of internal data format
within any of the integrated applications only the changes in the component per-
forming transformation are necessary, the applications will remain unaffected.
This way they do not depend on each other, and changes made in one of them
do not enforce to make changes in the others.
The transformation process itself can take place on many different levels of
data representation. It may refer to the name of the data fields, data represen-
tation in those fields, data structure as a whole (different ways of representing
the data) and so on. Hohpe and Woolf [7] makes a division of different levels of
data transformation and organises them in a similar form as the ISO/OSI model.
This division is presented in the table 3.1.
As it is being presented in the above table, the levels of transformations are
divided into four layers:
• Transport
Transformations performed in the scope of the communication protocols
(Transport Layer) enables data transfer between systems using different
communication protocols and ensures reliable message transfer between
those systems.
• Data Representation
The Data Representation layer performs the transformation concerning the
representation of the data. Transformation within Transport layer operates
on the stream of bytes, while the Data Representation transformation op-
erates on the data representation (e.g. it changes the XML representation
into name-value representation).
• Data Types
The Data Type layer performs the conversion of the data contained in
the message. The conversions includes changing field names, changing data
types of those fields, combining data from multiple fields into one or splitting
data from one field into two and so on. The goal of this transformation is
to make data comply with the data model of the receiver’s application.
CHAPTER 3. MESSAGING BASED SYSTEMS 37
The figure 3.6 summaries all of the terms explained so far. The application on
the left side (Application A) wants to sends a Message to the Application on the
right side of the above picture (Application B), the steps of that communication
would look as follows:
3. The Message is directed to the Router, which decides where the message
should be delivered (let us assume that the message would be delivered as
presented on the picture).
sender. Of course, sender can resend the message after some period of time to
increase the possibility that the receiver will finally receive it. But in the case
of some types of errors this will not assure a successful delivery. This drawback
can also lead to the loss of application performance and processing speed. An
application has to wait until it receives a response instead of performing its usual
activities. In the case of one application accepting requests from several clients
those drops in performance and processing speed can become even greater and
lead to deadlocks.
The main advantage of the synchronous approach is its simplicity. The appli-
cation does not have to use additional resources to monitor the whole processing
as it is in the case of an asynchronous approach.
This simplified approach is sufficient in case of systems that do not send
sophisticated requests that would require a lot of processing from the message
receiver. A short processing time of requests will reduce the time the sender
application has to wait until it receives an answer and as a result, do not affect
the overall performance of the senders in a noticeable way.
Asynchronous communication, in contrary to the synchronous one, is a much
more complicated concept. The main idea behind it is that application after
sending the request does not wait for a response, but continues its processing.
It requires different approach in implementation and design. An application using
an asynchronous model cannot be designed as a sequence of method invocations.
It has to be designed in such a way that the remote functionality will be in-
voked without affecting the main application flow. This enforces different kind
of application design than in the case of synchronous communication. A possible
scenario might look as follows:
2. If the receipt of the message has been confirmed by the messaging sys-
tem, Application A stores the information about the sent message along
with identifier assigned to it in the database and switches to perform other
operations.
5. Application A receives a new message with the response for its previously
sent request.
6. Application A looks up the database for the request identified by the mes-
sage identifier contained in the received message.
Design pattern is a term that has been adopted from the architecture to software
engineering [1], which describes a well-known method of solving commonly occur-
ring problem. In case of computer science design pattern should not be perceived
as a ready-to-use solution, such as a source code, but only as a template that can
be used in multiple situations. In this paragraph the usage of the design patterns
in application integration will be discussed and the term design pattern itself will
be explained in a more detailed way.
Designing and implementing an application can be a very complex and com-
plicated task. Applications may vary on the technologies used, working environ-
ment, performed task, complexity, and so on. But very often the designers would
encounter the same problems to solve. The design patterns are a set of proved
ways of solving those problems. They do not contain a ready solution that can
be put into an application to solve a problem, rather, they are template that can
be used to solve the problem [4].
Design patterns also occur in the messaging systems and application integra-
tion solutions. Some of the design patterns described in this section derive from
the basic concepts of the messaging systems described earlier (3). Each of them
is an answer for particular problem, for example, how to connect two applications
within the messaging system — by using the Message Channel.
As said before, design patterns are only general templates, not a ready to
use solutions. The Message Channel pattern, for example, can be implemented
in different ways, as a Datatype Channel, a Point-to-Point Channel, a Publish-
Subscribe Channel and so on. Each of those channels performs different task,
but all of them are based on the same Message Channel pattern template.
As mentioned before, the Message Channel pattern can be applied in various
ways. The simplest usage of this pattern is the Point-to-Point Channel that
connects two different systems directly. If the aim is to deliver the message to
more than one receiver at the same time, the Publish-Subscribe Channel can be
used. This channel has one input (Publisher) channel and many outputs (Message
43
44 CHAPTER 4. DESIGN PATTERNS IN THE APPLICATION INTEGRATION
• Content-Based Router
The Content-Based Router (Figure 4.1) reads the message content and bas-
ing on it and encoded routing rules directs message to the proper recipient.
• Message Filter
The Message Filter (Figure 4.2) works in a similar way to the Content-Based
Router. It reads message content and checks if it matches the encoded
criteria. If it does, it sends the message further, if not the message is
discarded.
• Dynamic Router
The Dynamic Router (Figure 4.3) is a more flexible variant of the Message
Router. It allows the routing rules to be modified by sending control mes-
sages to the given port of the router. This makes is more flexible then the
router with fixed routing rules and allows to change routing rules dynami-
cally, when such a need arises. It can be useful when a new system is being
CHAPTER 4. DESIGN PATTERNS IN THE APPLICATION INTEGRATION 45
connected to the messaging solution and all routers in the system have to
be updated with new routing rules.
• Recipients List
The Recipients List (Figure 4.4) extends the functionality of the Content-
Based Router. It works in a similar way to the Publish-Subscribe Channel
— inspects the incoming message and basing on the message content it
determines the list of the message recipients, then it forwards the message to
those recipients. The list of recipients may vary depending on the message
content, which also can be specified dynamically.
• Splitter
The Splitter (Figure 4.5) is used when an incoming message contains mul-
tiple elements, which cannot all be processed in the same way. In that case
the message is split into separate elements and each of them is sent inde-
pendently to an appropriate system to be processed. The Splitter produces
one message for each element contained in the incoming message (e.g. if
an incoming message contains order data with a list of ordered items, for
each item from the list a new message will be produced and published to
an appropriate channel).
• Aggregator
The Aggregator (Figure 4.6) works in the opposite way to the Splitter de-
scribed above. The Aggregator receives incoming messages and identifies
the ones that are correlated with each other. When the complete set of
correlated messages has been received, it performs an aggregation of those
messages collects the information from each of the correlated message and
publishes a new — single — message, containing all of the collected infor-
mation.
46 CHAPTER 4. DESIGN PATTERNS IN THE APPLICATION INTEGRATION
• Routing Slip
The Routing Slip (Figure 4.7) allows to determine the whole processing
path for every message. Each incoming message has a routing slip attached
to it, specifying the sequence of the processing steps for this particular
message. Every processing component is being wrapped in a special router
that reads the routing slip attached to the incoming message and forwards
the message to the next processing step from the routing slip. This way,
a whole processing chain can be composed and managed from one location.
Moreover, the Routing Slip for a new type of messages can be defined if
necessary.
• Process Manager
The Process Manager (Figure 4.8) works in a similar way to the Routing
Slip, although it works in a more dynamic way. It forwards the message
to the first processing unit and basing on the processing results from this
unit and the information about the processing step executed previously it
determines the next processing step. The next step is computed dynami-
cally basing on the processing result and information stored by the process
manager. The processing path is not fixed as in the case of the Routing
Slip but is constructed dynamically by the Process Manager.
• Message Broker
The Message Broker (Figure 4.9) is a central component of the integration
solution. It connects all integrated system. Within its internals it contains
design patterns described before used to effectively route the messages be-
tween the connected systems. It reduces the number of message channels
required to connect the integrated system. If each pair of the integrated
systems, which need to interact with each other, would be connected di-
rectly, the number of required channels would increase to an unmanageable
number. The Message Broker significantly reduces the number of the re-
quired channels and becomes a central component of the system, where all
message routing operations are being performed.
As it might be observed there are a lot of routing components that can be cre-
ated basing on the message router pattern. Each of them response to a different
kind of need and can be used to solve a particular problem. Those components
vary from the simplest ones with fixed routing algorithms to the more compli-
cated that perform routing dynamically basing on the results of the application
processing and dynamically builds a processing path for an incoming message.
Another pattern that is commonly used in the integration solutions is the
Message Translator. As mentioned before, the Message Translator pattern is
48 CHAPTER 4. DESIGN PATTERNS IN THE APPLICATION INTEGRATION
used to reformat the data in such a way that it would fit to the internal data
representation model of the other system. Such a need may arise very often as
the systems being integrated usually have different internal data representation
models. In case of this pattern, as well as in the case of the previous ones, the
pattern itself is a base for different variants of the Message Translators that can
be used in various situations depending on the faced problem. The idea of the
Message Translator concept has been already described in the section devoted
to the main concepts of the messaging systems (3.5). Now let us concentrate on
the description of the different variants of translators based on the same Message
Translator pattern:
• Envelope Wrapper
The Envelope Wrapper (Figure 4.10) wraps sent data into an envelope
in such a way that it fits the message format used by the given messaging
system (adds header and body sections, encryption, etc.). After the message
arrives at its destination point it is unwrapped by the unwrapper, which
withdraws any modifications done by the wrapper and passes the data, as
it was initially sent by the sender application, for further processing.
• Content Enricher
The Content Enricher (Figure 4.11) is used when the destination system
requires more information than the sender can provide. Content Enricher is
able to add additional information to the message fetched from the external
data source. After this step the message is forwarded to the next processing
component.
CHAPTER 4. DESIGN PATTERNS IN THE APPLICATION INTEGRATION 49
• Content Filter
The Content Filter (Figure 4.12) works in the opposite way to the Content
Enricher. When an incoming message contains complex information and
only a small part of that information is required by the message receiver
a Content Filter removes the obsolete data from the message, leaving only
data needed by the message receiver.
• Normalizer
The Normalizer (Figure 4.13) is a combination of the Message Router and
multiple Message Transformers. It is used when integrated systems use
different formats of the messages and when each of those formats requires
a different type of translation in order to fit into the model used by the mes-
saging system. Information can arrive as an XML document, as a plain text
file containing a comma separated data fields, as an Excel file, and so on.
Each of those formats requires different processing in order to transform it
into the format appropriate to the messaging system. When those messages
arrive at the Message Router, they are being forwarded to the appropriate
Message Transformer responsible for dealing with this particular data for-
mat. The range of accepted incoming message formats might be easily
widen by adding a new routing rule to the Message Router and connecting
the Router to the additional Message Transformer by the Message Chan-
nel. This way the integration solution might be dynamically adapted to the
changing business environment and extend its functionality.
The above examples of integration patterns show how a single template can
be used to solve different kinds of challenges of the same nature, in this case,
connecting multiple computer systems.
Single template can become the source for various types of components de-
signed to solve different types of problems. Each of those patterns finds an appli-
50 CHAPTER 4. DESIGN PATTERNS IN THE APPLICATION INTEGRATION
5.1 Definition
The Enterprise Service Bus (ESB) is an integration solution that enables
integrating systems in a loose-coupled way. It heavily uses open standards such
as XML and WebServices. It is based on the Message Oriented Middleware,
which provides reliable communication using messages. It simplifies creating
computer systems architecture focused on providing business services — services
that have a meaning to the business, not implementation services, services that
have a meaning to the developers.
Currently there is no formal, industry-agreed upon definition of an Enterprise
Service Bus. A lot of vendors provide products claiming they are ESB solutions,
but there is no precise definition of what such a product should contain. One of
the methods of explaining what an Enterprise Service Bus is, is focusing on the
capabilities that it is able to provide and the advantages of deploying it into the
company. This approach will be taken in this chapter.
According to the Gartner Group an ESB [14] consists of the following four
things:
51
52 CHAPTER 5. ENTERPRISE SERVICE BUS
• Web Services
• Intelligent Routing based on content
• XML data transformation
It is worth keeping in mind that the term ESB does not necessarily have to
refer to the software product. It may also beconsidered as:
• a pattern
• an architectural component
• a hardware component (there are devices, which have all the capabilities
required in order to be called as an ESB solution)
One of the following sections named ”ESB components” (5.6) will describe
the meaning of an Enterprise Service Bus as an architectural component.
• reliable transport
• efficient method of communication using messages
• end-to-end reliability
Let us apply the same scenario to the right side of the figure. Application 5
communicates with Application 6 and 7. Application 5 does not have to worry
about the applications’ accessibility, because it does not communicate directly
with them — it is using its message queues, which are always available. If Ap-
plication 6 or 7 is currently not working, the messages designated for them will
not be dropped, but they will be stored in the queues until the Applications
will be up again. Message Oriented Middleware guarantees that eventually every
message will reach its destination application. Application 5 does not have to be
concerned about it.
of the figure (Figure 5.2) shows the typical model of interconnected systems.
Component1, in order to connect to Component4, has to know the following:
• Component4 IP address
The same applies for all of the connections on the above picture. Before one
component will be able to connect to another, it has to know a lot about the
other party and also, assume that this information will not change.
The introduction of the ESB, presented on the right side of the figure 5.2,
relieves the client from the need to know, who is providing the service, because
the ESB is responsible for creating a communication channel between the client
and the service providers. That way, the application does not need to have
the integration code, because it will no longer be responsible for creating the
TCP/IP connections, reconnecting in case of a communication error, knowing
connection information (URL, port number) of the service provider and so on.
Thus, introducing an ESB will simplify the design of the client. Furthermore, in
a tightly coupled system, any change of the connection data in one of the service
providers, also requires a change in all of the clients that are using this service
provider. This is no longer true in the case of an ESB, because it is the ESB,
which is responsible for storing that information.
Let us imagine what would happen if the IP address of Component4 would
be changed one day? Then in the case of a tightly-coupled scenario all of the
applications connected directly to the Component4 (i.e. Component1 and Com-
ponent5) will have to be updated. Now, in the case of an ESB scenario only the
configuration hold by the Enterprise Service Bus will have to be updated.
CHAPTER 5. ENTERPRISE SERVICE BUS 55
1. Mediator
The Mediator is the most important component in an ESB. The crucial
functionality provided by this component is the routing, communication
and protocol transformation. A product not having the above cannot be
considered as an ESB solution. The Mediator is used as an entry point for
the ESB — messages sent to the ESB are received and processed by this
component. It might also be responsible for message transformation and
enhancement. In order to enable reliable and secure processing of requests
it must support security, error handling and transaction management.
2. Service registry
Service registry is a component that provides the functionality of the service
mapping.
3. Choreographer
The role of the choreographer is to enable process choreography — co-
ordination of business processes. This component is actually a client of
an ESB. It has the knowledge about the sequence of business services that
must be called in order to perform one — sophisticated — business request.
If the mediator decides (according to its rules) that this particular request
needs to be choreographed, it will be forwarded it to this component. The
Choreographer after looking up its configuration will invoke proper service
providers by sending messages — just like an ESB client — to the mediator.
4. Rules engine
The Rules engine is an additional component, which may not be required
in some integration projects. This component enables to have a rule-based
routing. Its functionality includes: message routing, message transforma-
tion, security and transaction management.
Validate
The aim of the validate step is to ensure that messages received by the service
provider will have proper syntax and semantics. This step should be performed
independently — not inside the service provider because that solution would
limit the re-usability of validation and complicate any further modifications of it.
Moreover, implementing validation as a separate component would ensure that
every message that gets to the service will be in a proper format, thus would
simplify the design of service provider and enable the Operate step to focus on
business logic. The simplest way of validating an incoming message is to check
whether the message is a well-formed XML document and conforms to the XML
schema or WSDL, but there are also other possibilities, like for example validation
scripts.
Enrich
The aim of the enrich step is to add some additional data to the message content
that would be needed by the service provider, for example, information about
the customer, who has placed order. That information might be fetched from the
database or might be the result of invoking another service.
Transform
The aim of the transform step is to change the message format to the one accepted
by the service provider. This step might transform the message into an internal
message format of the service provider, releasing the Operate step from the need
to perform this task and therefore increasing its efficiency.
Operate
The aim of the operate step is to invoke target service or to interact in some way
with the target application.
usually performed in one step, when the output from one step is used as an input
to the following step:
• XSLT transformation
• XPath query
• JDBC query
• SQL statement
The concept of the Two-step XRef pattern [3] (Figure 5.5) is to create two
separated components responsible for only one type of operations:
• loose coupling: problems with the database does not affect the operation
of the XML parsing component
• using message routing — duplicate every response from the integrated sys-
tem to the Cache Service
Publish-and-subscribe model
In this scenario every change of the data held by an integrated system will cause
sending a message with a set of changes to a message topic. The Cache Service
will be a subscriber of that topic. This solution is only suitable for small computer
CHAPTER 5. ENTERPRISE SERVICE BUS 63
software infrastructures with systems not frequently changing data. It is not hard
to imagine what would happen if there would be multiple integrated systems
constantly changing their own data, then most of the traffic would be consumed
by update-messages making an ESB incapable of handling any regular-messages.
Message routing
This scenario assumes the usage of one of the ESB main components the router.
Every response, before getting back to a portal application, should also be sent
to the Cache Service. In that way, it will have a copy of every information
that used to be presented by the portal application and in case of inaccessibility
of an integrated application that information might be supplied by the Cache
Service.
Chapter 6
• Internet Shop — responsible for the interaction with the user and placing
(and confirming) orders
• Orders Fulfilment System — responsible for fulfiling the orders
• Storage System — responsible for providing information about product
supplies
• Pricing System — manages the prices of all products available for pur-
chase
• Loyalty System — responsible for storing information about discounts
for those customers, who purchase most frequently and/or purchase large
quantities of products
65
66 CHAPTER 6. CASE STUDY: MESSAGING SYSTEMS WORK PRINCIPLES
The task at hand is to connect all those system into one big business entity
by using the messaging system. The high level design view on the schema of the
systems after the integration is presented on the figure 6.1.
Of course, each of the systems has to have a number of endpoints attached,
so it would be capable of sending and receiving messages. Message Channels
between the systems must also be set so that the communication can take place.
The connections between the systems must be determined earlier (there is no
point in connecting two systems using a Message Channel if no communication
between them will ever occur). During this process a possible location for placing
additional components, such as Message Routers or Message Transformers, should
also be determined.
For every router the set of rules must be set by which the router will determine
the destination point of the incoming message. If there is a Message Transformer,
the rules for message transformation must also be set.
It also must be decided whether this solution would use a synchronous or an
asynchronous communication model. The system being the subject of this study
should be resistant to the communication failures and as flexible as possible.
Also we do not want the Supply system to stop working and receiving requests
from the Storage System, if it would not get the acknowledgement of the received
order from the factory, and so on. It this case the most suitable model of com-
munication would be the asynchronous one. The usage of this model also means
that much more effort must be put into the design and implementation of the
solution, but it will assure that the final solution will operate in the desired way.
The detailed description of the design and implementation of this case study,
covering all possible issues, could easily cover the whole volume of this thesis.
Because our goal here is to only give a brief taste of the integration task and
show the practical usage of the concepts described previously, we will just give
the brief description of the integration solution, not focus on the technical details.
First, let us take a closer look at the simplest scenario, involving an order made
by the Internet Shop with credit card payment, when there is no need to order
the products from the factory and wait for the fulfilment of the order because
the products are available in the desired quantity at the store. This scenario is
presented on the sequence diagram — figure 6.2.
The user visits the web site of the Internet Store, logs in using his/her user
name and password, browses through the list of available products, selects the
CHAPTER 6. CASE STUDY: MESSAGING SYSTEMS WORK PRINCIPLES 67
one that he/she is interested in and places an order. After the order is being
placed, the Internet Shop sends a request to the Storage System to check if the
selected products are available at the moment. As mentioned before, in our
scenario we assume that the ordered products are available. In the opposite
case, the Storage System would send a request to order them to the Supply
System. Then the Supply System, after the amount of products needed would
reach a specified quantity, sends a request to the factory to make those goods
produced and delivered to the storage in Europe and then forwards the order to
68 CHAPTER 6. CASE STUDY: MESSAGING SYSTEMS WORK PRINCIPLES
of the user). The message is also being sent to the Logistics System to add the
package to the list of packages to be picked up and delivered on the next day.
When the package is sent out to the client the message is being sent to the Order
Fulfilment System to notify that the package has been sent, upon receiving this
message the system sends an notification e-mail to the user informing that the
package has been sent and stating the approximate delivery time. Sending this
notification e-mail finishes the order processing by the system.
As it can be seen, even in the case of the simplest scenario the interaction
between the involved systems is quite complex. There are many systems involved
in the process of exchanging different types of information. It is worth keeping in
mind that the data being sent is the subject of many changes and prepossessing
before it can be consumed by the next system in the processing chain (different
data formats, internal data model of the applications, and so on).
Designing the integration solution for this system would require the usage of
all of the components explained in the previous chapters, i.e. Message Routers,
Message Translators, Message Endpoints. The Message Routers could be used
to determine the destination of a message in the case when one system can send
messages to different receivers. The Message Translators could be used to trans-
late data contained in those messages so that they would fit the internal data
model of the receiver system.
The schema of the systems integrated by the messaging system and incorpo-
rating the elements mentioned above is presented on the figure 6.3.
Although the diagram 6.3 may look simple and straightforward, it is, in fact,
an example of a badly — tightly coupled — designed computer software infras-
tructure. One of the disadvantages of this solution is the unmanageable number
of message channels (depicted on the figure as arrows). Although, the major
disadvantage is the fact that an application, in order to communicate with other
applications, must know a lot of details about it, like for example message channel
addresses and message formats. Moreover, every time a message format changes
in one application, all applications communicating with that application also
have to be updated. For example, if the Storage System will change the format
of the date field, applications such as the Logistics System and the Order Fulfil-
ment System also would have to change the format of the messages that they are
sending to the Storage System.
The solution of the previously mentioned problems might be the usage of inte-
gration patterns — the Message Router (3.4) and the Message Transformer (3.5).
Figure 6.4 presents a new architecture utilising those concepts. Despite the fact
that the amount of message channels has increased, this solution enables greater
decoupling of the applications. The knowledge about the format of messages
accepted by the system is now not hard-coded inside each application, but is del-
egated to a new, intermediary component — the Message Transformer (depicted
on the figure as the letter T). Also decisions about the routing of messages are
not taken by each application, but by the Message Router (depicted on the figure
as the letter R). This approach introduces a greater level of loosely coupleness —
the message format of each application might be changed independently of the
others.
Each system shown on the figure 6.4 has been enriched by the endpoints
70 CHAPTER 6. CASE STUDY: MESSAGING SYSTEMS WORK PRINCIPLES
that enable the communication between the system and the messaging system.
The number of endpoints corresponds to the number of channels from which
the given system can receive messages or to which it can send them. Message
Routers are used to direct the messages to their destinations basing on given
business rules (e.g. name of the destination system placed within the message
header). The Router connected to the Internet Shop channel decides whether to
send a message to the Storage System (checks the availability of the product) or
to the Pricing System (gets the prices for a given product) or to the Orders Fulfil-
ment System, which passes it (a message containing information about a placed
order) on for further fulfilment. The router connected to the Order Fulfilment
System channel routes messages either to the Storage System (a message con-
taining a request to create the package for shipment) or to the Payment System
(checks whether the payment for the ordered products has been made) or to the
Internet Shop (notification messages about the various stages of order fulfilment).
Message Transformers are used to transform Messages to the format readable by
CHAPTER 6. CASE STUDY: MESSAGING SYSTEMS WORK PRINCIPLES 71
Figure 6.4: Case study: Message Channel with Router and Translator scenario
the recipient system (e.g. Messages sent from the Storage System to the Inter-
net Shop need to be transformed by the Transformer component in order to be
correctly read by the destination system, in particular in the case of the Internet
Shop). Let us assume that the Order Fulfilment System stores information in such
a way that the receivers name and surname are combined together in the field
receiver and the address is stored in the field package destination. Thus, the role
of the transformer would be to extract the receivers name and surname and put
them in the field receiver and then extract the information about the destination
address and put it in the field package destination. Only after those transforma-
tions the message can be sent further to the Order Fulfilment System. After those
changes it can be assured the data will be read and interpreted correctly by the
system and will not cause any errors during its processing.
The problem of the unmanageable number of message channels is resolved by
the introduction of an Enterprise Serivce Bus (Figure 6.5). In this scenario, each
of the systems communicates only with the ESB, which is responsible for mes-
72 CHAPTER 6. CASE STUDY: MESSAGING SYSTEMS WORK PRINCIPLES
sage transformation and routing. This approach simplifies the development and
management of applications, because there is only one message channel between
the application and the ESB. Moreover, an Enterprise Service Bus provides wide
range of adapters — components for accessing an ESB, which removes the need
for an application to know details of communication with the ESB.
An Enterprise Service Bus incorporates the implementation of multiple inte-
gration design patterns, such as the Message Channel (3.3), the Message
Router (3.4), the Message Transformer (3.5), the Message Endpoint (3.6), etc.
But the internals of that implementation are hidden from the applications using
an ESB. This simplifies their design and makes the process of integration easier.
The example described above should give a good overview on the usage and
practical application of the concepts presented in the previous chapters. Using the
most basic components such as a Message Router and a Message Translator,
a complex integration solution might be designed.
Chapter 7
Implementation
This chapter will cover the details of the implementation of the ESB platform
named pESB. It was developed by the authors of this paper as an internal part
of this master thesis.
First, the basic concept behind this product will be described, its technology
and architecture. Later, the internal working of it, illustrated by the sequence
diagram, will be presented. Having covered the essential information, a real life
example, which will familiarise the reader with the process of configuring the
ESB, along with source code snippets, will be introduced. Finally, the problems
that occurred during the implementation will be presented.
7.2 Concept
Being aware of our limited amount of time and available resources we decided to
create an integration solution that will be lightweight and easy to use in the first
place. We knew that we would not be able to create a tool that will compete
with integration products existing on the market for a couple of years. Thus, we
decided to create an integration product that will provide simple functionality,
but will have an architecture that will enable easy scalability and further devel-
opment.
The main concept of this approach to an ESB solution was to base implemen-
tation on the standards and technology already available on the market, like Java
Enterprise Edition (EJB, JMS), XML, WebServices, etc. This approach is very
different compared to other existing ESB solutions. pESB takes advantage of
73
74 CHAPTER 7. IMPLEMENTATION
the services provided by the application server like security management, trans-
action management, pool and resource management, etc. It does not come with
its own Message Queue implementation. In our opinion this should be treated
as an advantage, because we do not force users to use any particular solutions.
Nowadays, every company thinking about integration solutions has already an in-
frastructure that might be used. Furthermore, our solution does not reinvent the
wheel. The products providing those features have been available on the market
for a long time, and certainly the authors of those products had to overcome
a lot of problems and solve a lot of issues that appeared during the usage of their
products by the customers.
pESB was designed as a multi-tier application (Figure 7.1). The basis of this
solution is the Message Bus. Its main responsibility is to provide reliable method
of communication between multiple points using data channels.
The next layer is the Application server providing a lot of facilities, which
an application running in a container can take advantage of. Mentioned facilities
are as follows:
7.3 Technology
pESB is an enterprise application using the set of state-of-the-art Java technolo-
gies:
7.4 Classification
According to the classification presented in the Forrester Wave comprehensive re-
port about Enterprise Service Bus products available on the market[13] customer
requirements might be yielded into two segments:
• ”keep it simple”
For the first group of customers the most important thing is that the solu-
tion should be simple as possible in order to enable low-cost integration. Also, it
should have a plug-in architecture to enable easy customisation to customer’s
needs. The products from the second group, on the other hand, should feature
wide range of additional services like Business process management, process sim-
ulation, monitoring or optimisation. pESB will suit customers from the first
segment.
76 CHAPTER 7. IMPLEMENTATION
7.5 Architecture
The main focus in the development of pESB has been put not on the optimi-
sation of particular methods or functionalities but on having an architecture,
which will enable scalability, distribution, reliability and security of the solution.
One of the ways to achieve that goal was to use asynchronous processing over
synchronous. The internal communication between components of the pESB is
done using asynchronous messages (Figure 7.2). The message processed by one
component is put into the queue of the other component. Every message is per-
sisted. The combination of those two factors provides scalability and reliability.
Message queues enable to have multiple consumers running on different com-
puters. Thus, the number of consumers may change dynamically, new con-
sumers might be added and removed on the fly. This feature provides scalability,
pESB might be easily adjusted to the growing requirements of processing greater
amount of the messages, simply by adding new application servers.
ponent providing an API to access the pESB. From the client’s point of view
(depicted on the figure as System1 and System2) the whole complexity of mes-
sage based communication, asynchronous processing, XML documents, etc. is
hidden. pESB except providing an interface for receiving and fetching messages
also provides a method of dynamic configuration (interface Config) using Java
API.
• the pESB
1. System1 invokes the send method from the Agent passing as an argument
Data Transfer Object, which it would like to send to the ESB.
2. System1’s Agent receives the DTO and creates an XML message out of the
DTO content, then invokes the method from the remote interface of pESB
using EJB and passes that XML document as an argument.
3. The ReceiverBean in pESB receives that XML document and puts it into
an input queue and returns information back to the Agent about the status
of that operation.
7. System2 wants to fetch new messages, it invokes the receive method from
the Agent.
8. System2’s Agent invokes the method from the remote interface of pESB
using EJB.
CHAPTER 7. IMPLEMENTATION 79
9. The FetcherBean in pESB checks whether there are any new messages wait-
ing in the output queue for System2, if there is a new message, fetches it
and returns back to the System2’s Agent.
10. System2’s Agent creates a new Data Transformer Object using the content
from the XML document and returns that new object back to System2.
It is worth mentioning that all the operations performed inside the Transform
routing unit are performed in a transaction. This means that in case of an error
no message will ever be dropped.
7.7 Configuration
All the configuration is done in the runtime, no restart or reload is required
after changing the configuration, also no XML file editing is required. The whole
configuration might be changed using a Java API. Furthermore, even a Java code
provided by the user is compiled on the fly and functionality provided by it is
available right away.
Each component in the pESB — router or transformer — might be configured
in multiple ways. That is, the transformer might be configured using either an
XSLT file or a Java code. It is up to the user which method of configuration he/she
will choose. In the case of a Java code method, the source code provided by the
user is compiled on the fly, during the process of configuring. This approach
has a lot of advantages over dynamic execution of the code. The compilation
process can detect a lot of errors, which in the case of a dynamic execution would
be found out only at the runtime. Moreover, the user in order to develop a
component might use an IDE of his own — Eclipse, Netbeans or IDEA, etc.,
which gives him/her the possibility to avoid common programming mistakes like
misspelling variable names, methods, using wrong types and so on.
The process of configuration pESB is performed using a Java API. We believe
that this solution provides the greatest flexibility, because it does not enforce the
way the configuration data will be stored. In this approach, it is possible to save
the configuration in an XML file, a LDAP directory, a relational database, etc.
The only thing that must be done is the creation of the import tool, which will
read the data from the particular medium (XML, LDAP, database), parse it and
invoke proper methods from the Java API. Moreover, this approach also enables
to have some addition logic in the import tool itself, which will be called before
using the configuration Java API. Furthermore, it is easier to modify an import
tool (XML or database) than a system (pESB).
• System
1. Internet Shop — responsible for the interaction with the user and placing
(and confirming) orders
CHAPTER 7. IMPLEMENTATION 81
• InternetShop
• OrderSystem
• StorageSystem
• InternetShop
• OrderSystem
• StorageSystem
The following snippet of code creates a Transform Routing Unit object with
XSLT Transformer:
String xsltCode = "<?xml ...";
TransformerDTO transformerDTO = new TransformerDTO(TransformerDTO.TYPE_XSLT);
transformerDTO.setParameter(TransformerDTO.CODE, xsltCode);
TransformRoutingUnitDTO truDTO =
new TransformRoutingUnitDTO("OrderSystem", transformerDTO, routerDTO);
Once Transform Routing Unit object has been created (variable truDTO) it
must be registered at pESB:
int result1 = bean.registerTransformRoutingUnit(truDTO);
// (bean is EJB remote interface of pESB configuration bean)
The next step in the process of configuration is creating links between Trans-
form Routing Units and Systems. The following snippet creates the appropriate
connections:
int result3 = bean.setTransformRoutingUnitInput(truDTO, systemDTO);
int result4 = bean.setTransformRoutingUnitOutput(truDTO, systemDTO);
It is worth mentioning that the links should be created on both sides, which in
context of the above example means that the OrderSystem output and input must
be connected with the TRU, and the TRU input and output must be connected
with OrderSystem.
• Normalizer (4) — this design pattern has been actually used to solve the
case study integration problem (7.7.2), the solution consists of multiple
Transform Routing Units performing the role of Message Translators and
one central Transform Routing Unit performing the role of a Message
Router
routing operations are being performed. It also reduces the number of mes-
sage channels, because an application in order to be able to communicate
with other systems needs only one message channel — to the pESB.
Summary
The integration problem has been well-known since the 70s. Since that time
a lot of technologies and integration styles have been in use, some of them have
been described in chapter 2. The latest approach to the integration problem is
an Enterprise Service Bus, based on the concept of messages and asynchronous
communication. Our software project — the implementation of an ESB — tries
to fulfil the gap on the market of such products. The main focus in its develop-
ment was put on the usage of well-known, well-tested, already existing products
and technologies available on the market, such as: Java, Java Enterprise Edition
(Java EE) application server performing the role of a container (hosting platform)
for an Enterprise Java Bean (EJB) application, Java Message Service (JMS) and
XML — because, we believe that those technologies combined with the deliber-
ated architecture will enable our product to be a reliable, efficient, secure and —
the most important thing in the era of constantly growing IT systems — scalable
integration solution.
The domain of this paper — integration — usually concerns large IT systems,
quite often systems, which are crucial for the operation of the company, thus
introducing a solution, which will not be of high quality is not an option. A high
quality solution is the one which is reliable, secure, easily scalable, distributed and
having high performance. Only such solutions might be created as an answer for
the integration problem. Because of the reasons mentioned at the beginning, there
is no place for not well tested, buggy, unreliable software products. This, as it is
well known in IT, is very difficult to achieve. To help in solving this challenging
task integration, the integration design patterns might come handy.
As it has already been mentioned in the introduction to this thesis, due to the
time and resources limitations our effort has been directed in such a way so that
the created application would fit into an empty niche. This niche creates a space
for a lightweight solution suitable to solve most common integration problems.
For obvious reasons, it was not possible to create a fully functional software in
such a short time that would provide functionality comparable to the solutions
delivered by large software vendors. Never the less, aiming for this niche made
it possible to create a fully functional software, and, at the same time, gaining
85
86 CHAPTER 8. SUMMARY
knowledge about the topic of application integration. The main assumptions that
have been made for the application have been outlined in the first chapter of this
paper. Now when it is finished and working, we can say that those assumptions
have been fulfilled. The ease of use and simplicity have been our main goals and
they have been achieved. We not only managed to describe the topic that we
have decided to undertake from the theoretical point of view, but we also used
this theoretical knowledge while creating our application. We also managed to
apply designed patterns presented in this paper in a practical way, in order to
obtain the desired results.
Due to the shortage of time our application lacks a graphical user interface
that would simplify its usage. Creating such a user interface would signifi-
cantly speed up the process of designing an integration solution. This feature
might be one of the possible ways of the further development of this application.
Another way is to enrich its functionality by adding a larger number of design
patterns for the user to choose from. While adding those new patterns, the over-
all simplicity of the solution should be kept in mind, as it is one of the most
important features of this application, that cannot be lost during its further de-
velopment.
The application that has been created during our work can be seen as a base
system, which can be further extended in various ways — to improve its function-
ality and ease of use. Sample ways in which it can be extended have been already
pointed out in the previous paragraph. Those are of course only the extensions
that in our opinion would contribute most significantly to the development of
our system. Other possibilities and ways of developing our application can also
be applied. By gradually developing it and adding new features, a very powerful
integration tool can be created, which can be used to solve not only the basic
integration problems but also more sophisticated ones. Of course, it will not be
able to compete with solutions provided by the large IT companies, but over time
it can become an interesting alternative to complicated and expensive solutions.
Our work proved that the knowledge of integration design patterns is essential
— not only for the designers and developers of integration solutions, but also
for the people involved in creating IT products. Nowadays, even the simplest
applications, created to simplify everyday routines may one day be integrated
into a large computer software infrastructure. Awareness of the problems covered
in this thesis will enable to design and implement solutions — flexible enough to
one day become a part of some other — larger — system.
Bibliography
[3] David Chappell. Enterprise Service Bus. O’Reilly Media, Inc., 2004. [cited at p. 53,
59, 60, 61, 62]
[4] Ralph Johnson John Vlissides Erich Gamma, Richard Helm. Design Patterns: El-
ements of Reusable Object-Oriented Software. Addison-Wesley Professional, 1995.
[cited at p. 43]
[7] Bobby Woolf Gregor Hohpe. Enterprise Integration Patterns: Designing, Build-
ing, and Deploying Messaging Solutions. Addison-Wesley Professional, 2003.
[cited at p. 7, 17, 23, 24, 26, 27, 30, 32, 33, 35, 36, 38, 39, 44, 45, 46, 47, 48, 49, 50]
[8] William Grosso. Java RMI. O’Reilly Media, Inc., 2001. [cited at p. 11, 26]
[9] Stany Blanvalet Jeremy Bolie, Michael Cardella and Matjaz Juric. BPEL Cook-
book: Best Practices for SOA-based integration and composite applications devel-
opment. Packt Publishing, 2006. [cited at p. 52, 57]
[10] Ann Wollrath Sam Kendall Jim Waldo, Geoff Wyant. A Note on Distributed
Computing. Sun Microsystems Laboratories, Inc., 1994. [cited at p. 27]
[11] Doug Kaye. Loosely Coupled: The Missing Pieces of Web Services. RDS Press,
2003. [cited at p. 9]
[12] Susan Bishop Alan Hopkins Sven Milinski Chris Nott Rick Robinson Jonathan
Adams Paul Verschueren Martin Keen, Amit Acharya. Patterns: Implementing
an SOA Using an Enterprise Service Bus. IBM Corp., 2004. [cited at p. 55]
[13] Ken Vollmer Mike Gilpin. The Forrester Wave: Enterprise Service Bus, Q4 2005.
Forrester Research, Inc., 2005. [cited at p. 75]
87
88 BIBLIOGRAPHY
[14] Roy W. Schulte. Predicts 2003: Enterprise Service Buses Emerge. Gartner, Inc.,
2002. [cited at p. 51]
[15] Kim Williams Scott McLean, James Naftel. Microsoft .NET Remoting. Microsoft
Press, 2002. [cited at p. 11]
[16] Venky Shankararaman Wing Lam. Enterprise Architecture and Integration: Meth-
ods, Implementation and Technologies. IGI Global, 2007. [cited at p. 7]
Appendices
89
List of Figures
91
92 LIST OF FIGURES
6.4 Case study: Message Channel with Router and Translator scenario . 71
6.5 Case study: Enterprise Service Bus scenario . . . . . . . . . . . . . . 72
93
Index
legacy application, 7
loose coupling, 9
message body, 30
Message Broker, 47
message channel, 32
Message Endpoint, 38
Message Filter, 44
message header, 30
Message Oriented Middleware, 52
message properties, 30
Message Router, 34
95