Kafka Event System
Kafka Event System
m
pl
im
Designing
en
ts
of
Event-Driven
Systems
Concepts and Patterns for Streaming
Services with Apache Kafka
Ben Stopford
Foreword by Sam Newman
Designing Event-Driven Systems
Concepts and Patterns for Streaming
Services with Apache Kafka
Ben Stopford
The O’Reilly logo is a registered trademark of O’Reilly Media, Inc. Designing Event-Driven Systems,
the cover image, and related trade dress are trademarks of O’Reilly Media, Inc.
While the publisher and the author have used good faith efforts to ensure that the information and
instructions contained in this work are accurate, the publisher and the author disclaim all responsi‐
bility for errors or omissions, including without limitation responsibility for damages resulting from
the use of or reliance on this work. Use of the information and instructions contained in this work is
at your own risk. If any code samples or other technology this work contains or describes is subject
to open source licenses or the intellectual property rights of others, it is your responsibility to ensure
that your use thereof complies with such licenses and/or rights.
This work is part of a collaboration between O’Reilly and Confluent. See our statement of editorial
independence.
978-1-492-03822-1
[LSI]
Table of Contents
Foreword. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii
Preface. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xi
iii
Part II. Designing Event-Driven Systems
5. Events: A Basis for Collaboration. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
Commands, Events, and Queries 30
Coupling and Message Brokers 32
Using Events for Notification 34
Using Events to Provide State Transfer 37
Which Approach to Use 38
The Event Collaboration Pattern 39
Relationship with Stream Processing 41
Mixing Request- and Event-Driven Protocols 42
Summary 44
iv | Table of Contents
10. Lean Data. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
If Messaging Remembers, Databases Don’t Have To 91
Take Only the Data You Need, Nothing More 92
Rebuilding Event-Sourced Views 93
Automation and Schema Migration 94
Summary 96
Table of Contents | v
Windows, Joins, Tables, and State Stores 135
Summary 138
vi | Table of Contents
Foreword
For as long as we’ve been talking about services, we’ve been talking about data. In
fact, before we even had the word microservices in our lexicon, back when it was
just good old-fashioned service-oriented architecture, we were talking about
data: how to access it, where it lives, who “owns” it. Data is all-important—vital
for the continued success of our business—but has also been seen as a massive
constraint in how we design and evolve our systems.
My own journey into microservices began with work I was doing to help organi‐
zations ship software more quickly. This meant a lot of time was spent on things
like cycle time analysis, build pipeline design, test automation, and infrastructure
automation. The advent of the cloud was a huge boon to the work we were
doing, as the improved automation made us even more productive. But I kept
hitting some fundamental issues. All too often, the software wasn’t designed in a
way that made it easy to ship. And data was at the heart of the problem.
Back then, the most common pattern I saw for service-based systems was sharing
a database among multiple services. The rationale was simple: the data I need is
already in this other database, and accessing a database is easy, so I’ll just reach in
and grab what I need. This may allow for fast development of a new service, but
over time it becomes a major constraint.
As I expanded upon in my book, Building Microservices, a shared database cre‐
ates a huge coupling point in your architecture. It becomes difficult to under‐
stand what changes can be made to a schema shared by multiple services. David
Parnas1 showed us back in 1971 that the secret to creating software whose parts
could be changed independently was to hide information between modules. But
at a swoop, exposing a schema to multiple services prohibits our ability to inde‐
pendently evolve our codebases.
1 D. L. Parnas, On the Criteria to Be Used in Decomposing Systems into Modules (Pittsburgh, PA: Carnegie
Mellon University, 1971).
Foreword | vii
As the needs and expectations of software changed, IT organizations changed
with them. The shift from siloed IT toward business- or product-aligned teams
helped improve the customer focus of those teams. This shift often happened in
concert with the move to improve the autonomy of those teams, allowing them
to develop new ideas, implement them, and then ship them, all while reducing
the need for coordination with other parts of the organization. But highly cou‐
pled architectures require heavy coordination between systems and the teams
that maintain them—they are the enemy of any organization that wants to opti‐
mize autonomy.
Amazon spotted this many years ago. It wanted to improve team autonomy to
allow the company to evolve and ship software more quickly. To this end, Ama‐
zon created small, independent teams who would own the whole lifecycle of
delivery. Steve Yegge, after leaving Amazon for Google, attempted to capture
what it was that made those teams work so well in his infamous (in some circles)
“Platform Rant”. In it, he outlined the mandate from Amazon CEO Jeff Bezos
regarding how teams should work together and how they should design systems.
These points in particular resonate for me:
1) All teams will henceforth expose their data and functionality through service
interfaces.
2) Teams must communicate with each other through these interfaces.
3) There will be no other form of interprocess communication allowed: no direct
linking, no direct reads of another team’s datastore, no shared-memory model, no
backdoors whatsoever. The only communication allowed is via service interface
calls over the network.
In my own way, I came to the realization that how we store and share data is key
to ensuring we develop loosely coupled architectures. Well-defined interfaces are
key, as is hiding information. If we need to store data in a database, that database
should be part of a service, and not accessed directly by other services. A well-
defined interface should guide when and how that data is accessed and manipu‐
lated.
Much of my time over the past several years has been taken up with pushing this
idea. But while people increasingly get it, challenges remain. The reality is that
services do need to work together and do sometimes need to share data. How do
you do that effectively? How do you ensure that this is done in a way that is sym‐
pathetic to your application’s latency and load conditions? What happens when
one service needs a lot of information from another?
Enter streams of events, specifically the kinds of streams that technology like
Kafka makes possible. We’re already using message brokers to exchange events,
but Kafka’s ability to make that event stream persistent allows us to consider a
new way of storing and exchanging data without losing out on our ability to cre‐
ate loosely coupled autonomous architectures. In this book, Ben talks about the
viii | Foreword
idea of “turning the database inside out”—a concept that I suspect will get as
many skeptical responses as I did back when I was suggesting moving away from
giant shared databases. But after the last couple of years I’ve spent exploring
these ideas with Ben, I can’t help thinking that he and the other people working
on these concepts and technology (and there is certainly lots of prior art here)
really are on to something.
I’m hopeful that the ideas outlined in this book are another step forward in how
we think about sharing and exchanging data, helping us change how we build
microservice architecture. The ideas may well seem odd at first, but stick with
them. Ben is about to take you on a very interesting journey.
—Sam Newman
Foreword | ix
Preface
In 2006 I was working at ThoughtWorks, in the UK. There was a certain energy
to the office at that time, with lots of interesting things going on. The Agile
movement was in full bloom, BDD (behavior-driven development) was flourish‐
ing, people were experimenting with Event Sourcing, and SOA (service-oriented
architecture) was being adapted to smaller projects to deal with some of the
issues we’d seen in larger implementations.
One project I worked on was led by Dave Farley, an energetic and cheerful fellow
who managed to transfer his jovial bluster into pretty much everything we did.
The project was a relatively standard, medium-sized enterprise application. It
had a web portal where customers could request a variety of conveyancing serv‐
ices. The system would then run various synchronous and asynchronous pro‐
cesses to put the myriad of services they requested into action.
There were a number of interesting elements to that particular project, but the
one that really stuck with me was the way the services communicated. It was the
first system I’d worked on that was built solely from a collaboration of events.
Having worked with a few different service-based systems before, all built with
RPCs (remote procedure calls) or request-response messaging, I thought this one
felt very different. There was something inherently spritely about the way you
could plug new services right into the event stream, and something deeply satis‐
fying about tailing the log of events and watching the “narrative” of the system
whizz past.
A few years later, I was working at a large financial institution that wanted to
build a data service at the heart of the company, somewhere applications could
find the important datasets that made the bank work—trades, valuations, refer‐
ence data, and the like. I find this sort of problem quite compelling: it was techni‐
cally challenging and, although a number of banks and other large companies
had taken this kind of approach before, it felt like the technology had moved on
to a point where we could build something really interesting and transformative.
Preface | xi
Yet getting the technology right was only the start of the problem. The system
had to interface with every major department, and that meant a lot of stakehold‐
ers with a lot of requirements, a lot of different release schedules, and a lot of
expectations around uptime. I remember discussing the practicalities of the
project as we talked our design through in a two-week stakeholder kick-off meet‐
ing. It seemed a pretty tall order, not just technically, but organizationally, but it
also seemed plausible.
So we pulled together a team, with a bunch of people from ThoughtWorks and
Google and a few other places, and the resulting system had some pretty interest‐
ing properties. The datastore held queryable data in memory, spread over 35
machines per datacenter, so it could handle being hit from a compute grid.
Writes went directly through the query layer into a messaging system, which
formed (somewhat unusually for the time) the system of record. Both the query
layer and the messaging layer were designed to be sharded so they could scale
linearly. So every insert or update was also a published event, and there was no
side-stepping it either; it was baked into the heart of the architecture.
The interesting thing about making messaging the system of record is you find
yourself repurposing the data stream to do a whole variety of useful things:
recording it on a filesystem for recovery, pushing it to another datacenter, hyd‐
rating a set of databases for reporting and analytics, and, of course, broadcasting
it to anyone with the API who wants to listen.
But the real importance of using messaging as a system of record evaded me
somewhat at the time. I remember speaking about the project at QCon, and there
were more questions about the lone “messaging as a system of record” slide,
which I’d largely glossed over, than there were about the fancy distributed join
layer that the talk had focused on. So it slowly became apparent that, for all its
features—the data-driven precaching that made joins fast, the SQL-over-
Document interface, the immutable data model, and late-bound schema—what
most customers needed was really subtly different, and somewhat simpler. While
they would start off making use of the data service directly, as time passed, some
requirement would often lead them to take a copy, store it independently, and do
their own thing. But despite this, they still found the central dataset useful and
would often take a subset, then later come back for more. So, on reflection, it
seemed that a messaging system optimized to hold datasets would be more
appropriate than a database optimized to publish them. A little while later Con‐
fluent formed, and Kafka seemed a perfect solution for this type of problem.
The interesting thing about these two experiences (the conveyancing application
and the bank-wide data service) is that they are more closely related than they
may initially appear. The conveyancing application had been wonderfully collab‐
orative, yet pluggable. At the bank, a much larger set of applications and services
integrated through events, but also leveraged a historic reference they could go
xii | Preface
back to and query. So the contexts were quite different—the first was a single
application, the second a company—but much of the elegance of both systems
came from their use of events.
Streaming systems today are in many ways quite different from both of these
examples, but the underlying patterns haven’t really changed all that much. Nev‐
ertheless, the devil is in the details, and over the last few years we’ve seen clients
take a variety of approaches to solving both of these kinds of problems, along
with many others. Problems that both distributed logs and stream processing
tools are well suited to, and I’ve tried to extract the key elements of these
approaches in this short book.
Acknowledgments
Many people contributed to this book, both directly and indirectly, but a special
thanks to Jay Kreps, Sam Newman, Edward Ribeiro, Gwen Shapira, Steve Coun‐
sell, Martin Kleppmann, Yeva Byzek, Dan Hanley, Tim Bergland, and of course
my ever-patient wife, Emily.
Preface | xiii
PART I
Setting the Stage
While the main focus of this book is the building of event-driven systems of dif‐
ferent sizes, there is a deeper focus on software that spans many teams. This is
the realm of service-oriented architectures: an idea that arose around the start of
the century, where a company reconfigures itself around shared services that do
commonly useful things.
This idea became quite popular. Amazon famously banned all intersystem com‐
munications by anything that wasn’t a service interface. Later, upstart Netflix
went all in on microservices, and many other web-based startups followed suit.
Enterprise companies did similar things, but often using messaging systems,
which have a subtly different dynamic. Much was learned during this time, and
there was significant progress made, but it wasn’t straightforward.
One lesson learned, which was pretty ubiquitous at the time, was that service-
based approaches significantly increased the probability of you getting paged at 3
a.m., when one or more services go down. In hindsight, this shouldn’t have been
surprising. If you take a set of largely independent applications and turn them
into a web of highly connected ones, it doesn’t take too much effort to imagine
that one important but flaky service can have far-reaching implications, and in
the worst case bring the whole system to a halt. As Steve Yegge put it in his
famous Amazon/Google post, “Organizing into services taught teams not to trust
each other in most of the same ways they’re not supposed to trust external devel‐
opers.”
What did work well for Amazon, though, was the element of organizational
change that came from being wholeheartedly service based. Service teams think
of their software as being a cog in a far larger machine. As Ian Robinson put it,
“Be of the web, not behind the web.” This was a huge shift from the way people
built applications previously, where intersystem communication was something
teams reluctantly bolted on as an afterthought. But the services model made
3
interaction a first-class entity. Suddenly your users weren’t just customers or
businesspeople; they were other applications, and they really cared that your ser‐
vice was reliable. So applications became platforms, and building platforms is
hard.
LinkedIn felt this pain as it evolved away from its original, monolithic Java appli‐
cation into 800–1,100 services. Complex dependencies led to instability, version‐
ing issues caused painful lockstep releases, and early on, it wasn’t clear that the
new architecture was actually an improvement.
One difference in the way LinkedIn evolved its approach was its use of a messag‐
ing system built in-house: Kafka. Kafka added an asynchronous publish-
subscribe model to the architecture that enabled trillions of messages a day to be
transported around the organization. This was important for a company in
hypergrowth, as it allowed new applications to be plugged in without disturbing
the fragile web of synchronous interactions that drove the frontend.
But this idea of rearchitecting a system around events isn’t new—event-driven
architectures have been around for decades, and technologies like enterprise
messaging are big business, particularly with (unsurprisingly) enterprise compa‐
nies. Most enterprises have been around for a long time, and their systems have
grown organically, over many iterations or through acquisition. Messaging sys‐
tems naturally fit these complex and disconnected worlds for the same reasons
observed at LinkedIn: events decouple, and this means different parts of the
company can operate independently of one another. It also means it’s easier to
plug new systems into the real time stream of events.
A good example is the regulation that hit the finance industry in January 2018,
which states that trading activity has to be reported to a regulator within one
minute of it happening. A minute may seem like a long time in computing terms,
but it takes only one batch-driven system, on the critical path in one business
silo, for that to be unattainable. So the banks that had gone to the effort of instal‐
ling real-time trade eventing, and plumbed it across all their product-aligned
silos, made short work of these regulations. For the majority that hadn’t it was a
significant effort, typically resulting in half-hearted, hacky solutions.
So enterprise companies start out complex and disconnected: many separate,
asynchronous islands—often with users of their own—operating independently
of one another for the most part. Internet companies are different, starting life as
simple, front-facing web applications where users click buttons and expect things
to happen. Most start as monoliths and stay that way for some time (arguably for
longer than they should). But as internet companies grow and their business gets
more complex, they see a similar shift to asynchronicity. New teams and depart‐
ments are introduced and they need to operate independently, freed from the
synchronous bonds that tie the frontend. So ubiquitous desires for online utilit‐
ies, like making a payment or updating a shopping basket, are slowly replaced by
4 | Chapter 1: Introduction
a growing need for datasets that can be used, and evolved, without any specific
application lock-in.
But messaging is no panacea. Enterprise service buses (ESBs), for example, have
vocal detractors and traditional messaging systems have a number of issues of
their own. They are often used to move data around an organization, but the
absence of any notion of history limits their value. So, even though recent events
typically have more value than old ones, business operations still need historical
data—whether it’s users wanting to query their account history, some service
needing a list of customers, or analytics that need to be run for a management
report.
On the other hand, data services with HTTP-fronted interfaces make lookups
simple. Anyone can reach in and run a query. But they don’t make it so easy to
move data around. To extract a dataset you end up running a query, then period‐
ically polling the service for changes. This is a bit of a hack, and typically the
operators in charge of the service you’re polling won’t thank you for it.
But replayable logs, like Kafka, can play the role of an event store: a middle
ground between a messaging system and a database. (If you don’t know Kafka,
don’t worry—we dive into it in Chapter 4.) Replayable logs decouple services
from one another, much like a messaging system does, but they also provide a
central point of storage that is fault-tolerant and scalable—a shared source of
truth that any application can fall back to.
A shared source of truth turns out to be a surprisingly useful thing. Microservi‐
ces, for example, don’t share their databases with one another (referred to as the
IntegrationDatabase antipattern). There is a good reason for this: databases have
very rich APIs that are wonderfully useful on their own, but when widely shared
they make it hard to work out if and how one application is going to affect oth‐
ers, be it data couplings, contention, or load. But the business facts that services
do choose to share are the most important facts of all. They are the truth that the
rest of the business is built on. Pat Helland called out this distinction back in
2006, denoting it “data on the outside.”
But a replayable log provides a far more suitable place to hold this kind of data
because (somewhat counterintuitively) you can’t query it! It is purely about stor‐
ing data and pushing it to somewhere new. This idea of pure data movement is
important, because data on the outside—the data services share—is the most
tightly coupled of all, and the more services an ecosystem has, the more tightly
coupled this data gets. The solution is to move data somewhere that is more
loosely coupled, so that means moving it into your application where you can
manipulate it to your heart’s content. So data movement gives applications a
level of operability and control that is unachievable with a direct, runtime
dependency. This idea of retaining control turns out to be important—it’s the
same reason the shared database pattern doesn’t work out well in practice.
Introduction | 5
So, this replayable log–based approach has two primary benefits. First, it makes it
easy to react to events that are happening now, with a toolset specifically
designed for manipulating them. Second, it provides a central repository that can
push whole datasets to wherever they may be needed. This is pretty useful if you
run a global business with datacenters spread around the world, need to boot‐
strap or prototype a new project quickly, do some ad hoc data exploration, or
build a complex service ecosystem that can evolve freely and independently.
So there are some clear advantages to the event-driven approach (and there are
of course advantages for the REST/RPC models too). But this is, in fact, only half
the story. Streaming isn’t simply an alternative to RPCs that happens to work
better for highly connected use cases; it’s a far more fundamental change in
mindset that involves rethinking your business as an evolving stream of data, and
your services as functions that transform these streams of data into something
new.
This can feel unnatural. Many of us have been brought up with programming
styles where we ask questions or issue commands and wait for answers. This is
how procedural or object-oriented programs work, but the biggest culprit is
probably the database. For nearly half a century databases have played a central
role in system design, shaping—more than any other tool—the way we write
(and think about) programs. This has been, in some ways, unfortunate.
As we move from chapter to chapter, this book builds up a subtly different
approach to dealing with data, one where the database is taken apart, unbundled,
deconstructed, and turned inside out. These concepts may sound strange or even
novel, but they are, like many things in software, evolutions of older ideas that
have arisen somewhat independently in various technology subcultures. For
some time now, mainstream programmers have used event-driven architectures,
Event Sourcing, and CQRS (Command Query Responsibility Segregation) as a
means to break away from the pains of scaling database-centric systems. The big
data space encountered similar issues as multiterabyte-sized datasets highlighted
the inherent impracticalities of batch-driven data management, which in turn led
to a pivot toward streaming. The functional world has sat aside, somewhat know‐
ingly, periodically tugging at the imperative views of the masses.
But these disparate progressions—turning the database inside out, destructuring,
CQRS, unbundling—all have one thing in common. They are all simple
metaphors for the need to separate the conflation of concepts embedded into
every database we use, to decouple them so that we can manage them separately
and hence efficiently.
There are a number of reasons for wanting to do this, but maybe the most
important of all is that it lets us build larger and more functionally diverse sys‐
tems. So while a database-centric approach works wonderfully for individual
applications, we don’t live in a world of individual applications. We live in a
6 | Chapter 1: Introduction
world of interconnected systems—individual components that, while all valuable
in themselves, are really part of a much larger puzzle. We need a mechanism for
sharing data that complements this complex, interconnected world. Events lead
us to this. They constantly push data into our applications. These applications
react, blending streams together, building views, changing state, and moving
themselves forward. In the streaming model there is no shared database. The
database is the event stream, and the application simply molds it into something
new.
In fairness, streaming systems still have database-like attributes such as tables
(for lookups) and transactions (for atomicity), but the approach has a radically
different feel, more akin to functional or dataflow languages (and there is much
cross-pollination between the streaming and functional programming communi‐
ties).
So when it comes to data, we should be unequivocal about the shared facts of our
system. They are the very essence of our business, after all. Facts may be evolved
over time, applied in different ways, or even recast to different contexts, but they
should always tie back to a single thread of irrevocable truth, one from which all
others are derived—a central nervous system that underlies and drives every
modern digital business.
This book looks quite specifically at the application of Apache Kafka to this
problem. In Part I we introduce streaming and take a look at how Kafka works.
Part II focuses on the patterns and techniques needed to build event-driven pro‐
grams: Event Sourcing, Event Collaboration, CQRS, and more. Part III takes
these ideas a step further, applying them in the context of multiteam systems,
including microservices and SOA, with a focus on event streams as a source of
truth and the aforementioned idea that both systems and companies can be
reimagined as a database turned inside out. In the final part, we take a slightly
more practical focus, building a small streaming system using Kafka Streams
(and KSQL).
Introduction | 7
CHAPTER 2
The Origins of Streaming
This book is about building business systems with stream processing tools, so it
is useful to have an appreciation for where stream processing came from. The
maturation of this toolset, in the world of real-time analytics, has heavily influ‐
enced the way we build event-driven systems today.
Figure 2-1 shows a stream processing system used to ingest data from several
hundred thousand mobile devices. Each device sends small JSON messages to
denote applications on each mobile phone that are being opened, being closed,
or crashing. This can be used to look for instability—that is, where the ratio of
crashes to usage is comparatively high.
Figure 2-1. A typical streaming application that ingests data from mobile devices
into Kafka, processes it in a streaming layer, and then pushes the result to a serving
layer where it can be queried
9
The mobile devices land their data into Kafka, which buffers it until it can be
extracted by the various applications that need to put it to further use. For this
type of workload the cluster would be relatively large; as a ballpark figure Kafka
ingests data at network speed, but the overhead of replication typically divides
that by three (so a three-node 10 GbE cluster will ingest around 1 GB/s in prac‐
tice).
To the right of Kafka in Figure 2-1 sits the stream processing layer. This is a clus‐
tered application, where queries are either defined up front via the Java DSL or
sent dynamically via KSQL, Kafka’s SQL-like stream processing language. Unlike
in a traditional database, these queries compute continuously, so every time an
input arrives in the stream processing layer, the query is recomputed, and a result
is emitted if the value of the query has changed.
Once a new message has passed through all streaming computations, the result
lands in a serving layer from which it can be queried. Cassandra is shown in
Figure 2-1, but pushing to HDFS (Hadoop Distributed File System), pushing to
another datastore, or querying directly from Kafka Streams using its interactive
queries feature are all common approaches as well.
To understand streaming better, it helps to look at a typical query. Figure 2-2
shows one that computes the total number of app crashes per day. Every time a
new message comes in, signifying that an application crashed, the count of total
crashes for that application will be incremented. Note that this computation
requires state: the count for the day so far (i.e., within the window duration)
must be stored so that, should the stream processor crash/restart, the count will
continue where it was before. Kafka Streams and KSQL manage this state inter‐
nally, and that state is backed up to Kafka via a changelog topic. This is discussed
in more detail in “Windows, Joins, Tables, and State Stores” on page 135 in Chap‐
ter 14.
Figure 2-2. A simple KSQL query that evaluates crashes per day
Multiple queries of this type can be chained together in a pipeline. In Figure 2-3,
we break the preceding problem into three steps chained over two stages. Queries
(a) and (b) continuously compute apps opened per day and apps crashed per
day, respectively. The two resulting output streams are combined together in the
Figure 2-3. Two initial stream processing queries are pushed into a third to create a
pipeline
There are a few other things to note about this streaming approach:
The streaming layer is fault-tolerant
It runs as a cluster on all available nodes. If one node exits, another will pick
up where it left off. Likewise, you can scale out the cluster by adding new
processing nodes. Work, and any required state, will automatically be rerou‐
ted to make use of these new resources.
Each stream processor node can hold state of its own
This is required for buffering as well as holding whole tables, for example, to
do enrichments (streams and tables are discussed in more detail in “Win‐
dows, Joins, Tables, and State Stores” on page 135 in Chapter 14). This idea of
local storage is important, as it lets the stream processor perform fast,
message-at-a-time queries without crossing the network—a necessary fea‐
ture for the high-velocity workloads seen in internet-scale use cases. But this
ability to internalize state in local stores turns out to be useful for a number
of business-related use cases too, as we discuss later in this book.
Each stream processor can write and store local state
Making message-at-a-time network calls isn’t a particularly good idea when
you’re handling a high-throughput event stream. For this reason stream pro‐
cessors write data locally (so writes and reads are fast) and back those writes
There is an old parable about an elephant and a group of blind men. None of the
men had come across an elephant before. One blind man approaches the leg and
declares, “It’s like a tree.” Another man approaches the tail and declares, “It’s like
a rope.” A third approaches the trunk and declares, “It’s like a snake.” So each
blind man senses the elephant from his particular point of view, and comes to a
subtly different conclusion as to what an elephant is. Of course the elephant is
like all these things, but it is really just an elephant!
Likewise, when people learn about Kafka they often see it from a certain view‐
point. These perspectives are usually accurate, but highlight only some subsec‐
tion of the whole platform. In this chapter we look at some common points of
view.
13
also no requirement for storage. So this leaves the question: would you be better
off using a stateless protocol like HTTP?
So Kafka is a mechanism for programs to exchange information, but its home
ground is event-based communication, where events are business facts that have
value to more than one service and are worth keeping around.
17
sages matter, to mission-critical use cases where messages and their relative
ordering must be preserved with the same guarantees as you’d expect from a
DBMS (database management system) or storage system. The price paid for this
scalability is a slightly simpler contract that lacks some of the obligations of JMS
or AMQP, such as message selectors.
But this change of tack turns out to be quite important. Kafka’s throughput prop‐
erties make moving data from process to process faster and more practical than
with previous technologies. Its ability to store datasets removes the queue-depth
problems that plagued traditional messaging systems. Finally, its rich APIs, par‐
ticularly Kafka Streams and KSQL, provide a unique mechanism for embedding
data processing directly inside client programs. These attributes have led to its
use as a message and storage backbone for service estates in a wide variety of
companies that need all of these capabilities.
Taking a log-structured approach has an interesting side effect. Both reads and
writes are sequential operations. This makes them sympathetic to the underlying
media, leveraging prefetch, the various layers of caching, and naturally batching
operations together. This in turn makes them efficient. In fact, when you read
messages from Kafka, the server doesn’t even import them into the JVM (Java
virtual machine). Data is copied directly from the disk buffer to the network
Linear Scalability
As we’ve discussed, logs provide a hardware-sympathetic data structure for mes‐
saging workloads, but Kafka is really many logs, spanning many different
machines. The system ties these together, routing messages reliably, replicating
for fault tolerance, and handling failure gracefully.
While running on a single machine is possible, production clusters typically start
at three machines with larger clusters in the hundreds. When you read and write
Linear Scalability | 19
to a topic, you’ll typically be reading and writing to all of them, partitioning your
data over all the machines you have at your disposal. Scaling is thus a pretty sim‐
ple affair: add new machines and rebalance. Consumption can also be performed
in parallel, with messages in a topic being spread over several consumers in a
consumer group (see Figure 4-2).
Figure 4-2. Producers spread messages over many partitions, on many machines,
where each partition is a little queue; load-balanced consumers (denoted a con‐
sumer group) share the partitions between them; rate limits are applied to produc‐
ers, consumers, and groups
The main advantage of this, from an architectural perspective, is that it takes the
issue of scalability off the table. With Kafka, hitting a scalability wall is virtually
impossible in the context of business systems. This can be quite empowering,
especially when ecosystems grow, allowing implementers to pick patterns that
are a little more footloose with bandwidth and data movement.
Scalability opens other opportunities too. Single clusters can grow to company
scales, without the risk of workloads overpowering the infrastructure. For exam‐
ple, New Relic relies on a single cluster of around 100 nodes, spanning three
datacenters, and processing 30 GB/s. In other, less data-intensive domains, 5- to
10-node clusters commonly support whole-company workloads. But it should be
noted that not all companies take the “one big cluster” route. Netflix, for exam‐
ple, advises using several smaller clusters to reduce the operational overheads of
running very large installations, but their largest installation is still around the
200-node mark.
To manage shared clusters, it’s useful to carve bandwidth up, using the band‐
width segregation features that ship with Kafka. We’ll discuss these next.
Figure 4-4. If an instance of a service dies, data is redirected and ordering guaran‐
tees are maintained
Should one of the services fail, Kafka will detect this failure and reroute messages
from the failed service to the one that remains. If the failed service comes back
online, load flips back again.
This process actually works by assigning whole partitions to different consumers.
A strength of this approach is that a single partition can only ever be assigned to
Compacted Topics
By default, topics in Kafka are retention-based: messages are retained for some
configurable amount of time. Kafka also ships with a special type of topic that
manages keyed datasets—that is, data that has a primary key (identifier) as you
might have in a database table. These compacted topics retain only the most
recent events, with any old events, for a certain key, being removed. They also
support deletes (see “Deleting Data” on page 127 in Chapter 13).
Compacted topics work a bit like simple log-structure merge-trees (LSM trees).
The topic is scanned periodically, and old messages are removed if they have
been superseded (based on their key); see Figure 4-5. It’s worth noting that this is
an asynchronous process, so a compacted topic may contain some superseded
messages, which are waiting to be compacted away.
Figure 4-5. In a compacted topic, superseded messages that share the same key are
removed. So, in this example, for key K2, messages V2 and V1 would eventually be
compacted as they are superseded by V3.
Compacted topics let us make a couple of optimizations. First, they help us slow
down a dataset’s growth (by removing superseded events), but we do so in a
data-specific way rather than, say, simply removing messages older than two
weeks. Second, having smaller datasets makes it easier for us to move them from
machine to machine.
This is important for stateful stream processing. Say a service uses the Kafka’s
Streams API to load the latest version of the product catalogue into a table (as
discussed in “Windows, Joins, Tables, and State Stores” on page 135 in Chapter 14,
Security
Kafka provides a number of enterprise-grade security features for both authenti‐
cation and authorization. Client authentication is provided through either Ker‐
beros or Transport Layer Security (TLS) client certificates, ensuring that the
Kafka cluster knows who is making each request. There is also a Unix-like per‐
missions system, which can be used to control which users can access which data.
Network communication can be encrypted, allowing messages to be securely sent
across untrusted networks. Finally, administrators can require authentication for
communication between Kafka and ZooKeeper.
The quotas mechanism, discussed in the section “Segregating Load in Multiser‐
vice Ecosystems” on page 21, can be linked to this notion of identity, and Kafka’s
security features are extended across the different components of the Confluent
platform (the Rest Proxy, Confluent Schema Registry, Replicator, etc.).
Summary
Kafka is a little different from your average messaging technology. Being
designed as a distributed, scalable infrastructure component makes it an ideal
backbone through which services can exchange and buffer events. There are
obviously a number of elements unique to the technology itself, but the ones that
Life is a series of natural and spontaneous changes. Don’t resist them—that only
creates sorrow. Let reality be reality. Let things flow naturally forward.
—Lao-Tzu, 6th–5th century BCE
CHAPTER 5
Events: A Basis for Collaboration
29
In practice we tend to embrace this automatically. We’ve all found ourselves poll‐
ing database tables for changes, or implementing some kind of scheduled cron
job to churn through updates. These are simple ways to break the ties of synchro‐
nicity, but they always feel like a bit of a hack. There is a good reason for this:
they probably are.
So we can condense all these issues into a single observation. The imperative pro‐
gramming model, where we command services to do our bidding, isn’t a great fit
for estates where services are operated independently.
In this chapter we’re going to focus on the other side of the architecture coin:
composing services not through chains of commands and queries, but rather
through streams of events. This is an implementation pattern in its own right,
and has been used in industry for many years, but it also forms a baseline for the
more advanced patterns we’ll be discussing in Part III and Part V, where we
blend the ideas of event-driven processing with those seen in streaming plat‐
forms.
1 The term command originally came from Bertrand Meyer’s CQS (Command Query Separation) princi‐
ple. A slightly different definition from Bertrand’s is used here, leaving it optional as to whether a com‐
mand should return a result or not. There is a reason for this: a command is a request for something
specific to happen in the future. Sometimes it is desirable to have no return value; other times, a return
value is important. Martin Fowler uses the example of popping a stack, while here we use the example of
processing a payment, which simply returns whether the command succeeded. By leaving the command
with an optional return type, the implementer can decide if it should return a result or not, and if not
CQS/CQRS may be used. This saves the need for having another name for a command that does return
a result. Finally, a command is never an event. A command has an explicit expectation that something
(a state change or side effect) will happen in the future. Events come with no such future expectation.
They are simply a statement that something happened.
Events
Events are both a fact and a notification. They represent something that hap‐
pened in the real world but include no expectation of any future action. They
travel in only one direction and expect no response (sometimes called “fire
and forget”), but one may be “synthesized” from a subsequent event.
• Example: OrderCreated{Widget}, CustomerDetailsUpdated{Customer}
• When to use: When loose coupling is important (e.g., in multiteam sys‐
tems), where the event stream is useful to more than one service, or
where data must be replicated from one application to another. Events
also lend themselves to concurrent execution.
Queries
Queries are a request to look something up. Unlike events or commands,
queries are free of side effects; they leave the state of the system unchanged.
• Example: getOrder(ID=42) returns Order(42,…).
• When to use: For lightweight data retrieval across service boundaries, or
heavyweight data retrieval within service boundaries.
The beauty of events is they wear two hats: a notification hat that triggers services
into action, but also a replication hat that copies data from one service to
another. But from a services perspective, events lead to less coupling than com‐
mands and queries. Loose coupling is a desirable property where interactions
cross deployment boundaries, as services with fewer dependencies are easier to
change.
3 “Shared nothing” is also used in the database world but to mean a slightly different thing.
4 As an anecdote, I once worked with a team that would encrypt sections of the information they pub‐
lished, not so it was secure, but so they could control who could couple to it (by explicitly giving the
other party the encryption key). I wouldn’t recommend this practice, but it makes the point that people
really care about this problem.
Let’s look at a simple example based on a customer ordering an iPad. The user
clicks Buy, and an order is sent to the orders service. Three things then happen:
The same flow can be built with an event-driven approach (Figure 5-4), where
the orders service simply journals the event, “Order Created,” which the shipping
service then reacts to.
If we look closely at Figure 5-4, the interaction between the orders service and
the shipping service hasn’t changed all that much, other than that they commu‐
nicate via events rather than calling each other directly. But there is an important
change: the orders service has no knowledge that the shipping service exists. It
just raises an event denoting that it did its job and an order was created. The
shipping service now has control over whether it partakes in the interaction. This
is an example of receiver-driven routing: logic for routing is located at the
receiver of the events, rather than at the sender. The burden of responsibility is
flipped! This reduces coupling and adds a useful level of pluggability to the sys‐
tem.
Pluggability becomes increasingly important as systems get more complex. Say
we decide to extend our system by adding a repricing service, which updates the
price of goods in real time, tweaking a product’s price based on supply and
demand (Figure 5-5). In a REST- or RPC-based approach we would need to
introduce a maybeUpdatePrice() method, which is called by both the orders ser‐
vice and the payment service. But in the event-driven model, repricing is just a
service that plugs into the event streams for orders and payments, sending out
price updates when relevant criteria are met.
Figure 5-6. Extending the system described in Figure 5-4 to be fully event-driven;
here events are used for notification (the orders service notifies the shipping service)
as well as for data replication (data is replicated from the customer service to the
shipping service, where it can be queried locally).
This makes use of the other property events have—their replication hat. (For‐
mally this is termed event-carried state transfer, which is essentially a form of
The lack of any one point of central control means systems like these are often
termed choreographies: each service handles some subset of state transitions,
which, when put together, describe the whole business process. This can be con‐
trasted with orchestration, where a single process commands and controls the
whole workflow from one place—for example, via a process manager.5 A process
manager is implemented with request-response.
Choreographed systems have the advantage that they are pluggable. If the pay‐
ment service decides to create three new event types for the payment part of the
workflow, so long as the Payment Processed event remains, it can do so without
affecting any other service. This is useful because it means if you’re implement‐
ing a service, you can change the way you work and no other services need to
know or care about it. By contrast, in an orchestrated system, where a single ser‐
vice dictates the workflow, all changes need to be made in the controller. Which
of these approaches is best for you is quite dependent on use case, but the advan‐
tage of orchestration is that the whole workflow is written down, in code, in one
place. That makes it easy to reason about the system. The downside is that the
model is tightly coupled to the controller, so broadly speaking choreographed
approaches better suit larger implementations (particularly those that span teams
and hence change independently of one another).
Figure 5-8. Stateful stream processing is similar to using events for both notifica‐
tion and state transfer (left), while stateless stream processing is similar to using
events for notification (right)
Figure 5-9. A very simple event-driven services example, data is imported from a
legacy application via the Connect API; user-facing services provide REST APIs to
the UI; state changes are journaled to Kafka as events. at the bottom, business pro‐
cessing is performed via Event Collaboration
In Figure 5-10 three departments communicate with one another only through
events. Inside each department (the three larger circles), service interfaces are
shared more freely and there are finer-grained event-driven flows that drive col‐
laboration. Each department contains a number of internal bounded contexts—
small groups of services that share a domain model, are usually deployed
together, and collaborate closely. In practice, there is often a hierarchy of sharing.
At the top of this hierarchy, departments are loosely coupled: the only thing they
share is events. Inside a department, there will be many applications and those
applications will interact with one another with both request-response and
event-based mechanisms, as in Figure 5-9. Each application may itself be com‐
posed from several services, but these will typically be more tightly coupled to
one another, sharing a domain model and having synchronized release sched‐
ules.
This approach, which confines reuse within a bounded context, is an idea that
comes from domain-driven design, or DDD. One of the big ideas in DDD was
that broad reuse could be counterproductive, and that a better approach was to
create boundaries around areas of a business domain and model them separately.
So within a bounded context the domain model is shared, and everything is
available to everything else, but different bounded contexts don’t share the same
model, and typically interact through more restricted interfaces.
Summary
Businesses are a collection of people, teams, and departments performing a wide
range of functions, backed by technology. Teams need to work asynchronously
with respect to one another to be efficient, and many business processes are
inherently asynchronous—for example, shipping a parcel from a warehouse to a
user’s door. So we might start a project as a website, where the frontend makes
synchronous calls to backend services, but as it grows the web of synchronous
calls tightly couple services together at runtime. Event-based methods reverse
this, decoupling systems in time and allowing them to evolve independently of
one another.
In this chapter we noticed that events, in fact, have two separate roles: one for
notification (a call for action), and the other a mechanism for state transfer
(pushing data wherever it is needed). Events make the system pluggable, and for
reasonably sized architectures it is sensible to blend request- and event-based
protocols, but you must take care when using these two sides of the event duality:
they lead to very different types of architecture. Finally, we looked at how to scale
the two approaches by separating out different bounded contexts that collaborate
only through events.
But with all this talk of events, we’ve talked little of replayable logs or stream pro‐
cessing. When we apply these patterns with Kafka, the toolset itself creates new
opportunities. Retention in the broker becomes a tool we can design for, allowing
us to embrace data on the outside with a central store of events that services can
refer back to. So the ops engineers, whom we discussed in the opening section of
this chapter, will still be playing detective, but hopefully not quite as often—and
at least now the story comes with a script!
6 Neil Ford, Rebecca Parsons, and Pat Kua, Building Evolutionary Architectures (Sebastopol, CA: O’Reilly,
2017).
Imperative styles of programming are some of the oldest of all, and their popu‐
larity persists for good reason. Procedures execute sequentially, spelling out a
story on the page and altering the program’s state as they do so.
As mainstream applications became distributed in the 1980s and 1990s, the same
mindset was applied to this distributed domain. Approaches like Corba and EJB
(Enterprise JavaBeans) raised the level of abstraction, making distributed pro‐
gramming more accessible. History has not always judged these so well. EJB,
while touted as a panacea of its time, fell quickly by the wayside as systems
creaked with the pains of tight coupling and the misguided notion that the net‐
work was something that should be abstracted away from the programmer.
In fairness, things have improved since then, with popular technologies like
gRPC and Finagle adding elements of asynchronicity to the request-driven style.
But the application of this mindset to the design of distributed systems isn’t nec‐
essarily the most productive or resilient route to take. Two styles of program‐
ming that better suit distributed design, particularly in a services context, are the
dataflow and functional styles.
You will have come across dataflow programming if you’ve used utilities like Sed
or languages like Awk. These are used primarily for text processing; for example,
a stream of lines might be pushed through a regex, one line at a time, with the
output piped to the next command, chaining through stdin and stdout. This style
of program is more like an assembly line, with each worker doing a specific task,
as the products make their way along a conveyor belt. Since each worker is con‐
cerned only with the availability of data inputs, there have no “hidden state” to
track. This is very similar to the way streaming systems work. Events accumulate
in a stream processor waiting for a condition to be met, say, a join operation
45
between two different streams. When the correct events are present, the join
operation completes and the pipeline continues to the next command. So Kafka
provides the equivalent of a pipe in Unix shell, and stream processors provide the
chained functions.
There is a similarly useful analogy with functional programming. As with the
dataflow style, state is not mutated in place, but rather evolves from function to
function, and this matches closely with the way stream processors operate. So
most of the benefits of both functional and dataflow languages also apply to
streaming systems. These can be broadly summarized as:
But streaming approaches also inherit some of the downsides. Purely functional
languages must negotiate an impedance mismatch when interacting with more
procedural or stateful elements like filesystems or the network. In a similar vein,
streaming systems must often translate to the request-response style of REST or
RPCs and back again. This has led some implementers to build systems around a
functional core, which processes events asynchronously, wrapped in an impera‐
tive shell, used to marshal to and from outward-facing request-response inter‐
faces. The “functional core, imperative shell” pattern keeps the key elements of
the system both flexible and scalable, encouraging services to avoid side effects
and express their business logic as simple functions chained together through the
log.
In the next section we’ll look more closely at why statefulness, in the context of
stream processing, matters.
Figure 6-1. A simple event-driven service that looks up the data it needs as it pro‐
cesses messages
So a single event stream is processed, and lookups that pull in any required data
are performed inline. The solution suffers from two problems:
This solves the two aforementioned issues with the event-driven approach. There
are no remote lookups, addressing the first point. It also no longer matters what
order events arrive in, addressing the second point.
The second point turns out to be particularly important. When you’re working
with asynchronous channels there is no easy way to ensure relative ordering
across several of them. So even if we know that the order is always created before
the payment, it may well be delayed, arriving the other way around.
Finally, note that this approach isn’t, strictly speaking, stateless. The buffer
actually makes the email service stateful, albeit just a little. When Kafka Streams
restarts, before it does any processing, it will reload the contents of each buffer.
This is important for achieving deterministic results. For example, the output of
a join operation is dependent on the contents of the opposing buffer when a mes‐
sage arrives.
This is of course a perfectly valid approach (in fact, many production systems do
exactly this), but a stateful stream processing system can make a further optimi‐
zation. It uses the same process of local buffering used to handle potential delays
in the orders and payments topics, but instead of buffering for just a few minutes,
it preloads the whole customer event stream from Kafka into the email service,
where it can be used to look up historic values (Figure 6-4).
Figure 6-4. A stateful streaming service that replicates the Customers topic into a
local table, held inside the Kafka Streams API
So now the email service is both buffering recent events, as well as creating a
local lookup table. (The mechanics of this are discussed in more detail in Chap‐
ter 14.)
• The service is now stateful, meaning for an instance of the email service to
operate it needs the relevant customer data to be present. This means, in the
worst case, loading the full dataset on startup.
as well as advantages:
This final point is particularly important for the increasingly data-centric sys‐
tems we build today. As an example, imagine we have a GUI that allows users to
browse order, payment, and customer information in a scrollable grid. The grid
lets the user scroll up and down through the items it displays.
In a traditional, stateless model, each row on the screen would require a call to all
three services. This would be sluggish in practice, so caching would likely be
added, along with some hand-crafted polling mechanism to keep the cache up to
date.
But in the streaming approach, data is constantly pushed into the UI
(Figure 6-5). So you might define a query for the data displayed in the grid,
something like select * from orders, payments, customers where…. The
API executes it over the incoming event streams, stores the result locally, and
keeps it up to date. So streaming behaves a bit like a decoratively defined cache.
Figure 6-5. Stateful stream processing is used to materialize data inside a web
server so that the UI can access it locally for better performance, in this case via a
scrollable grid
• It uses a technique called standby replicas, which ensure that for every table
or state store on one node, there is a replica kept up to date on another. So, if
any node fails, it will immediately fail over to its backup node without inter‐
rupting processing unduly.
• Disk checkpoints are created periodically so that, should a node fail and
restart, it can load its previous checkpoint, then top up the few messages it
missed when it was offline from the log.
• Finally, compacted topics are used to keep the dataset as small as possible.
This acts to reduce the load time for a complete rebuild should one be neces‐
sary.
Kafka Streams uses intermediary topics, which can be reset and rederived using
the Streams Reset tool.1
Summary
This chapter covers three different ways of doing event-based processing: the
simple event-driven approach, where you process a single event stream one mes‐
sage at a time; the streaming approach, which joins different event streams
together; and finally, the stateful streaming approach, which turns streams into
tables and stores data in the log.
So instead of pushing the state problem down a layer into a database, stateful
stream processors, like Kafka’s Streams API, are proudly stateful. They make data
available wherever it is required. This increases performance and autonomy. No
remote calls needed!
Summary | 53
CHAPTER 7
Event Sourcing, CQRS, and Other Stateful
Patterns
55
asynchronously, decoupling the two in time so the two parts can be optimized
independently.
Command Sourcing is essentially a variant of Event Sourcing but applied to
events that come into a service, rather than via the events it creates.
That’s all a bit abstract, so let’s walk through the example in Figure 7-1. We’ll use
one similar to the one used in the previous chapter, where a user makes an online
purchase and the resulting order is validated and returned.
When a purchase is made (1), Command Sourcing dictates that the order request
be immediately stored as an event in the log, before anything happens (2). That
way, should anything go wrong, the service can be rewound and replayed—for
example, to recover from a corruption.
Next, the order is validated, and another event is stored in the log to reflect the
resulting change in state (3). In contrast to an update-in-place persistence model
like CRUD (create, read, update, delete), the validated order is represented as an
entirely new event, being appended to the log rather than overwriting the exist‐
ing order. This is the essence of Event Sourcing.
Finally, to query orders, a database is attached to the resulting event stream,
deriving an event-sourced view that represents the current state of orders in the
system (4). So (1) and (4) provide the Command and Query sides of CQRS.
These patterns have a number of benefits, which we will examine in detail in the
subsequent sections.
Figure 7-2. A programmatic bug can lead to data corruption, both in the service’s
own database as well as in data it exposes to other services
Recovering from this situation is tricky for a couple of reasons. First, the original
inputs to the system haven’t been recorded exactly, so we only have the corrup‐
ted version of the order. We will have to uncorrupt it manually. Second, unlike a
version control system, which can travel back in time, a database is mutated in
place, meaning the previous state of the system is lost forever. So there is no easy
way for this service to undo the damage the corruption did.
To fix this, the programmer would need to go through a series of steps: applying
a fix to the software, running a database script to fix the corrupted timestamps in
the database, and finally, working out some way of resending any corrupted data
previously sent to other services. At best this will involve some custom code that
pulls data out of the database, fixes it up, and makes new service calls to redis‐
tribute the corrected data. But because the database is lossy—as values are over‐
written—this may not be enough. (If rather than the release being fixed, it was
rolled back to a previous version after some time running as the new version, the
data migration process would likely be even more complex.)
Figure 7-3. Adding Kafka and an Event Sourcing approach to the system described
in Figure 7-2 ensures that the original events are preserved before the code, and
bug, execute
As Kafka can store events for as long as we need (as discussed in “Long-Term
Data Storage” on page 25 in Chapter 4), correcting the timestamp corruption is
now a relatively simple affair. First the bug is fixed, then the log is rewound to
before the bug was introduced, and the system is replayed from the stream of
order requests. The database is automatically overwritten with corrected time‐
stamps, and new events are published downstream, correcting the previous cor‐
rupted ones. This ability to store inputs, rewind, and replay makes the system far
better at recovering from corruptions and bugs.
So Command Sourcing lets us record our inputs, which means the system can
always be rewound and replayed. Event Sourcing records our state changes,
which ensures we know exactly what happened during our system’s execution,
and we can always regenerate our current state (in this case the contents of the
database) from this log of state changes (Figure 7-4).
Being able to store an ordered journal of state changes is useful for debugging
and traceability purposes, too, answering retrospective questions like “Why did
this order mysteriously get rejected?” or “Why is this balance suddenly in the
red?”—questions that are harder to answer with mutable data storage.
It is also worth mentioning that there are other well-established database pat‐
terns that provide some of these properties. Staging tables can be used to hold
unvalidated inputs, triggers can be applied in many relational databases to create
audit tables, and Bitemporal databases also provide an auditable data structure.
These are all useful techniques, but none of them lends itself to “rewind and
replay” functionality without a significant amount of effort on the programmer’s
part. By contrast, with the Event Sourcing approach, there is little or no addi‐
tional code to write or test. The primary execution path is used both for runtime
execution as well as for recovery.
Figure 7-5. The order request is validated and a notification is sent to other serv‐
ices, but the service fails before the data is persisted to the database
The second problem is that, in practice, it’s quite easy for the data in the database
and the data in the notification to diverge as code churns and features are imple‐
mented. The implication here is that, while the database may well be correct, if
the notifications don’t quite match, then the data quality of the system as a whole
suffers. (See “The Data Divergence Problem” on page 95 in Chapter 10.)
Event Sourcing addresses both of these problems by making the event stream the
primary source of truth (Figure 7-6). Where data needs to be queried, a read
model or event-sourced view is derived directly from the stream.
Event Sourcing ensures that the state a service communicates and the
state a service saves internally are the same.
This actually makes a lot of sense. In a traditional system the database is the
source of truth. This is sensible from an internal perspective. But if you consider
it from the point of view of other services, they don’t care what is stored inter‐
nally; it’s the data everyone else sees that is important. So the event being the
source of truth makes a lot of sense from their perspective. This leads us to
CQRS.
Figure 7-6. When we make the event stream the source of truth, the notification
and the database update come from the same event, which is stored immutably
and durably; when we split the read and write model, the system is an implementa‐
tion of the CQRS design pattern
2 Some databases—for example, DRUID—make this separation quite concrete. Other databases block
until indexes have been updated.
Materialized Views
There is a close link between the query side of CQRS and a materialized view in a
relational database. A materialized view is a table that contains the results of
some predefined query, with the view being updated every time any of the under‐
lying tables change.
Materialized views are used as a performance optimization so, instead of a query
being computed when a user needs data, the query is precomputed and stored.
For example, if we wanted to display how many active users there are on each
page of a website, this might involve us scanning a database table of user visits,
which would be relatively expensive to compute. But if we were to precompute
the query, the summary of active users that results will be comparatively small
and hence fast to retrieve. Thus, it is a good candidate to be precomputed.
We can create exactly the same construct with CQRS using Kafka. Writes go into
Kafka on the command side (rather than updating a database table directly). We
can transform the event stream in a way that suits our use case, typically using
Kafka Streams or KSQL, then materialize it as a precomputed query or material‐
ized view. As Kafka is publish-subscribe, we can have many such views, precom‐
puted to match the various use cases we have (Figure 7-7). But unlike with
materialized views in a relational database, the underlying events are decoupled
from the view. This means (a) they can be scaled independently, and (b) the writ‐
ing process (so whatever process records user visits) doesn’t have to wait for the
view to be computed before it returns.
This idea of storing data in a log and creating many derived views is taken fur‐
ther when we discuss “Event Streams as a Shared Source of Truth” in Chapter 9.
If an event stream is the source of truth, you can have as many different
views in as many different shapes, sizes, or technologies as you may
need. Each is focused on the use case at hand.
Polyglot Views
Whatever sized data problem you have, be it free-text search, analytic aggrega‐
tion, fast key/value lookups, or a host of others, there is a database available
today that is just right for your use case. But this also means there is no “one-
size-fits-all” approach to databases, at least not anymore. A supplementary bene‐
fit of using CQRS is that a single write model can push data into many read
models or materialized views. So your read model can be in any database, or even
a range of different databases.
A replayable log makes it easy to bootstrap such polyglot views from the same
data, each tuned to different use cases (Figure 7-7). A common example of this is
to use a fast key/value store to service queries from a website, but then use a
search engine like Elasticsearch or Solr to support the free-text-search use case.
Polyglot Views | 63
Whole Fact or Delta?
One question that arises in event-driven—particularly event-sourced—programs,
is whether the events should be modeled as whole facts (a whole order, in its
entirety) or as deltas that must be recombined (first a whole order message, fol‐
lowed by messages denoting only what changed: “amount updated to $5,” “Order
cancelled,” etc.).
As an analogy, imagine you are building a version control system like SVN or
Git. When a user commits a file for the first time, the system saves the whole file
to disk. Subsequent commits, reflecting changes to that file, might save only the
delta—that is, just the lines that were added, changed, or removed. Then, when
the user checks out a certain version, the system opens the version-0 file and
applies all subsequent deltas, in order, to derive the version the user asked for.
The alternate approach is to simply store the whole file, exactly as it was at the
time it was changed, for every single commit. This will obviously take more stor‐
age, but it means that checking out a specific version from the history is a quick
and easy file retrieval. However, if the user wanted to compare different versions,
the system would have to use a “diff” function.
These two approaches apply equally to data we keep in the log. So to take a more
business-oriented example, an order is typically a set of line items (i.e., you often
order several different items in a single purchase). When implementing a system
that processes purchases, you might wonder: should the order be modeled as a
single order event with all the line items inside it, or should each line item be a
separate event with the order being recomposed by scanning the various inde‐
pendent line items? In domain-driven design, an order of this latter type is
termed an aggregate (as it is an aggregate of line items) with the wrapping entity
—that is, the order—being termed an aggregate root.
As with many things in software design, there are a host of different opinions on
which approach is best for a certain use case. There are a few rules of thumb that
can help, though. The most important one is journal the whole fact as it arrived.
So when a user creates an order, if that order turns up with all line items inside it,
we’d typically record it as a single entity.
But what happens when a user cancels a single line item? The simple solution is
to just journal the whole thing again, as another aggregate but cancelled. But
what if for some reason the order is not available, and all we get is a single can‐
celed line item? Then there would be the temptation to look up the original order
internally (say from a database), and combine it with the cancellation to create a
new Cancelled Order with all its line items embedded inside it. This typically
isn’t a good idea, because (a) we’re not recording exactly what we received, and
(b) having to look up the order in the database erodes the performance benefits
In Chapter 15 we walk through a set of richer code examples that create different
types of views using tables and state stores, along with discussing how this
approach can be scaled.
The most reliable and efficient way to achieve this is using a technique called
change data capture (CDC). Most databases write every modification operation
to a write-ahead log, so, should the database encounter an error, it can recover its
state from there. Many also provide some mechanism for capturing modification
operations that were committed. Connectors that implement CDC repurpose
these, translating database operations into events that are exposed in a messaging
system like Kafka. Because CDC makes use of a native “eventing” interface it is
(a) very efficient, as the connector is monitoring a file or being triggered directly
when changes occur, rather than issuing queries through the database’s main
API, and (b) very accurate, as issuing queries through the database’s main API
will often create an opportunity for operations to be missed if several arrive, for
the same row, within a polling period.
In the Kafka ecosystem CDC isn’t available for every database, but the ecosystem
is growing. Some popular databases with CDC support in Kafka Connect are
MySQL, Postgres, MongoDB, and Cassandra. There are also proprietary CDC
connectors for Oracle, IBM, SQL Server, and more. The full list of connectors is
available on the Connect home page.
The advantage of this database-fronted approach is that it provides a consistency
point: you write through it into Kafka, meaning you can always read your own
writes.
We discuss this use of state stores for holding application-level state in the sec‐
tion “Windows, Joins, Tables, and State Stores” on page 135 in Chapter 14.
Figure 7-10. Applying the write-through pattern with Kafka Streams and a state
store
So part of our legacy system might allow admins to manage and update the prod‐
uct catalog. We might retain this functionality by importing the dataset into
Kafka from the legacy system’s database. Then that product catalog can be reused
in the validation service, or any other.
An issue with attaching to legacy, or any externally sourced dataset, is that the
data is not always well formed. If this is a problem, consider adding a post-
processing stage. Kafka Connect’s single message transforms are useful for this
type of operation (for example, adding simple adjustments or enrichments),
while Kafka’s Streams API is ideal for simple to very complex manipulations and
for precomputing views that other services need.
MemoryImages provide a simple and efficient model for datasets that (a) fit in
memory and (b) can be loaded in a reasonable amount of time. To reduce the
load time issue, it’s common to keep a snapshot of the event log using a compac‐
ted topic (which represents the latest set of events, without any of the version his‐
tory). The MemoryImage pattern can be hand-crafted, or it can be implemented
with Kafka Streams using in-memory state stores. The pattern suits high-
performance use cases that don’t need to overflow to disk.
What differentiates an event-sourced view from a typical database, cache, and the
like is that, while it can represent data in any form the user requires, its data is
sourced directly from the log and can be regenerated at any time.
For example, we might create a view of orders, payments, and customer informa‐
tion, filtering anything that doesn’t ship within the US. This would be an event-
sourced view if, when we change the view definition—say to include orders that
ship to Canada—we can automatically recreate the view in its entirety from the
log.
An event-sourced view is equivalent to a projection in Event Sourcing parlance.
Summary
In this chapter we looked at how an event can be more than just a mechanism for
notification, or state transfer. Event Sourcing is about saving state using the exact
same medium we use to communicate it, in a way that ensures that every change
is recorded immutably. As we noted in the section “Version Control for Your
Data” on page 57, this makes recovery from failure or corruption simpler and
more efficient when compared to traditional methods of application design.
CQRS goes a step further by turning these raw events into an event-sourced view
—a queryable endpoint that derives itself (and can be rederived) directly from
the log. The importance of CQRS is that it scales, by optimizing read and write
models independently of one another.
Summary | 71
We then looked at various patterns for getting events into the log, as well as
building views using Kafka Streams and Kafka’s Connect interface and our data‐
base of choice.
Ultimately, from the perspective of an architect or programmer, switching to this
event-sourced approach will have a significant effect on an application’s design.
Event Sourcing and CQRS make events first-class citizens. This allows systems to
relate the data that lives inside a service directly to the data it shares with others.
Later we’ll see how we can tie these together with Kafka’s Transactions API. We
will also extend the ideas introduced in this chapter by applying them to inter‐
team contexts, with the “Event Streaming as a Shared Source of Truth” approach
discussed in Chapter 9.
If you squint a bit, you can see the whole of your organization’s systems and data
flows as a single distributed database.
—Jay Kreps, 2013
CHAPTER 8
Sharing Data and Services Across an
Organization
When we build software, our main focus is, quite rightly, aimed at solving some
real-world problem. It might be a new web page, a report of sales features, an
analytics program searching for fraudulent behavior, or an almost infinite set of
options that provide clear and physical benefits to our users. These are all very
tangible goals—goals that serve our business today.
But when we build software we also consider the future—not by staring into a
crystal ball in some vain attempt to predict what our company will need next
year, but rather by facing up to the fact that whatever does happen, our software
will need to change. We do this without really thinking about it. We carefully
modularize our code so it is comprehensible and reusable. We write tests, run
continuous integration, and maybe even do continuous deployment. These
things take effort, yet they bear little resemblance to anything a user might ask
for directly. We do these things because they make our code last, and that doesn’t
mean sitting on some filesystem the far side of git push. It means providing for
a codebase that is changed, evolved, refactored, and repurposed. Aging in soft‐
ware isn’t a function of time; it is a function of how we choose to change it.
But when we design systems, we are less likely to think about how they will age.
We are far more likely to ask questions like: Will the system scale as our user base
increases? Will response times be fast enough to keep users happy? Will it pro‐
mote reuse? In fact, you might even wonder what a system designed to last a long
time looks like.
If we look to history to answer this question, it would point us to mainframe
applications for payroll, big-name client programs like Excel or Safari, or even
operating systems like Windows or Linux. But these are all complex, individual
programs that have been hugely valuable to society. They have also all been diffi‐
75
cult to evolve, particularly with regard to organizing a large engineering effort
around a single codebase. So if it’s hard to build large but individual software
programs, how do we build the software that runs a company? This is the ques‐
tion we address in this particular section: how do we design systems that age well
at company scales and keep our businesses nimble?
As it happens, many companies sensibly start their lives with a single system,
which becomes monolithic as it slowly turns into the proverbial big ball of mud.
The most common response to this today is to break the monolith into a range of
different applications and services. In Chapter 1 we talked about companies like
Amazon, LinkedIn, and Netflix, which take a service-based approach to this. This
is no panacea; in fact, many implementations of the microservices pattern suffer
from the misconceived notion that modularizing software over the network will
somehow improve its sustainability. This of course isn’t what microservices are
really about. But regardless of your interpretation, breaking a monolith, alone,
will do little to improve sustainability. There is a very good reason for this too.
When we design systems at company scales, those systems become far more
about people than they are about software.
As a company grows it forms into teams, and those teams have different respon‐
sibilities and need to be able to make progress without extensive interaction with
one another. The larger the company, the more of this autonomy they need. This
is the basis of management theories like Slack.1
In stark contrast to this, total independence won’t work either. Different teams
or departments need some level of interaction, or at least a shared sense of pur‐
pose. In fact, dividing sociological groups is a tactic deployed in both politics and
war as a mechanism for reducing the capabilities of an opponent. The point here
is that a balance must be struck, organizationally, in terms of the way people,
responsibility, and communication structures are arranged in a company, and
this applies as acutely to software as it does to people, because beyond the con‐
fines of a single application, people factors invariably dominate.
Some companies tackle this at an organizational level using approaches like the
Inverse Conway Maneuver, which applies the idea that, if the shape of software
and the shape of organizations are intrinsically linked (as Conway argued), then
it’s often easier to change the organization and let the software follow suit than it
is to do the reverse. But regardless of the approach taken, when we design soft‐
ware systems where components are operated and evolved independently, the
problem we face has three distinct parts—organization, software, and data—
which are all intrinsically linked. To complicate matters further, what really dif‐
1 Tom DeMarco, Slack: Getting Past Burnout, Busywork, and the Myth of Total Efficiency (New York:
Broadway Books, 2001).
Reuse can be a bad thing. Reuse lets us develop software quickly and
succinctly, but the more we reuse a component, the more dependencies
that component has, and the harder it is to change.
Figure 8-1. An SSO service provides a good example of encapsulation and reuse
The problem is that, in the real world, business services can’t typically retain the
same clean separation of concerns, meaning new requirements will inevitably
crosscut service boundaries and several services will need to change at once. This
can be measured.3 So if one team needs to implement a feature, and that requires
another team to make a code change, we end up having to make changes to both
services at around the same time. In a monolithic system this is pretty straight‐
forward—you make the change and then do a release—but it’s considerably more
painful where independent services must synchronize. The coordination between
teams and release cycles erodes agility.
This problem isn’t actually restricted to services. Shared libraries suffer from the
same problem. If you work in retail, it might seem sensible to create a library that
models how customers, orders, payments, and the like all relate to one another.
You could include common logic for standard operations like returns and
refunds. Lots of people did this in the early days of object orientation, but it
turned out to be quite painful because suddenly the most sensitive part of your
system was coupled to many different programs, making it really fiddly to
change and release. This is why microservices typically don’t share a single
domain model. But some library reuse is of course OK. Take a logging library,
for example—much like the earlier SSO example, you’re unlikely to have a busi‐
ness requirement that needs the logging library to change.
Data sits at the very heart of this problem: most business services inevitably rely
heavily on one another’s data. If you’re an online retailer, the stream of orders,
the product catalog, or the customer information will find its way into the
requirements of many of your services. Each of these services needs broad access
to these datasets to do its work, and there is no temporary workaround for not
having the data you need. So you need access to shared datasets, but you also
want to stay loosely coupled. This turns out to be a pretty hard bargain to strike.
Now creating something that looks like a kooky, shared database can lead to a set
of issues of its own. The more functionality, data, and users data services have,
the more tightly coupled they become and the harder (and more expensive) they
are to operate and evolve.
But to extract data from some service, then keep that data up to date, you need
some kind of polling mechanism. While this is not altogether terrible, it isn’t
ideal either.
What’s more, as this happens again and again in larger architectures, with data
being extracted and moved from service to service, little errors or idiosyncrasies
often creep in. Over time these typically worsen and the data quality of the whole
ecosystem starts to suffer. The more mutable copies, the more data will diverge
over time.
Making matters worse, divergent datasets are very hard to fix in retrospect.
(Techniques like master data management are in many ways a Band-aid over
this.) In fact, some of the most intractable technology problems that businesses
encounter arise from divergent datasets proliferating from application to applica‐
tion. This issue is discussed in more detail in Chapter 10.
So a cyclical pattern of behavior emerges between (a) the drive to centralize data‐
sets to keep them accurate and (b) the temptation (or need) to extract datasets
and go it alone—an endless cycle of data inadequacy (Figure 8-5).
Figure 8-6. Tradeoff between service interfaces, messaging, and a shared database
A better solution is to use a replayable log like Kafka. This works like a kind of
event store: part messaging system, part database.
Summary
Patterns like microservices are opinionated when it comes to services being inde‐
pendent: services are run by different teams, have different deployment cycles,
don’t share code, and don’t share databases. The problem is that replacing this
with a web of RPC calls fails to address the question: how do services get access
to these islands of data for anything beyond trivial lookups?
The data dichotomy highlights this question, underlining the tension between
the need for services to stay decoupled and their need to control, enrich, and
combine data in their own time.
This leads to three core conclusions: (1) as architectures grow and systems
become more data-centric, moving datasets from service to service becomes an
inevitable part of how systems evolve; (2) data on the outside—the data services
share—becomes an important entity in its own right; (3) sharing a database is
not a sensible solution to data on the outside, but sharing a replayable log better
balances these concerns, as it can hold datasets long-term, and it facilitates event-
driven programming, reacting to the now.
This approach can keep data across many services in sync, through a loosely cou‐
pled interface, giving them the freedom to slice, dice, enrich, and evolve data
locally.
4 Neil Ford, Rebecca Parsons, and Pat Kua, Building Evolutionary Architectures (Sebastopol, CA: O’Reilly,
2017).
Summary | 85
CHAPTER 9
Event Streams as a Shared
Source of Truth
As we saw in Part II of this book, events are a useful tool for system design, pro‐
viding notification, state transfer, and decoupling. For a couple of decades now,
messaging systems have leveraged these properties, moving events from system
to system, but only in the last few years have messaging systems started to be
used as a storage layer, retaining the datasets that flow through them. This cre‐
ates an interesting architectural pattern. A company’s core datasets are stored as
centralized event streams, with all the decoupling effects of a message broker
built in. But unlike traditional brokers, which delete messages once they have
been read, historic data is stored and made available to any team that needs it.
This links closely with the ideas developed in Event Sourcing (see Chapter 7) and
Pat Helland’s concept of data on the outside. ThoughtWorks calls this pattern
event streaming as the source of truth.
87
Figure 9-1. A streaming engine and a replayable log have the core components of a
database
“The database inside out” is an analogy for stream processing where the
same components we find in a database—a commit log, views, indexes,
caches—are not confined to a single place, but instead can be made
available wherever they are needed.
This idea actually comes up in a number of other areas too. The Clojure commu‐
nity talks about deconstructing the database. There are overlaps with Event
Sourcing and polyglot persistence as we discussed in Chapter 7. But the idea was
• The log makes data available centrally as a shared source of truth but with
the simplest possible contract. This keeps applications loosely coupled.
• Query functionality is not shared; it is private to each service, allowing teams
to move quickly by retaining control of the datasets they use.
Figure 9-2. A number of different applications and services, each with its own
views, derived from the company’s core datasets held in Kafka; the views can be
optimized for each use case
As we’ll see in Chapter 10, this leads to some further optimizations where each
view is optimized to target a specific use case, in much the same way that materi‐
alized views are used in relational databases to create read-optimized, use-case-
focused datasets. Of course, unlike in a relational database, the view is decoupled
from the underlying data and can be regenerated from the log should it need to
be changed. (See “Materialized Views” on page 62 in Chapter 7.)
Summary
This chapter introduced the analogy that stream processing can be viewed as a
database turned inside out, or unbundled. In this analogy, responsibility for data
storage (the log) is segregated from the mechanism used to query it (the Stream
Processing API). This makes it possible to create views and embed them exactly
where they are needed—in another application, in another geography, or on
another platform. There are two main drivers for pushing data to code in this
way:
Lean data is a simple idea: rather than collecting and curating large datasets,
applications carefully select small, lean ones—just the data they need at a point in
time—which are pushed from a central event store into caches, or stores they
control. The resulting lightweight views are propped up by operational processes
that make rederiving those views practical.
91
and the data in the database will have been subject to many operational fixes over
its lifetime. So it is unsurprising that errors and inaccuracies creep in.
In stream processing, files aren’t copied around in this way. If a stream processor
creates a view, then does a release that changes the shape of that view, it typically
throws the original view away, resets to offset 0, and derives a new one from the
log.
Looking to other areas of our industry—DevOps and friends—we see similar
patterns. There was a time when system administrators would individually
tweak, tune, and mutate the computers they managed. Those computers would
end up being subtly different from one another, and when things went wrong it
was often hard to work out why.
Today, issues like these have been largely solved within as-a-service cultures that
favor immutability through infrastructure as code. This approach comes with
some clear benefits: deployments become deterministic, builds are identical, and
rebuilds are easy. Suddenly ops engineers transform into happy people empow‐
ered by the predictability of the infrastructure they wield, and comforted by the
certainty that their software will do exactly what it did in test.
Streaming encourages a similar approach, but for data. Event-sourced views are
kept lean and can be rederived from the log in a deterministic way. The view
could be a cache, a Kafka Streams state store, or a full-blown database. But for
this to work, we need to deal with a problem. Loading data can be quite slow. In
the next section we look at ways to keep this manageable.
Figure 10-1. If the messaging system can store data, then the views or databases it
feeds don’t have to
Kafka Streams
If you create event-sourced views with Kafka Streams, view regeneration is par
for the course. Views are either tables, which are a direct materialization of a
Kafka topic, or state stores, which are populated with the result of some declara‐
tive transformation, defined in JVM code or KSQL. Both of these are automati‐
cally rebuilt if the disk within the service is lost (or removed) or if the Streams
Reset tool is invoked.1 We discussed how Kafka Streams manages statefulness in
2 As a yardstick, RocksDB (which Kafka Streams uses) will bulk-load ~10M 500 KB objects per minute
(roughly GbE speed). Badger will do ~4M × 1K objects a minute per SSD. Postgres will bulk-load ~1M
rows per minute.
Figure 10-2. Data is replicated to a UAT environment where views are regenerated
from source
A good example of this is when schemas change. If you have used traditional
messaging approaches before to integrate data from one system into another, you
may have encountered a time when the message schema changed in a non-
backward-compatible way. For example, if you were importing customer infor‐
mation from a messaging system into a database when the customer message
schema undergoes a breaking change, you would typically craft a database script
to migrate the data forward, then subscribe to the new topic of messages.
If using Kafka to hold datasets in full,3 instead of performing a schema migration,
you can simply drop and regenerate the view.
3 We assume the datasets have been migrated forward using a technique like dual-schema upgrade win‐
dow, discussed in “Handling Schema Change and Breaking Backward Compatibility” on page 124 in
Chapter 13.
Summary
So when it comes to data, we should be unequivocal about the shared facts of our
system. They are the very essence of our business, after all. Lean practices
encourage us to stay close to these shared facts, a process where we manufacture
Summary | 97
PART IV
Consistency, Concurrency, and
Evolution
The term consistency is quite overused in our industry, with several different
meanings applied in a range of contexts. Consistency in CAP theorem differs
from consistency in ACID transactions, and there is a whole spectrum of subtly
different guarantees, including strong consistency and eventual consistency,
among others. The lack of consistent terminology around this word may seem a
little ironic, but it is really a reflection of the complexity of a subject that goes
way beyond the scope of this book.1
But despite these many subtleties, most people have an intuitive notion of what
consistency is, one often formed from writing single-threaded programs2 or mak‐
ing use of a database. This typically equates to some general notions about the
transactional guarantees a database provides. When you write a record, it stays
written. When you read a record, you read the most recently written value. If you
perform multiple operations in a transaction, they all become visible at once, and
you don’t need to be concerned with what other users may be doing at the same
1 For a full treatment, see Martin Kleppmann’s encyclopedic Designing Data-Intensive Applications
(Sebastopol, CA: O’Reilly, 2017).
2 The various consistency models really reflect optimizations on the concept of in-order execution against
a single copy of data. These optimizations are necessary in practice, and most users would prefer to
trade a slightly weaker guarantee for the better performance (or availability) characteristics that typically
come with them. So implementers come up with different ways to slacken the simple “in-order execu‐
tion” guarantee. These various optimizations lead to different consistency models, and because there are
many dimensions to optimize, particularly in distributed systems, there are many resulting models.
When we discuss “Scaling Concurrent Operations in Streaming Systems” on page 142 in Chapter 15, we’ll
see how streaming systems achieve strong guarantees by partitioning relevant data into different stream
threads, then wrapping those operations in a transaction, which ensures that, for that operation, we
have in-order execution on a single copy of data (i.e., a strong consistency model).
101
time. We might call this idea intuitive consistency (which is closest in technical
terms to the famous ACID properties).
A common approach to building business systems is to take this intuitive notion
and apply it directly. If you build traditional three-tier applications (i.e., client,
server, and database), this is often what you would do. The database manages
concurrent changes, isolated from other users, and everyone observes exactly the
same view at any one point in time. But groups of services generally don’t have
such strong guarantees. A set of microservices might call one another synchro‐
nously, but as data moves from service to service it will often become visible to
users at different times, unless all services coordinate around a single database
and force the use of a single global consistency model.
But in a world where applications are distributed (across geographies, devices,
etc.), it isn’t always desirable to have a single, global consistency model. If you
create data on a mobile device, it can only be consistent with data on a backend
server if the two are connected. When disconnected, they will be, by definition,
inconsistent (at least in that moment) and will synchronize at some later point,
eventually becoming consistent. But designing systems that handle periods of
inconsistency is important. For a mobile device, being able to function offline is a
desirable feature, as is resynchronizing with the backend server when it recon‐
nects, converging to consistency as it does so. But the usefulness of this mode of
operation depends on the specific work that needs to be done. A mobile shop‐
ping application might let you select your weekly groceries while you’re offline,
but it can’t work out whether those items will be available, or let you physically
buy anything until you come back online again. So these are use cases where
global strong consistency is undesirable.
Business systems often don’t need to work offline in this way, but there are still
benefits to avoiding global strong consistency and distributed transactions: they
are difficult and expensive to scale, don’t work well across geographies, and are
often relatively slow. In fact, experience with distributed transactions that span
different systems, using techniques like XA, led the majority of implementers to
design around the need for such expensive coordination points.
But on the other hand, business systems typically want strong consistency to
reduce the potential for errors, which is why there are vocal proponents who
consider stronger safety properties valuable. There is also an argument for want‐
ing a bit of both worlds. This middle ground is where event-driven systems sit,
often with some form of eventual consistency.
Eventual Consistency
The previous section refers to an intuitive notion of consistency: the idea that
business operations execute sequentially on a single copy of data. It’s quite easy
Figure 11-1. A set of services that notify one another and share data via a database
Timeliness
Consider the email service (4) and orders view (5). Both subscribe to the same
event stream (Validated Orders) and process them concurrently. Executing con‐
currently means one will lag slightly behind the other. Of course, if we stopped
writes to the system, then both the orders view and the email service would even‐
tually converge on the same state, but in normal operation they will be at slightly
different positions in the event stream. So they lack timeliness with respect to one
another. This could cause an issue for a user, as there is an indirect connection
between the email service and the orders service. If the user clicks the link in the
confirmation email, but the view in the orders service is lagging, the link would
either fail or return an incorrect state (7).
So a lack of timeliness (i.e., lag) can cause problems if services are linked in some
way, but in larger ecosystems it is beneficial for the services to be decoupled, as it
allows them to do their work concurrently and in isolation, and the issues of
timeliness can usually be managed (this relates closely to the discussion around
CQRS in Chapter 7).
But what if this behavior is unacceptable, as this email example demonstrates?
Well, we can always add serial execution back in. The call to the orders service
might block until the view is updated (this is the approach taken in the worked
When the single writer principle is applied in conjunction with Event Collabora‐
tion (discussed in Chapter 5), each writer evolves part of a single business work‐
flow through a set of successive events. So in Figure 11-3, which shows a larger
online retail workflow, several services collaborate around the order process as
an order moves from inception, through payment processing and shipping, to
Figure 11-3. Here each circle represents an event; the color of the circle designates
the topic it is in; a workflow evolves from Order Requested through to Order Com‐
pleted; on the way, four services perform different state transitions in topics they
are “single writer” to; the overall workflow spans them all
So, instead of sharing a global consistency model (e.g., via a database), we use the
single writer principle to create local points of consistency that are connected via
the event stream. There are a couple of variants on this pattern, which we will
discuss in the next two sections.
As we’ll see in “Scaling Concurrent Operations in Streaming Systems” on page 142
in Chapter 15, single writers can be scaled out linearly through partitioning, if we
use Kafka’s Streams API.
Command Topic
A common variant on this pattern uses two topics per entity, often named Com‐
mand and Entity. This is logically identical to the base pattern, but the Com‐
mand topic can be written to by any process and is used only for the initiating
event. The Entity topic can be written to only by the owning service: the single
writer. Splitting these two allows administrators to enforce the single writer prin‐
ciple strictly by configuring topic permissions. So, for example, we might break
order events into two topics, shown in Table 11-1.
Table 11-2. The order service and payment services both write to the
orders topic, but each service is responsible for a different state transition
Service Orders service Payment service
Topic OrdersTopic OrdersTopic
Writable transition OrderRequested->OrderValidated OrderValidated->PaymentReceived
PaymentReceived->OrderConfirmed
• Messages sent to different topics, within a transaction, will either all be writ‐
ten or none at all.
• Messages sent to a single topic, in a transaction, will never be subject to
duplicates, even on failure.
Summary
In this chapter we looked at why global consistency can be problematic and why
eventual consistency can be useful. We adapted eventual consistency with the
single writer principle, keeping its lack of timeliness but avoiding collisions.
Finally, we looked at implementing identity and concurrency control in event-
driven systems.
Kafka ships with built-in transactions, in much the same way that most relational
databases do. The implementation is quite different, as we will see, but the goal is
similar: to ensure that our programs create predictable and repeatable results,
even when things fail.
Transactions do three important things in a services context:
In this chapter we delve into transactions, looking at the problems they solve,
how we should make use of them, and how they actually work under the covers.
111
One issue with this is that retries can result in duplicate processing, and this can
cause very real problems. Taking a payment twice from someone’s account will
lead to an incorrect balance (Figure 12-1). Adding duplicate tweets to a user’s
feed will lead to a poor user experience. The list goes on.
Figure 12-1. The UI makes a call to the payment service, which calls an external
payment provider; the payment service fails before returning to the UI; as the UI
did not get a response, it eventually times out and retries the call; the user’s account
could be debited twice
If you are using the Kafka Streams API, no extra code is required. You simply
enable the feature.
Figure 12-4. Message brokers provide two opportunities for failure—one when
sending to the broker, and one when reading from it
1 In practice a clever optimization is used to move buffering from the consumer to the broker, reducing
memory pressure. Begin markers are also optimized out.
To ensure each transaction is atomic, sending the Commit markers involves the
use of a transaction coordinator. There will be many of these spread throughout
the cluster, so there is no single point of failure, but each transaction uses just
one.
The transaction coordinator is the ultimate arbiter that marks a transaction com‐
mitted atomically, and maintains a transaction log to back this up (this step
implements two-phase commit).
For those that worry about performance, there is of course an overhead that
comes with this feature, and if you were required to commit after every message,
the performance degradation would be noticeable. But in practice there is no
need for that, as the overhead is dispersed among whole batches of messages,
allowing us to balance transactional overhead with worst-case latency. For exam‐
ple, batches that commit every 100 ms, with a 1 KB message size, have a 3% over‐
head when compared to in-order, at-least-once delivery. You can test this out
yourself with the performance test scripts that ship with Kafka.
In reality, there are many subtle details to this implementation, particularly
around recovering from failure, fencing zombie processes, and correctly allocat‐
ing IDs, but what we have covered here is enough to provide a high-level under‐
standing of how this feature works. For a comprehensive explanation of how
transactions work, see the post “Transactions in Apache Kafka” by Apurva
Mehta and Jason Gustafson.
The database used by Kafka Streams is a state store. Because state stores
are backed by Kafka topics, transactions let us tie messages we send and
state we save in state stores together, atomically.
Imagine we extend the previous example so our validation service keeps track of
the balance as money is deposited. So if the balance is currently $50, and we
deposit $5 more, then the balance should go to $55. We record that $5 was
deposited, but we also store this current balance, $55, by writing it to a state store
(or directly to a compacted topic). See Figure 12-6.
Figure 12-6. Three messages are sent atomically: a deposit, a balance update, and
the acknowledgment
If transactions are enabled in Kafka Streams, all these operations will be wrapped
in a transaction automatically, ensuring the balance will always be atomically in
sync with deposits. You can achieve the same process with the product and con‐
sumer by wrapping the calls manually in your code, and the current account bal‐
ance can be reread on startup.
What’s powerful about this example is that it blends concepts of both messaging
and state management. We listen to events, act, and create new events, but we
Summary
Transactions affect the way we build services in a number of specific ways:
• They take idempotence right off the table for services interconnected with
Kafka. So when we build services that follow the pattern “read, process,
(save), send,” we don’t need to worry about deduplicating inputs or con‐
structing keys for outputs.
• We no longer need to worry about ensuring there are appropriate unique
keys on the messages we send. This typically applies less to topics containing
business events, which often have good keys already. But it’s useful when
we’re managing derivative/intermediary data—for example, when we’re
remapping events, creating aggregate events, or using the Streams API.
So, to put it simply, when you are building event-based systems, Kafka’s transac‐
tions free you from the worries of failure and retries in a distributed world—wor‐
ries that really should be a concern of the infrastructure, not of your code. This
raises the level of abstraction, making it easier to get accurate, repeatable results
from large estates of fine-grained services.
Having said all that, we should also be careful. Transactions remove just one of
the issues that come with distributed systems, but there are many more. Coarse-
grained services still have their place. But in a world where we want to be fast and
nimble, streaming platforms raise the bar, allowing us to build finer-grained
services that behave as predictably in complex chains as they would standing
alone.
Summary | 121
CHAPTER 13
Evolving Schemas and Data over Time
Schemas are the APIs used by event-driven services, so a publisher and sub‐
scriber need to agree on exactly how a message is formatted. This creates a logical
coupling between sender and receiver based on the schema they both share. In
the same way that request-driven services make use of service discovery technol‐
ogy to discover APIs, event-driven technologies need some mechanism to dis‐
cover what topics are available, and what data (i.e., schema) they provide.
There are a fair few options available for schema management: Protobuf and
JSON Schema are both popular, but most projects in the Kafka space use Avro.
For central schema management and verification, Confluent has an open source
Schema Registry that provides a central repository for Avro schemas.
123
Unfortunately, you can’t move or remove fields from a schema in a compatible
way, although it’s typically possible to synthesize a move with a clone. The data
will be duplicated in two places until such time as a breaking change can be
released.
This ability to evolve a schema with additive changes that don’t break old pro‐
grams is how most shared messaging models are managed over time.
The Confluent Schema Registry can be used to police this approach. The Schema
Registry provides a mapping between topics in Kafka and the schema they use
(Figure 13-1). It also enforces compatibility rules before messages are added. So
the Schema Registry will check every message sent to Kafka for Avro compatibil‐
ity, ensuring that incompatible messages will fail on publication.
Figure 13-1. Calling out to the Schema Registry to validate schema compatibility
when reading and writing orders in the orders service
• The orders service can dual-publish in both schemas at the same time, to two
topics, using Kafka’s transactions API to make the publication atomic. (This
approach doesn’t solve back-population so isn’t appropriate for topics used
for long-term storage.)
• The orders service can be repointed to write to orders-v2. A Kafka Streams
job is added to down-convert from the orders-v2 topic to the orders-v1 for
backward compatibility. (This also doesn’t solve back-population.) See
Figure 13-2.
• The orders service continues to write to orders-v1. A Kafka Streams job is
added that up-converts from orders-v1 topic to orders-v2 topic until all cli‐
ents have upgraded, at which point the orders service is repointed to orders-
v2. (This approach handles back-population.)
• The orders service can migrate its dataset internally, in its own database,
then republish the whole view into the log in the orders-v2 topic. It then
continues to write to both orders-v1 and orders-v2 using the appropriate
formats. (This approach handles back-population.)
All four approaches achieve the same goal: to give services a window in which
they can upgrade. The last two options make it easier to port historic messages
from the v1 to the v2 topics, as the Kafka Streams job will do this automatically if
it is started from offset 0. This makes it better suited to long-retention topics
such as those used in Event Sourcing use cases.
Services continue in this dual-topic mode until fully migrated to the v2 topic, at
which point the v1 topic can be archived or deleted as appropriate.
As an aside, we discussed the single writer principle in Chapter 11. One of the
reasons for applying this approach is that it makes schema upgrades simpler. If
we had three different services writing orders, it would be much harder to sched‐
ule a non-backward-compatible upgrade without a conjoined release.
Deleting Data
When you keep datasets in the log for longer periods of time, or even indefi‐
nitely, there are times you need to delete messages, correct errors or corrupted
data, or redact sensitive sections. A good example of this is recent regulations like
General Data Protection Regulation (GDPR), which, among other things, gives
users the right to be forgotten.
The simplest way to remove messages from Kafka is to simply let them expire. By
default, Kafka will keep data for two weeks, and you can tune this to an arbitrar‐
ily large (or small) period of time. There is also an Admin API that lets you delete
messages explicitly if they are older than some specified time or offset. When
using Kafka for Event Sourcing or as a source of truth, you typically don’t need
delete. Instead, removal of a record is performed with a null value (or delete
Summary
In this chapter we looked at a collection of somewhat disparate issues that affect
event-driven systems. We considered the problem of schema change: something
that is inevitable in the real-world. Often this can be managed simply by evolving
the schema with a format like Avro or Protobuf that supports backward compati‐
bility. At other times evolution will not be possible and the system will have to
undergo a non-backward-compatible change. The dual-schema upgrade window
is one way to handle this.
Then we briefly looked at handling unreadable messages as well as how data can
be deleted. For many users deleting data won’t be an issue–it will simply age out
of the log–but for those that keep data for longer periods this typically becomes
important.
When it comes to building event-driven services, the Kafka Streams API pro‐
vides the most complete toolset for handling a distributed, asynchronous world.
Kafka Streams is designed to perform streaming computations. We discussed a
simple example of such a use case, where we processed app open/close events
emitted from mobile phones in Chapter 2. We also touched on its stateful ele‐
ments in Chapter 6. This led us to three types of services we can build: event-
driven, streaming, and stateful streaming.
In this chapter we look more closely at this unique tool for stateful stream pro‐
cessing, along with its powerful declarative interface: KSQL.
133
either by implementing a user-defined function (UDF) directly or, more com‐
monly, by pushing the output to a Kafka topic and using a native Kafka client, in
whatever language our service is built in, to process the manipulated streams one
message at a time. Whichever approach we take, these tools let us model business
operations in an asynchronous, nonblocking, and coordination-free manner.
Let’s consider something more concrete. Imagine we have a service that sends
emails to platinum-level clients (Figure 14-1). We can break this problem into
two parts. First, we prepare by joining a stream of orders to a table of customers
and filtering for the “platinum” clients. Second, we need code to construct and
send the email itself. We would do the former in the DSL and the latter with a
per-message function:
//Join customers and orders
orders.join(customers, Tuple::new)
//Consider confirmed orders for platinum customers
.filter((k, tuple) → tuple.customer.level().equals(PLATINUM)
&& tuple.order.state().equals(CONFIRMED))
//Send email for each customer/order pair
.peek((k, tuple) → emailer.sendMail(tuple));
Figure 14-1. An example email service that joins orders and customers, then sends
an email
We can perform the same operation using KSQL (Figure 14-2). The pattern is
the same; the event stream is dissected with a declarative statement, then pro‐
cessed one record at a time:
Incoming event streams are buffered for a defined period of time (denoted reten‐
tion). But to avoid doing all of this buffering in memory, state stores—disk-
backed hash tables—overflow the buffered streams to disk. Thus, regardless of
which event turns up later, the corresponding event can be quickly retrieved
from the buffer so the join operation can complete.
Kafka Streams also manages whole tables. Tables are a local manifestation of a
complete topic—usually compacted—held in a state store by key. (You might
also think of them as a stream with infinite retention.) In a services context, such
tables are often used for enrichments. So to look up the customer’s email, we
might use a table loaded from the Customers topic in Kafka.
The nice thing about using a table is that it behaves a lot like tables in a database.
So when we join a stream of orders to a table of customers, there is no need to
worry about retention periods, windows, or other such complexities. Figure 14-4
shows a three-way join between orders, payments, and customers, where cus‐
tomers are represented as a table.
Figure 14-5. Using a state store to keep application-specific state within the Kafka
Streams API as well as backed up in Kafka
Starting from the lefthand side of the Figure 15-1, the REST interface provides
methods to POST and GET orders. Posting an order creates an Order Created
event in Kafka. Three validation engines (Fraud, Inventory, Order Details) sub‐
scribe to these events and execute in parallel, emitting a PASS or FAIL based on
whether each validation succeeds. The result of these validations is pushed
through a separate topic, Order Validations, so that we retain the single writer
relationship between the orders service and the Orders topic.1 The results of the
various validation checks are aggregated back in the orders service, which then
moves the order to a Validated or Failed state, based on the combined result.
Validated orders accumulate in the Orders view, where they can be queried his‐
torically. This is an implementation of the CQRS design pattern (see “Command
1 In this case we choose to use a separate topic, Order Validations, but we might also choose to update the
Orders topic directly using the single-writer-per-transition approach discussed in Chapter 11.
139
Query Responsibility Segregation” on page 61 in Chapter 7). The email service
sends confirmation emails.
The inventory service both validates orders and reserves inventory for the pur‐
chase—an interesting problem, as it involves tying reads and writes together
atomically. We look at this in detail later in this chapter.
Join-Filter-Process
Most streaming systems implement the same broad pattern where a set of
streams is prepared, and then work is performed one event at a time. This
involves three steps:
1. Join. The DSL is used to join a set of streams and tables emitted by other
services.
2. Filter. Anything that isn’t required is filtered. Aggregations are often used
here too.
3. Process. The join result is passed to a function where business logic executes.
The output of this business logic is pushed into another stream.
This pattern is seen in most services but is probably best demonstrated by the
email service, which joins orders, payments, and customers, forwarding the
result to a function that sends an email. The pattern can be implemented in
either Kafka Streams or KSQL equivalently.
Figure 15-2. Close-up of the Orders Service, from Figure 15-1, demonstrating the
materialized view it creates which can be accessed via an HTTP GET; the view rep‐
resents the Query-side of the CQRS pattern and is spread over all three instances of
the Orders Service
Because data is partitioned it can be scaled out horizontally (Kafka Streams sup‐
ports dynamic load rebalancing), but it also means GET requests must be routed
to the right node—the one that has the partition for the key being requested. This
is handled automatically via the interactive queries functionality in Kafka
Streams.2
There are actually two parts to this. The first is the query, which defines what
data goes into the view. In this case we are grouping orders by their key (so new
orders overwrite old orders), with the result written to a state store where it can
be queried. We might implement this with the Kafka Streams DSL like so:
builder.stream(ORDERS.name(), serializer)
.groupByKey(groupSerializer)
.reduce((agg, newVal) -> newVal, getStateStore())
2 It is also common practice to implement such event-sourced views via Kafka Connect and your data‐
base of choice, as we discussed in “Query a Read-Optimized View Created in a Database” on page 69 in
Chapter 7. Use this method when you need a richer query model or greater storage capacity.
The first point should be pretty obvious: if we fail and we’re not wrapped in a
transaction, we have no idea what state the system will be in. But the second
point should be a little less clear, because for it to make sense we need to think
about this particular operation being scaled out linearly over several different
threads or machines.
Stateful stream processing systems like Kafka Streams have a novel and high-
performance mechanism for managing stateful problems like these concurrently.
We have a single critical section:
Let’s first consider how a traditional (i.e., not stateful) streaming system might
work (Figure 15-4). If we scale the operation to run over two parallel processes,
we would run the critical section inside a transaction in a (shared) database. So
both instances would bottleneck on the same database instance.
Stateful stream processing systems like Kafka Streams avoid remote transactions
or cross-process coordination. They do this by partitioning the problem over a
set of threads or processes using a chosen business key. (“Partitions and Parti‐
tioning” was discussed in Chapter 4.) This provides the key (no pun intended) to
scaling these systems horizontally.
Partitioning in Kafka Streams works by rerouting messages so that all the state
required for one particular computation is sent to a single thread, where the
computation can be performed.3 The approach is inherently parallel, which is
how streaming systems achieve such high message-at-a-time processing rates
(for example, in the use case discussed in Chapter 2). But the approach works
only if there is a logical key that cleanly segregates all operations: both state that
they need, and state they operate on.
So splitting (i.e., partitioning) the problem by ProductId ensures that all opera‐
tions for one ProductId will be sequentially executed on the same thread. That
means all iPads will be processed on one thread, all iWatches will be processed
on one (potentially different) thread, and the two will require no coordination
between each other to perform the critical section (Figure 15-5). The resulting
operation is atomic (thanks to Kafka’s transactions), can be scaled out horizon‐
tally, and requires no expensive cross-network coordination. (This is similar to
the Map phase in MapReduce systems.)
3 As an aside, one of the nice things about this feature is that it is managed by Kafka, not Kafka Streams.
Kafka’s Consumer Group Protocol lets any group of consumers control how partitions are distributed
across the group.
The inventory service must rearrange orders so they are processed by ProductId.
This is done with an operation called a rekey, which pushes orders into a new
intermediary topic in Kafka, this time keyed by ProductId, and then back out to
the inventory service. The code is very simple:
orders.selectKey((id, order) -> order.getProduct())//rekey by ProductId
Part 2 of the critical section is a state mutation: inventory must be reserved. The
inventory service does this with a Kafka Streams state store (a local, disk-resident
hash table, backed by a Kafka topic). So each thread executing will have a state
store for “reserved stock” for some subset of the products. You can program with
these state stores much like you would program with a hash map or key/value
store, but with the benefit that all the data is persisted to Kafka and restored if the
process restarts. A state store can be created in a single line of code:
KeyValueStore<Product, Long> store = context.getStateStore(RESERVED);
Then we make use of it, much like a regular hash table:
//Get the current reserved stock for this product
Long reserved = store.get(order.getProduct());
//Add the quantity for this order and submit it back
store.put(order.getProduct(), reserved + order.getQuantity())
Writing to the store also partakes in Kafka’s transactions, discussed in Chap‐
ter 12.
Rekey to Join
We can apply exactly the same technique used in the previous section, for parti‐
tioning writes, to partitioning reads (e.g., to do a join). Say we want to join a
stream of orders (keyed by OrderId) to a table of warehouse inventory (keyed by
ProductId), as we do in Figure 15-3. The join will have to use the ProductId.
Figure 15-6. To perform a join between orders and warehouse inventory by Pro‐
ductId, orders are repartitioned by ProductId, ensuring that for each product all
corresponding orders will be on the same instance
There are limitations to this approach, though. The keys used to partition the
event streams must be invariant if ordering is to be guaranteed. So in this particu‐
lar case it means the keys, ProductId and OrderId, on each order must remain
fixed across all messages that relate to that order. Typically, this is a fairly easy
thing to manage at a domain level (for example, by enforcing that, should a user
want to change the product they are purchasing, a brand new order must be cre‐
ated).
Basket writer/view
These represent an implementation of CQRS, as discussed in “Command
Query Responsibility Segregation” on page 61 in Chapter 7. The Basket
writer proxies HTTP requests, forwarding them to the Basket topic in Kafka
when a user adds a new item. The Confluent REST proxy (which ships with
the Confluent distribution of Kafka) is used for this. The Basket view is an
event-sourced view, implemented in Kafka Streams, with the contents of its
Summary
When we build services using a streaming platform, some will be stateless—sim‐
ple functions that take an input, perform a business operation, and produce an
output. Some will be stateful, but read only, as in event-sourced views. Others
will need to both read and write state, either entirely inside the Kafka ecosystem
(and hence wrapped in Kafka’s transactional guarantees), or by calling out to
other services or databases. One of the most attractive properties of a stateful
stream processing API is that all of these options are available, allowing us to
trade the operational ease of stateless approaches for the data processing capabil‐
ities of stateful ones.
But there are of course drawbacks to this approach. While standby replicas,
checkpoints, and compacted topics all mitigate the risks of pushing data to code,
there is always a worst-case scenario where service-resident datasets must be
rebuilt, and this should be considered as part of any system design.
There is also a mindset shift that comes with the streaming model, one that is
inherently more asynchronous and adopts a more functional and data-centric
style, when compared to the more procedural nature of traditional service inter‐
faces. But this is—in the opinion of this author—an investment worth making.
In this chapter we looked at a very simple system that processes orders. We did
this with a set of small streaming microservices that implement the Event Collab‐
oration pattern we discussed in Chapter 5. Finally, we looked at how we can cre‐
ate a larger architecture using the broader range of patterns discussed in this
book.