Oreilly Report What Is Distributed SQL
Oreilly Report What Is Distributed SQL
m
pl
im
en
ts
of
What Is
Distributed
SQL?
Scale, Resilience, and Data Locality
for Modern Applications
REPORT
What Is Distributed SQL?
Scale, Resilience, and Data Locality
for Modern Applications
The O’Reilly logo is a registered trademark of O’Reilly Media, Inc. What Is Dis‐
tributed SQL?, the cover image, and related trade dress are trademarks of O’Reilly
Media, Inc.
The views expressed in this work are those of the authors and do not represent the
publisher’s views. While the publisher and the authors have used good faith efforts
to ensure that the information and instructions contained in this work are accurate,
the publisher and the authors disclaim all responsibility for errors or omissions,
including without limitation responsibility for damages resulting from the use of
or reliance on this work. Use of the information and instructions contained in this
work is at your own risk. If any code samples or other technology this work contains
or describes is subject to open source licenses or the intellectual property rights of
others, it is your responsibility to ensure that your use thereof complies with such
licenses and/or rights.
This work is part of a collaboration between O’Reilly and Cockroach Labs. See our
statement of editorial independence.
978-1-098-11645-3
[LSI]
Table of Contents
4. Looking Forward. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
Distributed SQL: The Enabler 23
Serverless 23
The Distributed Mindset: An Exhortation 26
iii
The Distributed Mindset
v
Resilience
It doesn’t matter if one computer or one datacenter disappears.
Applications exist in computing fabric and simply survive
failures.
Locality
One must put the computing resource where it needs to go, no
matter where that is.
The distributed mindset and cloud computing concepts are under‐
stood and accepted in the areas of storage, compute, development,
deployment, and analytical data—but there is a straggler: the trans‐
actional database. However, in recent years, our traditional SQL
(structured query language) databases have been reimagined to
incorporate this distributed mindset, and an emerging category
known as distributed SQL has been established.
Distributed SQL fits all applications and eliminates complex chal‐
lenges like sharding from traditional RDBMS systems. Distributed
SQL databases can scale to become global without introducing the
consistency trade-offs found in NoSQL solutions. Distributed SQL
comes to life through cloud computing, where legacy databases
simply can’t rise to meet this elastic and ubiquitous paradigm.
This report will take you through why the distributed SQL category
has emerged, what it consists of, how it can be (and is) used, and
what the future holds for it.
Let’s start with the motivation.
1
authentication with easily integrated Firebase Authentication. Pretty
simple.
The application toolkit can run entirely on small, cloud-hosted
development systems. There, developers run isolated versions of
their system and unit-test feature builds. When they’re ready, they
initiate a full suite of cloud-based automated tests. If those tests
pass, code and assets automatically deploy to a global content distri‐
bution network, and automatically update a set of virtual machines
or containers. Google is not alone in the distributed application
development game. See Table 1-1 for an incomplete list of products
offered by major cloud vendors in this same space.
stands for platform as a service; Saas stands for software as a service. Note the totals may not add
up due to rounding. Source: Gartner (November 2019).
Again, existing SQL databases haven’t kept up. They are not partic‐
ularly resilient to outages in infrastructure zones, especially when
the zone that goes offline contains the main transactional instance.
And while NoSQL alternatives help with this particular challenge,
they can’t promise transactional consistency and present developers
complexity around a document model that lacks the elegance and
power of the relational model (for example, normalization, referen‐
tial integrity, secondary indexes, and joins).
Distributed SQL databases fill in the gaps. They are highly resilient
to any type of outage. They deliver the time-tested and familiar
SQL query syntax that developers know and love and promise
truly consistent ACID transactions. And most importantly, they are
aligned with the elastic scale and ubiquitous nature of our cloud
infrastructure.
7
overlap each other and create strange read or write scenarios. After
we lay out the key concepts, we’ll see Google’s Spanner project as a
pioneer in distributed SQL.
CAP Theorem
Computer scientist Eric Brewer published the foundational pieces of
the CAP theorem in the late 1990s. It lays out the boundaries for
what a distributed data store can guarantee about three desirable
read and write capabilities:
Consistency
Every data node provides the latest state of requested data. If it
doesn’t have the latest state, it won’t provide a state.
Availability
Every data node can always read and write data in the system.
Partition tolerance
The data system works even if partitions occur in the network.
Partitions mean network functionality has broken between two
or more data nodes in a distributed system.
These three capabilities relate in such a way that a distributed data
store can only guarantee two of them during a partition event.
Looking at the Venn diagram (Figure 2-1), CAP theorem could be
restated to mean that there is no central three-circle overlap—no
distributed data store can guarantee all three at the same time.
Leader election
Systems implementing Raft split nodes into groups. Each group has
a leader node, which handles incoming data requests from clients
and sends its log to other nodes. The leader is the authoritative
source of truth within the replica group. The leader election process
provides Raft-implementing systems a way to determine which node
is the leader, and when to redetermine leadership. A node’s duration
of being leader is called its term, with a monotonically increasing
number identifier.
A leader node sends heartbeat messages to its followers. When a
follower node doesn’t receive a heartbeat from the leader after a
certain interval, it makes itself a candidate node. A candidate node
assigns itself a new term number and sends messages to other nodes
to request their vote. If the candidate node receives votes from a
majority of other nodes, it becomes the leader and starts sending
heartbeat messages.
Other nodes in the group may have undergone the follower-to-
candidate process as well. Those nodes have chosen their own term
number and are still waiting for election results. Recall that object
data flow goes one way in the Raft group: the leader system sends
log data messages to followers. When the newly elected leader node
sends log data messages, it also sends the term number it was elected
under. When candidate nodes receive log data messages with a
higher term number than their own, they accept that log entry and
revert to follower state.
It’s possible for an election cycle to come away with no clear winner
node. Raft minimizes this by setting randomized heartbeat time-
outs on each node as it enters a new election cycle, so that not all
nodes enter the candidate state at one time.
Log replication
The leader of a Raft group maintains a log of data activity. As the
data comes from the leader and moves on to the followers, the
leader needs to keep all the node logs in sync with its own. It does
this through the Raft log replication process.
A client-initiated request to operate on data is added to the leader’s
log. This log entry includes an incremented index marking the log
entry’s position in the log, and the term number that the leader has
been elected with. Then the leader sends all the follower nodes a
message containing the log entry, index, election term, and previous
index, until all the followers replicate it. When a majority of follow‐
ers respond in the affirmative, the leader executes the data change in
the log and responds to the client with a success message.
The key to maintaining log consistency is that followers reject log
entries if they cannot find a log entry matching the previous index.
The leader then begins a process of moving backward through the
follower’s indices and finding where the two logs agree; then the
leader forces the follower to duplicate logs from there.
Safety
In election and replication processes, Raft ensures that the following
five rules apply, to ensure that all nodes take the same logs in the
same order. Ongaro and Ousterhout from Stanford briefly lay out
the safety mechanisms in their original Raft paper:1
Election safety
At most one leader can be elected in a given term.
Leader append-only
A leader never overwrites or deletes entries in its log; it only
appends new entries.
Google Spanner
In the implementation of distributed SQL, everything started with
Spanner. It takes the key elements we’ve described (distributed
consensus and multiversion concurrency control) and gives dis‐
tributed SQL its first steps as a real-life database product. Spanner
reached the public eye with the publication of a paper describing
it in 2012.2 The paper’s abstract describes it well: “Spanner is Goo‐
gle’s scalable, multiversion, globally-distributed, and synchronously-
replicated database. It is the first system to distribute data at global
scale and support externally-consistent distributed transactions.”
Google had data needs that stretched the boundaries of existing
database systems. Very large datasets like Google Ads had grown
to enormous size, spread all over the world, in a highly sharded
MySQL system. Having many shards makes management, mainte‐
nance, and upgrades incredibly difficult. And Ads wasn’t the only
system facing this problem. It was time to make it easier for pro‐
grammers to build applications that fit the growing need for global
transactional data. The Google team invented Spanner and put into
practice the previously discussed consensus and MVCC concepts,
using a different consensus algorithm called Paxos.
Management
This could be said for any distributed system: if it’s mission critical,
it needs to be managed. Distributed SQL is no different. Enterprises
need a variety of ways to visualize and control database resources.
3 See David A. Bacon et al. “Spanner: Becoming a SQL System.” Paper included in
the Proceedings of the 2017 ACM International Conference on Management of Data
(SIGMOD’17). Association for Computing Machinery, New York, NY, May 2017, 331–
43. https://doi.org/10.1145/3035918.3056103.
Security
Beyond monitoring and control, enterprises need security. Dis‐
tributed SQL databases support a variety of authentication mech‐
anisms. Username and password are a given for any system, but
many others are available. For the data protection side of security,
almost all distributed SQL products support row-level permissions
and in-motion and at-rest encryption.
Integration
These days, no enterprise runs just one large-scale system. They
need to integrate many systems. CockroachDB and YugabyteDB
have wire compatibility with PostgreSQL, so integration tools that
connect to databases are likely to have drivers that already work.
Google Cloud Spanner, while not wire-compatible with other stan‐
dard database drivers, has deep integration with other Google Cloud
Platform products and a growing set of open source connectors.
Databases are always part of a larger (and often quite complex)
enterprise data architecture that will allow for streaming analytics,
data warehousing, AI/ML, and other data operations. The database
should fit into this landscape and be able to use shared storage
or change data capture (CDC) to integrate and work with these
systems.
17
Usage
A major US telecommunications company uses a virtual customer
support agent as the triage point for customer requests. This vir‐
tual agent was originally built based on a standard RDBMS—which
caused major pain when a cloud provider connectivity issue made
the database unavailable. To mitigate this in the future, this telecom
reengineered the virtual agent’s backend to run on a distributed SQL
solution with high availability.
A Greek telecommunications company provides businesses with
VoIP telephony. A build-out of their new Session Initiation Protocol
(SIP) platform required a global database that can live anywhere—
on-premises or in the cloud—and provide high performance on
both volume and latency. Providing VoIP services to businesses
means ensuring that those businesses never miss a call from their
customers, so the database solution also required extreme resilience
to regional outages.
Deployment
Hybrid deployment of distributed SQL means that some servers are
hosted in private datacenters and some are hosted in public clouds.
In both cases here, hybrid deployment was key. Telecoms often have
their own datacenters, so their distributed computing landscape
includes both on-premises and cloud computing.
Usage
A global retailer with hundreds of physical stores in dozens of coun‐
tries faced significant “distributed” challenges. A set of legacy sys‐
tems based on MySQL limited their ability to scale, and availability
was becoming an issue more and more often. They moved to using
a set of managed instances of a distributed SQL system to process
orders for wholesale partners. Success of this application has laid
plans for further electronic data interchange (EDI) integration and
later moving all product data to the distributed SQL system.
A data platform that helps retailers match promotions and tactics to
customers’ usage habits chose to consolidate several types of existing
database systems into one system. This simplified their data infra‐
structure and enabled extremely responsive, fast query operations.
Moving to a single database spread across multiple availability zones
made their data more highly available than ever.
A business-to business (B2B) company creates white-label experien‐
ces for its retail customers, tuned for before and after a purchase. A
growing set of retailers and brands onboarding to its platform made
scaling on AWS databases very expensive, especially considering
large-impact scaling events like Black Friday and Cyber Monday.
Switching to multicloud distributed SQL saved money versus AWS
solutions, kept them in compliance with customers’ EU General
Data Protection Regulation (GDPR) requirements, and simplified
operations.
Deployment
Two styles of deployment are common in the retail sector: multi‐
cloud and managed cloud. A multicloud deployment means hosting
replicas of the distributed SQL system in datacenters managed by
different cloud providers. For example, one CockroachDB system
addressable from one entry point might have hosts in DigitalOcean,
AWS, and GCP. Multicloud is particularly helpful in minimizing
latency while maintaining availability.
Managed cloud means letting a service provider do the heavy lifting
of operations for your distributed SQL database. Often, you still
have a limited choice in which cloud provider hosts your database.
A managed cloud deployment is especially powerful when you have
a small engineering team or your technical resources do not focus
on system administration, but your solutions still require distributed
SQL features.
Usage
A leading gaming company processes billions of financial transac‐
tions per year. Forecasting continued growth and new markets, they
chose a distributed SQL solution to manage customer wallet data,
so that georeplication could enhance responsiveness and compliance
for customers no matter their location.
A game development studio has created successful games for years.
As they built on the success of their intellectual property, they
began to see scalability issues with their Aurora MySQL databases.
They migrated to a new distributed SQL solution with automated
Deployment
Multicloud deployments are common in gaming, which makes
sense in light of georeplication and scaling needs. For gaming com‐
panies with lots of in-house administration skill, Kubernetes is a
popular choice for orchestrating distributed SQL clusters. Vendors
such as Cockroach Labs and Yugabyte offer Kubernetes operators to
simplify operations.
Themes
In examining use cases, a few other themes pop up alongside
industry-specific use cases. Understanding real-world use wouldn’t
be complete without these.
Internet of Things (IoT) use cases abound across industries. Logistics,
social, gaming, and others all have needs to capture events as they
happen on connected devices. For example, with the ability to scale
Themes | 21
to enormous size and keep data localized, distributed SQL is a great
way to keep a multinational company in tune with its plant opera‐
tions. Or when trucks, ships, and containers all move constantly—
distributed SQL presents a single source of truth on tracking and
load data. Gaming consoles and peripherals, social groups, and edu‐
cation all benefit from global-yet-local data.
Cost is, of course, a huge factor for any company considering tech‐
nology solutions. Across industries and use cases, customers report
cost reduction as a major benefit of standardizing on distributed
SQL. Scale is a well-known cost killer in cloud platform-as-a-service
solutions like AWS DynamoDB or Azure SQL. Taking tight control
of when and how to scale with Kubernetes and multicloud deploy‐
ments makes distributed SQL an effective way to reduce total cost of
ownership of database solutions.
The final chapter of this report is your crystal ball, looking future‐
ward at distributed SQL and other distributed-computing-enabled
capabilities.
Serverless
Since the mid-2010s, a distributed paradigm known as serverless has
risen in popularity. Its most common form is code functions that
can be invoked remotely from anywhere, with the distinguishing
feature that while they execute on computers, they don’t need to
be managed in any way as physical machines (see Table 4-1 for
hyperscaler serverless function solutions).
23
For example, in AWS Lambda a developer can create a utility func‐
tion in whatever language they like and, when the function executes
the hosting environment, provides the environment to execute that
function. If there are no running instances of the function code, one
is quickly started and responds to the request. The developer does
not know or care about details at the machine level, and the hosting
environment automatically scales up or down to as much computing
power as needed to respond to multiple invocations of the function.
Table 4-1. Just some of the serverless function compute solutions available
in the market today
Vendor Serverless function product
Alibaba Cloud Function compute
Amazon Web Services AWS Lambda
Cloudflare Cloudflare Workers
Google Cloud Platform Cloud functions, Firebase functions
Microsoft Azure Azure functions
Netlify Netlify functions
Serverless | 25
Figure 4-1. The CockroachDB approach to multitenancy with server‐
less distributed SQL