Notes 01
Notes 01
1 / 53
Introduction From networked systems to distributed systems Introduction From networked systems to distributed systems
Introduction From networked systems to distributed systems Introduction From networked systems to distributed systems
Alternative approach
Two definitions
• A decentralized system is a networked computer system in which
processes and resources are necessarily spread across multiple
computers.
• A distributed system is a networked computer system in which processes
and resources are sufficiently spread across multiple computers.
Important
There are many, poorly founded, misconceptions regarding scalability, fault
tolerance, security, etc. We need to develop skills by which distributed systems
can be readily understood so as to judge such misconceptions.
Introduction From networked systems to distributed systems Introduction From networked systems to distributed systems
6 / 53 6 / 53
Introduction Design goals Introduction Design goals
Sharing resources
Canonical examples
• Cloud-based shared storage and files
• Peer-to-peer assisted multimedia streaming
• Shared mail services (think of outsourced mail systems)
• Shared Web hosting (think of content distribution networks)
Observation
“The network is the computer”
(quote from John Gage, then at Sun Microsystems)
Distribution transparency
What is transparency?
The phenomenon by which a distributed system attempts to hide the fact that
its processes and resources are physically distributed across multiple
computers, possibly separated by large distances.
Observation
Distribution transparancy is handled through many different techniques in a
layer between applications and operating systems: a middleware layer
Distribution transparency 8 / 53 Distribution transparency 8 / 53
Distribution transparency
Types
Transparency Description
Access Hide differences in data representation and how an
object is accessed
Location Hide where an object is located
Relocation Hide that an object may be moved to another location
while in use
Migration Hide that an object may move to another location
Replication Hide that an object is replicated
Concurrency Hide that an object may be shared by several
independent users
Failure Hide the failure and recovery of an object
Degree of transparency
Degree of transparency
Conclusion
Distribution transparency is a nice goal, but achieving it is a different story, and
it should often not even be aimed at.
Openness 12 / 53 Openness 12 / 53
Introduction Design goals Introduction Design goals
Openness 13 / 53 Openness 13 / 53
On strict separation
Observation
The stricter the separation between policy and mechanism, the more we need
to ensure proper mechanisms, potentially leading to many configuration
parameters and complex management.
Finding a balance
Hard-coding policies often simplifies management, and reduces complexity at
the price of less flexibility. There is no obvious solution.
Openness 14 / 53 Openness 14 / 53
Dependability
Basics
A component provides services to clients. To provide services, the component
may require the services from other components ⇒ a component may depend
on some other component.
Specifically
A component C depends on C ∗ if the correctness of C’s behavior depends on
the correctness of C ∗ ’s behavior. (Components are processes or channels.)
Dependability 15 / 53 Dependability 15 / 53
Introduction Design goals Introduction Design goals
Dependability
Requirement Description
Availability Readiness for usage
Reliability Continuity of service delivery
Safety Very low probability of catastrophes
Maintainability How easy can a failed system be repaired
Dependability 16 / 53 Dependability 16 / 53
Traditional metrics
• Mean Time To Failure (MTTF): The average time until a component fails.
• Mean Time To Repair (MTTR): The average time needed to repair a
component.
• Mean Time Between Failures (MTBF): Simply MTTF + MTTR.
Dependability 17 / 53 Dependability 17 / 53
Terminology
Dependability 18 / 53 Dependability 18 / 53
Introduction Design goals Introduction Design goals
Terminology
Handling faults
Dependability 19 / 53 Dependability 19 / 53
On security
Observation
A distributed system that is not secure, is not dependable
What we need
• Confidentiality: information is disclosed only to authorized parties
• Integrity: Ensure that alterations to assets of a system can be made only
in an authorized way
Security 20 / 53 Security 20 / 53
Security mechanisms
Keeping it simple
It’s all about encrypting and decrypting data using security keys.
Notation
K (data) denotes that we use key K to encrypt/decrypt data.
Security 21 / 53 Security 21 / 53
Introduction Design goals Introduction Design goals
Security mechanisms
Symmetric cryptosystem
With encryption key EK (data) and decryption key DK (data):
if data = DK (EK (data)) then DK = EK . Note: encryption and descryption key
are the same and should be kept secret.
Asymmetric cryptosystem
Distinguish a public key PK (data) and a private (secret) key SK (data).
Sent by Alice
z }| {
• Encrypt message from Alice to Bob: data = SKbob ( PKbob (data) )
| {z }
Action by Bob
• Sign message for Bob by Alice:
?
[data, data = PKalice (SKalice (data))] = [data, SKalice (data)]
| {z } | {z }
Check by Bob Sent by Alice
Security 22 / 53 Security 22 / 53
Security mechanisms
Secure hashing
In practice, we use secure hash functions: H(data) returns a fixed-length
string.
• Any change from data to data∗ will lead to a completely different string
H(data∗ ).
• Given a hash value, it is computationally impossible to find a data with
h = H(data)
Security 23 / 53 Security 23 / 53
Observation
Many developers of modern distributed systems easily use the adjective
“scalable” without making clear why their system actually scales.
Observation
Most systems account only, to a certain extent, for size scalability. Often a
solution: multiple powerful servers operating independently in parallel. Today,
the challenge still lies in geographical and administrative scalability.
Scalability 24 / 53 Scalability 24 / 53
Introduction Design goals Introduction Design goals
Size scalability
Scalability 25 / 53 Scalability 25 / 53
Formal analysis
λ λ k
pk = 1 −
µ µ
Scalability 26 / 53 Scalability 26 / 53
Formal analysis
λ
U= ∑ pk = 1 − p0 = µ ⇒ pk = (1 − U)U k
k>0
(1 − U)U U
N= ∑ k · pk = ∑ k · (1 − U)U k = (1 − U) ∑ k · U k = (1 − U)2
=
1−U
k≥0 k≥0 k≥0
Average throughput
λ
X = U · µ + (1 − U) · 0 = · µ = λ
| {z } | {z } µ
server at work server idle
Scalability 27 / 53 Scalability 27 / 53
Introduction Design goals Introduction Design goals
Formal analysis
N S R 1
R= = ⇒ =
X 1−U S 1−U
1
with S = µ being the service time.
Observations
• If U is small, response-to-service time is close to 1: a request is
immediately processed
• If U goes up to 1, the system comes to a grinding halt.
Solution: decrease S.
Scalability 28 / 53 Scalability 28 / 53
Scalability 29 / 53 Scalability 29 / 53
Essence
Conflicting policies concerning usage (and thus payment), management, and
security
Examples
• Computational grids: share expensive resources between different
domains.
• Shared equipment: how to control, manage, and use a shared radio
telescope constructed as large-scale shared sensor network?
Scalability 30 / 53 Scalability 30 / 53
Introduction Design goals Introduction Design goals
Scalability 31 / 53 Scalability 31 / 53
Scalability 32 / 53 Scalability 32 / 53
Scalability 33 / 53 Scalability 33 / 53
Introduction Design goals Introduction Design goals
Scalability 34 / 53 Scalability 34 / 53
Observation
If we can tolerate inconsistencies, we may reduce the need for global
synchronization, but tolerating inconsistencies is application dependent.
Scalability 35 / 53 Scalability 35 / 53
Introduction A simple classification of distributed systems Introduction A simple classification of distributed systems
Parallel computing
Observation
High-performance distributed computing started with parallel computing
Observation
Multiprocessors are relatively easy to program in comparison to
multicomputers, yet have problems when increasing the number of processors
(or cores). Solution: Try to implement a shared-memory model on top of a
multicomputer.
Problem
Performance of distributed shared memory could never compete with that of
multiprocessors, and failed to meet the expectations of programmers. It has
been widely abandoned by now.
Introduction A simple classification of distributed systems Introduction A simple classification of distributed systems
Cluster computing
Introduction A simple classification of distributed systems Introduction A simple classification of distributed systems
Grid computing
Note
To allow for collaborations, grids generally use virtual organizations. In
essence, this is a grouping of users (or better: their IDs) that allows for
authorization on resource allocation.
The layers
• Fabric: Provides interfaces to local
resources (for querying state and
capabilities, locking, etc.)
• Connectivity: Communication/transaction
protocols, e.g., for moving data between
resources. Also various authentication
protocols.
• Resource: Manages a single resource,
such as creating processes or reading
data.
• Collective: Handles access to multiple
resources: discovery, scheduling,
replication.
• Application: Contains actual grid
applications in a single organization.
Introduction A simple classification of distributed systems Introduction A simple classification of distributed systems
Integrating applications
Situation
Organizations confronted with many networked applications, but achieving
interoperability was painful.
Basic approach
A networked application is one that runs on a server making its services
available to remote clients. Simple integration: clients combine requests for
(different) applications; send that off; collect responses, and present a
coherent result to the user.
Next step
Allow direct application-to-application communication, leading to Enterprise
Application Integration.
Introduction A simple classification of distributed systems Introduction A simple classification of distributed systems
Issue: all-or-nothing
• Atomic: happens indivisibly (seemingly)
• Consistent: does not violate system invariants
• Isolated: not mutual interference
• Durable: commit means changes are permanent
Observation
Often, the data involved in a transaction is distributed across several servers. A
TP Monitor is responsible for coordinating the execution of a transaction.
Introduction A simple classification of distributed systems Introduction A simple classification of distributed systems
Introduction A simple classification of distributed systems Introduction A simple classification of distributed systems
Observation
Emerging next-generation of distributed systems in which nodes are small,
mobile, and often embedded in a larger system, characterized by the fact that
the system naturally blends into the user’s environment.
Introduction A simple classification of distributed systems Introduction A simple classification of distributed systems
Ubiquitous systems
Core elements
1. (Distribution) Devices are networked, distributed, and accessible
transparently
2. (Interaction) Interaction between users and devices is highly unobtrusive
3. (Context awareness) The system is aware of a user’s context to optimize
interaction
4. (Autonomy) Devices operate autonomously without human intervention,
and are thus highly self-managed
5. (Intelligence) The system as a whole can handle a wide range of dynamic
actions and interactions
Introduction A simple classification of distributed systems Introduction A simple classification of distributed systems
Mobile computing
Distinctive features
• A myriad of different mobile devices (smartphones, tablets, GPS devices,
remote controls, active badges).
• Mobile implies that a device’s location is expected to change over time ⇒
change of local services, reachability, etc. Keyword: discovery.
• Maintaining stable communication can introduce serious problems.
• For a long time, research has focused on directly sharing resources
between mobile devices. It never became popular and is by now
considered to be a fruitless path for research.
Bottomline
Mobile devices set up connections to stationary servers, essentially bringing
mobile computing in the position of clients of cloud-based services.
Mobile computing
Introduction A simple classification of distributed systems Introduction A simple classification of distributed systems
Sensor networks
Characteristics
The nodes to which sensors are attached are:
• Many (10s-1000s)
• Simple (small memory/compute/communication capacity)
• Often battery-powered (or even battery-less)
Introduction A simple classification of distributed systems Introduction A simple classification of distributed systems
Two extremes
Observation
Many distributed systems are needlessly complex, caused by mistakes that
required patching later on. Many false assumptions are often made.
53 / 53 53 / 53