Distributed File Systems
Distributed File Systems
1
Organization
General concepts
Introduction
NFS
AFS
CODA
2
Introduction
File system were originally developed for
centralized computer systems and desktop
computers.
File system was as an operating system
facility providing a convenient programming
interface to disk storage.
3
Introduction
The main purposes of using file in operating systems
are:
Permanent storage of information
Sharing the information: A file can be created by one
application and the shared with different applications.
4
What Distributed File Systems Provide
6
Why DFSs are Useful
Data sharing among multiple users
User mobility
Location transparency
Backups and centralized management
7
Cont…
Clients and Servers:
Clients access files and directories that are provided
by one or more file servers.
File Servers provide a client with a file service
interface and a view of the file system
Servers allow clients to perform operations from the
file service interface on the files and directories
Operations: add/remove, read/write
Servers may provide different views to different
clients
8
Desirable features of DFS
A good distributed file system should have the following features.
Transparency:-
Location: a client cannot tell where a file is located, where the
name of a file does not reveal any hint of the file's physical storage
location.
• Location transparent: location of the file doesn’t appear in the name of
the file
• ex: /server1/dir1/file specifies the server but not where the server is
located -> server can move the file in the network without changing
the path
9
Cont…
Flexibility: In a flexible DFS, it must be
possible to add or replace file servers.
Also, a DFS should support multiple
underlying file system types (e.g., various
Unix file systems, various Windows file
systems, etc.)
10
Cont…
Reliability:
Consistency: employing replication and allowing
concurrent access to files may introduce consistency
problems.
11
Cont…
Performance: In order for a DFS to offer good
performance it may be necessary to distribute
requests across multiple servers.
Multiple servers may also be required if the
amount of data stored by a file system is very
large.
12
Cont…
Scalability: A scalable DFS will avoid
centralized components such as a centralized
naming service, a centralized locking facility,
and a centralized file store.
A scalable DFS must be able to handle an
increasing number of files and users.
It must also be able to handle growth over a
geographic area (e.g., clients that are widely
spread over the world), as well as clients from
different administrative domains.
13
Cont…
Another features are:
User mobility
Simplicity and ease of use
High availability
High reliability
Security
Heterogeneity
14
Accessing Remote Files
There are several models to service a client’s file access request when the
accessed file is a remote file
Remote service model:- The client’s request is performed at the server’s node.
15
Units of Data Transfer
In the systems that use the data-caching model, an important issue is to
decide the unit of data transfer. The other models are:
Block-level transfer model. In this model, file data transfers across the
network between a client and a server take place in units of file blocks.
Byte-level transfer model. In this model, file data transfers across the
network between a client and a server take place in units of bytes.
16
Semantics of File Sharing
(a) On a single processor, when a read follows a write ,
the value returned by the read is the value just written.
(b) In a distributed system with caching, absolute values may
be returned.
17
Cont….
18
Semantics of File Sharing . Unix
19
Semantics of File Sharing . Unix
Distributed UNIX Semantics
Could use a single centralized server which would thus serialize all file
operations
Provides poor performance
In the case of a DFS, it is possible to achieve such semantics
if there is only a single file server and no client-side caching
is used.
In practice, such a system is unrealistic because caches are
needed for performance and write-through caches are
expensive.
Furthermore deploying only a single file server is bad for
scalability.
Because of this it is impossible to achieve Unix semantics
with distributed file systems.
20
Semantics of File Sharing . Session
In the case of session semantics, changes to an open
file are only locally visible.
Only after a file is closed, changes are propagated to
the server (and other clients).
This raises the issue of what happens if two clients
modify the same file simultaneously.
It is generally up to the server to resolve conflicts
and merge the changes.
21
Semantics of File Sharing . Immutable files
Immutable files cannot be altered after they have
been closed.
In order to change a file, instead of overwriting the
contents of the existing file a new file must be
created.
This file may then replace the old one as a whole.
Problems with this approach include a race condition
when two clients try to replace the same file ……
OR what to do with processes that are reading a file
at the same time as it is being replaced by another
process.
22
Semantics of File Sharing . Atomic
Transactions
In the transaction model, a sequence of file manipulations can
be executed indivisibly, which implies that two transactions
can never interfere.
Changes are all or nothing
Begin-Transaction
End-Transaction
System responsible for enforcing serialization
Ensuring that concurrent transactions produce results consistent with
some serial execution
Transaction systems commonly track the read/write component
operations
Familiar aid of atomicity provided by transaction model to
implementers of distributed systems
Commit and rollback both very useful in simplifying implementation
This is the standard model for databases, but it is expensive to
implement. 23
Caching
Caching is often used to improve the performance of
a DFS.
In a DFS caching involves storing either a whole
file, or the results of file service operations.
Caching can be performed at two locations: at the
server and at the client.
Server-side caching makes use of file caching
provided by the host operating system. This is
transparent to the server and helps to improve the
server’s performance by reducing costly disk
accesses.
24
Cont…
Client-side caching comes in two flavours: on-disk
caching, and in-memory caching.
On-disk caching involves the creation of (temporary)
files on the client’s disk. These can either be
complete files (as in the upload/download model) or
they can contain partial file state, attributes, etc.
In-memory caching stores the results of requests in
the client-machine’s memory. This can be process-
local (in the client process), in the kernel, or in a
separate dedicated caching process.
25
Caching…
cache hit-
for example, a web browser program might check its local cache on
disk to see if it has a local copy of the contents of a web page at a
particular URL.
27
Caching…
28
Caching…
29
Caching…
Modification propagation-
The aim is keeping file data cached at multiple client
nodes consistent.
There are several approaches related with:
When to propagate modifications made to a cached data to
the corresponding file server
How to verify the validity of cached data
The modification propagation used has a critical
effect on the system’s performance and reliability.
The file semantics supported depends greatly on the
modification propagation scheme used.
30
Caching…
Write through Scheme.-
When a cache entry is modified, the new value is immediately
sent to the server for updating the original copy of the file.
Advantages: reliability and suitability for UNIX-like
semantics.
Drawback: write-through caches are too expensive to be
useful, the consistency of caches will be damaged.
Delayed-Write Scheme.-
The aim is to reduce network traffic for writes.
where writes are not propagated to the server immediately,
but in the background later on, and write-on-close where
the server receives updates only after the file is closed.
31
Caching…
Delayed-Write Scheme helps in performance
improvement for write accesses due the following
reasons:
Write accesses complete more quickly because the new
value is written only in the cache , the client performing
the write.
Adding a delay to write-on-close has the benefit of
avoiding redundant writes.
Gathering of all file updates and sending them together to
the server is more efficient than sending each update
separately.
Reliability Problems. Modifications not yet send to
the server from a client’s cache will be lost if the
client crashes.
32
Caching…
Cache Validation Schemes.-
A file data may simultaneously reside in the cache of multiple nodes.
The modification propagation policy only specifies when the master
copy of a file at the server node is update upon modification of a
cache entry.
Client initiated approach.-
Check on file open. With this option, a client’s cache entry is
validated only when the client opens the corresponding file for use. It
is suitable for supporting session semantics.
Server initiated approach.-
maintaining a record of which files are cached by which clients.
33
File Replication
To improve the performance and fault
tolerance of a DFS ----> is to replicate its
content.
A replicating DFS maintains multiple copies
of files on different servers.
This can prevent data loss, protect a system
against down time of a single server, and
distribute the overall workload.
34
File Replication
There are three approaches to replication in a DFS:
35
File Replication
Differences between Replication and Caching
A replica is associated with a server, whereas a cached
copy is normally associated with a client.
The existence of a cached copy is primarily dependent on
the locality in file access patterns, whereas the existence
of a replica normally depends on availability and
performance requirements.
As compared to a cached copy, a replica is widely known,
secure, available, complete and accurate.
A cached copy is dependent upon a replica.
36
File Replication …
The possible benefits that offer the replication
of data are:
Increased availability
Increased reliability
Better scalability
37
Fault tolerance
The fault tolerance is an important issue in the design of a
distributed file system. The characteristics of fault tolerance
kind of systems make possible several fault situations.
The primary file properties that directly influence the
availability of a distributed system to tolerate faults are:
38
Distinctions between Stateful vs Stateless
service
The file servers that implement a distributed file service can be stateless or stateful.
Failure Recovery-
stateful server
If a stateful server loses all its volatile state in a crash.
• Restore state by recovery protocol based on a dialog with clients ,or
abort operations that were underway when the crash occurred.
• Server needs to be aware of client failures in order to reclaim space
allocated to record the state of crashed client processes (orphan
detection and elimination).
stateless server
With stateless server ,the effects of server failures and recovery are almost
unnoticeable.
The main advantage of stateless servers is that they can easily recover
from failure. Because there is no state that must be restored, a failed server
can simply restart after a crash and immediately provide services to clients
as though nothing happened.
A newly restored server can respond to a self-contained request without
any difficulty.
39
NFS Introduction
Network File System (NFS) is a distributed
file system protocol originally developed
by Sun Microsystems in 1984, allowing a user
on a client computer to access files over
a network in a manner similar to how local
storage is accessed.
The basic idea behind NFS is that each file
server provides a standardized view of its
local file system (each NFS server supports
the same model).
NFS allows an arbitrary number of clients and
servers to share a common file system over
local area or wide area networks . 40
NFS Introduction
NFS version 1, 2, 3 &4.
NFS is a collection of protocols that together
provide clients with a model of a distributed
file system.
The NFS protocols have been designed in
such a way that different implementations
should easily interoperate. In this way, NFS
can run on heterogeneous collection of
computers.
An open standard with clear and simple
interfaces.
41
Cont…
Supports many of the design requirements
already mentioned:
transparency
heterogeneity
efficiency
fault tolerance
42
NFS Architecture
43
Cont…
There are benefits and drawbacks to both models.
The remote access model -
In the first approach (The remote access model) all operations are
performed at the server itself, with clients simply sending commands to
the server.
It makes it possible for the file server to order all operations and therefore
allow concurrent modifications to the files.
A drawback is that the client can only use files if it has contact with the file
server. If the file server goes down, or the network connection is broken, then
the client loses access to the files.
The upload/download model -
In the second model (The upload/download model), files are downloaded
from the server to the client. Modifications are performed directly at the
client after which the file is uploaded back to the server.
It can avoid generating traffic every time. Also, a client can potentially
use a file even if it cannot access the file server.
A drawback of performing operations locally and then sending an updated file
back to the server is that concurrent modification of a file by different clients
can cause problems.
44
NFS Architecture…
A client accesses the file system using the system calls provided by its local operating
system.
The local OS file system interface is replaced by an interface to the Virtual File System
(VFS), which is a standard for interfacing to different (distributed) file systems. The whole
idea of the VFS is to hide the differences between various file systems.
Operations on the VFS interface are either passed to a local file system, or passed to a
separate component known as the NFS Client, which takes care of handling access to
files stored at a remote server.
In NFS all, client -server communication is done through RPCs.
On the server side, the NFS server is responsible for handling incoming client request.
The RPC stub unmarshals requests and the NFS server converts them to regular VFS file
operations that are subsequently passed to the VFS layer.
45
NFS Architecture…
An important advantage of this scheme is that
NFS is largely independent of local file
systems.
It really does not matter whether the operating
systems at the client or server implements a
UNIX file system, a Windows 2000 file
system, or even an old MS-DOS file system.
The only important issue is that these file
systems are accommodating with the file
system model offered by NFS.
46
File System Model
47
Communication
48
NFS. Characteristics
Stateless server, so the user's identity and access rights must be checked
by the server on each request.
In the local file system they are checked only on open()
Every client request is accompanied by the user ID and group ID
Server is exposed to imposter attacks unless the user ID and group ID are
protected by encryption
Kerberos has been integrated with NFS to provide a stronger and more
comprehensive security solution
Mount operation:
• Mount ( remotehost, remotedirectory, localdirectory)
Server maintains a table of clients who have mounted file systems at that
server
Each client maintains a table of mounted file systems holding:
< IP address, port number, file handle>
50
NFS Naming
51
NFS Naming
53
Cont…
By considering a simple automounter, assume that
for each user, the home directories of all users are
available through the local directory /home.
When a client machine boots, the automounter starts with
mounting this directory.
The effect of this local mount is that whenever
program attempts to access /home, the OS kernel
will forward a lookup operation to the NFS client,
which in this case, will forward the request to the
automounter in its role as NFS server, as shown in
below figure.
54
Cont…
55
Cont…
Automounter first creates a subdirectory /alice in
/home . it then looks up the NFS server that exports
Alice’s home directory to subsequently and mount
that directory in /home/alice.
56
NFS. File Attributes
An NFS file has a number of associated attributes. In version
3, the set of attributes was fixed and every implementation
was expected to support those attributes. With version 4, the
set of file attributes has been split into a set of mandatory
attributes that every implementation must support, a set of
recommended attributes that should be preferably supported,
and an additional set of named attributes.
57
Cont…
58
Synchronization: File Locking in NFS
Lock operation is used to request a read or write lock on a consecutive range of bytes in a file.
if the lock cannot be granted due to another conflicting lock, the client gets back an error
message and has ask the server at a latter time. The client request to be put on a FIFO-ordered
list maintained by the server. As soon as conflicting lock has been removed, the server will grant
the next lock to the client at the top of the list.
Removing a lock from a file is done by means of locku operation.
Using the renew operation, a client requests the server to renew the lease on its lock.
59
NFS Client Caching
A server may delegate some of its rights to a client when a file is opened.
An important consequence of delegating a file to a client is that the server needs
to be able to recall the delegation, for example,
when another client on a different machine needs to obtain access rights to
the file. Recalling a delegation requires that the server can do a callback to the
client.
60
NFS The security architecture
61
NFS. Access Control
62
NFS. Scalability
The performance of a single server can be increased
by the addition of processors, disks and controllers.
To improve the performance- Additional servers
must be installed and the file systems must be
reallocated between them.
63
Andrew File System (AFS)
64
Introduction
Andrew File System (AFS) is one of the distributed file systems
that been developed at Carnegie Mellon University (CMU).
Main Goal:
AFS provide scalability to thousands of workstation at one
site while offering users, applications & administrations the
conveniences of a shared file system.
65
AFS Design
2 design characteristics-
• Whole-file serving
• Whole-file caching
66
AFS Design (cont.)
Whole-file serving
67
AFS Design (cont.)
Whole-file caching
once a copy of a file or a chunk has been
transferred to a client computer it is stored
in a cache on the local disk. The cache
contains several hundreds of the files most
recently used on that computer. The cache
is permanent, surviving reboots of the client
computer. Local copies of files are used to
satisfy clients’ open requests in preference to
remote copies whenever possible.
68
Implementation
69
Distribution of processes in the AFS
Workstations Servers
User Venus
program
Vice
UNIX kernel
UNIX kernel
Vice
Venus
User
program UNIX kernel
UNIX kernel
70
Implementation (cont.)
The files available to user processes
running on workstations are either local or
shared.
• Local files are handled as normal UNIX
files, stored on a workstation’s disk and
are available only to local user
processes.
73
Implementation (cont.)
• User programs use conventional UNIX
pathnames to refer to files, but AFS uses file
identifier (fids) in the communication between
the Venus and Vice processes.
75
Implementation of file system calls in AFS
Cache Consistency Mechanisms
76
Cache Consistency Mechanisms (cont.)
77
Management Issues
Economy
Reliability
OTHER ASPECTS
Security OF AFS Availability
Threads
Location Database
Read-only replicas
79
Security
AFS uses Kerberos authentication to enforce
security without inhibiting legitimate uses of
the system .
80
Performance
81
Availability
82
Reliability
As servers go down, only non-replicated
portions of the AFS filespace become
inaccessible, leaving the rest of the file
system intact (undamaged). Replicated
data is automatically fetched from one of
the remaining servers. The clients
automatically balance requests between
all the servers containing a particular file.
83
Bulk Transfers
84
UNIX Kernel Modifications
85
Location database
86
Threads
87
Read-only replicas
88
Partial File Caching
89
Wide-area support
90
CODA File System
91
Introduction
Coda has been developed at Carnegie Mellon University (CMU)
in 1990s.
Coda is a version 2 of Andrew File System (AFS) which was
also developed at CMU.
Coda is integrated with a number of UNIX-based operating
system such as LINUX.
Coda file system allow a client to continue operation despite
being disconnected from a server.
Coda was designed to be a scalable, and highly available
distributed file system.
92
Overall Organization Of AFS
Virtue
Workstation
give users
and process
access to file
system
Centrally Administered
Maintaining a local
collection of files
93
Continue…
Venus
Every Virtue Workstation hosts a user-level process called
Venus.
94
Internal Organization Of a Virtue
Workstation
1. Client Call any
operation on to a file
96
Side Effect in Coda’s RPC2 system
97
Side Effect
A side effect is a mechanism by which the client and server can
communicating using application-specific protocol.
Example:
Client opening a file at a video-server
RPC2 allows the client and server to set up a separate
connection for transferring the video data to the client on time.
Connection setup is done as a side-effect of an RPC call.
RPC2 runtime system provides an interface of side-effect
routines.
There are routines for a setting up a connection and routines
for transferring data.
These routines are automatically called by the RPC2 runtime
system at client and server.
98
RPC2 Support Multicasting
100
Continue…
101
File Identifiers
The collection of files may be replicated and distributed
across multiple Vice servers.
Each file in coda is contained in exactly one volume.
A volume may be replicated across several servers.
Logical volume:- Represent a possible replicated physical
volume and associated with Replicated Volume Identifier
(RVID).
Each physical volume has its own Volume Identifier (VID).
Coda is assign each file a 96 bit file identifier.
RVID File Handle
32 bit 64 bit
102
Continue…
1. Client passes the RVID of the file
identifier.
•Lookup the server that is currently hosting the particular replica of the
logical volume.
•Return the current location of that specific physical volume.
103
Sharing Files in Coda
104
Client Caching
Coda is maintained by Callbacks.
Server keep track of which client have a copy of that file
cached ,locally.
A server record a Callback promise for a client.
When a client updates its local copy of the file it notifies
the server.
Server sends an invalidation message to the other clients.
Invalidation message is Callback break.
The server discard the callback promise it held for the
client it just send an invalidation.
105
Client Caching
106
SUMMARY
Introduction to distributed file system
NFS
AFS
CODA
107