Google File System (GFS)

The Google File System (GFS) was designed to address the needs of Google's massive data storage and high query load. GFS uses a master-chunk server architecture with replication to provide scalability, fault tolerance, and high performance. It operates on a shared storage cluster and allows applications to read, write, and append files in parallel via simple operations. GFS's use of chunking, replication and load balancing enables it to scale efficiently across thousands of commodity servers.

Uploaded by

Mohit Gautam

Available Formats

Download as PDF, TXT or read online on Scribd

Download as pdf or txt

0% found this document useful (0 votes)

90 views18 pages

Google File System (GFS)

Uploaded by

Mohit Gautam

Available Formats

Download as PDF, TXT or read online on Scribd

Download as pdf or txt

You are on page 1/ 18

GOOGLE FILE SYSTEM (GFS)

Hans Vatne Hansen

Distributed File Systems

● Andrew File System (AFS)

● Network File System (NFS)
● Coda
● Microsoft Distributed File System (DFS)
● Apple Filing Protocol (AFP)
Distributed File Systems

Andrew File System Coda

● Performance CODA = AFS +
● Scalability ● Disconnected operation
● Availability ● For mobile computing
● Security
● For partial network failures
● Network bandwidth
adaptation
● Client side caching
● Server replication
Motivation for GFS
● Nothing is small in Google land ● Clusters all over the world
● Peta-bytes of data
● Thousands of queries
● Millions of users
served per second
● Lots of services and servers
→ Scalability ● One query reads
● Failures are normal hundreds of MB of data
● Network connections ● One query consumes
● Hard disks billions of CPU cycles
● Power supplies
→ Fault tolerance
● Monitoring and maintenance is hard
● A distributed, fault-tolerant
file system is needed!
→ Autonomic computing
Google Data Centers

● Scaling out on commodity hardware is cheaper than

scaling up on high-end servers
● Google servers:
● > 15 000 servers (2003)
● ~ 200 000 (2005)
● ~ 1 M servers (2010)

● Data centers are composed of standard shipping

containers with 1160 servers in each
Google Data Centers
Google Data Centers
Chunks and Chunk Servers

Chunk
● Similar to block in file systems
● Size is always 64 MB
● Less fragmentation
● Eases management
● Sent directly to clients
Master Servers

Master Server
● Coordinates cluster
● Updates operation log
● Stores meta-data
Master Server – Chunk Server Communication

State updates
● Is a chunk server down?
● Are there disk failures on a chunk server?
● Are any replicas corrupted?
● Which chunk replicas does a chunk server store?

Instructions
● Create new chunk
● Delete existing chunks
GFS Architecture
Read operation
Write operation (1/2)
Write operation (2/2)
Record Append
● Record append allows multiple clients to append data to the same file concurrently while guaranteeing atomicity

Algorithm
● Application originates record append request.
● GFS client translates request and sends it to master.
● Master responds with chunk handle and (primary + secondary) replica locations.
● Client pushes write data to all locations.
● Primary check if record fits in specified chunk.
● If record does not fit, then the primary:
● Pads the chunk.
● Tells secondaries to do the same.
● Informs the client.
● Client retries append with the next chunk.
● If records fits, then the primary:
● Appends the record.
● Tells secondaries to do the same.
● Receives responses from secondaries.
● Send final response to the client.
Fault Tolerance

● Master and chunk server recovers extremely fast

● Chunks, operation log and master state is replicated
● Replication is done across multiple machines and data
centers in case of severe failures
Conclusions

● GFS has
● Performance
● Scalability
● Fault-tolerance

● GFS is
● Easy to maintain
● Cheapest solution for Google

● Clients and applications can

● Read in parallel
● Write in parallel
● Append in parallel
References

● Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung,

The Google File System, ACM Symposium on Operating
Systems Principles, 2003
● Naushad UzZaman, Survey on Google File System,
CSC 456 (Operating Systems), 2007
● Jonathan Strickland, How the Google File System
Works, HowStuffWorks.com, 2010
● Wikipedia Contributors, Google File System, Wikipedia -
The Free Encyclopedia, 2010

Routing Protocols
No ratings yet
Routing Protocols
51 pages
Fintech MCQ
100% (4)
Fintech MCQ
12 pages
DC MidSem
100% (1)
DC MidSem
333 pages
Software Architectures (Quiz 1)
No ratings yet
Software Architectures (Quiz 1)
5 pages
An Integrated Tutorial On InfiniBand, Verbs, and MPI PDF
No ratings yet
An Integrated Tutorial On InfiniBand, Verbs, and MPI PDF
33 pages
Huawei ICT Competition 2019-2020 Preliminary Examination Outline - PAk
No ratings yet
Huawei ICT Competition 2019-2020 Preliminary Examination Outline - PAk
5 pages
Isilon OneFS 8.2.1 HDFS Reference Guide
No ratings yet
Isilon OneFS 8.2.1 HDFS Reference Guide
49 pages
Paginas Cryptos
0% (1)
Paginas Cryptos
4 pages
Nasuni 2015 State of Cloud Storage Report
No ratings yet
Nasuni 2015 State of Cloud Storage Report
13 pages
CCNA Security Module 8
0% (1)
CCNA Security Module 8
8 pages
CCNA Security Module 4 100%
100% (2)
CCNA Security Module 4 100%
6 pages
Osi and Tcp/ip Model
No ratings yet
Osi and Tcp/ip Model
7 pages
Ccna 2 Module 4
83% (6)
Ccna 2 Module 4
6 pages
CCNA1 v5 + v5.02 Final Exam Answers: LAN MAN Wlan
No ratings yet
CCNA1 v5 + v5.02 Final Exam Answers: LAN MAN Wlan
37 pages
Chapter 1: WAN Concepts: CCNA Routing and Switching Connecting Networks v6.0
No ratings yet
Chapter 1: WAN Concepts: CCNA Routing and Switching Connecting Networks v6.0
44 pages
Interview Q&A
No ratings yet
Interview Q&A
22 pages
Asymmetric Encryption
No ratings yet
Asymmetric Encryption
62 pages
CNv6 instructorPPT Chapter6
No ratings yet
CNv6 instructorPPT Chapter6
44 pages
CCNA Security Module 6
No ratings yet
CCNA Security Module 6
4 pages
AI Presentation
No ratings yet
AI Presentation
66 pages
Chapter 8 - Protocol Architecture - Computer Networks
100% (3)
Chapter 8 - Protocol Architecture - Computer Networks
12 pages
The Google File System
No ratings yet
The Google File System
21 pages
Application Layer
No ratings yet
Application Layer
107 pages
CSE306. Computer Networks
No ratings yet
CSE306. Computer Networks
22 pages
Session 20-21-22-Mongoose ODM
No ratings yet
Session 20-21-22-Mongoose ODM
17 pages
CN Final Slides Computer Networks Communication Networks B.tech
No ratings yet
CN Final Slides Computer Networks Communication Networks B.tech
409 pages
303 Linux OS PDF
No ratings yet
303 Linux OS PDF
13 pages
Part - A (Short Answer Questions) : Unit - I
No ratings yet
Part - A (Short Answer Questions) : Unit - I
7 pages
Network and System Expert MCQ - Questions
No ratings yet
Network and System Expert MCQ - Questions
12 pages
RNIC Verbs Overview2
No ratings yet
RNIC Verbs Overview2
28 pages
Unit-3 Software: Need of Computer Software
No ratings yet
Unit-3 Software: Need of Computer Software
10 pages
Subnetting Tricks Subnetting Made Easy With Examples
No ratings yet
Subnetting Tricks Subnetting Made Easy With Examples
2 pages
Python M1
No ratings yet
Python M1
22 pages
Programming Fundamental All Chapter
100% (1)
Programming Fundamental All Chapter
265 pages
Thread (Operating System)
No ratings yet
Thread (Operating System)
17 pages
Mobile Transport Layer
No ratings yet
Mobile Transport Layer
18 pages
Computing Project: Topic 1
100% (1)
Computing Project: Topic 1
45 pages
Remote Procedure Calls
No ratings yet
Remote Procedure Calls
104 pages
Troubleshooting Theory-Six Steps Methodology
No ratings yet
Troubleshooting Theory-Six Steps Methodology
3 pages
Distributed Databases and Client Server Architectures
No ratings yet
Distributed Databases and Client Server Architectures
14 pages
VCA-DCV Prüfungsvorbereitung
No ratings yet
VCA-DCV Prüfungsvorbereitung
22 pages
BCIT COMP 7005 TCP and FTP Protocol Analysis Final Project by Wesley Kenzie, December 2010
100% (2)
BCIT COMP 7005 TCP and FTP Protocol Analysis Final Project by Wesley Kenzie, December 2010
117 pages
High Availability Troubleshooting Guide
No ratings yet
High Availability Troubleshooting Guide
41 pages
CNv6 instructorPPT Chapter5
No ratings yet
CNv6 instructorPPT Chapter5
45 pages
Usb 3.0
100% (2)
Usb 3.0
21 pages
Dbms Quiz
No ratings yet
Dbms Quiz
13 pages
Operating System Assignment For SYBCA: The Following Assignment Have To Be Submitted On 23 Jan 2012. OS Questions
No ratings yet
Operating System Assignment For SYBCA: The Following Assignment Have To Be Submitted On 23 Jan 2012. OS Questions
3 pages
D01 S04 NSX-VSphere Component Overview
No ratings yet
D01 S04 NSX-VSphere Component Overview
23 pages
Windows Server 2008 Interview Questions and Answers
100% (1)
Windows Server 2008 Interview Questions and Answers
8 pages
Databases
No ratings yet
Databases
43 pages
Process-to-Process Delivery: UDP and TCP
75% (4)
Process-to-Process Delivery: UDP and TCP
58 pages
Introduction To CICD
No ratings yet
Introduction To CICD
8 pages
Remote Procedure Call
No ratings yet
Remote Procedure Call
6 pages
Distributed Systems
No ratings yet
Distributed Systems
29 pages
Wireless Communications Security: Solutions for the Internet of Things
From Everand
Wireless Communications Security: Solutions for the Internet of Things
Jyrki T. J. Penttinen
No ratings yet
26 Ways to Save on Your Utility Bills!: 26 Ways, #1
From Everand
26 Ways to Save on Your Utility Bills!: 26 Ways, #1
Kimberly Peters
No ratings yet
MCSA: Windows 10 Complete Study Guide: Exam 70-698 and Exam 70-697
From Everand
MCSA: Windows 10 Complete Study Guide: Exam 70-698 and Exam 70-697
William Panek
No ratings yet
Trackpad Pro Ver. 5.0 Class 7
From Everand
Trackpad Pro Ver. 5.0 Class 7
Nidhi Arora
5/5 (1)
Advanced Computer Networking: Comprehensive Techniques for Modern Systems
From Everand
Advanced Computer Networking: Comprehensive Techniques for Modern Systems
Adam Jones
No ratings yet
F1 - The Fault-Tolerant Distributed RDBMS Supporting Google's Ad Business
No ratings yet
F1 - The Fault-Tolerant Distributed RDBMS Supporting Google's Ad Business
19 pages
Why Computers Are Getting Slower
No ratings yet
Why Computers Are Getting Slower
24 pages
Google File System
No ratings yet
Google File System
22 pages
Red Hat Gluster Storage Performance
No ratings yet
Red Hat Gluster Storage Performance
31 pages
Introduction To Azure SQL Databases Family
No ratings yet
Introduction To Azure SQL Databases Family
6 pages
SQL Server 2019 Developer Edition
No ratings yet
SQL Server 2019 Developer Edition
1 page
SQL Server Production DBA
No ratings yet
SQL Server Production DBA
14 pages
(T-AK8S-I) M4 - Kubernetes Operations - ILT v1.7
No ratings yet
(T-AK8S-I) M4 - Kubernetes Operations - ILT v1.7
25 pages
Docu48462 - Using Ntxmap For CIFS User Mapping On VNX 8.1
No ratings yet
Docu48462 - Using Ntxmap For CIFS User Mapping On VNX 8.1
36 pages
How Java Works
No ratings yet
How Java Works
11 pages
02-01 NetApp Storage Platforms and Operating Systems Intro
No ratings yet
02-01 NetApp Storage Platforms and Operating Systems Intro
3 pages
(T-AK8S-I) M3 - Kubernetes Architecture - ILT v1.7
No ratings yet
(T-AK8S-I) M3 - Kubernetes Architecture - ILT v1.7
53 pages
02-14 SAN Protocols - iSCSI Overview
No ratings yet
02-14 SAN Protocols - iSCSI Overview
14 pages
Module 2 - Understanding Virtualization
No ratings yet
Module 2 - Understanding Virtualization
8 pages
Module 3 - Introduction To VMWare
No ratings yet
Module 3 - Introduction To VMWare
4 pages
Isilon OneFS 8.1.2 Hortonworks Installation Guide
No ratings yet
Isilon OneFS 8.1.2 Hortonworks Installation Guide
44 pages
Isilon OneFS 8 1 2 Hortonworks For Kerberos Installation Guide
No ratings yet
Isilon OneFS 8 1 2 Hortonworks For Kerberos Installation Guide
78 pages
Isilon OneFS 8 1 2 Cloudera For Kerberos Installation Guide
No ratings yet
Isilon OneFS 8 1 2 Cloudera For Kerberos Installation Guide
75 pages
Sessional - 1 Blockchain (MCA)
No ratings yet
Sessional - 1 Blockchain (MCA)
9 pages
Maling VPN
No ratings yet
Maling VPN
15 pages
Lottery Round 139
No ratings yet
Lottery Round 139
77 pages
Kubernetes HA: Montreal Kubernetes Meetup October 12
No ratings yet
Kubernetes HA: Montreal Kubernetes Meetup October 12
14 pages
Blockchain New Unconfirmed Transaction Hack by Bitcoin Makers
0% (1)
Blockchain New Unconfirmed Transaction Hack by Bitcoin Makers
2 pages
M. Tech. Semester - I: Advanced Computer Architecture (MCSCS102IBMCSCS 902)
No ratings yet
M. Tech. Semester - I: Advanced Computer Architecture (MCSCS102IBMCSCS 902)
12 pages
Address - 1KFHE7w8BhaENAswwryaoccDb6qcT6DbYY - Mempool - Bitcoin Explorer
100% (2)
Address - 1KFHE7w8BhaENAswwryaoccDb6qcT6DbYY - Mempool - Bitcoin Explorer
9 pages
Distributed Database Systems
No ratings yet
Distributed Database Systems
311 pages
Concurrency Control DBMS
No ratings yet
Concurrency Control DBMS
12 pages
Yotta Chandivali Deck
No ratings yet
Yotta Chandivali Deck
36 pages
Building Distributed System To Handle Data in Internet of Things
No ratings yet
Building Distributed System To Handle Data in Internet of Things
86 pages
Hive File Formats Presentation
No ratings yet
Hive File Formats Presentation
19 pages
Projectlist
No ratings yet
Projectlist
3 pages
Distributed Systems
No ratings yet
Distributed Systems
8 pages
17CS742 Mod 1 PPT
No ratings yet
17CS742 Mod 1 PPT
82 pages
Big Data Computing - Assignment 2
No ratings yet
Big Data Computing - Assignment 2
3 pages
University of Makeni (Unimak) Sylvanus Koroma
No ratings yet
University of Makeni (Unimak) Sylvanus Koroma
14 pages
Unit III Notes
No ratings yet
Unit III Notes
11 pages
Models in Ds
No ratings yet
Models in Ds
4 pages
Foundation of Data Science - CS3352 - Hand Written Notes - Unit 4 - Python Libraries For Data Wrangling
No ratings yet
Foundation of Data Science - CS3352 - Hand Written Notes - Unit 4 - Python Libraries For Data Wrangling
42 pages
Detailed Cryptocurrency Presentation
No ratings yet
Detailed Cryptocurrency Presentation
10 pages
AWS+Slides+Ch+01 Finalized
No ratings yet
AWS+Slides+Ch+01 Finalized
25 pages
Cloud Reference Model and Cloud Service Model Compress
No ratings yet
Cloud Reference Model and Cloud Service Model Compress
17 pages
Navigating Data Redu Nancy
No ratings yet
Navigating Data Redu Nancy
22 pages
Rohini College of Engineering & Technology: Cs3492-Database Management Systems
No ratings yet
Rohini College of Engineering & Technology: Cs3492-Database Management Systems
4 pages
Introduction To Validation Based Protocol in DBMS
No ratings yet
Introduction To Validation Based Protocol in DBMS
5 pages
Weather Forecasting Report File 1
No ratings yet
Weather Forecasting Report File 1
33 pages
BLOCKCHAIN
No ratings yet
BLOCKCHAIN
23 pages