The Google File System (GFS)

Authors

Sanjay Ghemawat
Howard Gobioff
Shun-Tak Leung

Abstract

GFS is a scalable distributed file system for large, distributed data-intensive applications.
Provides fault tolerance on inexpensive hardware.
Delivers high performance to many clients.
Designed based on Google's application workloads and technological environment.
Successfully used within Google for storage needs, generating and processing large datasets.

Design Overview

Assumptions

Built from inexpensive components prone to failure; requires constant monitoring and recovery.
Manages a modest number of large files efficiently.
Workloads consist of large streaming reads and small random reads.
Large sequential writes with append operations are common.
High sustained bandwidth prioritized over low latency.

Interface

Provides a familiar file system interface (not POSIX).
Files are organized in directories, identified by pathnames.
Supports create, delete, open, close, read, write, snapshot, and record append operations.

Architecture

Composed of a single master and multiple chunkservers.
Files divided into fixed-size chunks (64MB); chunks replicated across servers.
Master maintains metadata including namespace and chunk locations.
Client applications link GFS client code to perform file operations.
Focuses on efficient network usage and metadata management.

Key Features

Single Master

Simplifies design; master makes sophisticated chunk placement and replication decisions.
Clients directly communicate with chunkservers for reads/writes after initial metadata request.

Chunk Size

Large chunk size (64MB) reduces need for client-master interaction.
Reduces metadata size stored in the master, allowing it to be stored in-memory.

Metadata Management

Includes namespace, file-to-chunk mapping, and replica locations.
Metadata stored in master’s memory for fast operations, with logs for persistence.
Regular garbage collection process to manage storage efficiently.

Consistency Model

File mutations are atomic and consistent.
Guarantees for file region states include consistent and defined regions.
Record append operation supports atomic appends even with concurrent mutations.

System Interactions

Leases used to maintain consistent mutation orders.
Data flow is decoupled from control flow for efficient network use.

Fault Tolerance

Relies on chunk replication and fast recovery processes.
Uses checksums to ensure data integrity and detect corruption.

Measurements

The system demonstrates high throughput and efficient metadata management.
Sustains high read and write rates across large clusters with hundreds of chunkservers.

Experiences and Challenges

Issues with Linux kernel and hardware failures were addressed using checksums and kernel modifications.
GFS has evolved to support various tasks beyond initial production systems.

Related Work

GFS differentiates from other systems like AFS, xFS, and NASD by its design choices focused on Google’s needs.
Centralized master simplifies design but maintains high reliability and scalability.

Conclusions

GFS effectively supports large-scale data processing on commodity hardware.
Focuses on fault tolerance, efficient handling of large files, and a simple, centralized master design.
Continues to be a vital tool for Google's data processing and innovation.

These notes capture the core design, architecture, and operational principles of the Google File System as presented in the paper. They highlight the unique design choices made to accommodate Google's specific needs and the system's ability to handle large-scale, distributed data processing efficiently.

Overview of the Google File System