The Hadoop Distributed File System
-
Upload
nelle-garcia -
Category
Documents
-
view
32 -
download
3
description
Transcript of The Hadoop Distributed File System
PaoMin Wu University at Buffalo
The Hadoop Distributed File System
ARCHITECTURE
1. Namenodestores matadata of the systemkeeps all namespace in RAM
2. Datanodeblock replicastores application data
3. HDFS-ClientUser applications access the file system using the HDFSclient
HDFS Client Process
ARCHITECTURE
4. Image and JournalNamespace image = file system metadataPeresistent record of image = checkpoint
5. CheckpointNode (NameNode)Protects file system metadata
6. BackupNode (NameNode)Capable of creating periodic checkpoints
FILE I/O OPERATIONS AND REPLICA MANGEMENT
FILE I/O OPERATIONS AND REPLICA MANGEMENT
Sort Benchmark
Future Work
Problem:NameNode contains all important information
Solution:Allow multiple namespaces(and NameNodes) to share the physical storage within a cluster
PaoMin Wu University at Buffalo
MapReduce: Simplied Data Processing on Large Clusters
Introduction
•key/value pair
•execution across a set of machines
•handling machine failures
•managing the required inter-machine communication
•runs on a large cluster
•powerful interface
•automatic parallelization
•distribution of large-scale computations
Programming Model
Map, written by the user, takes an input pair and produces a set of intermediate key/value pairs.
The Reduce function, also written by the user, acceptsan intermediate key and a set of values for that key.
The intermediate values are supplied to the user's reduce function via an iterator.
Example:
Execution Overflow:
Backup Tasks:
Conclusions
1. Restricting the programming model is beneficial
2. Network bandwidth is a scarce resource
3. Redundant execution can help
References:
The Hadoop Distributed File SystemKonstantin Shvachko, Hairong Kuang, Sanjay Radia, Robert ChanslerYahoo!Sunnyvale, California USA{Shv, Hairong, SRadia, Chansler}@Yahoo-Inc.com
MapReduce: Simplied Data Processing on Large ClustersJeffrey Dean and Sanjay [email protected], [email protected], Inc.