MapReduce: Simplified Data Processing on Large Clusters Authors: Jeffrey Dean and Sanjay Ghemawat...
-
date post
21-Dec-2015 -
Category
Documents
-
view
220 -
download
1
Transcript of MapReduce: Simplified Data Processing on Large Clusters Authors: Jeffrey Dean and Sanjay Ghemawat...
MapReduce: Simplified Data Processing on
Large Clusters
Authors: Jeffrey Dean and Sanjay Ghemawat
Presenter: Guangdong Liu
Jan 28th, 2011
Presentation Outline
Motivation
Goal
Programming Model
Implementation
Refinement
Motivation
Large-scale data processingMany data-intensive applications involve processing huge amounts of data and then producing lots of other data
Certain common themes are shared when executing such applications
Hundreds or thousands of machines are used Two categories of basic operation on the input data:
1) Map():process a key/value pair to generate a set of intermediate key/value pairs
2) Reduce(): merge all intermediate values with the same key
Goal
MapReduce: an abstraction that allows users to
perform simple computations across large data set
which is distributed on large clusters of
commodity PCs while hiding the details of
parallelization, data distribution, load balancing
and fault toleration User-defined functions
Automatic parallelization and distribution
Fault tolerance
I/O scheduling
Status monitoring
Programming Model
Inspired by Lisp primitives map and reduce
Map(key, val) Written by a user
Process a key/value pair to generate intermediate key/value pairs
The MapReduce library groups all intermediate values associated with the same key together and passes them to the reduce function
Reduce(key,vals) Also written by a user
Merge all intermediate values associated with the same key
Programming Model
Programming Model
Count words in docs Input consists of (doc_url, doc_contents) pairs
Map(key=doc_url, val=doc_contents), for each word w in contents, emit(w, “1”)
Reduce(key=word, values=counts_list), sum all “1”s in value list and emit result “(word, sum)”
Programming Model
Hello World, Bye World!
Hello MapReduce, Goodbye to MapReduce.
Welcome to UNL, Goodbye to
UNL.
Reduce Phase
DFS Map Phase
Intermediate Result
DFS
M1
M2
M3
(Hello, 1) (Bye, 1)
(World, 1)(World, 1)
(Welcome, 1)(to, 1)(to, 1)
(Goodbye, 1)(UNL, 1)(UNL, 1)
(Hello, 1)(to, 1)(Goodbye, 1)(MapReduce, 1)(MapReduce, 1)
R1
R2
(Hello, 2) (Bye, 1)(Welcome, 1)(to, 3)
(World, 2)(UNL, 2)(Goodbye, 2)(MapReduce, 2)
Implementation
User to do list Indicate input and output files
M: number of map tasks
R: number of reduce tasks
W: number of machines
Write map and reduce functions
Submit jobs
This requires no knowledge of parallel/distributed systems!!!
Implementation
… …
Reduce Phase
DFS
… …
Map Phase
Master
M2
R1
Inpu
t
P1... …Pr
B2
… …
Bn
B1 M1
Local WriteRead fro
m
DFS
P1… …
Pr
P1… …
Pr
Assign
MapTask Assign ReduceTask
Remote ReadOutput 1
Output r
Write to DFS
… …
Intermediate Result
DFS
Rr
ReducerMapperMn
Implementation
1. Input files split (M splits)
Each block is typically 16~64MB
Start up many copies of user program on a cluster of machines
2. Master & Workers One special instance becomes the master
Workers are assigned tasks by the master
There are M map tasks and R reduce tasks to assign
Master finds idle workers and assigns map or reduce tasks to them
Implementation
3. Map tasks Map workers read contents of corresponding
input partition
Perform user-defined map computation to create intermediate <key,value> pairs
The intermediate <key,value> pairs produced by the map function are buffered in memory
4. Writing intermediate data to disk (R regions) Buffered output pairs written to local disk
periodically
Partitioned into R regions by a partitioning function
Location of these buffered pairs on the local disk are passed back to the master
Implementation
5. Read & Sorting Use remote procedure calls to read the buffered
data from the local disks of map workers Sort intermediate data by the intermediate keys
6. Reduce tasks Reduce worker iterates over ordered
intermediate data Each unique key encountered – key & values are
passed to user's reduce function Output of user's reduce function is written to
output file on a global file system
7.When all tasks have completed, the master
wakes up user program
Implementation
Fault tolerance-in a word, redo Workers are periodically pinged by master No response = failed worker Reschedule failed tasks Note: completed map task by the failed
worker need to be re-executed because the output is stored on the local disk
Implementation
Locality Input data is managed by GFS and has
several replicas
Schedule a task on a machine containing a local replica or near a replica
Task GranularityM map tasks and R reduce tasks
Make M and R much larger than number of worker machines
Implementation
Backup tasksStraggler: a machine that takes an unusually
long time to complete one of the last few map or reduce tasks in the computation.
Cause: bad disk, competition for CPU …
Resolution: schedule backup executions of
in-progress tasks when a MapReduce operation is close to completion
Source
The example is quoted from: Wei Wei; Juan Du; Ting Yu; Xiaohui Gu; , "SecureMR:
A Service Integrity Assurance Framework for MapReduce," Computer Security Applications Conference, 2009. ACSAC '09. Annual , vol., no., pp.73-82, 7-11 Dec. 2009
Making Cluster Application Energy-Aware
Authors: Nedeljko Vaasic, Martin Braistits and Vincent Salzgerber
Jan 28th, 2011
Outline
Introduction
Case Study
Approach
Introduction
Power consumption A critical issue in large scale clusters
Data centers consume as much energy as a city
7.4 billion dollars per year
Current techniques for efficiency Consolidate workload into fewer machines
Minimize the energy consumption while keeping the same overall performance level
Problems Cannot operate at multiple power levels
Cannot deal with energy consumption limits
Case Study
Google’s Server Utilization and Energy
Consumption
Case Study
Hadoop Distributed File System (HDFS)
Case Study
Hadoop Distributed File System (HDFS)
Case Study
MapReduce
Case Study
Conclusion It is a wise decision to aggregate load
on a fewer number of machines for saving energy
Distributed applications must actively participate in the power management in order to avoid poor performance
Approach
On the Energy (In)efficiency of Hadoop Clusters
Authors: Jacob Leverich, Christ Kozyrakis
Jan 28th, 2011
Introduction
Improvement of energy efficiency of a cluster Place some nodes into low-power standby
modes
Avoid energy waste on oversized components for each node
Problems
Approach
Hadoop data layout overview Distribute replicas across different nodes in
order to improve performance and reliability
The user specifies a block replication factor n to ensure n identical copies of any data-block are stored across a cluster (typically n=3)
The largest number of nodes that can be disabled without impacting data availability is n-1
Approach
Covering subsetAt least one replica of a data-block must
be stored in a subset of nodes called covering subset
Make sure that a large number of nodes can be gracefully removed from a cluster without affecting the availability of data or interrupting the normal operation of a cluster