SSS: An Implementation of Key -value Store based MapReduce...

23
SSS: An Implementation of Key-value Store based MapReduce Framework Hirotaka Ogawa (AIST, Japan) Hidemoto Nakada Ryousei Takano Tomohiro Kudoh

Transcript of SSS: An Implementation of Key -value Store based MapReduce...

Page 1: SSS: An Implementation of Key -value Store based MapReduce ...salsahpc.indiana.edu/CloudCom2010/slides/PDF/SSS An... · SSS: Distributed KVS based MapReduce • Our first prototype

SSS: An Implementation of Key-value Store based MapReduce Framework

Hirotaka Ogawa (AIST, Japan)Hidemoto Nakada

Ryousei TakanoTomohiro Kudoh

Page 2: SSS: An Implementation of Key -value Store based MapReduce ...salsahpc.indiana.edu/CloudCom2010/slides/PDF/SSS An... · SSS: Distributed KVS based MapReduce • Our first prototype

MapReduce• A promising programming tool for implementing large-

scale data-intensive apps• Essentially provides a data-parallel computing model

– Map• Spreads a segment of a single array computation over multiple

processors• Performs each computation on the relevant processor

– Reduce• Aggregates distributed reduction variables• Performs computation over them

• (Theoretically) most SPMD-type apps can be realized by MR model

Page 3: SSS: An Implementation of Key -value Store based MapReduce ...salsahpc.indiana.edu/CloudCom2010/slides/PDF/SSS An... · SSS: Distributed KVS based MapReduce • Our first prototype

Extends MapReduce to HPC

• User can develop HPC apps faster and easier than ever!– Provides a higher level programming model than

parallel programming languages, e.g., HPF, OpenMP– Provides simpler communication topologies and

synchronization model than message-passing libraries, e.g., MPI

• But there are limitations in MapReduce– Sacrificing runtime performance– Fixed workflow

Page 4: SSS: An Implementation of Key -value Store based MapReduce ...salsahpc.indiana.edu/CloudCom2010/slides/PDF/SSS An... · SSS: Distributed KVS based MapReduce • Our first prototype

Why?

• Semantic gap between MR data model and the input/output data format– MR apps handle KV data– Backend DFS provides an interface to large data files

• No opportunity to reuse the internal KV data– These KV data exist

only in the MR runtime

Considerable overhead for reading/writing large amount of data, in particular, iterative apps

Flexible workflows, e.g., reusing KV data (incl. the int. KV data) across multiple maps and reduces, are

infeasible!

Page 5: SSS: An Implementation of Key -value Store based MapReduce ...salsahpc.indiana.edu/CloudCom2010/slides/PDF/SSS An... · SSS: Distributed KVS based MapReduce • Our first prototype

Related Work• Twister

– Map tasks can read input data from:• Distributed in-memory cache• Local disks

• MRAP– Map tasks can read input data from:

• Preprocessed files optimized to efficient access• Original files

• →Partial Solutions– Cannot handle the intermediate KV data– Users need to determine which data is cacheable (immutable)

Page 6: SSS: An Implementation of Key -value Store based MapReduce ...salsahpc.indiana.edu/CloudCom2010/slides/PDF/SSS An... · SSS: Distributed KVS based MapReduce • Our first prototype

Our Solution

File System

MapReduce Runtime

Task Manager

M M M R R

Internal Data (Cache) Manager

KV KV KV KV KV

KV KV KV KV

Serialized Files

KV KV KV KV

Storage System

MapReduce Runtime

Task Manager

M M M R R

KV KV KV KV

KV KV KV KV

Storage-side Cache

KV KV KV KV

Page 7: SSS: An Implementation of Key -value Store based MapReduce ...salsahpc.indiana.edu/CloudCom2010/slides/PDF/SSS An... · SSS: Distributed KVS based MapReduce • Our first prototype

SSS: Distributed KVS based MapReduce

• Our first prototype of KVS-based MapReduce• Hadoop-compatible API• Distributed KVS-centric

– Scale-out• Horizontally adding new nodes

– Owner computes• Each map and reduce is distributed to the node where the target KV

data is stored– Shuffle-and-Sort phase is not required– On-memory cache

• Enables reuse of KV data across multiple maps and reduces– Flexible workflows are supported

• Any combination of map and reduce, including iterative apps, are available

Why?

Page 8: SSS: An Implementation of Key -value Store based MapReduce ...salsahpc.indiana.edu/CloudCom2010/slides/PDF/SSS An... · SSS: Distributed KVS based MapReduce • Our first prototype

KVSKVS

KV

KV

KV

KV

KV

KV

KV

KV

KV

Map

Map

Map

iKV

iKV

iKV

iKV

iKV

iKV

iKV

iKV

iKV

KVS

iKViKV

iKV

iKV

iKV

iKV

iKV

iKV

iKV

grouping

grouping

grouping

Reduce

Reduce

Reduce

KV

KV

KV

KV

KV

KV

KV

How MapReduce runs in SSS

Page 9: SSS: An Implementation of Key -value Store based MapReduce ...salsahpc.indiana.edu/CloudCom2010/slides/PDF/SSS An... · SSS: Distributed KVS based MapReduce • Our first prototype

SSS: Distributed KVS based MapReduce

• Our first prototype of KVS-based MapReduce• Hadoop-compatible API• Distributed KVS-based

– Scale-out• Horizontally adding new nodes

– Owner computes• Each map and reduce is distributed to the node where the target KV

data is stored– Shuffle-and-Sort phase is not required– On-memory cache

• Enables reuse of KV data across multiple maps and reduces– Flexible workflows are supported

• Any combination of map and reduce, including iterative apps, are available

How?

Page 10: SSS: An Implementation of Key -value Store based MapReduce ...salsahpc.indiana.edu/CloudCom2010/slides/PDF/SSS An... · SSS: Distributed KVS based MapReduce • Our first prototype

Input Key-value Data

Map Map Map Map Map

Output Key-value Data

Intermediate Key-value Data

Red Red Red Red Red

IterativeApplication

Map Map Map Map Map

Output Key-value Data

Intermediate Key-value Data

Red Red Red Red Red

Map Map Map Map Map

Intermediate Key-value Data

Red Red Red Red Red

Output Key-value Data

Reuse the Int.Key-value Data

Page 11: SSS: An Implementation of Key -value Store based MapReduce ...salsahpc.indiana.edu/CloudCom2010/slides/PDF/SSS An... · SSS: Distributed KVS based MapReduce • Our first prototype

SSS: Architectural Overview

Page 12: SSS: An Implementation of Key -value Store based MapReduce ...salsahpc.indiana.edu/CloudCom2010/slides/PDF/SSS An... · SSS: Distributed KVS based MapReduce • Our first prototype

SSS Map Thread Pipeline

• To limit the memory usage and hide the latency of KVS, SSS runtime employs pipelining technique

Page 13: SSS: An Implementation of Key -value Store based MapReduce ...salsahpc.indiana.edu/CloudCom2010/slides/PDF/SSS An... · SSS: Distributed KVS based MapReduce • Our first prototype

Packed-SSS Map Thread Pipeline

• To reduce the number of KV data– Stores KV data into thread-local buffer– Converts them to a single large KV data– Stores it to the KVS

Page 14: SSS: An Implementation of Key -value Store based MapReduce ...salsahpc.indiana.edu/CloudCom2010/slides/PDF/SSS An... · SSS: Distributed KVS based MapReduce • Our first prototype

Preliminary Evaluation

• Environment– Number of nodes: 16 + 1 (master)– CPUs per node: Intel Xeon W5590 3.33GHz x 2– Memory per node: 48GB– OS: CentOS 5.5 x86_64– Storage: Fusion-io ioDrive Duo 320GB– NIC: Mellanox ConnectX-II 10G

• MapReduce implementations– SSS– Packed-SSS– Hadoop (replica count is 1, to avoid unintended

replications)

Page 15: SSS: An Implementation of Key -value Store based MapReduce ...salsahpc.indiana.edu/CloudCom2010/slides/PDF/SSS An... · SSS: Distributed KVS based MapReduce • Our first prototype

Benchmarks• Word count

– 128 text files• Total file size: 12.5GiB• Each file size: almost 100MiB

– No combiners employed– Input: Coarse grain, Output: Fine grain

• Iterative identity map and reduce– Both of map and reduce generate a set of KV data same as

their inputs– Iteratively 8 times applied– 8 million keys

• Total amount of KV data: almost 128MiB– Input/Output: Fine grain

Page 16: SSS: An Implementation of Key -value Store based MapReduce ...salsahpc.indiana.edu/CloudCom2010/slides/PDF/SSS An... · SSS: Distributed KVS based MapReduce • Our first prototype

DistCopy: Distributing 12.5GiB text files for WordCount

020406080

100120140160180200

Hadoop (serial) SSS (serial) SSS (parallel)

DistCopy [sec]

DistCopy [sec]

17% faster

90% faster

Our KVS is not slower than HDFSParallelization is very effective due to ioDrive + 10GbE

Page 17: SSS: An Implementation of Key -value Store based MapReduce ...salsahpc.indiana.edu/CloudCom2010/slides/PDF/SSS An... · SSS: Distributed KVS based MapReduce • Our first prototype

0

20

40

60

80

100

120

140

160

1 2 3 4 5

HadoopSSSpacked-SSS

Trials

Runn

ing

Tim

e [s

ec]

WordCount12% faster

3x faster

Page 18: SSS: An Implementation of Key -value Store based MapReduce ...salsahpc.indiana.edu/CloudCom2010/slides/PDF/SSS An... · SSS: Distributed KVS based MapReduce • Our first prototype

0

50

100

150

200

250

300

350

1 2 3 4 5 6 7 8

HadoopSSSpacked-SSS

Iteration Count

Runn

ing

Tim

e [s

ec]

Iterative Identity Map/Reduce2.9x faster 10x faster

Page 19: SSS: An Implementation of Key -value Store based MapReduce ...salsahpc.indiana.edu/CloudCom2010/slides/PDF/SSS An... · SSS: Distributed KVS based MapReduce • Our first prototype

Numbers of KV data/files• WordCount

• Iterative Identity Map/Reduce

# Map Inputs # Intermediate # Reduce Outputs

SSS 128 1.5 billion 2.7 million

Packed-SSS 128 2,048 16

Hadoop 128 files ~256 files 16 files

# Map Inputs # Intermediate # Reduce Outputs

SSS 8 million 8 million 8 million

Packed-SSS 128 128 128

Hadoop 128 files 128 files 128 files

Page 20: SSS: An Implementation of Key -value Store based MapReduce ...salsahpc.indiana.edu/CloudCom2010/slides/PDF/SSS An... · SSS: Distributed KVS based MapReduce • Our first prototype

Conclusion

• SSS is our first prototype of KVS-based MapReduce• Distributed KVS centric

– Scale-out– Owner computes– Shuffle-and-Sort phase is not required– On-memory cache– Flexible workflows are supported

• Hadoop-compatible API• Runtime performance is better than Hadoop, but not

enough (we think)

Page 21: SSS: An Implementation of Key -value Store based MapReduce ...salsahpc.indiana.edu/CloudCom2010/slides/PDF/SSS An... · SSS: Distributed KVS based MapReduce • Our first prototype

Future Work

• Performance• Fault-tolerance• More comprehensive benchmarks

– To identify the characteristics and feasibility to various class of HPC and data-intensive apps

• Higher-level programming tool– Pig, Szl, DryadLINQ, HAMA, R, etc.– We have already implemented our own Sawzall-

clone running on top of Hadoop and SSS

Page 22: SSS: An Implementation of Key -value Store based MapReduce ...salsahpc.indiana.edu/CloudCom2010/slides/PDF/SSS An... · SSS: Distributed KVS based MapReduce • Our first prototype

Thank you!

Page 23: SSS: An Implementation of Key -value Store based MapReduce ...salsahpc.indiana.edu/CloudCom2010/slides/PDF/SSS An... · SSS: Distributed KVS based MapReduce • Our first prototype

1,1 1,2 1,3 1,4

2,1 2,2 2,3 2,4

3,1 3,2 3,3 3,4

4,1 4,2 4,3 4,4

Matrix A (blocked)

1,1 1,2 1,3 1,4

2,1 2,2 2,3 2,4

3,1 3,2 3,3 3,4

4,1 4,2 4,3 4,4

Matrix B (blocked)

Blockmultiply

<C11, A11*B11> <C11, A12*B21>

<C11, A13*B31> <C11, A14*B41>

<C12, A11*B12> <C12, A12*B22>

<C12, A13*B32> <C12, A14*B42>

<C13, A11*B13>

<C11, A11*B11 + A12*B21 + A13*B31 + A14*B41>

<C12, A11*B12 + A12*B22 + A13*B32 + A14*B42>

<C13, A11*B13 + A12*B23 + A13*B33 + A14*B43>

Blockadd

Matrix Multiply by MR

map red