Large-scale computation without sacrificing expressiveness
-
Upload
sangjin-han -
Category
Technology
-
view
600 -
download
2
description
Transcript of Large-scale computation without sacrificing expressiveness
![Page 1: Large-scale computation without sacrificing expressiveness](https://reader034.fdocuments.net/reader034/viewer/2022052601/558b22a1d8b42a98478b4592/html5/thumbnails/1.jpg)
Large-‐scale computa0on without sacrificing expressiveness
Sangjin Han Sylvia Ratnasamy UC Berkeley
1
![Page 2: Large-scale computation without sacrificing expressiveness](https://reader034.fdocuments.net/reader034/viewer/2022052601/558b22a1d8b42a98478b4592/html5/thumbnails/2.jpg)
Review: MapReduce and Friends
Computa0on
Input Output
map filter
group by reduce join …
2
![Page 3: Large-scale computation without sacrificing expressiveness](https://reader034.fdocuments.net/reader034/viewer/2022052601/558b22a1d8b42a98478b4592/html5/thumbnails/3.jpg)
Review: MapReduce and Friends
Computa0on
Input Output
map filter
group by reduce join …
Observa(on 1: Bulk transforma(on of immutable data (no fine-‐grained updates)
3
![Page 4: Large-scale computation without sacrificing expressiveness](https://reader034.fdocuments.net/reader034/viewer/2022052601/558b22a1d8b42a98478b4592/html5/thumbnails/4.jpg)
Example 1: Sparse Opera0ons
• k-‐hop reachability with itera0ve MapReduce
4
![Page 5: Large-scale computation without sacrificing expressiveness](https://reader034.fdocuments.net/reader034/viewer/2022052601/558b22a1d8b42a98478b4592/html5/thumbnails/5.jpg)
Example 1: Sparse Opera0ons
• k-‐hop reachability with itera0ve MapReduce
MR Source node
Graph
5
1-‐hop nodes
![Page 6: Large-scale computation without sacrificing expressiveness](https://reader034.fdocuments.net/reader034/viewer/2022052601/558b22a1d8b42a98478b4592/html5/thumbnails/6.jpg)
Example 1: Sparse Opera0ons
• k-‐hop reachability with itera0ve MapReduce
MR Source node
Graph
6
1-‐hop nodes MR
Graph
2-‐hop nodes
![Page 7: Large-scale computation without sacrificing expressiveness](https://reader034.fdocuments.net/reader034/viewer/2022052601/558b22a1d8b42a98478b4592/html5/thumbnails/7.jpg)
Example 1: Sparse Opera0ons
• k-‐hop reachability with itera0ve MapReduce
MR Source node
Graph
7
1-‐hop nodes MR
Graph
2-‐hop nodes MR
Graph
…
![Page 8: Large-scale computation without sacrificing expressiveness](https://reader034.fdocuments.net/reader034/viewer/2022052601/558b22a1d8b42a98478b4592/html5/thumbnails/8.jpg)
Example 1: Sparse Opera0ons
• k-‐hop reachability with itera0ve MapReduce
MR Source node
Graph
8
1-‐hop nodes MR
Graph
2-‐hop nodes MR
Graph
…
![Page 9: Large-scale computation without sacrificing expressiveness](https://reader034.fdocuments.net/reader034/viewer/2022052601/558b22a1d8b42a98478b4592/html5/thumbnails/9.jpg)
Example 1: Sparse opera0ons
Internet router topology graph (1.7M nodes, 22.2M edges) 9
• k-‐hop reachability with itera0ve MapReduce
0
5
10
15
20
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22
# of
pro
cess
ed e
dges
(Mill
ions
)
Iteration
Iterative MapReduce Optimal
![Page 10: Large-scale computation without sacrificing expressiveness](https://reader034.fdocuments.net/reader034/viewer/2022052601/558b22a1d8b42a98478b4592/html5/thumbnails/10.jpg)
Review: MapReduce and Friends (cont’d)
Converged?
10
![Page 11: Large-scale computation without sacrificing expressiveness](https://reader034.fdocuments.net/reader034/viewer/2022052601/558b22a1d8b42a98478b4592/html5/thumbnails/11.jpg)
Review: MapReduce and Friends (cont’d)
Converged?
Filter Map
Filter
Union
Join
11
![Page 12: Large-scale computation without sacrificing expressiveness](https://reader034.fdocuments.net/reader034/viewer/2022052601/558b22a1d8b42a98478b4592/html5/thumbnails/12.jpg)
Review: MapReduce and Friends (cont’d)
Observa(on 2: Sta(c dataflow (no data-‐dependent control flow)
Converged?
Filter Map
Filter
Union
Join
12
![Page 13: Large-scale computation without sacrificing expressiveness](https://reader034.fdocuments.net/reader034/viewer/2022052601/558b22a1d8b42a98478b4592/html5/thumbnails/13.jpg)
13
E = (p ∨ !q)∧(!p ∨ r ∨ s)∧(q ∨ !s ∨ !t)∧(!p ∨ s)∧…
Example 2: Irregular parallelism
• Parallel SAT solver
![Page 14: Large-scale computation without sacrificing expressiveness](https://reader034.fdocuments.net/reader034/viewer/2022052601/558b22a1d8b42a98478b4592/html5/thumbnails/14.jpg)
14
E = (p ∨ !q)∧(!p ∨ r ∨ s)∧(q ∨ !s ∨ !t)∧(!p ∨ s)∧…
Example 2: Irregular parallelism
• Parallel SAT solver
p = T F
q = T F T F
![Page 15: Large-scale computation without sacrificing expressiveness](https://reader034.fdocuments.net/reader034/viewer/2022052601/558b22a1d8b42a98478b4592/html5/thumbnails/15.jpg)
15
E = (p ∨ !q)∧(!p ∨ r ∨ s)∧(q ∨ !s ∨ !t)∧(!p ∨ s)∧…
Example 2: Irregular parallelism
• Parallel SAT solver
p = T F
q = T F T F
![Page 16: Large-scale computation without sacrificing expressiveness](https://reader034.fdocuments.net/reader034/viewer/2022052601/558b22a1d8b42a98478b4592/html5/thumbnails/16.jpg)
16
E = (p ∨ !q)∧(!p ∨ r ∨ s)∧(q ∨ !s ∨ !t)∧(!p ∨ s)∧…
Example 2: Irregular parallelism
• Parallel SAT solver
p = T F
q = T F T F
r = T F
![Page 17: Large-scale computation without sacrificing expressiveness](https://reader034.fdocuments.net/reader034/viewer/2022052601/558b22a1d8b42a98478b4592/html5/thumbnails/17.jpg)
MapReduce-‐like frameworks assume:
1. Bulk transforma0on of immutable data
2. Sta0c dataflow
17
![Page 18: Large-scale computation without sacrificing expressiveness](https://reader034.fdocuments.net/reader034/viewer/2022052601/558b22a1d8b42a98478b4592/html5/thumbnails/18.jpg)
Exis0ng frameworks assume: Our work:
1. Bulk transforma0on of immutable data Fine-‐grained opera0ons on mutable data
2. Sta0c dataflow Dynamic, data-‐dependent control flow
18
Yet we s0ll want elas0c scalability and fault tolerance
![Page 19: Large-scale computation without sacrificing expressiveness](https://reader034.fdocuments.net/reader034/viewer/2022052601/558b22a1d8b42a98478b4592/html5/thumbnails/19.jpg)
CELIAS PROGRAMMING MODEL Spinning a small twist to Linda
19
![Page 20: Large-scale computation without sacrificing expressiveness](https://reader034.fdocuments.net/reader034/viewer/2022052601/558b22a1d8b42a98478b4592/html5/thumbnails/20.jpg)
Programming model = data model + computa0on model
20
![Page 21: Large-scale computation without sacrificing expressiveness](https://reader034.fdocuments.net/reader034/viewer/2022052601/558b22a1d8b42a98478b4592/html5/thumbnails/21.jpg)
Data Models for Mutable Shared Memory
21
![Page 22: Large-scale computation without sacrificing expressiveness](https://reader034.fdocuments.net/reader034/viewer/2022052601/558b22a1d8b42a98478b4592/html5/thumbnails/22.jpg)
Global address space: UPC, X10, Fortress… Too low level
Data Models for Mutable Shared Memory
22
![Page 23: Large-scale computation without sacrificing expressiveness](https://reader034.fdocuments.net/reader034/viewer/2022052601/558b22a1d8b42a98478b4592/html5/thumbnails/23.jpg)
Global address space: UPC, X10, Fortress…
Key Value
… …
Key-‐value tables: RAMCloud, Dynamo, Piccolo…
Too low level
Data Models for Mutable Shared Memory
Limited lookup ability
Consistency concerns
23
![Page 24: Large-scale computation without sacrificing expressiveness](https://reader034.fdocuments.net/reader034/viewer/2022052601/558b22a1d8b42a98478b4592/html5/thumbnails/24.jpg)
Global address space: UPC, X10, Fortress…
Key Value
… …
Key-‐value tables: RAMCloud, Dynamo, Piccolo…
Tuplespace: Linda
Too low level
Data Models for Mutable Shared Memory
Limited lookup ability
Consistency concerns
Flexible lookup with any ahributes
Individual tuples are immutable
(‘employee’, ‘John’, 29)
(‘todo’, ‘shopping’)
(‘todo’, ‘walk’)
24
![Page 25: Large-scale computation without sacrificing expressiveness](https://reader034.fdocuments.net/reader034/viewer/2022052601/558b22a1d8b42a98478b4592/html5/thumbnails/25.jpg)
Programming model = data model + computa0on model
Linda = Tuplespace + Linda processes
25
![Page 26: Large-scale computation without sacrificing expressiveness](https://reader034.fdocuments.net/reader034/viewer/2022052601/558b22a1d8b42a98478b4592/html5/thumbnails/26.jpg)
in(…) … out(…) … Process A
… out(…) … out(…) … Process B
… in(…) … in(…) … Process C
Linda Processes
26
![Page 27: Large-scale computation without sacrificing expressiveness](https://reader034.fdocuments.net/reader034/viewer/2022052601/558b22a1d8b42a98478b4592/html5/thumbnails/27.jpg)
in(…) … out(…) … Process A
… out(…) … out(…) … Process B
… in(…) … in(…) … Process C
Linda Processes L No automa0c scaling L No fault tolerance
27
![Page 28: Large-scale computation without sacrificing expressiveness](https://reader034.fdocuments.net/reader034/viewer/2022052601/558b22a1d8b42a98478b4592/html5/thumbnails/28.jpg)
Programming model = data model + computa0on model
Linda = Tuplespace + Linda processes
Celias = Tuplespace + microtasks
28
![Page 29: Large-scale computation without sacrificing expressiveness](https://reader034.fdocuments.net/reader034/viewer/2022052601/558b22a1d8b42a98478b4592/html5/thumbnails/29.jpg)
29
Microtasks
( ‘hello’, 5)
( ‘hello’, 7)
(‘world’, 2)
…
Func0on wordcount() Signature (?word, ?cnt1), (?word, ?cnt2) Code sum := cnt1 + cnt2
emit (word, sum)
![Page 30: Large-scale computation without sacrificing expressiveness](https://reader034.fdocuments.net/reader034/viewer/2022052601/558b22a1d8b42a98478b4592/html5/thumbnails/30.jpg)
30
Microtasks
( ‘hello’, 5)
( ‘hello’, 7)
(‘world’, 2)
…
Func0on wordcount() Signature (?word, ?cnt1), (?word, ?cnt2) Code sum := cnt1 + cnt2
emit (word, sum)
word = ‘hello’ cnt1 = 5 cnt2 = 7
When a signature matches:
1. microtask launch
![Page 31: Large-scale computation without sacrificing expressiveness](https://reader034.fdocuments.net/reader034/viewer/2022052601/558b22a1d8b42a98478b4592/html5/thumbnails/31.jpg)
31
Microtasks
( ‘hello’, 5)
( ‘hello’, 7)
(‘world’, 2)
…
Func0on wordcount() Signature (?word, ?cnt1), (?word, ?cnt2) Code sum := cnt1 + cnt2
emit (word, sum)
5 + 7 = ??
When a signature matches:
1. microtask launch
2. code execu0on
![Page 32: Large-scale computation without sacrificing expressiveness](https://reader034.fdocuments.net/reader034/viewer/2022052601/558b22a1d8b42a98478b4592/html5/thumbnails/32.jpg)
32
Microtasks
(‘world’, 2)
…
Func0on wordcount() Signature (?word, ?cnt1), (?word, ?cnt2) Code sum := cnt1 + cnt2
emit (word, sum) ( ‘hello’, 12)
When a signature matches:
1. microtask launch
2. code execu0on
3. atomic replacement
![Page 33: Large-scale computation without sacrificing expressiveness](https://reader034.fdocuments.net/reader034/viewer/2022052601/558b22a1d8b42a98478b4592/html5/thumbnails/33.jpg)
33
(A + B) × (C + D)
Two func0ons: add() and mul0ply()
![Page 34: Large-scale computation without sacrificing expressiveness](https://reader034.fdocuments.net/reader034/viewer/2022052601/558b22a1d8b42a98478b4592/html5/thumbnails/34.jpg)
34
(A + B) × (C + D)
Two func0ons: add() and mul0ply()
![Page 35: Large-scale computation without sacrificing expressiveness](https://reader034.fdocuments.net/reader034/viewer/2022052601/558b22a1d8b42a98478b4592/html5/thumbnails/35.jpg)
35
E × F
J Automa0c scaling
Two func0ons: add() and mul0ply()
![Page 36: Large-scale computation without sacrificing expressiveness](https://reader034.fdocuments.net/reader034/viewer/2022052601/558b22a1d8b42a98478b4592/html5/thumbnails/36.jpg)
36
J Automa0c scaling
E × F
Two func0ons: add() and mul0ply()
![Page 37: Large-scale computation without sacrificing expressiveness](https://reader034.fdocuments.net/reader034/viewer/2022052601/558b22a1d8b42a98478b4592/html5/thumbnails/37.jpg)
37
J Automa0c scaling
E × F
Two func0ons: add() and mul0ply()
![Page 38: Large-scale computation without sacrificing expressiveness](https://reader034.fdocuments.net/reader034/viewer/2022052601/558b22a1d8b42a98478b4592/html5/thumbnails/38.jpg)
38
J Automa0c scaling J Fault tolerance
E × F
Two func0ons: add() and mul0ply()
![Page 39: Large-scale computation without sacrificing expressiveness](https://reader034.fdocuments.net/reader034/viewer/2022052601/558b22a1d8b42a98478b4592/html5/thumbnails/39.jpg)
More Examples in the Paper…
• MapReduce – Celias is Turing-‐complete MapReduce-‐complete! – without any ar0ficial sync. barriers
• Single-‐source shortest path – Pregel-‐style graph processing
• Quicksort – Recursive control flow
39
![Page 40: Large-scale computation without sacrificing expressiveness](https://reader034.fdocuments.net/reader034/viewer/2022052601/558b22a1d8b42a98478b4592/html5/thumbnails/40.jpg)
Summary
• MapReduce-‐like frameworks are not suitable for algorithms with: – Sparse/incremental/fine-‐grained computa0on – Dynamic dataflow
• Celias comes to our rescue, yet it is also – automa0cally scalable – fault tolerant
40
![Page 41: Large-scale computation without sacrificing expressiveness](https://reader034.fdocuments.net/reader034/viewer/2022052601/558b22a1d8b42a98478b4592/html5/thumbnails/41.jpg)
Open Ques0ons
• Microtask abstrac0on: good enough? went too far?
• Feasibility of an efficient implementa0on – Reliable tuplespace – Signature matching – Microtask transac0ons
• … what is a killer app of Celias?
41
• <Your ques0ons here>