Post on 17-Aug-2020
Indian Institute of Science Bangalore, India
भारतीय विज्ञान संस्थान
बंगलौर, भारत
One Trillion Edges :
Graph Processing at Facebook Scale A v e r y C h i n g , S e r g e y E d u n o v , M a j a K a b i l j o ,
D i o n y s i o s L o g o t h e t i s , S a m b h a v i M u t h u k r i s h n a n
F a c e b o o k
P r e s e n t e d b y : S w a p n i l G a n d h i
2 1 s t N o v e m b e r 2 0 1 8
2
Konigsberg* Bridge Problem
Its negative resolution by Leonhard Euler in 1736 laid the foundations of graph theory.
* Located in Kingdom of Prussia (Now in Russia)
Graphs are common Web & Social Networks
‣ Web graph, Citation Networks, Twitter, Facebook, Internet
Knowledge networks & relationships ‣ Google’s Knowledge Graph, CMU’s NELL
Cybersecurity ‣ Telecom call logs, financial transactions, Malware
Internet of Things ‣ Transport, Power, Water networks
Bioinformatics ‣ Gene sequencing, Gene expression networks
3
Graph Algorithms
Traversals: Paths & flows between different parts of the graph
‣ Breadth First Search, Shortest path, Minimum Spanning Tree, Eulerian paths, Max-Cut
Clustering: Closeness between sets of vertices ‣ Community detection & Evolution, Connected
components, K-means clustering, Max Independent Set
Centrality: Relative importance of vertices ‣ PageRank, Betweenness Centrality
4
Graphs are Central to Analytics
Raw
Wikipedia
< / > < / > < / > XML
Hyperlinks PageRank Top 20 Pages
Title PR Text
Table
Title Body Topic Model
(LDA) Word Topics
Word Topic
Editor Graph
Community
Detection
User
Community
User Com.
Term-Doc
Graph
Discussion
Table
User Disc.
Community
Topic
Topic Com.
But Graphs can be challenging Shared memory algorithms don’t scale!
Do not fit naturally to Hadoop/MapReduce ‣ Multiple MR jobs (iterative MR)
‣ Topology & Data written to HDFS each time
‣ Tuple, rather than graph-centric, abstraction
Lot of work on parallel graph libraries for HPC ‣ Boost Graph Library, Graph500
‣ Storage & compute are (loosely) coupled, not fault tolerant
‣ But everyone does not have a supercomputer! • If in-case you do own a supercomputer, stick with HPC
algorithms 6
PageRank using MR
7 MapReduce : https://static.googleusercontent.com/media/research.google.com/en//archive/mapreduce-
osdi04.pdf
PageRank using MR MR will run for multiple iterations (Typically 30)
Mapper will ‣ Initially, load adjacency list and initialize default PR ‣ Subsequent iterations will load adjacency list and
new PR ‣ Emit two types of messages from Map
Reducer will ‣ Reconstruct the adjacency list for each vertex ‣ Update the PageRank values for the vertex based
on neighbour’s PR messages ‣ Write adjacency list and new PR values to HDFS, to
be used by next Map iteration
8 SQL v/s MapReduce : http://www.science.smith.edu/dftwiki/images/6/6a/ComparisonOfApproachesToLargeScaleDataAnalysis.p
df
Two Birds
9
Half-caf Double Expresso
Less Data movement over Network
Fault Tolerance
Credits : Google Images
One Stone
10
Pregel
Credits : Google Images
It’s a word play on the English proverb : “Kill two birds with one stone”
Pregel To overcome these challenges, Google came up with Pregel
11
Valiant’s BSP
12
Superstep 1
Superstep 2
P1 P2 P3 P4
Computation
Communication
Barrier Synchronization
Computation
“Often expensive and should be used as sparingly as possible”
Vertex State Machine
13
In superstep 0, every vertex is in the active state.
A vertex deactivates itself by voting to halt.
It can be reactivated by receiving an (external) message.
Algorithm termination is based on every vertex voting to halt.
Vertex Centric Programming
Vertex Centric Programming Model ‣ Logic written from perspective on a single vertex.
‣ Executed on all vertices.
Vertices know about ‣ Their own value(s) ‣ Their outgoing edges
14
15
6 6 2 6
6 6 6 6
6 6 6 6
3 6 2 1 Superstep 0
Superstep 1
Superstep 2
Superstep 3
Active
Voted
to Halt
Message
Finding Largest Value in a Graph using Pregel
Worker
Edges
Advantages
Makes distributed programming easy ‣ No locks, semaphores, race conditions ‣ Separates computing from communication phase
Vertex-level parallelization ‣ Bulk message passing for efficiency
Stateful (in-memory) ‣ Only messages & checkpoints hit disk
16
Lifecycle of a Pregel Program
17 Apache Giraph, Claudio Martella, Hadoop Summit, Amsterdam, April
2014
Applications
18
SSSP class ShortestPathVertex: public Vertex<int, int, int> {
void Compute(MessageIterator* msgs) { int mindist = IsSource(vertex_id()) ? 0 : INF; for ( ; !msgs->Done(); msgs->Next())
mindist = min(mindist, msgs->Value());
if (mindist < GetValue()) {
*MutableValue() = mindist; OutEdgeIterator iter = GetOutEdgeIterator();
for ( ; !iter.Done(); iter.Next())
SendMessageTo(iter.Target(),
mindist + iter.GetValue());
}
VoteToHalt();
}
};
19
In the 0th superstep, only source vertex will
update its value
SSSP (1/6)
20
A
B
C
D
E
H
F
G
1
2 4
1 1
3
2
5 2
2
Input Graph
Worker 1
Worker 2
Worker 3
Worker 4
SSSP (2/6)
21
0
∞
∞
∞
∞
∞
∞
∞
1
2 4
1 1
3
2
5 2
2
Superstep 0
Worker 1
Worker 2
Worker 3
Worker 4
SSSP (3/6)
22
0
1
2
∞
∞
∞
∞
∞
1
2 4
1 1
3
2
5 2
2
Superstep 1
Worker 1
Worker 2
Worker 3
Worker 4
SSSP (4/6)
23
0
1
2
3
4
6
3
∞
1
2 4
1 1
3
2
5 2
2
Superstep 2
Worker 1
Worker 2
Worker 3
Worker 4
SSSP (5/6)
24
0
1
2
3
4
4
3
6
1
2 4
1 1
3
2
5 2
2
Superstep 3
Worker 1
Worker 2
Worker 3
Worker 4
SSSP (6/6)
25
0
1
2
3
4
4
3
6
1
2 4
1 1
3
2
5 2
2
Worker 1
Worker 2
Worker 3
Worker 4
Algorithm has converged
Apache Giraph
26
Platform Improvements (1/2)
Efficient Memory Management (MM) ‣ Vertex and Edge data stored using serialized byte array
‣ Better MM -> Less GC
Support for Multi-Threading ‣ Maximized resource utilization
‣ Linear speed-up for CPU bound applications like K-Means Clustering
27
Platform Improvements (2/2)
Flexible IO Format ‣ Reduces Pre-processing
‣ Allows Vertex and Edge data to be loaded from different sources
Sharded Aggregator ‣ Aggregator responsibilities are balanced across workers
‣ Different Aggregators can be assigned to different workers.
28
29
Refer Class-room Discussion
Beyond Pregel
Master Compute ‣ Allows centralized execution of computation
‣ Refer Class-room Discussion
Worker Phases ‣ Special methods which by-pass Pregel Model, but add
ease of usability
‣ Applicability is application specific
Computation Composability ‣ Decouples Vertex and Computation
‣ Existing Computation implementation can be decoupled for multiple applications
30
Superstep Splitting
Master runs same “Message Heavy” Superstep for fixed number of iterations
For an iteration: ‣ Vertex computation invoked if vertex passes hash
function H ‣ Message sent to destination only if destination passes
hash function H’
Applicable to computation where messages are not “aggregatable”. ‣ If they can be aggregated (commutative and associative)
then stick with Combiners
Example : Friends-of-Friends Computation
31
Key Take-aways
Usability, Performance and scalability improvement to Apache Giraph ‣ Code available as open-source to try out !
Memoir detailing Facebook’s experience of using Giraph for production applications
Headline Grabber :
“Scales to a Trillion Edge graph”
32