Boosting Vertex-Cut Partitioning for Streaming Graphs

Hooman Peiro Sajjad*, Amir H. Payberah†, Fatemeh Rahimian†, Vladimir Vlassov*, Seif Haridi†

* KTH Royal Institute of Technology † SICS Swedish ICT

5th IEEE International Congress on Big Data

Introduction

Graph PartitioningPartition large graphs for applications such as:•Complexity reduction, parallelization and distributed graph analysis

Partitioning Models

P1 P2 P1 P2

Partitioning Models

P1 P2 P1 P2

More efficient for power-law graphs

A Good Vertex-Cut Partitioning

• Low replication factor• Balanced partitions with respect to the number of edges

Streaming Graph Partitioning

• Graph elements are assigned to partitions as they are being streamed

• No global knowledge

Partitioner

streaming edges

State-of-the-Art Partitioners

State-of-the-Art Partitioners• Centralized partitioner:

• Single thread partitioner

• Multi-threaded partitioner: each thread partitions a subset of the graph and shares the state information

• Distributed partitioner:• Oblivious partitioners: several independent

partitioners

Slow partitioning timeLow replication factor

partitioners16

Fast partitioning timeHigh replication factor

Centralized partitioner

Partitioning Time vs. Partition Quality

Distributed partitioner

HoVerCut

HoVerCut Framework

HoVerCut ...

• Streaming Vertex-Cut partitioner

• Horizontally and Vertically scalable

• Scales without degrading the quality of partitions

• Employs different partitioning policies

Architecture Overview

Partitioning Policy

Tumbling Window

Local State

Subpartitioner 1

Edge stream

Partitioning Policy

Tumbling Window

Local State

Subpartitioner n

Edge stream

Shared State

Architecture: Input

Partitioning Policy

Tumbling Window

Subpartitioner 1

Partitioning Policy

Tumbling Window

Local State

Subpartitioner n

Edge stream

• Input graphs are streamed by their edges

• Each subpartitioner receives an exclusive subset of the edges

Shared State

Architecture: Configurable Window

Partitioning Policy

Local State

Subpartitioner 1

Edge stream

Partitioning Policy

Tumbling Window

Local State

Subpartitioner n

Edge stream

Subpartitioners collect a number of incoming edges in a window of a certain size.

Tumbling Window

Shared State

Architecture: Partitioning Policy

Local State

Subpartitioner 1

Edge stream

Partitioning Policy

Tumbling Window

Local State

Subpartitioner n

Edge stream

Each subpartitioner assigns the edges to the partitions based on a given policy

Partitioning Policy

Tumbling Window

Shared State

Architecture: Local State

Each subpartitioner has a local state, which includes information about the edges processed locally:

• partial degree• partitions of each vertex• num. edges in each partition

Partitioning Policy

Subpartitioner 1

Edge stream

Partitioning Policy

Tumbling Window

Local State

Subpartitioner n

Edge stream

Local State

Tumbling Window

Shared State

Architecture: Shared State

Shared-state is the global state accessible by all subpartitioners.

Partitioning Policy

Subpartitioner 1

Edge stream

Partitioning Policy

Tumbling Window

Subpartitioner n

Edge stream

Tumbling Window

Shared State

Architecture: Shared State

Shared-state is the global state accessible by all subpartitioners.

putState

getState

ID Partial Degree partitions

v1 12 p1

v2 50 p1,p2

Vertex Table Partition Table

Shared State

ID Num. of edges

p1 5000

p2 6500

Partitioning Policy

Subpartitioner 1

Edge stream

Partitioning Policy

Tumbling Window

Subpartitioner n

Edge stream

Tumbling Window

Shared State

Architecture: Core

Partitioning Policy

Local State

Subpartitioner 1

Edge stream

Partitioning Policy

Tumbling Window

Local State

Subpartitioner n

Edge stream

The core is HoVerCut’s main algorithm parametrised with partitioning policy and the window size.

Shared State

Tumbling Window

Vertex-Cut Partitioning Heuristics

For an edge with end-vertices uand v and for every partition p

Score = ReplicationScore + LoadBalanceScore

Choose the partition that maximizes the Score.

Choose the partition that maximizes the Score

State-of-the-Art Heuristics:•Greedy•HDRF

Greedy vs. HDRF

Greedy: places end-vertices u and v of an edge in a partition that already has a replica of u or v.

Greedy vs. HDRF

uGreedy

Greedy vs. HDRF

Greedy

Greedy vs. HDRF

Greedy

HDRF (High Degree Replicated First): replicates the higher degree end-vertex.

Greedy vs. HDRF

Greedy

HDRF (High Degree Replicated First): replicates the higher degree end-vertex.

Greedy vs. HDRF

Greedy

vHDRF (High Degree Replicated First): replicates the higher degree end-vertex.

Partitioning a Window of Edges

vids: the set of vertex ids in the current windowedges: set of edges in current windowpt = get the partition table vt = get the vertex subtable restricted to vids

vids: the set of vertex ids in the current windowedges: set of edges in current windowpt = get the partition table vt = get the vertex table restricted to vids

for each e ∊ edges:u = e.src , v = e.dstincrement vt(u).degree and vt(v).degreegiven a partition policy: select p based on vt(u), vt(v) and ptadd p to vt(u).partitions and vt(v).partitionsincrement pt(p).size

update the shared state by sending vt, pt represented as deltas

ID Degree partitions

v1 +4 +p1

v2 +2 +p2

ID size

Evaluation

Datasets

Dataset |V| |E|

Autonomous systems (AS) 1.7M 11M

Pokec social network (PSN) 1.6M 22M

LiveJournal social network (LSN) 4.8M 48M

Orkut social network (OSN) 3.1M 117MPartitions: 16

Evaluation Metrics

• Replication Factor (RF): the average number of replicated vertices

• Load Relative Standard Deviation (LRSD): the relative standard deviation of edge size in each partition (LRSD=0 indicates equal size partitions)

• Partitioning time: the time it takes to partition a graph

One Host: Summary

HoVerCut’s configuration: Subpartitioners (threads) = 32Window size = 32

One Host: Summary

HoVerCut’s configuration: Subpartitioners = 32Window size = 32

Distributed Configuration

AS|V|=1.7M|E|=11M

Distributed Configuration

AS|V|=1.7M|E|=11M

OSN|V|=3.1M|E|=117M

Conclusion•We presented HoVerCut, a parallel and distributed partitioner

•We can employ different partitioning policies in a scalable fashion

•We can scale HoVerCut to partition larger graphs without degrading the quality of partitions

Thank You!

Boosting Vertex-Cut Partitioning for Streaming Graphs

Presentations & Public Speaking

Transcript of Boosting Vertex-Cut Partitioning for Streaming Graphs

Partitioning Systems and Doors - RIBA Product Selector Partitioning the SAS Partitioning Range We manufacture and supply a range of demountable, relocatable partitioning systems and

VersaClipper 2000 - Vertex Fasteners - Vertex Fasteners

Vertex Separators for Partitioning a Graph · the type of the applications, partitioning is performed by a removal of either a small set of edges or a small set of nodes. In this

Vertex ThermoSens - Vertex Dental Brochures/VERTEX ThermoSens... · Vertex™ ThermoSens is the innovative, virtually unbreakable, new monomer-free rigid denture base material from

LHCb VErtex LOcator & Displaced Vertex Trigger Vertex Detector Design Test Beam Results Displaced Vertex Trigger Algorithm Test Beam Emulation Conclusions.

Vertex Tools for Google SketchUp - Evil Software Empireevilsoftwareempire.com/resources/vertex-tools/1.0/VertexTools... · Vertex Tools for Google SketchUp Vertex Tools • version

Leggett & Platt - Vertex Fasteners - Vertex Fasteners

Database Partitioning, Table Partitioning, and MDC for DB2 9

Vertex Platform-Vertex Family - Trina Solar

Graph Partitioning with AMPL - Antonio Mucherino · Graph Partitioning with AMPL Graph partitioning Introduction Recalling some deﬁnitions: Graph partitioning Deﬁnition Graph

Vertex Partitioning Problems: Characterization, Complexity ... › ~telle › bib › Tth.pdf · cellent academic guidance, rock-solid support through rough waters, last-ing friendship,

Exploring 11g/12c Partitioning New Features and Best Practices · Oracle 11g Partitioning New Features Oracle 12c Partitioning New Features Live Demo. Oracle Partitioning Enhances

Subsea Boosting Systems - · PDF file1 Content Introduction Subsea Boosting and Compression –Why subsea boosting? Subsea Boosting Systems for Subsea Tiebacks –Total system approach

HYPERGRAPH PARTITIONING THROUGH VERTEX …SEPARATORS ON GRAPHS ENVER KAYAASLAN , ALI PINARy, UMIT C˘ATALY UREK z, AND CEVDET AYKANATx Abstract. The modeling exibility provided by

Boosting Moving Object Indexing through Velocity Partitioning

Ensemble Models and Partitioning Algorithms in SAS® Enterprise … · 2018. 2. 2. · decision tree or gradient boosting tree would be used. •Often used with big data •Variable

Partitioning& Re Partitioning

PARTITIONING HYPERGRAPHS IN SCIENTIFIC COMPUTING ...apinar/papers/NIG.pdfAPPLICATIONS THROUGH VERTEX SEPARATORS ON GRAPHS ENVER KAYAASLAN , ALI PINARy, UMIT C˘ATALY UREK z, AND CEVDET

Vertex Arrays, Vertex Buffer Objects and Vertex Array Objectsac.aua.am/Rzavodnik/Web/CS217/LECTURES/VertexArrays.pdf · Objects and Vertex Array Objects Professor Raymond Zavodnik

Revolver: Vertex-centric Graph Partitioning Using ...people.cs.pitt.edu/~hasanzadeh/files/notes/07.02.18_revolver.pdfJul 02, 2018 · Mohammad Hasanzadeh Mofrad 1, Rami Melhem and