Architecture-Aware Graph (Re)Partitioningpeople.cs.pitt.edu/~anz28/papers/proposal.slides.pdfN9 G P1...

72
Architecture-Aware Graph (Re)Partitioning Thesis Proposal Defense Angen Zheng Committee: Alexandros Labrinidis, Depart. of Comp. Science, Pitt (Advisor) Panos K. Chrysanthis, Depart. of Comp. Science, Pitt (Co-advisor) Jack Lange, Depart. of Comp. Science, Pitt Peyman Givi, Swanson School of Engineering, Pitt Patrick Pisciuneri, Swanson School of Engineering, Pitt 1

Transcript of Architecture-Aware Graph (Re)Partitioningpeople.cs.pitt.edu/~anz28/papers/proposal.slides.pdfN9 G P1...

Page 1: Architecture-Aware Graph (Re)Partitioningpeople.cs.pitt.edu/~anz28/papers/proposal.slides.pdfN9 G P1 P2 P3 P4 P5 P6 P7 P8 P9 31. Paragon: Partition Grouping P3 P2 P1 P9 P6 P4 P8 P7

Architecture-Aware Graph (Re)Partitioning

Thesis Proposal DefenseAngen Zheng

Committee: Alexandros Labrinidis, Depart. of Comp. Science, Pitt (Advisor)Panos K. Chrysanthis, Depart. of Comp. Science, Pitt (Co-advisor)Jack Lange, Depart. of Comp. Science, PittPeyman Givi, Swanson School of Engineering, PittPatrick Pisciuneri, Swanson School of Engineering, Pitt

1

Page 2: Architecture-Aware Graph (Re)Partitioningpeople.cs.pitt.edu/~anz28/papers/proposal.slides.pdfN9 G P1 P2 P3 P4 P5 P6 P7 P8 P9 31. Paragon: Partition Grouping P3 P2 P1 P9 P6 P4 P8 P7

Graph (Re)Partitioning

❖ Applications of Graph (Re)Partitioning▪ Scientific Simulations

▪ Distributed Graph Computation

o e.g., Pregel and Giraph

▪ VLSI Design

▪ Task Scheduling

▪ Linear Programming

2

Page 3: Architecture-Aware Graph (Re)Partitioningpeople.cs.pitt.edu/~anz28/papers/proposal.slides.pdfN9 G P1 P2 P3 P4 P5 P6 P7 P8 P9 31. Paragon: Partition Grouping P3 P2 P1 P9 P6 P4 P8 P7

Vertex-Centric BSP Computing Model

★ Vertex○ a unique identifier○ a modifiable, user-defined value

★ Edge○ a modifiable, user-defined value○ a target vertex identifier

UD

F

UD

F

UD

F

UD

F

★ Vertex-Centric UDF○ Change vertex/edge state○ Send msg to neighbours○ Receive msg from neighbors○ Mutate the graph topology ○ Deactivate at end of the superstep○ Reactivate by external msgs

3

Page 4: Architecture-Aware Graph (Re)Partitioningpeople.cs.pitt.edu/~anz28/papers/proposal.slides.pdfN9 G P1 P2 P3 P4 P5 P6 P7 P8 P9 31. Paragon: Partition Grouping P3 P2 P1 P9 P6 P4 P8 P7

A Balanced Partitioning = Even Load Distribution

N3N1

N2

Balanced:

4

Page 5: Architecture-Aware Graph (Re)Partitioningpeople.cs.pitt.edu/~anz28/papers/proposal.slides.pdfN9 G P1 P2 P3 P4 P5 P6 P7 P8 P9 31. Paragon: Partition Grouping P3 P2 P1 P9 P6 P4 P8 P7

Minimal Edge-Cut = Minimal Data Comm

N3N1

N2

Minimizing Edge-Cut:

5

Page 6: Architecture-Aware Graph (Re)Partitioningpeople.cs.pitt.edu/~anz28/papers/proposal.slides.pdfN9 G P1 P2 P3 P4 P5 P6 P7 P8 P9 31. Paragon: Partition Grouping P3 P2 P1 P9 P6 P4 P8 P7

Minimal Edge-Cut = Minimal Data Comm But Minimal Data Comm ≠ Minimal Comm Cost

Group neighboring vertices as close as possible

The (re)partitioner has to be Architecture-Aware

Figure 1. Pair-Wise Network Bandwidth (J. Xue, BigData’15)

STD DEV:416.82Mb/s

STD DEV:358.34Mb/s

STD DEV: 269 . 71Mb/s

6

Page 7: Architecture-Aware Graph (Re)Partitioningpeople.cs.pitt.edu/~anz28/papers/proposal.slides.pdfN9 G P1 P2 P3 P4 P5 P6 P7 P8 P9 31. Paragon: Partition Grouping P3 P2 P1 P9 P6 P4 P8 P7

Overview of the State-of-the-Art

Balanced Graph (Re)Partitioning

Partitioners(static graphs)

Repartitioners(dynamic graphs)

Metis’95

ICA3PP’08

SoCC’12

TKDE’15

BigData’15

DG/LDG’12

Offline Methods(High Quality)

(Poor Scalability)

Online Methods(Moderate Quality)(High Scalability)

Parmetis’

97

Offline Methods(High Quality)

(Poor Scalability)

Online Methods(Moderate~High Quality)

(High Scalability)

CatchW

’13xd

gp’13

Hermes

’15Miza

n’13

Aragon’14Paragon’16

Plan

ar’16

Architecture-Aware

7

Page 8: Architecture-Aware Graph (Re)Partitioningpeople.cs.pitt.edu/~anz28/papers/proposal.slides.pdfN9 G P1 P2 P3 P4 P5 P6 P7 P8 P9 31. Paragon: Partition Grouping P3 P2 P1 P9 P6 P4 P8 P7

Architecture-Aware Graph Repartitioning

Given G=(V, E) and an initial Partitioning P:

Minimizing Communication:

Balancing Load:

Network Cost

Minimizing Migration:

8

Page 9: Architecture-Aware Graph (Re)Partitioningpeople.cs.pitt.edu/~anz28/papers/proposal.slides.pdfN9 G P1 P2 P3 P4 P5 P6 P7 P8 P9 31. Paragon: Partition Grouping P3 P2 P1 P9 P6 P4 P8 P7

Road Map

Introduction

Aragon

Paragon

Planar

Contention

Evaluation

Future Work

9

Page 10: Architecture-Aware Graph (Re)Partitioningpeople.cs.pitt.edu/~anz28/papers/proposal.slides.pdfN9 G P1 P2 P3 P4 P5 P6 P7 P8 P9 31. Paragon: Partition Grouping P3 P2 P1 P9 P6 P4 P8 P7

Aragon: Sequential Architecture-Aware Graph Partition Refinement [BigGraphs’14]

❖ Goal: Group neighbouring vertices as close as possible❖ Input:

o A partitioned grapho The relative network comm cost matrix

❖ Output:o A partitioning with neighbouring vertices being

grouped as close as possible.

10

Page 11: Architecture-Aware Graph (Re)Partitioningpeople.cs.pitt.edu/~anz28/papers/proposal.slides.pdfN9 G P1 P2 P3 P4 P5 P6 P7 P8 P9 31. Paragon: Partition Grouping P3 P2 P1 P9 P6 P4 P8 P7

N3N1

N2

Aragon: Illustrate Aragon through an example

N1 N2 N3N1 1 6N2 1 1N3 6 1

11

Page 12: Architecture-Aware Graph (Re)Partitioningpeople.cs.pitt.edu/~anz28/papers/proposal.slides.pdfN9 G P1 P2 P3 P4 P5 P6 P7 P8 P9 31. Paragon: Partition Grouping P3 P2 P1 P9 P6 P4 P8 P7

N3N1

N2

Aragon: Send partitions to a centralized place

● Sending partitions to one place

N1 N2 N3N1 1 6N2 1 1N3 6 1

12

Page 13: Architecture-Aware Graph (Re)Partitioningpeople.cs.pitt.edu/~anz28/papers/proposal.slides.pdfN9 G P1 P2 P3 P4 P5 P6 P7 P8 P9 31. Paragon: Partition Grouping P3 P2 P1 P9 P6 P4 P8 P7

N3N1

N2

Aragon: Refine each partition pair sequentially

N1 N2 N3N1 1 6N2 1 1N3 6 1

● Sending partitions to one place● Refining two partitions a time

13

Page 14: Architecture-Aware Graph (Re)Partitioningpeople.cs.pitt.edu/~anz28/papers/proposal.slides.pdfN9 G P1 P2 P3 P4 P5 P6 P7 P8 P9 31. Paragon: Partition Grouping P3 P2 P1 P9 P6 P4 P8 P7

Aragon: Move vertices between N1 and N2?

N1 N2 N3N1 1 6N2 1 1N3 6 1

N3N1

N2

● Sending partitions to one place● Refining (N1, N2)

14

Page 15: Architecture-Aware Graph (Re)Partitioningpeople.cs.pitt.edu/~anz28/papers/proposal.slides.pdfN9 G P1 P2 P3 P4 P5 P6 P7 P8 P9 31. Paragon: Partition Grouping P3 P2 P1 P9 P6 P4 P8 P7

N3N1

N2

Aragon: Compute initial gain for vertices of N1 & N2

N1 N2 N3N1 1 6N2 1 1N3 6 1

● Sending partitions to one place● Refining (N1, N2)

■ Compute initial gain

15

Page 16: Architecture-Aware Graph (Re)Partitioningpeople.cs.pitt.edu/~anz28/papers/proposal.slides.pdfN9 G P1 P2 P3 P4 P5 P6 P7 P8 P9 31. Paragon: Partition Grouping P3 P2 P1 P9 P6 P4 P8 P7

N3N1

N2

Aragon: Compute initial gain for vertex a

N1 N2 N3N1 1 6N2 1 1N3 6 1

● Sending partitions to one place● Refining (N1, N2)

■ Compute initial gain■ a: N

1->N

2

16

Page 17: Architecture-Aware Graph (Re)Partitioningpeople.cs.pitt.edu/~anz28/papers/proposal.slides.pdfN9 G P1 P2 P3 P4 P5 P6 P7 P8 P9 31. Paragon: Partition Grouping P3 P2 P1 P9 P6 P4 P8 P7

N3N1

N2

Aragon: How the movement affects comm(N1, N2)?

N1 N2 N3N1 1 6N2 1 1N3 6 1

● Sending partitions to one place● Refining (N1, N2)

■ Compute initial gain■ a: N

1->N

2■ g

std(a) = (1-2)*1=-1

17

Page 18: Architecture-Aware Graph (Re)Partitioningpeople.cs.pitt.edu/~anz28/papers/proposal.slides.pdfN9 G P1 P2 P3 P4 P5 P6 P7 P8 P9 31. Paragon: Partition Grouping P3 P2 P1 P9 P6 P4 P8 P7

Aragon: How the movement affects comm(N1, N3)?

N3N1

N2

N1 N2 N3N1 1 6N2 1 1N3 6 1

● Sending partitions to one place● Refining (N1, N2)

■ Compute initial gain■ a: N

1->N

2■ g

std(a) = (1-2)*1=-1

■ gtopo

(a) = 1*(6-1)=5

18

Page 19: Architecture-Aware Graph (Re)Partitioningpeople.cs.pitt.edu/~anz28/papers/proposal.slides.pdfN9 G P1 P2 P3 P4 P5 P6 P7 P8 P9 31. Paragon: Partition Grouping P3 P2 P1 P9 P6 P4 P8 P7

Aragon: What is the migration cost of vertex a?

N3N1

N2

N1 N2 N3N1 1 6N2 1 1N3 6 1

● Sending partitions to one place● Refining (N1, N2)

■ Compute initial gain■ a: N

1->N

2■ g

std(a) = (1-2)*1=-1

■ gtopo

(a) = 1*(6-1)=5■ g

mig(a) = 1*1=1

19

Page 20: Architecture-Aware Graph (Re)Partitioningpeople.cs.pitt.edu/~anz28/papers/proposal.slides.pdfN9 G P1 P2 P3 P4 P5 P6 P7 P8 P9 31. Paragon: Partition Grouping P3 P2 P1 P9 P6 P4 P8 P7

N3N1

N2

Aragon: What’s the initial gain of moving a?

3

N1 N2 N3N1 1 6N2 1 1N3 6 1

● Sending partitions to one place● Refining (N1, N2)

■ Compute initial gain■ a: N

1->N

2■ g

std(a) = (1-2)*1=-1

■ gtopo

(a) = 1*(6-1)=5■ g

mig(a) = 1*1=1

■ g(a) = -1 + 5 – 1=3

20

Page 21: Architecture-Aware Graph (Re)Partitioningpeople.cs.pitt.edu/~anz28/papers/proposal.slides.pdfN9 G P1 P2 P3 P4 P5 P6 P7 P8 P9 31. Paragon: Partition Grouping P3 P2 P1 P9 P6 P4 P8 P7

● Sending partitions to one place● Refining (N1, N2)

■ Compute initial gain

N3N1

N2

Aragon: What’s the initial gain of other vertices?

3-2

-3

-2

0

0

-2

N1 N2 N3N1 1 6N2 1 1N3 6 1

21

Page 22: Architecture-Aware Graph (Re)Partitioningpeople.cs.pitt.edu/~anz28/papers/proposal.slides.pdfN9 G P1 P2 P3 P4 P5 P6 P7 P8 P9 31. Paragon: Partition Grouping P3 P2 P1 P9 P6 P4 P8 P7

N3N1

N2

Aragon: Which vertex has the max gain?

3-2

-3

-2

0

0

-2

N1 N2 N3N1 1 6N2 1 1N3 6 1

● Sending partitions to one place● Refining (N1, N2)

■ Compute initial gain■ Select max gain vertex, a

22

Page 23: Architecture-Aware Graph (Re)Partitioningpeople.cs.pitt.edu/~anz28/papers/proposal.slides.pdfN9 G P1 P2 P3 P4 P5 P6 P7 P8 P9 31. Paragon: Partition Grouping P3 P2 P1 P9 P6 P4 P8 P7

Aragon: Move the vertex with max gain

N3

N1

N2

-2

-3

-2

0

0

-2

N1 N2 N3N1 1 6N2 1 1N3 6 1

● Sending partitions to one place● Refining (N1, N2)

■ Compute initial gain■ Select max gain vertex, a■ Move a to N2

23

Page 24: Architecture-Aware Graph (Re)Partitioningpeople.cs.pitt.edu/~anz28/papers/proposal.slides.pdfN9 G P1 P2 P3 P4 P5 P6 P7 P8 P9 31. Paragon: Partition Grouping P3 P2 P1 P9 P6 P4 P8 P7

N3

N1

N2

-2

-1

-2

-2

0

0

Aragon: Update the gain of a’s nbors

N1 N2 N3N1 1 6N2 1 1N3 6 1

● Sending partitions to one place● Refining (N1, N2)

■ Compute initial gain■ Select max gain vertex, a■ Move a to N2■ Update gain of a’s nbors

24

Page 25: Architecture-Aware Graph (Re)Partitioningpeople.cs.pitt.edu/~anz28/papers/proposal.slides.pdfN9 G P1 P2 P3 P4 P5 P6 P7 P8 P9 31. Paragon: Partition Grouping P3 P2 P1 P9 P6 P4 P8 P7

Aragon: Repeat the whole process

N3

N1

N2

-2

-1

-2

-2

0

0

N1 N2 N3N1 1 6N2 1 1N3 6 1

● Sending partitions to one place● Refining (N1, N2)

■ Compute initial gain■ Repeat

■ Select max gain vertex, a■ Move a to N2■ Update gain of a’s nbors

25

Page 26: Architecture-Aware Graph (Re)Partitioningpeople.cs.pitt.edu/~anz28/papers/proposal.slides.pdfN9 G P1 P2 P3 P4 P5 P6 P7 P8 P9 31. Paragon: Partition Grouping P3 P2 P1 P9 P6 P4 P8 P7

Aragon: Output for the refinement of N1 and N2

● 4 Units Comm Cost (4 Edge-Cuts)● 1 Unit Migration Cost

N3

N1

N2

N1 N2 N3N1 1 6N2 1 1N3 6 1

● Sending partitions to one place● Refining (N1, N2)

■ Compute initial gain■ Repeat

■ Select max gain vertex, a■ Move a to N2■ Update gain of a’s nbors

26

Page 27: Architecture-Aware Graph (Re)Partitioningpeople.cs.pitt.edu/~anz28/papers/proposal.slides.pdfN9 G P1 P2 P3 P4 P5 P6 P7 P8 P9 31. Paragon: Partition Grouping P3 P2 P1 P9 P6 P4 P8 P7

Aragon: Refining N1 and N3

N3

N1

N2

N1 N2 N3N1 1 6N2 1 1N3 6 1

● Sending partitions to one place● Refining (N1, N3)

■ Compute initial gain■ Repeat

■ Select max gain vertex, a■ Move a to N2■ Update gain of a’s nbors

27

Page 28: Architecture-Aware Graph (Re)Partitioningpeople.cs.pitt.edu/~anz28/papers/proposal.slides.pdfN9 G P1 P2 P3 P4 P5 P6 P7 P8 P9 31. Paragon: Partition Grouping P3 P2 P1 P9 P6 P4 P8 P7

Aragon: Refining N2 and N3

N3

N1

N2

N1 N2 N3N1 1 6N2 1 1N3 6 1

● Sending partitions to one place● Refining (N2, N3)

■ Compute initial gain■ Repeat

■ Select max gain vertex, a■ Move a to N2■ Update gain of a’s nbors

28

Page 29: Architecture-Aware Graph (Re)Partitioningpeople.cs.pitt.edu/~anz28/papers/proposal.slides.pdfN9 G P1 P2 P3 P4 P5 P6 P7 P8 P9 31. Paragon: Partition Grouping P3 P2 P1 P9 P6 P4 P8 P7

Road Map

Introduction

Aragon

Paragon

Planar

Contention

Evaluation

Future Work

29

Page 30: Architecture-Aware Graph (Re)Partitioningpeople.cs.pitt.edu/~anz28/papers/proposal.slides.pdfN9 G P1 P2 P3 P4 P5 P6 P7 P8 P9 31. Paragon: Partition Grouping P3 P2 P1 P9 P6 P4 P8 P7

❖ Goal: Group neighbouring vertices as close as possible

Paragon: Parallel Architecture-Aware Graph Partitioning Refinement [EDBT’16]

Paragon vs Aragon○ lower overhead○ scale to much larger graphs

30

Page 31: Architecture-Aware Graph (Re)Partitioningpeople.cs.pitt.edu/~anz28/papers/proposal.slides.pdfN9 G P1 P2 P3 P4 P5 P6 P7 P8 P9 31. Paragon: Partition Grouping P3 P2 P1 P9 P6 P4 P8 P7

Paragon: Illustrate Paragon via an example

N1

N2

N3

N4

N5

N6

N7

N8

N9

G

P1

P2

P3

P4

P5

P6

P7

P8

P931

Page 32: Architecture-Aware Graph (Re)Partitioningpeople.cs.pitt.edu/~anz28/papers/proposal.slides.pdfN9 G P1 P2 P3 P4 P5 P6 P7 P8 P9 31. Paragon: Partition Grouping P3 P2 P1 P9 P6 P4 P8 P7

Paragon: Partition Grouping

P1P2P3

P4P6P9

P5P7P8

P1

P2

P3

P4

P5

P6

P7

P8

P9

N1

N2

N3

N4

N5

N6

N7

N8

N9

32

Page 33: Architecture-Aware Graph (Re)Partitioningpeople.cs.pitt.edu/~anz28/papers/proposal.slides.pdfN9 G P1 P2 P3 P4 P5 P6 P7 P8 P9 31. Paragon: Partition Grouping P3 P2 P1 P9 P6 P4 P8 P7

Paragon: Group Server Selection

P1

P2

P3

P4

P5

P6

P7

P8

P9

N1

N2

N3

N4

N5

N6

N7

N8

N9

N9

N8

N2P1P2P3

P4P6P9

P5P7P8

33

Page 34: Architecture-Aware Graph (Re)Partitioningpeople.cs.pitt.edu/~anz28/papers/proposal.slides.pdfN9 G P1 P2 P3 P4 P5 P6 P7 P8 P9 31. Paragon: Partition Grouping P3 P2 P1 P9 P6 P4 P8 P7

Paragon: Sending “Partition” to Group Servers

N1

N2

N3

N4

N5

N6

N7

N8

N9

P1

P2

P3

P4

P5

P6

P7

P8

P9

P1

P3

P4

P6

P5

P7

Only send boundary vertices

N9

N8

N2P1P2P3

P4P6P9

P5P7P8

34

Page 35: Architecture-Aware Graph (Re)Partitioningpeople.cs.pitt.edu/~anz28/papers/proposal.slides.pdfN9 G P1 P2 P3 P4 P5 P6 P7 P8 P9 31. Paragon: Partition Grouping P3 P2 P1 P9 P6 P4 P8 P7

Paragon: Parallel Refinement

Aragon

P1

P2

P3

P4

P5

P6

P7

P8

P9

P1

P3

P4

P6

P5

P7

Aragon

Aragon

N2

N3

N4

N5

N6

N7

N8

N9

N1

N9

N8

N2P1P2P3

P4P6P9

P5P7P8

Number of Groups○ Degree of Parallelism○ Parallelism vs Quality

35

Page 36: Architecture-Aware Graph (Re)Partitioningpeople.cs.pitt.edu/~anz28/papers/proposal.slides.pdfN9 G P1 P2 P3 P4 P5 P6 P7 P8 P9 31. Paragon: Partition Grouping P3 P2 P1 P9 P6 P4 P8 P7

Paragon: Shuffle Refinement

N2:

N9:

N8:

P1P2P4

P5P7P9

P3P6P8

Swap

Aragon

Aragon

Aragon

Parallel

P1P2P3

P4P6P9

P5P7P8

Repeat k times

To increase the # of partition pairs being refined!

36

Page 37: Architecture-Aware Graph (Re)Partitioningpeople.cs.pitt.edu/~anz28/papers/proposal.slides.pdfN9 G P1 P2 P3 P4 P5 P6 P7 P8 P9 31. Paragon: Partition Grouping P3 P2 P1 P9 P6 P4 P8 P7

Road Map

Introduction

Aragon

Paragon

Planar

Contention

Evaluation

Future Work

37

Page 38: Architecture-Aware Graph (Re)Partitioningpeople.cs.pitt.edu/~anz28/papers/proposal.slides.pdfN9 G P1 P2 P3 P4 P5 P6 P7 P8 P9 31. Paragon: Partition Grouping P3 P2 P1 P9 P6 P4 P8 P7

Planar: Parallel Lightweight Architecture-Aware Graph Repartitioning [ICDE’16]

Phase-1: Logical Vertex Migration

Phase-2: Physical Vertex Migration

Phase-3: Convergence Check

★ Migration Planning○ What vertices to move?○ Where to move?

★ Still beneficial?

★ Perform the Migration Plan

Sk Sk+1 Sk+2 Sk+4 Sk+5Pl

anar

Plan

ar

Plan

ar

Plan

ar

Plan

ar

○ Phase-1a: Minimizing Comm Cost○ Phase-1b: Ensuring Balanced Partitions

38

Page 39: Architecture-Aware Graph (Re)Partitioningpeople.cs.pitt.edu/~anz28/papers/proposal.slides.pdfN9 G P1 P2 P3 P4 P5 P6 P7 P8 P9 31. Paragon: Partition Grouping P3 P2 P1 P9 P6 P4 P8 P7

Phase-1a: Minimizing Comm Cost

N1 N2 N3

N1 6 1

N2 6 1

N3 1 1

N1 N2

N3

6

6

1

1

39

Page 40: Architecture-Aware Graph (Re)Partitioningpeople.cs.pitt.edu/~anz28/papers/proposal.slides.pdfN9 G P1 P2 P3 P4 P5 P6 P7 P8 P9 31. Paragon: Partition Grouping P3 P2 P1 P9 P6 P4 P8 P7

★ Run Planar on each partition in Parallel○ Each boundary vertex of my partition

■ make a migration decision on my own■ Probabilistic vertex migration

N1 N2

N3

6

6

1

1

Phase-1a: Minimizing Comm Cost

40

Page 41: Architecture-Aware Graph (Re)Partitioningpeople.cs.pitt.edu/~anz28/papers/proposal.slides.pdfN9 G P1 P2 P3 P4 P5 P6 P7 P8 P9 31. Paragon: Partition Grouping P3 P2 P1 P9 P6 P4 P8 P7

N1 N2

N3

Phase-1a: Minimizing Comm Cost

6

6

1

1

★ Run Planar on each partition in Parallel○ Each boundary vertex of my partition

■ make a migration decision on my own■ Probabilistic vertex migration

41

Page 42: Architecture-Aware Graph (Re)Partitioningpeople.cs.pitt.edu/~anz28/papers/proposal.slides.pdfN9 G P1 P2 P3 P4 P5 P6 P7 P8 P9 31. Paragon: Partition Grouping P3 P2 P1 P9 P6 P4 P8 P7

N1 N2

N3

Phase-1a@N1: Use vertex a as an example

6

6

1

1

g(a, N1, N1) = 0

★ Run Planar on each partition in Parallel○ Each boundary vertex of my partition

■ make a migration decision on my own■ Probabilistic vertex migration

Max Gain: 0Optimal Dest: N1

42

Page 43: Architecture-Aware Graph (Re)Partitioningpeople.cs.pitt.edu/~anz28/papers/proposal.slides.pdfN9 G P1 P2 P3 P4 P5 P6 P7 P8 P9 31. Paragon: Partition Grouping P3 P2 P1 P9 P6 P4 P8 P7

N1 N2

N3

Phase-1a@N1: Move vertex a to N2?

new_comm(a, N2) = 1 * 6 + 1 * 1 = 7

g(a, N1, N2) = 13 - 7 - 6 = 0

old_comm(a, N1) = 2 * 6 + 1 * 1 = 13

mig(a, N1, N2) = 1 * 6 = 6

★ Run Planar on each partition in Parallel○ Each boundary vertex of my partition

■ make a migration decision on my own■ Probabilistic vertex migration

Max Gain: 0Optimal Dest: N1

6

6

1

1

N1

N2

N3

6

1

1

43

Page 44: Architecture-Aware Graph (Re)Partitioningpeople.cs.pitt.edu/~anz28/papers/proposal.slides.pdfN9 G P1 P2 P3 P4 P5 P6 P7 P8 P9 31. Paragon: Partition Grouping P3 P2 P1 P9 P6 P4 P8 P7

N2

N3

6

6

1

N1

Phase-1a@N1: Move vertex a to N3?

new_comm(a, N3) = 1 * 1 + 2 * 1 = 3

old_comm(a, N1) = 2 * 6 + 1 * 1 = 13

mig(a, N1, N3) = 1 * 1 = 1

g(a, N1, N3) = 13 - 3 - 1 = 9

★ Run Planar on each partition in Parallel○ Each boundary vertex of my partition

■ make a migration decision on my own■ Probabilistic vertex migration

Max Gain: 9Optimal Dest: N3

1

N1N2

N3

11

1

1

44

Page 45: Architecture-Aware Graph (Re)Partitioningpeople.cs.pitt.edu/~anz28/papers/proposal.slides.pdfN9 G P1 P2 P3 P4 P5 P6 P7 P8 P9 31. Paragon: Partition Grouping P3 P2 P1 P9 P6 P4 P8 P7

Phase-1a: Probabilistic Vertex Migration

Partition N1

Boundary Vtx a

Migration Dest N3

Gain 9

N2

b d

N3 N3

2 3

Migration Planning

Probability 9/9 2/3 3/3

Max Gain 9 3

N1 N2

N3

6

6

1

1

Migrate with a probability proportional to the gain

0

0 0

N3

e g

N3 N3

0 0

45

Page 46: Architecture-Aware Graph (Re)Partitioningpeople.cs.pitt.edu/~anz28/papers/proposal.slides.pdfN9 G P1 P2 P3 P4 P5 P6 P7 P8 P9 31. Paragon: Partition Grouping P3 P2 P1 P9 P6 P4 P8 P7

Phase-1b: Balancing Partitions

❖ Quota-Based Vertex Migration

Q2: What vertices to migrate?■ Phase-1a vertex migration, but limited by the quota.

Q1: How much work should each overloaded partition migrate to each underloaded partition?

■ Potential Gain Computation

● Similar to Phase-1a vertex gain computation

■ Iteratively allocate quota starting from the partition pair having the largest gain.

46

Page 47: Architecture-Aware Graph (Re)Partitioningpeople.cs.pitt.edu/~anz28/papers/proposal.slides.pdfN9 G P1 P2 P3 P4 P5 P6 P7 P8 P9 31. Paragon: Partition Grouping P3 P2 P1 P9 P6 P4 P8 P7

Planar: Physical Vertex Migration

Phase-1: Logical Vertex Migration

Phase-2: Physical Vertex Migration

Phase-3: Convergence Check

★ Migration Planning○ What vertices to move?○ Where to move?

★ Still beneficial?

★ Perform the Migration Plan

Sk Sk+1 Sk+2 Sk+4 Sk+5Pl

anar

Plan

ar

Plan

ar

Plan

ar

Plan

ar

○ Phase-1a: Minimizing Comm Cost○ Phase-1b: Ensuring Balanced Partitions

47

Page 48: Architecture-Aware Graph (Re)Partitioningpeople.cs.pitt.edu/~anz28/papers/proposal.slides.pdfN9 G P1 P2 P3 P4 P5 P6 P7 P8 P9 31. Paragon: Partition Grouping P3 P2 P1 P9 P6 P4 P8 P7

Planar: Convergence Check

Phase-1: Logical Vertex Migration

Phase-2: Physical Vertex Migration

Phase-3: Convergence Check

★ Migration Planning○ What vertices to move?○ Where to move?

★ Still beneficial?

★ Perform the Migration Plan

Sk Sk+1 Sk+2 Sk+4 Sk+5Pl

anar

Plan

ar

Plan

ar

Plan

ar

Plan

ar

○ Phase-1a: Minimizing Comm Cost○ Phase-1b: Ensuring Balanced Partitions

48

Page 49: Architecture-Aware Graph (Re)Partitioningpeople.cs.pitt.edu/~anz28/papers/proposal.slides.pdfN9 G P1 P2 P3 P4 P5 P6 P7 P8 P9 31. Paragon: Partition Grouping P3 P2 P1 P9 P6 P4 P8 P7

Phase-3: Convergence

Sk Sk+1 Sk+2 Sk+4 Sk+5

ConvergeEnough changes (structure/load)

Repartitioning Epoch

★ Converge○ improvement achieved per adaptation superstep < ○ after consecutive adaptation supersteps

Plan

ar

Plan

ar

Plan

ar

Plan

ar

Plan

ar

= 1% and = 10 (via Sensitivity Analysis on 12 datasets)49

Page 50: Architecture-Aware Graph (Re)Partitioningpeople.cs.pitt.edu/~anz28/papers/proposal.slides.pdfN9 G P1 P2 P3 P4 P5 P6 P7 P8 P9 31. Paragon: Partition Grouping P3 P2 P1 P9 P6 P4 P8 P7

Road Map

Introduction

Aragon

Paragon

Planar

Contention

Evaluation

Future Work

50

Page 51: Architecture-Aware Graph (Re)Partitioningpeople.cs.pitt.edu/~anz28/papers/proposal.slides.pdfN9 G P1 P2 P3 P4 P5 P6 P7 P8 P9 31. Paragon: Partition Grouping P3 P2 P1 P9 P6 P4 P8 P7

Network may not always be the bottleneck!

★ Dual-socket Xeon E5v2 server with ○ DDR3-1600○ 2 FDR 4x NICs per socket

Revisit the Impact of Memory Subsystem Carefully!

★ Infiniband: 1.7GB/s~37.5GB/s ★ DDR3: 6.25GB/s~16.6GB/s

Network vs Memory Bandwidth (C. Bing, CoRR’15)

51

Page 52: Architecture-Aware Graph (Re)Partitioningpeople.cs.pitt.edu/~anz28/papers/proposal.slides.pdfN9 G P1 P2 P3 P4 P5 P6 P7 P8 P9 31. Paragon: Partition Grouping P3 P2 P1 P9 P6 P4 P8 P7

Contention on Memory Subsystems: Intra-node data comm via shared memory!

Send Buffer

Sending Core Receiving Core

Receive BufferShared Buffer

1. Load 3. Load2b. Write

2a. Load 4a. Load

4b. Write

52

Page 53: Architecture-Aware Graph (Re)Partitioningpeople.cs.pitt.edu/~anz28/papers/proposal.slides.pdfN9 G P1 P2 P3 P4 P5 P6 P7 P8 P9 31. Paragon: Partition Grouping P3 P2 P1 P9 P6 P4 P8 P7

Cached Send/Shared/Receive Buffer

Contention on Memory Subsystems: Intra-node data comm-->cache pollution!

Multiple copies of the same data in LLC, contending for LLC and MC

53

Page 54: Architecture-Aware Graph (Re)Partitioningpeople.cs.pitt.edu/~anz28/papers/proposal.slides.pdfN9 G P1 P2 P3 P4 P5 P6 P7 P8 P9 31. Paragon: Partition Grouping P3 P2 P1 P9 P6 P4 P8 P7

Contention on Memory Subsystems: Intra-node data comm-->cache pollution!

Cached Send/Shared Buffer Cached Receive/Shared Buffer

Multiple copies of the same data in LLC, contending for LLC, MC, and QPI

54

Page 55: Architecture-Aware Graph (Re)Partitioningpeople.cs.pitt.edu/~anz28/papers/proposal.slides.pdfN9 G P1 P2 P3 P4 P5 P6 P7 P8 P9 31. Paragon: Partition Grouping P3 P2 P1 P9 P6 P4 P8 P7

(P)aragon/Planar Contention Avoiding: Penalize intra-node network comm cost!

Intra-Node Network Comm Cost

Maximal Inter-Node Network Comm Cost

Degree of Contention

System Bottleneck

Clusters with High-Speed Networks

Geo-distributed clusters or Cloud

Memory (λ=1) Network (λ=0)

55

Page 56: Architecture-Aware Graph (Re)Partitioningpeople.cs.pitt.edu/~anz28/papers/proposal.slides.pdfN9 G P1 P2 P3 P4 P5 P6 P7 P8 P9 31. Paragon: Partition Grouping P3 P2 P1 P9 P6 P4 P8 P7

(P)aragon/Planar Contention Avoiding: RDMA allows inter-node data comm without polluting the cache!

SendBuffer

Sending Core

Node#1

IB HCA

Receive Buffer

Receiving Core

Node#2

IB HCA

56

Page 57: Architecture-Aware Graph (Re)Partitioningpeople.cs.pitt.edu/~anz28/papers/proposal.slides.pdfN9 G P1 P2 P3 P4 P5 P6 P7 P8 P9 31. Paragon: Partition Grouping P3 P2 P1 P9 P6 P4 P8 P7

Road Map

Introduction

Aragon

Paragon

Planar

Contention

Evaluation

Future Work

57

Page 58: Architecture-Aware Graph (Re)Partitioningpeople.cs.pitt.edu/~anz28/papers/proposal.slides.pdfN9 G P1 P2 P3 P4 P5 P6 P7 P8 P9 31. Paragon: Partition Grouping P3 P2 P1 P9 P6 P4 P8 P7

Evaluation

❖ Microbenchmarks▪ Partitioning Quality

❖ Real-World Workloads▪ Breadth First Search (BFS)▪ Single Source Shortest Path (SSSP)

❖ Scalability Test▪ Scalability vs Graph Size▪ Scalability vs # of Partitions▪ Scalability vs Graph Size and # of Partitions

58

Page 59: Architecture-Aware Graph (Re)Partitioningpeople.cs.pitt.edu/~anz28/papers/proposal.slides.pdfN9 G P1 P2 P3 P4 P5 P6 P7 P8 P9 31. Paragon: Partition Grouping P3 P2 P1 P9 P6 P4 P8 P7

Real-World Workload: Setup

Cluster Configuration MPICluster(FDR Infiniband)

Gordon(QDR Infiniband)

# of Nodes 32 1024

Network Topology Single Switch(32 nodes / switch)

4*4*4 3D Torus of Switches(16 nodes / switch)

Network Bandwidth 56Gbps 8Gbps

Node Configuration MPICluster(Intel Haswell)

Gordon(Intel Sandy Bridge)

# of Sockets 2(10 cores / socket)

2(8 cores / socket)

L3 Cache 25MB 20MB

Memory Bandwidth 65GB/s 85GB/s

59

Page 60: Architecture-Aware Graph (Re)Partitioningpeople.cs.pitt.edu/~anz28/papers/proposal.slides.pdfN9 G P1 P2 P3 P4 P5 P6 P7 P8 P9 31. Paragon: Partition Grouping P3 P2 P1 P9 P6 P4 P8 P7

Real-World Workload: System Bottleneck

Intra-Node Network Comm Cost

Maximal Inter-Node Network Comm Cost

Degree of Contention

System Bottleneck

MPICluster Gordon

Memory (λ=1) Network (λ=0)

60

Page 61: Architecture-Aware Graph (Re)Partitioningpeople.cs.pitt.edu/~anz28/papers/proposal.slides.pdfN9 G P1 P2 P3 P4 P5 P6 P7 P8 P9 31. Paragon: Partition Grouping P3 P2 P1 P9 P6 P4 P8 P7

Real-World Workload: Baselines

CatchW

’13

Balanced Graph (Re)Partitioning

Partitioners(static graphs)

Repartitioners(dynamic graphs)

Metis’95

ICA3PP’08

SoCC’12

TKDE’15

BigData

’15

DG/LDG’12

Offline Methods(High Quality)

(Poor Scalability)

Online Methods(Moderate Quality)(High Scalability)

Parmeti

s’97

Aragon’14

Para

gon’1

6xd

gp’13

Hermes

’15

Mizan’1

3

Offline Methods(High Quality)

(Poor Scalability)

Online Methods(Moderate~High Quality)

(High Scalability)

Plan

ar

uniPlanar

Initial Partitioner: DG 61

Page 62: Architecture-Aware Graph (Re)Partitioningpeople.cs.pitt.edu/~anz28/papers/proposal.slides.pdfN9 G P1 P2 P3 P4 P5 P6 P7 P8 P9 31. Paragon: Partition Grouping P3 P2 P1 P9 P6 P4 P8 P7

BFS Exec. Time on MPICluster (λ=1): Planar achieved up to 9x speedups

9x

7.5x

5.8x

4.1x

1.48x 1.37x 1x

★ as-skitter: |V|=1.6M, |E| = 22M

★ 60 Partitions: three 20-core machines

62

Page 63: Architecture-Aware Graph (Re)Partitioningpeople.cs.pitt.edu/~anz28/papers/proposal.slides.pdfN9 G P1 P2 P3 P4 P5 P6 P7 P8 P9 31. Paragon: Partition Grouping P3 P2 P1 P9 P6 P4 P8 P7

BFS Comm Volume on MPICluster (λ=1): Planar had the lowest intra-node comm volume

★ as-skitter: |V|=1.6M, |E| = 22M

★ 60 Partitions: three 20-core machines

Reduction Intra-Socket

Inter-Socket

DG 51% 38%

METIS 51% 36%

PARMETIS 47% 34%

uniPLANAR 44% 28%

ARAGON 4.3% 0.8%

PARAGON 5.2% 2.6%

63

Page 64: Architecture-Aware Graph (Re)Partitioningpeople.cs.pitt.edu/~anz28/papers/proposal.slides.pdfN9 G P1 P2 P3 P4 P5 P6 P7 P8 P9 31. Paragon: Partition Grouping P3 P2 P1 P9 P6 P4 P8 P7

3.2x

1.05x 1.16x 1.21x

BFS Exec. Time on Gordon (λ=0): Planar achieved up to 3.2x speedups

1x

★ as-skitter: |V|=1.6M, |E| = 22M

★ 48 Partitions: three 16-core machines

64

Page 65: Architecture-Aware Graph (Re)Partitioningpeople.cs.pitt.edu/~anz28/papers/proposal.slides.pdfN9 G P1 P2 P3 P4 P5 P6 P7 P8 P9 31. Paragon: Partition Grouping P3 P2 P1 P9 P6 P4 P8 P7

51%

25%

11% 0.1%

BFS Comm. Volume on Gordon (λ=0): Planar had the lowest inter-node comm volume

★ as-skitter: |V|=1.6M, |E| = 22M

★ 48 Partitions: three 16-core machines

65

Page 66: Architecture-Aware Graph (Re)Partitioningpeople.cs.pitt.edu/~anz28/papers/proposal.slides.pdfN9 G P1 P2 P3 P4 P5 P6 P7 P8 P9 31. Paragon: Partition Grouping P3 P2 P1 P9 P6 P4 P8 P7

Road Map

Introduction

Aragon

Paragon

Planar

Contention

Evaluation

Future Work

66

Page 67: Architecture-Aware Graph (Re)Partitioningpeople.cs.pitt.edu/~anz28/papers/proposal.slides.pdfN9 G P1 P2 P3 P4 P5 P6 P7 P8 P9 31. Paragon: Partition Grouping P3 P2 P1 P9 P6 P4 P8 P7

Argo: Architecture-Aware Graph Partitioning(05/2016~09/2016)

❖ Goal: make initial partitioning architecture-aware❑ To further confirm the contention issue

(experimentally)o By collecting a set of low-level metrics

✓ (e.g., cache misses, TLB misses)❑ Architecture-aware static graph partitioner

o For the initial partitioning step

67

Page 68: Architecture-Aware Graph (Re)Partitioningpeople.cs.pitt.edu/~anz28/papers/proposal.slides.pdfN9 G P1 P2 P3 P4 P5 P6 P7 P8 P9 31. Paragon: Partition Grouping P3 P2 P1 P9 P6 P4 P8 P7

Sargon: Skew-Resistant Graph Partitioning(05/2016~09/2016)

❖ Goal: make initial partitioning skew-resistant❑ Workload characteristics

➢ Traversal-style graph workloads (e.g., BFS/SSSP)✓ Not all vertices are always active✓ A balanced partitioning of the entire graph ≠ even

active vertex distribution

68

Page 69: Architecture-Aware Graph (Re)Partitioningpeople.cs.pitt.edu/~anz28/papers/proposal.slides.pdfN9 G P1 P2 P3 P4 P5 P6 P7 P8 P9 31. Paragon: Partition Grouping P3 P2 P1 P9 P6 P4 P8 P7

Sargon: Skew-Resistant Graph Partitioning(05/2016~09/2016)

❖ Goal: make initial partitioning skew-resistant❑ Workload characteristics

➢ Traversal-style graph workloads (e.g., BFS/SSSP)✓ Not all vertices are always active✓ A balanced partitioning of the entire graph ≠ even active vertex distribution

❑ Graph structure characteristics➢ Skewed vertex degree distribution (scale-free)➢ A balanced partitioning of the entire graph ≠ even

high-degree vertex distribution.

69

Page 70: Architecture-Aware Graph (Re)Partitioningpeople.cs.pitt.edu/~anz28/papers/proposal.slides.pdfN9 G P1 P2 P3 P4 P5 P6 P7 P8 P9 31. Paragon: Partition Grouping P3 P2 P1 P9 P6 P4 P8 P7

Skew-Resistant Graph Repartitioning(09/2016~12/2016)

❖ Goal: make repartitioning skew-resistant❑ Workload characteristics

➢ Traversal-style graph workloads (e.g., BFS/SSSP)✓ Not all vertices are always active✓ A balanced partitioning of the entire graph ≠ even

active vertex distribution❑ Graph structure characteristics

➢ Skewed vertex degree distribution (scale-free)➢ A balanced partitioning of the entire graph ≠ even

high-degree vertex distribution.

70

Page 71: Architecture-Aware Graph (Re)Partitioningpeople.cs.pitt.edu/~anz28/papers/proposal.slides.pdfN9 G P1 P2 P3 P4 P5 P6 P7 P8 P9 31. Paragon: Partition Grouping P3 P2 P1 P9 P6 P4 P8 P7

Research Overview & Timeline

Timeline Work ProgressAragon: Heterogeneity-Aware Graph Partition Refinement

Completed [BigGraphs’14]

Paragon: Parallel Architecture-Aware Graph Partition Refinement

Completed [EDBT’16]

Planar: Parallel Lightweight Architecture-Aware Graph Repartitioning

Completed [ICDE’16]

05/2016~09/2016 Argo: Architecture-Aware Graph Partitioning Evaluation & Writing

05/2016~09/2016 Sargon: Skew-Resistant Graph Partitioning Evaluation & Writing

09/2016~12/2016 Skew-Resistant Graph Repartitioning Algorithm Design

07/2016~03/2017 Thesis writing Ongoing

04/2017 Thesis defense

71

Page 72: Architecture-Aware Graph (Re)Partitioningpeople.cs.pitt.edu/~anz28/papers/proposal.slides.pdfN9 G P1 P2 P3 P4 P5 P6 P7 P8 P9 31. Paragon: Partition Grouping P3 P2 P1 P9 P6 P4 P8 P7

Thanks!

72