Ligra: A Lightweight Graph Processing Framework for Shared Memory Julian Shun Miller Institute, UC...

Ligra: A Lightweight Graph Processing Framework for

Shared Memory

Julian Shun Miller Institute, UC Berkeley

(work done while at Carnegie Mellon University)

Joint work with Laxman Dhulipala and Guy Blelloch

(Principles and Practice of Parallel Programming, 2013 and Data Compression Conference, 2015)

HPGM 2015 2

Large Graphs

Need to efficiently analyze these graphs• Computational efficiency• Space efficiency• Programming efficiency

HPGM 2015 3

Breadth-first Search (BFS)

• Compute a BFS tree rooted at source r containing all vertices reachable from r

rr

Frontier

• Can process each frontier in parallel• Race conditions, load balancing

HPGM 2015 4

BFS Abstractly• Operate on a subset of vertices• Map computation over subset of edges in parallel• Return new subset of vertices• (Map computation over subset of vertices in parallel)

Can we build an abstraction for these types of algorithms?

Breadth-first searchBetweenness centralityConnected components

Bellman-Ford shortest pathsGraph eccentricity estimationPageRank

All graph traversal algorithms do this!

HPGM 2015 5

Graph Processing Systems

• Existing: Pregel/Giraph, GraphLab, Pegasus, Knowledge Discovery Toolbox, GraphChi, Parallel BGL, and many others…

• Our system: Ligra - Lightweight graph processing system for shared memory• Efficient for “frontier-based” algorithms

HPGM 2015 6

Ligra

}

Shared memory Cilk Plus/OpenMP

Graph

VertexSubset

EdgeMap

VertexMap

• Operate on a subset of vertices• Map computation over subset of edges in parallel• Return new subset of vertices• (Map computation over subset of vertices in parallel)

HPGM 2015 7

Ligra Framework

0 4 68VertexSubset

4

7

52

1

0

6

8

3

68VertexSubset

bool f(v){ data[v] = data[v] + 1; return (data[v] == 1);}

4

0

6

8

VertexMap

HPGM 2015 8

Ligra Framework

4

7

52

1

0

6

8

3

0 4 68VertexSubset

VertexSubset

bool update(u,v){…}

4

0

6

8

5 24 71

bool cond(v){…}

F

7

52

1

4

T

T

T

T

T EdgeMap

HPGM 2015 9

Breadth-first Search in Ligraparents = {-1, …, -1}; //-1 indicates “unvisited”

procedure UPDATE(s, d):

return compare_and_swap(parents[d], -1, s);

procedure COND(i):

return parents[i] == -1; //checks if “unvisited”

procedure BFS(G, r):

parents[r] = r;

frontier = {r}; //VertexSubset

while (size(frontier) > 0):

frontier = EDGEMAP(G, frontier, UPDATE, COND);

frontier

HPGM 2015 10

Actual BFS code in Ligra

HPGM 2015 11

Sparse or Dense EdgeMap?

11

10

9

12

13

15

14

1

4

3

2

5

8

7

6

Frontier

• Dense method better when frontier is large and many vertices have been visited

• Sparse (traditional) method better for small frontiers

• Switch between the two methods based on frontier size [Beamer et al. SC ’12]

11

10

9

12

Limited to BFS?

HPGM 2015 12

EdgeMap

Loop through outgoing edges of frontier vertices in parallel

procedure EDGEMAP(G, frontier, Update, Cond):if (|frontier| + sum of out-degrees > threshold) then: return EDGEMAP_DENSE(G, frontier, Update,

Cond);else: return EDGEMAP_SPARSE(G, frontier, Update,

Cond);

Loop through incoming edges of “unexplored” vertices (in parallel), breaking early if possible

• Not limited to BFS!• Generalization to other problems: Bellman-Ford shortest

paths, betweenness centrality, graph eccentricity estimation, connected components, PageRank

• Users need not worry about this

HPGM 2015 13

PageRank

HPGM 2015 14

PageRank in Ligrap_curr = {1/|V|, …, 1/|V|}; p_next = {0, …, 0}; diff = {};

procedure UPDATE(s, d):

return atomic_increment(p_next[d], p_curr[s] / degree(s));

procedure COMPUTE(i):

p_next[i] = α ∙ p_next[i] + (1- α) ∙ (1/|V|);

diff[i] = abs(p_next[i] – p_curr[i]);

p_curr[i] = 0;

return 1;

procedure PageRank(G, α, ε):

frontier = {0, …, |V|-1};

error = ∞

while (error > ε):

frontier = EDGEMAP(G, frontier, UPDATE, CONDtrue);

frontier = VERTEXMAP(frontier, COMPUTE);

error = sum of diff entries;

swap(p_curr, p_next)

return p_curr;

HPGM 2015 15

PageRank• Sparse version?

• PageRank-Delta: Only update vertices whose PageRank value has changed by more than some Δ-fraction (discussed in GraphLab and McSherry WWW ‘05)

HPGM 2015 16

PageRank-Delta in Ligra

PR[i] = {1/|V|, …, 1/|V|};

nghSum = {0, …, 0};

Change = {}; //store changes in PageRank values

procedure UPDATE(s, d): //passed to EdgeMap

return atomic_increment(nghSum[d], Change[s] / degree(s));

procedure COMPUTE(i): //passed to VertexMap

Change[i] = α ∙ nghSum[i];

PR[i] = PR[i] + Change[i];

return (abs(Change[i]) > Δ); //check if absolute value of change is big enough

HPGM 2015 17

Ligra Performance

• Ligra performance close to hand-written code• Faster than distributed-memory on per-core basis • Several shared-memory graph processing systems subsequently

developed: Galois [SOSP ‘13], X-stream [SOSP ‘13], PRISM [SPAA ‘14], Polymer [PPoPP ‘15], Ringo [SIGMOD ‘15]

Page Rank (1 iteration)

BFS Connected Components

0

1

2

3

4

5

6Twitter graph (41M vertices, 1.5B edges)

GraphLab

Ligra (40-core machine)

Hand-written Cilk/OpenMP (40-core machine)

Ru

nn

ing

tim

e (s

eco

nd

s) (64 x 8-cores)

(64 x 32-cores)

(16 x 8-cores)

244 seconds

HPGM 2015 18

Large Graphs

6.6B edges

~20B edges (latest version)

~150B edges

• All fit in a Terabyte of memory; can fit on commodity shared memory machine

• What if you don’t have that much RAM?

Memory Required

Ru

nn

ing

Tim

e

Available RAM

HPGM 2015 19

• Difference encoding (using variable-length codes) for sorted edges per vertex

• Modify EdgeMap: parallel edge decoding on-the-fly• All hidden from the user!

Ligra+: Adding Graph Compression to Ligra

HPGM 2015 20

Ligra+: Adding Graph Compression to Ligra

soc-

LJ

cit-P

atents

com

-LJ

com

-O...

nlpkk

t240

Twitter

uk-unio

n

Yahoo0

0.10.20.30.40.50.60.70.80.9

1Space relative to Ligra

Ligra

Ligra+

BFS

Betweenness

Radii

Components

PageRank

Bellman-F

ord0.6

0.7

0.8

0.9

1

1.1

1.2

1.3

1.440-core time relative to Ligra

• Cost of decoding on-the-fly?• Memory bottleneck a bigger issue as graph

algorithms are memory-bound

HPGM 2015 21

Conclusion

• Ligra: lightweight graph processing framework for shared-memory• “frontier-based” algorithms• Switches computation based on frontier size

• Ligra+: extension which incorporates graph compression• Reduces space usage and improves parallel performance

• Code: http://github.com/jshun/ligra • An Evaluation of Parallel Eccentricity Estimation Algorithms on Undirected Real-World Graphs [KDD 2015]

RT06 Tue 13:00-15:00 Social and Graphs 2

Thank you!

http://github.com/jshun/ligra

Ligra: A Lightweight Graph Processing Framework for Shared Memory Julian Shun Miller Institute, UC...

Documents

Transcript of Ligra: A Lightweight Graph Processing Framework for Shared Memory Julian Shun Miller Institute, UC...