X-Stream: Edge-centric Graph Processing using Streaming ... Presented by: Amir H. Payberah (SICS)...

download X-Stream: Edge-centric Graph Processing using Streaming ... Presented by: Amir H. Payberah (SICS) X-Stream

of 34

  • date post

    25-Jul-2020
  • Category

    Documents

  • view

    1
  • download

    0

Embed Size (px)

Transcript of X-Stream: Edge-centric Graph Processing using Streaming ... Presented by: Amir H. Payberah (SICS)...

  • X-Stream: Edge-centric Graph Processing using Streaming Partitions

    Amitabha Roy, Ivo Mihailovic, Willy Zwaenepoel (EPFL - SOSP’13)

    Presented by: Amir H. Payberah amir@sics.se

    Feb. 7, 2014

    Presented by: Amir H. Payberah (SICS) X-Stream Feb. 7, 2014 1 / 32

  • Graphs

    Presented by: Amir H. Payberah (SICS) X-Stream Feb. 7, 2014 2 / 32

  • Big Graphs

    I Large graphs are a subset of the big data problem.

    I Billions of vertices and edges, hundreds of gigabytes.

    I Normally tackled on large clusters. • Pregel, Giraph, GraphLab, PowerGraph ... • Complexity, power consumption ...

    Presented by: Amir H. Payberah (SICS) X-Stream Feb. 7, 2014 3 / 32

  • Question

    Could we compute Big Graphs on a single machine?

    Presented by: Amir H. Payberah (SICS) X-Stream Feb. 7, 2014 4 / 32

  • Challenges

    I Disk-based processing • Problem: graph traversal = random access • Random access is inefficient for storage

    Eiko Y., and Roy A., “Scale-up Graph Processing: A Storage-centric View”, 2013.

    Presented by: Amir H. Payberah (SICS) X-Stream Feb. 7, 2014 5 / 32

  • Challenges

    I Disk-based processing • Problem: graph traversal = random access • Random access is inefficient for storage

    Eiko Y., and Roy A., “Scale-up Graph Processing: A Storage-centric View”, 2013.

    Presented by: Amir H. Payberah (SICS) X-Stream Feb. 7, 2014 5 / 32

  • Proposed Solution

    Solution

    X-Stream makes graph accesses sequential.

    Presented by: Amir H. Payberah (SICS) X-Stream Feb. 7, 2014 6 / 32

  • X-Stream Contribution

    I Edge-centric scatter-gather model

    I Streaming partitions

    Presented by: Amir H. Payberah (SICS) X-Stream Feb. 7, 2014 7 / 32

  • Edge-Centric Scatter-Gather Model

    Presented by: Amir H. Payberah (SICS) X-Stream Feb. 7, 2014 8 / 32

  • Scatter-Gather Programming Model

    I State stored in vertices.

    I Vertex operations: • Scatter updates along outgoing edges • Gather updates from incoming edges

    Presented by: Amir H. Payberah (SICS) X-Stream Feb. 7, 2014 9 / 32

  • Vertex-Centric Scatter

    I Iterates over vertices

    for each vertex v

    if v has update

    for each edge e from v

    scatter update along e

    Presented by: Amir H. Payberah (SICS) X-Stream Feb. 7, 2014 10 / 32

  • Vertex-Centric Scatter-Gather (1/5)

    for each vertex v

    if v has update

    for each edge e from v

    scatter update along e

    Presented by: Amir H. Payberah (SICS) X-Stream Feb. 7, 2014 11 / 32

  • Vertex-Centric Scatter-Gather (2/5)

    for each vertex v

    if v has update

    for each edge e from v

    scatter update along e

    Presented by: Amir H. Payberah (SICS) X-Stream Feb. 7, 2014 12 / 32

  • Vertex-Centric Scatter-Gather (3/5)

    for each vertex v

    if v has update

    for each edge e from v

    scatter update along e

    Presented by: Amir H. Payberah (SICS) X-Stream Feb. 7, 2014 13 / 32

  • Vertex-Centric Scatter-Gather (4/5)

    for each vertex v

    if v has update

    for each edge e from v

    scatter update along e

    Presented by: Amir H. Payberah (SICS) X-Stream Feb. 7, 2014 14 / 32

  • Vertex-Centric Scatter-Gather (5/5)

    for each vertex v

    if v has update

    for each edge e from v

    scatter update along e

    Presented by: Amir H. Payberah (SICS) X-Stream Feb. 7, 2014 15 / 32

  • Vertex-Centric vs. Edge-Centric Access

    Vertex-centric Edge-centric

    Presented by: Amir H. Payberah (SICS) X-Stream Feb. 7, 2014 16 / 32

  • Edge-Centric Scatter

    I Iterates over edges

    for each edge e

    if e.src has update

    scatter update along e

    Presented by: Amir H. Payberah (SICS) X-Stream Feb. 7, 2014 17 / 32

  • Edge-Centric Scatter-Gather (1/5)

    for each edge e

    if e.src has update

    scatter update along e

    Presented by: Amir H. Payberah (SICS) X-Stream Feb. 7, 2014 18 / 32

  • Edge-Centric Scatter-Gather (2/5)

    for each edge e

    if e.src has update

    scatter update along e

    Presented by: Amir H. Payberah (SICS) X-Stream Feb. 7, 2014 19 / 32

  • Edge-Centric Scatter-Gather (3/5)

    for each edge e

    if e.src has update

    scatter update along e

    Presented by: Amir H. Payberah (SICS) X-Stream Feb. 7, 2014 20 / 32

  • Edge-Centric Scatter-Gather (4/5)

    for each edge e

    if e.src has update

    scatter update along e

    Presented by: Amir H. Payberah (SICS) X-Stream Feb. 7, 2014 21 / 32

  • Edge-Centric Scatter-Gather (5/5)

    for each edge e

    if e.src has update

    scatter update along e

    Presented by: Amir H. Payberah (SICS) X-Stream Feb. 7, 2014 22 / 32

  • Vertex-Centric vs. Edge-Centric Tradeoff

    I Vertex-centric scatter-gather: EdgeDataRandomAccessBandwidth

    I Edge-centric scatter-gather: Scatters×EdgeDataSequentialAccessBandwidth

    I Sequential Access Bandwidth � Random Access Bandwidth.

    I Few scatter gather iterations for real world graphs.

    Presented by: Amir H. Payberah (SICS) X-Stream Feb. 7, 2014 23 / 32

  • Streaming Partitions

    Presented by: Amir H. Payberah (SICS) X-Stream Feb. 7, 2014 24 / 32

  • Problem

    I Problem: still have random access to vertex set.

    Solution

    Partition the graph into streaming partitions.

    Presented by: Amir H. Payberah (SICS) X-Stream Feb. 7, 2014 25 / 32

  • Problem

    I Problem: still have random access to vertex set.

    Solution

    Partition the graph into streaming partitions.

    Presented by: Amir H. Payberah (SICS) X-Stream Feb. 7, 2014 25 / 32

  • Partitioning the Graph (1/2)

    Presented by: Amir H. Payberah (SICS) X-Stream Feb. 7, 2014 26 / 32

  • Partitioning the Graph (2/2)

    Random access for free.

    Presented by: Amir H. Payberah (SICS) X-Stream Feb. 7, 2014 27 / 32

  • Streaming Partition

    I A subset of the vertices that fits in RAM.

    I All edges whose source vertex is in that subset.

    I No requirement on quality of the partition, e.g., sorting edges.

    I Consists of three sets: vertex set, edge list, and update list.

    Presented by: Amir H. Payberah (SICS) X-Stream Feb. 7, 2014 28 / 32

  • Streaming Partition Scatter-Gather

    I The scatter phase iterates over all streaming partitions, rather than over all edges.

    I The gather phase iterates over all streaming partitions, rather than over all updates.

    I The vertex sets and edge lists remain fixed during the entire com- putation.

    I The update list of a partition varies over time.

    Presented by: Amir H. Payberah (SICS) X-Stream Feb. 7, 2014 29 / 32

  • X-Stream Edge-Centric Scatter-Gather

    // Scatter phase

    for each streaming_partition p {

    read in vertex set of p

    for each edge e in edge list of p

    edge_scatter(e): append update to Uout

    }

    // Shuffle phase

    for each update u in Uout {

    let p = partition containing target of u

    append u to Uin(p)

    }

    destroy Uout

    // Gatter phase

    for each streaming_partition p {

    read in vertex set of p

    for each update u in Uin(p)

    edge_gather(u): apply update u to u.destination

    destroy Uin(p)

    }

    Presented by: Amir H. Payberah (SICS) X-Stream Feb. 7, 2014 30 / 32

  • Summary

    I X-Stream

    I Scatter-Gather model

    I Edge-centric: sequential access to the graph edges

    I Streaming partition: vertex set, edge list, and update list

    Presented by: Amir H. Payberah (SICS) X-Stream Feb. 7, 2014 31 / 32

  • Questions?

    Acknowledgement

    Some slides were derived from the slides of Amitabha Roy (EPFL)

    Presented by: Amir H. Payberah (SICS) X-Stream Feb. 7, 2014 32 / 32