Resource Management and Query Optimization in the Cloud Centralized Resource Management [YARN,...

download Resource Management and Query Optimization in the Cloud Centralized Resource Management [YARN, Mesos,

of 48

  • date post

    17-Apr-2020
  • Category

    Documents

  • view

    0
  • download

    0

Embed Size (px)

Transcript of Resource Management and Query Optimization in the Cloud Centralized Resource Management [YARN,...

  • cluster ROI Consolidate workloads

    • Wide variety

    Workload Heterogeneity in Shared Clusters

  • APIs

    high cluster utilization

    centralized distributed

    Resource Management in Shared Clusters

  • Centralized Resource Management [YARN, Mesos, Omega, Borg]

    Node Manager

    Node Manager

    Node Manager

  • Centralized Resource Management [YARN, Mesos, Omega, Borg]

    Node Manager

    Node Manager

    Node Manager

  • Centralized Resource Management [YARN, Mesos, Omega, Borg]

    Node Manager

    Node Manager

    Node Manager

    1. Request

  • Centralized Resource Management [YARN, Mesos, Omega, Borg]

    Node Manager

    Node Manager

    Node Manager

    1. Request

    2. Allocation

  • Centralized Resource Management [YARN, Mesos, Omega, Borg]

    Node Manager

    Node Manager

    Node Manager

    1. Request

    2. Allocation

    3. Start task

  • Distributed Resource Management [Apollo, Sparrow]

    Node Manager

    Node Manager

    Node Manager

  • Distributed Resource Management [Apollo, Sparrow]

    Node Manager

    Node Manager

    Node Manager

  • Distributed Resource Management [Apollo, Sparrow]

    Node Manager

    Node Manager

    Node Manager

  • Distributed Resource Management [Apollo, Sparrow]

    Node Manager

    Node Manager

    Node Manager

  • Centralized vs. Distributed Scheduling

    Centralized Distributed

    Workload heterogeneity 

    Task placement 

    Enforcing scheduling

    invariants 

    Allocation latency 

    Slot utilization 

    Scalability 

  • • “Trade performance guarantees for allocation latency”

    choose among scheduling types Based on job type, job characteristics, cluster load, etc.

    Mercury provides a programmatic way to use otherwise idle resources

    Mercury: Key Insight

  • • “Trade performance guarantees for allocation latency”

    choose among scheduling types Based on job type, job characteristics, cluster load, etc.

    Mercury provides a programmatic way to use otherwise idle resources

    Mercury: Key Insight

    Gains over YARN:

    Up to 40% task throughput

    Up to 66% mean job latency

  • Mercury Architecture (Conceptual)

    Mercury Runtime

    Mercury Runtime

    Mercury Runtime

    Mercury Resource Management Framework •

  • Mercury Architecture (Conceptual)

    Mercury Runtime

    Mercury Runtime

    Mercury Runtime

    Mercury Resource Management Framework •

  • Mercury Architecture (Conceptual)

    Mercury Runtime

    Mercury Runtime

    Mercury Runtime

    Mercury Resource Management Framework •

    resource type

  • Mercury Architecture (Conceptual)

    Mercury Runtime

    Mercury Runtime

    Mercury Runtime

    Mercury Resource Management Framework •

    resource type

  • Container Types

    GUARANTEED containers

    QUEUEABLE containers

    • opportunistically

  • Container Types

    GUARANTEED containers

    QUEUEABLE containers

    • opportunistically

    central

    distributed

  • Use of Container Types in DAGs: Examples

  • • LocalRM

    • Queuing

    • Framework

    • Application

    Mercury Architecture over YARN

  • GUARANTEED Request and Allocation

  • GUARANTEED Request and Allocation

  • GUARANTEED Request and Allocation

    request(GUARANTEED, …)

  • GUARANTEED Request and Allocation

    request(GUARANTEED, …)

    allocate(…)

  • GUARANTEED Request and Allocation

    request(GUARANTEED, …)

    allocate(…)

  • GUARANTEED Request and Allocation

    start(GUARANTEED, …)

    request(GUARANTEED, …)

    allocate(…)

  • QUEUEABLE Request and Allocation

  • QUEUEABLE Request and Allocation

  • QUEUEABLE Request and Allocation

    request(QUEUEABLE, …)

  • QUEUEABLE Request and Allocation

    request(QUEUEABLE, …)allocate(…)

  • QUEUEABLE Request and Allocation

    start(QUEUEABLE, …)

    request(QUEUEABLE, …)allocate(…)

  • Task Execution: Conflict Resolution two priorities

    types of schedulers shared resources

  • Application Policies

    • container type to be requested for each task

  • Framework Policies

  • rebalance

    reordering job arrival time

    QUEUEABLE containers per node

    Load Shaping Policies

    Mercury Runtime

    Mercury Runtime

    Mercury Runtime

  • Experimental Setup

  • Task Throughput for Increasing Task Duration

  • Cosmos-based Workload: Task Throughput

  • Cosmos-based Workload: Job Latency

  • Application Engines

    M/R AM REEFTezSpark Runtime

    Cluster-wide resource management: YARN++

    YARN + Federation

    YARN + Rayon

    YARN + Mercury

    YARN + Mercury

    YARN + Mercury YARN + Mercury YARN + Mercury

    Per-job/framework Resource Management

    Hive …Storm Giraph PigSpark

    The Bigger Picture

  • Conclusion