Sparrow Distributed Low-Latency Spark Scheduling Kay Ousterhout, Patrick Wendell, Matei Zaharia, Ion...

36
Sparrow Distributed Low-Latency Spark Scheduling Kay Ousterhout, Patrick Wendell, Matei Zaharia, Ion Stoica

Transcript of Sparrow Distributed Low-Latency Spark Scheduling Kay Ousterhout, Patrick Wendell, Matei Zaharia, Ion...

Page 1: Sparrow Distributed Low-Latency Spark Scheduling Kay Ousterhout, Patrick Wendell, Matei Zaharia, Ion Stoica.

SparrowDistributed Low-Latency Spark Scheduling

Kay Ousterhout, Patrick Wendell, Matei Zaharia, Ion Stoica

Page 2: Sparrow Distributed Low-Latency Spark Scheduling Kay Ousterhout, Patrick Wendell, Matei Zaharia, Ion Stoica.

Outline

The Spark scheduling bottleneck

Sparrow’s fully distributed, fault-tolerant technique

Sparrow’s near-optimal performance

Page 3: Sparrow Distributed Low-Latency Spark Scheduling Kay Ousterhout, Patrick Wendell, Matei Zaharia, Ion Stoica.

Spark Today

WorkerWorkerWorkerWorkerWorker

Worker

Spark ContextUser

1User

2User

3

Query Compilation

Storage

Scheduling

Page 4: Sparrow Distributed Low-Latency Spark Scheduling Kay Ousterhout, Patrick Wendell, Matei Zaharia, Ion Stoica.

Spark Today

WorkerWorkerWorkerWorkerWorker

Worker

Spark ContextUser

1User

2User

3

Query Compilation

Storage

Scheduling

Page 5: Sparrow Distributed Low-Latency Spark Scheduling Kay Ousterhout, Patrick Wendell, Matei Zaharia, Ion Stoica.

Job Latencies Rapidly Decreasing

10 min.

10 sec.

100 ms

1 ms

2004: MapReducebatch job

2009: Hive

query

2010: Dremel Query

2012: Impala query

2010:In-

memory Spark query

2013:Spark

streaming

Page 6: Sparrow Distributed Low-Latency Spark Scheduling Kay Ousterhout, Patrick Wendell, Matei Zaharia, Ion Stoica.

Job latencies rapidly decreasing

Page 7: Sparrow Distributed Low-Latency Spark Scheduling Kay Ousterhout, Patrick Wendell, Matei Zaharia, Ion Stoica.

Job latencies rapidly decreasing+

Spark deployments growing in size

Scheduling bottleneck!

Page 8: Sparrow Distributed Low-Latency Spark Scheduling Kay Ousterhout, Patrick Wendell, Matei Zaharia, Ion Stoica.

Spark scheduler throughput:

1500 tasks / second

1 second 100100 ms

10

10 second 1000

Task DurationCluster size(# 16-core machines)

Page 9: Sparrow Distributed Low-Latency Spark Scheduling Kay Ousterhout, Patrick Wendell, Matei Zaharia, Ion Stoica.

Optimizing the Spark Scheduler

0.8: Monitoring code moved off critical path

0.8.1: Result deserialization moved off critical path

Future improvements may yield 2-3x higher throughput

Page 10: Sparrow Distributed Low-Latency Spark Scheduling Kay Ousterhout, Patrick Wendell, Matei Zaharia, Ion Stoica.

Is the scheduler the bottleneck in my cluster?

Page 11: Sparrow Distributed Low-Latency Spark Scheduling Kay Ousterhout, Patrick Wendell, Matei Zaharia, Ion Stoica.

WorkerWorkerWorkerWorkerWorker

Worker

Cluster Scheduler

Task launch

Task completion

Page 12: Sparrow Distributed Low-Latency Spark Scheduling Kay Ousterhout, Patrick Wendell, Matei Zaharia, Ion Stoica.

WorkerWorkerWorkerWorkerWorker

Worker

Cluster Scheduler

Task launch

Task completion

Page 13: Sparrow Distributed Low-Latency Spark Scheduling Kay Ousterhout, Patrick Wendell, Matei Zaharia, Ion Stoica.

WorkerWorkerWorkerWorkerWorker

Worker

Cluster Scheduler

Task launch

Task completion

Scheduler

delay

Page 14: Sparrow Distributed Low-Latency Spark Scheduling Kay Ousterhout, Patrick Wendell, Matei Zaharia, Ion Stoica.
Page 15: Sparrow Distributed Low-Latency Spark Scheduling Kay Ousterhout, Patrick Wendell, Matei Zaharia, Ion Stoica.
Page 16: Sparrow Distributed Low-Latency Spark Scheduling Kay Ousterhout, Patrick Wendell, Matei Zaharia, Ion Stoica.

Spark Today

WorkerWorkerWorkerWorkerWorker

Worker

Spark ContextUser

1User

2User

3

Query Compilation

Storage

Scheduling

Page 17: Sparrow Distributed Low-Latency Spark Scheduling Kay Ousterhout, Patrick Wendell, Matei Zaharia, Ion Stoica.

Future Spark

WorkerWorkerWorkerWorkerWorker

Worker

User 1

User 2

User 3

SchedulerQuery

compilation

SchedulerQuery

compilation

SchedulerQuery

compilation

Benefits:High

throughputFault

tolerance

Page 18: Sparrow Distributed Low-Latency Spark Scheduling Kay Ousterhout, Patrick Wendell, Matei Zaharia, Ion Stoica.

Future Spark

WorkerWorkerWorkerWorkerWorker

Worker

User 1

User 2

User 3

SchedulerQuery

compilation

SchedulerQuery

compilation

SchedulerQuery

compilation

Storage:

Tachyon

Page 19: Sparrow Distributed Low-Latency Spark Scheduling Kay Ousterhout, Patrick Wendell, Matei Zaharia, Ion Stoica.

Scheduling with Sparrow

WorkerWorkerWorkerWorkerWorker

Scheduler

Scheduler

Scheduler

SchedulerStage

Worker

Page 20: Sparrow Distributed Low-Latency Spark Scheduling Kay Ousterhout, Patrick Wendell, Matei Zaharia, Ion Stoica.

Stage

Batch Sampling

WorkerWorkerWorkerWorkerWorker

Scheduler

Scheduler

Scheduler

Scheduler

Worker

Place m tasks on the least loaded of 2m workers

4 probes (d =

2)

Page 21: Sparrow Distributed Low-Latency Spark Scheduling Kay Ousterhout, Patrick Wendell, Matei Zaharia, Ion Stoica.

Queue length poor predictor of wait time

Worker

Worker

80 ms155

ms

530 ms

Poor performance on heterogeneous workloads

Page 22: Sparrow Distributed Low-Latency Spark Scheduling Kay Ousterhout, Patrick Wendell, Matei Zaharia, Ion Stoica.

Stage

Late Binding

Worker

Worker

Worker

Worker

Worker

Scheduler

Scheduler

SchedulerScheduler

Worker

Place m tasks on the least loaded of dm workers

4 probes (d =

2)

Page 23: Sparrow Distributed Low-Latency Spark Scheduling Kay Ousterhout, Patrick Wendell, Matei Zaharia, Ion Stoica.

Late Binding

Scheduler

Scheduler

SchedulerScheduler

Place m tasks on the least loaded of dm workers

4 probes (d =

2)

Worker

Worker

Worker

Worker

Worker

Worker

Stage

Page 24: Sparrow Distributed Low-Latency Spark Scheduling Kay Ousterhout, Patrick Wendell, Matei Zaharia, Ion Stoica.

Late Binding

Scheduler

Scheduler

SchedulerScheduler

Place m tasks on the least loaded of dm workers

Worker

requests

task

Worker

Worker

Worker

Worker

Worker

Worker

Stage

Page 25: Sparrow Distributed Low-Latency Spark Scheduling Kay Ousterhout, Patrick Wendell, Matei Zaharia, Ion Stoica.

What about constraints?

Page 26: Sparrow Distributed Low-Latency Spark Scheduling Kay Ousterhout, Patrick Wendell, Matei Zaharia, Ion Stoica.

Stage

Per-Task Constraints

Scheduler

Scheduler

Scheduler

Scheduler

Worker

Worker

Worker

Worker

Worker

Worker

Probe separately for each task

Page 27: Sparrow Distributed Low-Latency Spark Scheduling Kay Ousterhout, Patrick Wendell, Matei Zaharia, Ion Stoica.

Technique Recap

Scheduler

Scheduler

Scheduler

Scheduler

Batch sampling

+Late binding

+Constraints

WorkerWorkerWorkerWorkerWorker

Worker

Page 28: Sparrow Distributed Low-Latency Spark Scheduling Kay Ousterhout, Patrick Wendell, Matei Zaharia, Ion Stoica.

How well does Sparrow perform?

Page 29: Sparrow Distributed Low-Latency Spark Scheduling Kay Ousterhout, Patrick Wendell, Matei Zaharia, Ion Stoica.

How does Sparrow compare to Spark’s native scheduler?

100 16-core EC2 nodes, 10 tasks/job, 10 schedulers, 80% load

Page 30: Sparrow Distributed Low-Latency Spark Scheduling Kay Ousterhout, Patrick Wendell, Matei Zaharia, Ion Stoica.

TPC-H Queries: Background

TPC-H: Common benchmark for analytics workloads

Sparrow

Spark

Shark: SQL execution engine

Page 31: Sparrow Distributed Low-Latency Spark Scheduling Kay Ousterhout, Patrick Wendell, Matei Zaharia, Ion Stoica.

TPC-H Queries

100 16-core EC2 nodes, 10 schedulers, 80% load

95

75

25

50

Percentiles

5

Within 12% of ideal

Median queuing delay of 9ms

Page 32: Sparrow Distributed Low-Latency Spark Scheduling Kay Ousterhout, Patrick Wendell, Matei Zaharia, Ion Stoica.

Policy Enforcement

WorkerHigh Priority

Low Priority WorkerUser A (75%)

User B (25%)

Fair SharesServe queues using

weighted fair queuing

PrioritiesServe queues based on strict priorities

Page 33: Sparrow Distributed Low-Latency Spark Scheduling Kay Ousterhout, Patrick Wendell, Matei Zaharia, Ion Stoica.

Weighted Fair Sharing

Page 34: Sparrow Distributed Low-Latency Spark Scheduling Kay Ousterhout, Patrick Wendell, Matei Zaharia, Ion Stoica.

Fault Tolerance

Scheduler 1

Scheduler 2

Spark Client 1 ✗Spark

Client 2

Timeout: 100msFailover: 5ms

Re-launch queries: 15ms

Page 35: Sparrow Distributed Low-Latency Spark Scheduling Kay Ousterhout, Patrick Wendell, Matei Zaharia, Ion Stoica.

Making Sparrow feature-complete

Interfacing with UI

Delay scheduling

Speculation

Page 36: Sparrow Distributed Low-Latency Spark Scheduling Kay Ousterhout, Patrick Wendell, Matei Zaharia, Ion Stoica.

(2) Distributed,

fault-tolerant scheduling

with Sparrow www.github.com/radlab/sparrow

Scheduler

Scheduler

Scheduler

Scheduler

WorkerWorkerWorkerWorkerWorker

Worker

(1) Diagnosing a

Spark scheduling bottleneck