2013 12 02 Sparrow - EECS at UC Berkeleykeo/talks/sparrow-spark-summit... · Sparrow Distributed...

36
Sparrow Distributed Low-Latency Spark Scheduling Kay Ousterhout, Patrick Wendell, Matei Zaharia, Ion Stoica

Transcript of 2013 12 02 Sparrow - EECS at UC Berkeleykeo/talks/sparrow-spark-summit... · Sparrow Distributed...

Page 1: 2013 12 02 Sparrow - EECS at UC Berkeleykeo/talks/sparrow-spark-summit... · Sparrow Distributed Low-Latency Spark Scheduling Kay Ousterhout, Patrick Wendell, Matei Zaharia, Ion Stoica

Sparrow Distributed Low-Latency Spark Scheduling

Kay Ousterhout, Patrick Wendell, Matei Zaharia, Ion Stoica

Page 2: 2013 12 02 Sparrow - EECS at UC Berkeleykeo/talks/sparrow-spark-summit... · Sparrow Distributed Low-Latency Spark Scheduling Kay Ousterhout, Patrick Wendell, Matei Zaharia, Ion Stoica

Outline

The Spark scheduling bottleneck Sparrow’s fully distributed, fault-tolerant technique Sparrow’s near-optimal performance

Page 3: 2013 12 02 Sparrow - EECS at UC Berkeleykeo/talks/sparrow-spark-summit... · Sparrow Distributed Low-Latency Spark Scheduling Kay Ousterhout, Patrick Wendell, Matei Zaharia, Ion Stoica

Spark Today

Worker

Worker

Worker

Worker

Worker

Worker

Spark Context

User 1 User 2 User 3

Query Compilation

Storage

Scheduling

Page 4: 2013 12 02 Sparrow - EECS at UC Berkeleykeo/talks/sparrow-spark-summit... · Sparrow Distributed Low-Latency Spark Scheduling Kay Ousterhout, Patrick Wendell, Matei Zaharia, Ion Stoica

Spark Today

Worker

Worker

Worker

Worker

Worker

Worker

Spark Context

User 1 User 2 User 3

Query Compilation

Storage

Scheduling

Page 5: 2013 12 02 Sparrow - EECS at UC Berkeleykeo/talks/sparrow-spark-summit... · Sparrow Distributed Low-Latency Spark Scheduling Kay Ousterhout, Patrick Wendell, Matei Zaharia, Ion Stoica

Job Latencies Rapidly Decreasing

10 min. 10 sec. 100 ms 1 ms

2004: MapReduce batch job

2009: Hive query

2010: Dremel Query

2012: Impala query 2010:

In-memory Spark query

2013: Spark

streaming

Page 6: 2013 12 02 Sparrow - EECS at UC Berkeleykeo/talks/sparrow-spark-summit... · Sparrow Distributed Low-Latency Spark Scheduling Kay Ousterhout, Patrick Wendell, Matei Zaharia, Ion Stoica

Job latencies rapidly decreasing

Page 7: 2013 12 02 Sparrow - EECS at UC Berkeleykeo/talks/sparrow-spark-summit... · Sparrow Distributed Low-Latency Spark Scheduling Kay Ousterhout, Patrick Wendell, Matei Zaharia, Ion Stoica

Job latencies rapidly decreasing +

Spark deployments growing in size

Scheduling bottleneck!

Page 8: 2013 12 02 Sparrow - EECS at UC Berkeleykeo/talks/sparrow-spark-summit... · Sparrow Distributed Low-Latency Spark Scheduling Kay Ousterhout, Patrick Wendell, Matei Zaharia, Ion Stoica

Spark scheduler throughput:

1500 tasks / second

1 second 100

100 ms 10

10 second 1000

Task Duration Cluster size

(# 16-core machines)

Page 9: 2013 12 02 Sparrow - EECS at UC Berkeleykeo/talks/sparrow-spark-summit... · Sparrow Distributed Low-Latency Spark Scheduling Kay Ousterhout, Patrick Wendell, Matei Zaharia, Ion Stoica

Optimizing the Spark Scheduler

0.8: Monitoring code moved off critical path 0.8.1: Result deserialization moved off critical path Future improvements may yield 2-3x higher throughput

Page 10: 2013 12 02 Sparrow - EECS at UC Berkeleykeo/talks/sparrow-spark-summit... · Sparrow Distributed Low-Latency Spark Scheduling Kay Ousterhout, Patrick Wendell, Matei Zaharia, Ion Stoica

Is the scheduler the bottleneck in my cluster?

tinyurl.com/sparkdemo

Page 11: 2013 12 02 Sparrow - EECS at UC Berkeleykeo/talks/sparrow-spark-summit... · Sparrow Distributed Low-Latency Spark Scheduling Kay Ousterhout, Patrick Wendell, Matei Zaharia, Ion Stoica

Worker

Worker

Worker

Worker

Worker

Worker

Cluster Scheduler

Task launch

Task completion

tinyurl.com/sparkdemo

Page 12: 2013 12 02 Sparrow - EECS at UC Berkeleykeo/talks/sparrow-spark-summit... · Sparrow Distributed Low-Latency Spark Scheduling Kay Ousterhout, Patrick Wendell, Matei Zaharia, Ion Stoica

Worker

Worker

Worker

Worker

Worker

Worker

Cluster Scheduler

Task launch

Task completion

tinyurl.com/sparkdemo

Page 13: 2013 12 02 Sparrow - EECS at UC Berkeleykeo/talks/sparrow-spark-summit... · Sparrow Distributed Low-Latency Spark Scheduling Kay Ousterhout, Patrick Wendell, Matei Zaharia, Ion Stoica

Worker

Worker

Worker

Worker

Worker

Worker

Cluster Scheduler

Task launch

Task completion

Scheduler delay

tinyurl.com/sparkdemo

Page 14: 2013 12 02 Sparrow - EECS at UC Berkeleykeo/talks/sparrow-spark-summit... · Sparrow Distributed Low-Latency Spark Scheduling Kay Ousterhout, Patrick Wendell, Matei Zaharia, Ion Stoica
Page 15: 2013 12 02 Sparrow - EECS at UC Berkeleykeo/talks/sparrow-spark-summit... · Sparrow Distributed Low-Latency Spark Scheduling Kay Ousterhout, Patrick Wendell, Matei Zaharia, Ion Stoica
Page 16: 2013 12 02 Sparrow - EECS at UC Berkeleykeo/talks/sparrow-spark-summit... · Sparrow Distributed Low-Latency Spark Scheduling Kay Ousterhout, Patrick Wendell, Matei Zaharia, Ion Stoica

Spark Today

Worker

Worker

Worker

Worker

Worker

Worker

Spark Context

User 1 User 2 User 3

Query Compilation

Storage

Scheduling

Page 17: 2013 12 02 Sparrow - EECS at UC Berkeleykeo/talks/sparrow-spark-summit... · Sparrow Distributed Low-Latency Spark Scheduling Kay Ousterhout, Patrick Wendell, Matei Zaharia, Ion Stoica

Future Spark

Worker

Worker

Worker

Worker

Worker

Worker

User 1 User 2 User 3

Scheduler Query compilation

Scheduler Query compilation

Scheduler Query compilation

Benefits: High throughput Fault tolerance

Page 18: 2013 12 02 Sparrow - EECS at UC Berkeleykeo/talks/sparrow-spark-summit... · Sparrow Distributed Low-Latency Spark Scheduling Kay Ousterhout, Patrick Wendell, Matei Zaharia, Ion Stoica

Future Spark

Worker

Worker

Worker

Worker

Worker

Worker

User 1 User 2 User 3

Scheduler Query compilation

Scheduler Query compilation

Scheduler Query compilation

Storage:

Tachyon

Page 19: 2013 12 02 Sparrow - EECS at UC Berkeleykeo/talks/sparrow-spark-summit... · Sparrow Distributed Low-Latency Spark Scheduling Kay Ousterhout, Patrick Wendell, Matei Zaharia, Ion Stoica

Scheduling with Sparrow

Worker

Worker

Worker

Worker

Worker Scheduler

Scheduler

Scheduler

Scheduler Stage

Worker

Page 20: 2013 12 02 Sparrow - EECS at UC Berkeleykeo/talks/sparrow-spark-summit... · Sparrow Distributed Low-Latency Spark Scheduling Kay Ousterhout, Patrick Wendell, Matei Zaharia, Ion Stoica

Stage

Batch Sampling

Worker

Worker

Worker

Worker

Worker Scheduler

Scheduler

Scheduler

Scheduler

Worker

Place m tasks on the least loaded of 2m workers

4 probes (d = 2)

Page 21: 2013 12 02 Sparrow - EECS at UC Berkeleykeo/talks/sparrow-spark-summit... · Sparrow Distributed Low-Latency Spark Scheduling Kay Ousterhout, Patrick Wendell, Matei Zaharia, Ion Stoica

Queue length poor predictor of wait time

Worker

Worker

80 ms 155 ms

530 ms

Poor performance on heterogeneous workloads

Page 22: 2013 12 02 Sparrow - EECS at UC Berkeleykeo/talks/sparrow-spark-summit... · Sparrow Distributed Low-Latency Spark Scheduling Kay Ousterhout, Patrick Wendell, Matei Zaharia, Ion Stoica

Stage

Late Binding

Worker

Worker

Worker

Worker

Worker Scheduler

Scheduler

Scheduler

Scheduler

Worker

Place m tasks on the least loaded of d�m workers

4 probes (d = 2)

Page 23: 2013 12 02 Sparrow - EECS at UC Berkeleykeo/talks/sparrow-spark-summit... · Sparrow Distributed Low-Latency Spark Scheduling Kay Ousterhout, Patrick Wendell, Matei Zaharia, Ion Stoica

Late Binding

Scheduler

Scheduler

Scheduler

Scheduler

Place m tasks on the least loaded of d�m workers

4 probes (d = 2)

Worker

Worker

Worker

Worker

Worker

Worker

Stage

Page 24: 2013 12 02 Sparrow - EECS at UC Berkeleykeo/talks/sparrow-spark-summit... · Sparrow Distributed Low-Latency Spark Scheduling Kay Ousterhout, Patrick Wendell, Matei Zaharia, Ion Stoica

Late Binding

Scheduler

Scheduler

Scheduler

Scheduler

Place m tasks on the least loaded of d�m workers

Worker requests

task Worker

Worker

Worker

Worker

Worker

Worker

Stage

Page 25: 2013 12 02 Sparrow - EECS at UC Berkeleykeo/talks/sparrow-spark-summit... · Sparrow Distributed Low-Latency Spark Scheduling Kay Ousterhout, Patrick Wendell, Matei Zaharia, Ion Stoica

What about constraints?

Page 26: 2013 12 02 Sparrow - EECS at UC Berkeleykeo/talks/sparrow-spark-summit... · Sparrow Distributed Low-Latency Spark Scheduling Kay Ousterhout, Patrick Wendell, Matei Zaharia, Ion Stoica

Stage

Per-Task Constraints

Scheduler

Scheduler

Scheduler

Scheduler

Worker

Worker

Worker

Worker

Worker

Worker

Probe separately for each task

Page 27: 2013 12 02 Sparrow - EECS at UC Berkeleykeo/talks/sparrow-spark-summit... · Sparrow Distributed Low-Latency Spark Scheduling Kay Ousterhout, Patrick Wendell, Matei Zaharia, Ion Stoica

Technique Recap

Scheduler

Scheduler

Scheduler

Scheduler

Batch sampling +

Late binding +

Constraints

Worker

Worker

Worker

Worker

Worker

Worker

Page 28: 2013 12 02 Sparrow - EECS at UC Berkeleykeo/talks/sparrow-spark-summit... · Sparrow Distributed Low-Latency Spark Scheduling Kay Ousterhout, Patrick Wendell, Matei Zaharia, Ion Stoica

How well does Sparrow perform?

Page 29: 2013 12 02 Sparrow - EECS at UC Berkeleykeo/talks/sparrow-spark-summit... · Sparrow Distributed Low-Latency Spark Scheduling Kay Ousterhout, Patrick Wendell, Matei Zaharia, Ion Stoica

How does Sparrow compare to Spark’s native scheduler?

!"!#"""!$"""!%"""!&"""!'"""!("""

!"!#"""!$"""!%"""!&"""!'"""!("""

)*+,-.+*!/01*!21+3

/4+5!678490-.!21+3

:,485!.490;*!+<=*>7?*8:,488-@A>*4?

100 16-core EC2 nodes, 10 tasks/job, 10 schedulers, 80% load

Page 30: 2013 12 02 Sparrow - EECS at UC Berkeleykeo/talks/sparrow-spark-summit... · Sparrow Distributed Low-Latency Spark Scheduling Kay Ousterhout, Patrick Wendell, Matei Zaharia, Ion Stoica

TPC-H Queries: Background

TPC-H: Common benchmark for analytics workloads

Sparrow

Spark

Shark: SQL execution engine

Page 31: 2013 12 02 Sparrow - EECS at UC Berkeleykeo/talks/sparrow-spark-summit... · Sparrow Distributed Low-Latency Spark Scheduling Kay Ousterhout, Patrick Wendell, Matei Zaharia, Ion Stoica

!"!#""!$"""!$#""!%"""!%#""!&"""!&#""!'"""

(& (' () ($%

*+,-./,+!012+!32,4

'%$5!32+674 #&8)!32+674 599$!32+674

*:/6.2 ;-:<<.= >6+:?

TPC-H Queries

100 16-core EC2 nodes, 10 schedulers, 80% load

95

75

25

50

Percentiles

5

Within 12% of ideal Median queuing delay of 9ms

Page 32: 2013 12 02 Sparrow - EECS at UC Berkeleykeo/talks/sparrow-spark-summit... · Sparrow Distributed Low-Latency Spark Scheduling Kay Ousterhout, Patrick Wendell, Matei Zaharia, Ion Stoica

Policy Enforcement

Worker High Priority

Low Priority Worker User A (75%)

User B (25%)

Fair Shares Serve queues using weighted fair

queuing

Priorities Serve queues based on strict

priorities

Page 33: 2013 12 02 Sparrow - EECS at UC Berkeleykeo/talks/sparrow-spark-summit... · Sparrow Distributed Low-Latency Spark Scheduling Kay Ousterhout, Patrick Wendell, Matei Zaharia, Ion Stoica

Weighted Fair Sharing

!"!#"!$""!$#"!%""!%#"!&""!&#"!'""

!" !$" !%" !&" !'" !#"

()**+*,!-./0/

-+12!3/4

5/26!"5/26!$

Page 34: 2013 12 02 Sparrow - EECS at UC Berkeleykeo/talks/sparrow-spark-summit... · Sparrow Distributed Low-Latency Spark Scheduling Kay Ousterhout, Patrick Wendell, Matei Zaharia, Ion Stoica

Fault Tolerance

Scheduler 1

Scheduler 2

Spark Client 1 ✗ Spark Client 2

Timeout: 100ms Failover: 5ms

Re-launch queries: 15ms

!"!#"""!$"""!%"""!&"""

!" !#" !$" !%" !&" !'" !(")*+,!-./

!"!#"""!$"""!%"""!&"""

01,23!2,.456.,!7*+,!-+./

89*:12,;492<!=:*,67!#

;492<!=:*,67!$

Page 35: 2013 12 02 Sparrow - EECS at UC Berkeleykeo/talks/sparrow-spark-summit... · Sparrow Distributed Low-Latency Spark Scheduling Kay Ousterhout, Patrick Wendell, Matei Zaharia, Ion Stoica

Making Sparrow feature-complete

Interfacing with UI

Delay scheduling

Speculation

Page 36: 2013 12 02 Sparrow - EECS at UC Berkeleykeo/talks/sparrow-spark-summit... · Sparrow Distributed Low-Latency Spark Scheduling Kay Ousterhout, Patrick Wendell, Matei Zaharia, Ion Stoica

(2) Distributed, fault-tolerant scheduling

with Sparrow

www.github.com/radlab/sparrow

Scheduler

Scheduler

Scheduler

Scheduler

Worker

Worker

Worker

Worker

Worker

Worker

(1) Diagnosing a Spark scheduling

bottleneck