Low Latency Execution For Apache Spark
-
Upload
jen-aman -
Category
Data & Analytics
-
view
2.144 -
download
0
Transcript of Low Latency Execution For Apache Spark
![Page 1: Low Latency Execution For Apache Spark](https://reader031.fdocuments.net/reader031/viewer/2022030310/58f9a91d760da3da068b6b1d/html5/thumbnails/1.jpg)
Low latency execution for apache spark
Shivaram Venkataraman, Aurojit Panda, Kay Ousterhout
![Page 2: Low Latency Execution For Apache Spark](https://reader031.fdocuments.net/reader031/viewer/2022030310/58f9a91d760da3da068b6b1d/html5/thumbnails/2.jpg)
Who am I ?
PhD candidate, AMPLab UC Berkeley Dissertation: System design for large scale machine learning Apache Spark PMC Member. Contributions to Spark core, MLlib, SparkR
![Page 3: Low Latency Execution For Apache Spark](https://reader031.fdocuments.net/reader031/viewer/2022030310/58f9a91d760da3da068b6b1d/html5/thumbnails/3.jpg)
Low latency: SPARK STREAMING
![Page 4: Low Latency Execution For Apache Spark](https://reader031.fdocuments.net/reader031/viewer/2022030310/58f9a91d760da3da068b6b1d/html5/thumbnails/4.jpg)
Low latency: SPARK STREAMING
![Page 5: Low Latency Execution For Apache Spark](https://reader031.fdocuments.net/reader031/viewer/2022030310/58f9a91d760da3da068b6b1d/html5/thumbnails/5.jpg)
Low latency: SPARK STREAMING
![Page 6: Low Latency Execution For Apache Spark](https://reader031.fdocuments.net/reader031/viewer/2022030310/58f9a91d760da3da068b6b1d/html5/thumbnails/6.jpg)
Low latency: execution
Scheduler
Few milliseconds
Large Clusters
![Page 7: Low Latency Execution For Apache Spark](https://reader031.fdocuments.net/reader031/viewer/2022030310/58f9a91d760da3da068b6b1d/html5/thumbnails/7.jpg)
Low latency execution engine for iterative workloads
Machine learning Stream processing
State
This talk
![Page 8: Low Latency Execution For Apache Spark](https://reader031.fdocuments.net/reader031/viewer/2022030310/58f9a91d760da3da068b6b1d/html5/thumbnails/8.jpg)
Execution Models
![Page 9: Low Latency Execution For Apache Spark](https://reader031.fdocuments.net/reader031/viewer/2022030310/58f9a91d760da3da068b6b1d/html5/thumbnails/9.jpg)
Centralized task scheduling
Lineage, Parallel Recovery
Microsoft Dryad
Execution models: batch processing
Task
Control Message Driver
Driver: Coordinate shuffles, data sharing
SHUFFLE
Network Transfer
Iteration
B ROADCAST
![Page 10: Low Latency Execution For Apache Spark](https://reader031.fdocuments.net/reader031/viewer/2022030310/58f9a91d760da3da068b6b1d/html5/thumbnails/10.jpg)
Execution models: parallel operators
Long-lived operators
Checkpointing based fault tolerance
Naiad
SHUFFLE
SHUFFLE
Iteration
Task
Control Message Driver
Network Transfer
Low Latency
![Page 11: Low Latency Execution For Apache Spark](https://reader031.fdocuments.net/reader031/viewer/2022030310/58f9a91d760da3da068b6b1d/html5/thumbnails/11.jpg)
SCALING behavior
0 50
100 150 200 250
4 8 16 32 64 128
Tim
e (m
s)
Machines
Network Wait Compute Task Fetch Scheduler Delay
Cluster: 4 core, r3.xlarge machines Workload: Sum of 10k numbers per-core
Median-task time breakdown
![Page 12: Low Latency Execution For Apache Spark](https://reader031.fdocuments.net/reader031/viewer/2022030310/58f9a91d760da3da068b6b1d/html5/thumbnails/12.jpg)
Can we achieve low latency with Apache Spark ?
![Page 13: Low Latency Execution For Apache Spark](https://reader031.fdocuments.net/reader031/viewer/2022030310/58f9a91d760da3da068b6b1d/html5/thumbnails/13.jpg)
DESIGN INSIGHT Fine-grained execution with coarse-grained scheduling
![Page 14: Low Latency Execution For Apache Spark](https://reader031.fdocuments.net/reader031/viewer/2022030310/58f9a91d760da3da068b6b1d/html5/thumbnails/14.jpg)
DRIZZLE
SHUFFLE
B ROADCAST
Iteration
Batch Scheduling Pre-Scheduling Shuffles Distributed Shared Variables
![Page 15: Low Latency Execution For Apache Spark](https://reader031.fdocuments.net/reader031/viewer/2022030310/58f9a91d760da3da068b6b1d/html5/thumbnails/15.jpg)
DAG scheduling
Assign tasks to hosts using (a) locality preferences (b) straggler mitigation (c) fair sharing etc.
Tasks Host1
Host2
Driver
Host1
Host2
Serialize & Launch
Host Metadata
Scheduler
Same DAG structure for many iterations Can reuse scheduling decisions
![Page 16: Low Latency Execution For Apache Spark](https://reader031.fdocuments.net/reader031/viewer/2022030310/58f9a91d760da3da068b6b1d/html5/thumbnails/16.jpg)
Batch scheduling
Schedule a batch of iterations at once
Fault tolerance, scheduling at batch boundaries
1 stage in each iteration
b = 2
![Page 17: Low Latency Execution For Apache Spark](https://reader031.fdocuments.net/reader031/viewer/2022030310/58f9a91d760da3da068b6b1d/html5/thumbnails/17.jpg)
How much does this help ?
1
10
100
1000
4 8 16 32 64 128
Tim
e / I
ter (
ms)
Machines
Apache Spark Drizzle-10 Drizzle-50 Drizzle-100
Workload: Sum of 10k numbers per-core
Single Stage Job, 100 iterations – Varying Drizzle batch size
![Page 18: Low Latency Execution For Apache Spark](https://reader031.fdocuments.net/reader031/viewer/2022030310/58f9a91d760da3da068b6b1d/html5/thumbnails/18.jpg)
DRIZZLE
SHUFFLE
B ROADCAST
Iteration
Batch Scheduling Pre-Scheduling Shuffles Distributed Shared Variables
![Page 19: Low Latency Execution For Apache Spark](https://reader031.fdocuments.net/reader031/viewer/2022030310/58f9a91d760da3da068b6b1d/html5/thumbnails/19.jpg)
coordinating shuffles: APACHE SPARK
Task
Control Message
Data Message
Driver
Intermediate Data
Driver sends metadata Tasks pull data
![Page 20: Low Latency Execution For Apache Spark](https://reader031.fdocuments.net/reader031/viewer/2022030310/58f9a91d760da3da068b6b1d/html5/thumbnails/20.jpg)
coordinating shuffles: PRE-SCHEDULING
Pre-schedule down-stream tasks on executors
Trigger tasks once dependencies are met
Task
Control Message Data Message
Driver
Intermediate Data
Pre-scheduled task
![Page 21: Low Latency Execution For Apache Spark](https://reader031.fdocuments.net/reader031/viewer/2022030310/58f9a91d760da3da068b6b1d/html5/thumbnails/21.jpg)
Push-metadata, pull-data
Coalesce metadata across cores Transfer during downstream stage
Data Message Intermediate Data Metadata Message
![Page 22: Low Latency Execution For Apache Spark](https://reader031.fdocuments.net/reader031/viewer/2022030310/58f9a91d760da3da068b6b1d/html5/thumbnails/22.jpg)
Micro-benchmark: 2-stages
0 50
100 150 200 250 300 350
4 8 16 32 64 128
Tim
e / I
ter (
ms)
Machines
Apache Spark Drizzle-Pull
Workload: Sum of 10k numbers per-core
100 iterations
![Page 23: Low Latency Execution For Apache Spark](https://reader031.fdocuments.net/reader031/viewer/2022030310/58f9a91d760da3da068b6b1d/html5/thumbnails/23.jpg)
DRIZZLE
SHUFFLE
B ROADCAST
Iteration
Batch Scheduling Pre-Scheduling Shuffles Distributed Shared Variables
![Page 24: Low Latency Execution For Apache Spark](https://reader031.fdocuments.net/reader031/viewer/2022030310/58f9a91d760da3da068b6b1d/html5/thumbnails/24.jpg)
Shared variables
Fine-grained updates to shared data Distributed shared variables
- MPI using MPI_AllReduce - Bloom, CRDTs - Parameter servers, key-value stores - reduce followed by broadcast
![Page 25: Low Latency Execution For Apache Spark](https://reader031.fdocuments.net/reader031/viewer/2022030310/58f9a91d760da3da068b6b1d/html5/thumbnails/25.jpg)
Drizzle: Replicated shared variables
Custom merge strategies 1
Commutative updates e.g. vector addition
Task Shared Variable
MERGE
1
1
1
1
2
2
2
2
![Page 26: Low Latency Execution For Apache Spark](https://reader031.fdocuments.net/reader031/viewer/2022030310/58f9a91d760da3da068b6b1d/html5/thumbnails/26.jpg)
Enabling async updates
Synchronous
Asynchronous semantics within a batch Synchronous semantics enforced at batch boundaries
Task Shared Variable
1 2
Async Stale versions 1 1 2
3
Merge
![Page 27: Low Latency Execution For Apache Spark](https://reader031.fdocuments.net/reader031/viewer/2022030310/58f9a91d760da3da068b6b1d/html5/thumbnails/27.jpg)
Evaluation
Micro-benchmarks - Single stage - Multiple stages - Shared variables
End-to-end experiments - Streaming benchmarks - Logistic Regression
Implemented on Apache Spark 1.6.1 Integrations with Spark Streaming, MLlib
![Page 28: Low Latency Execution For Apache Spark](https://reader031.fdocuments.net/reader031/viewer/2022030310/58f9a91d760da3da068b6b1d/html5/thumbnails/28.jpg)
USING DRIZZLE API
DAGScheduler.scala defrunJob( rdd:RDD[T], func:Iterator[T]=>U)
Internal defrunBatchJob( rdds:Seq[RDD[T]], funcs:Seq[Iterator[T]=>U])
![Page 29: Low Latency Execution For Apache Spark](https://reader031.fdocuments.net/reader031/viewer/2022030310/58f9a91d760da3da068b6b1d/html5/thumbnails/29.jpg)
Using drizzle
StreamingContext.scala defthis( sc:SparkContext, batchDuration:Duration)
Spark Streaming defthis( sc:SparkContext, batchDuration:Duration, jobsPerBatch:Int)
![Page 30: Low Latency Execution For Apache Spark](https://reader031.fdocuments.net/reader031/viewer/2022030310/58f9a91d760da3da068b6b1d/html5/thumbnails/30.jpg)
b=1 à Batch processing
Choosing batch size
Higher overhead Smaller window for fault tolerance
In practice: Largest batch such that overhead is below fixed threshold e.g., For 128 machines, batch of few seconds is enough
b=N à Parallel operators
Lower overhead Larger window for fault tolerance
![Page 31: Low Latency Execution For Apache Spark](https://reader031.fdocuments.net/reader031/viewer/2022030310/58f9a91d760da3da068b6b1d/html5/thumbnails/31.jpg)
Evaluation
Micro-benchmarks - Single stage - Multiple stages - Shared variables
End-to-end experiments - Streaming benchmarks - Logistic Regression
Implemented on Apache Spark 1.6.1 Integrations with Spark Streaming, MLlib
![Page 32: Low Latency Execution For Apache Spark](https://reader031.fdocuments.net/reader031/viewer/2022030310/58f9a91d760da3da068b6b1d/html5/thumbnails/32.jpg)
Streaming BENCHMARK
0
25
50
75
100
0 300 600 900 1200 1500
CDF
Event Latency (ms)
Drizzle, 90ms Spark Streaming 560ms
Yahoo Streaming Benchmark: 1M JSON Ad-events / second, 64 machines
Event Latency: Difference between window end, processing end
![Page 33: Low Latency Execution For Apache Spark](https://reader031.fdocuments.net/reader031/viewer/2022030310/58f9a91d760da3da068b6b1d/html5/thumbnails/33.jpg)
Mllib: LOGISTIC REGRESSION
0 100 200 300 400 500 600
4 8 16 32 64 128
Tim
e / I
ter (
ms)
Machines
Apache Spark Drizzle
Sparse Logistic Regression on RCV1 677K examples, 47K features
![Page 34: Low Latency Execution For Apache Spark](https://reader031.fdocuments.net/reader031/viewer/2022030310/58f9a91d760da3da068b6b1d/html5/thumbnails/34.jpg)
Work In progress
Automatic batch size tuning Open source release Apache Spark JIRA to discuss potential contribution
![Page 35: Low Latency Execution For Apache Spark](https://reader031.fdocuments.net/reader031/viewer/2022030310/58f9a91d760da3da068b6b1d/html5/thumbnails/35.jpg)
conclusion
Overheads when using Apache Spark for streaming, ML workloads Drizzle: Low Latency Execution Engine
- Decouple execution from centralized scheduling - Milliseconds latency for iterative workloads
Workloads / Questions / Contributions
Shivaram Venkataraman [email protected]