Back-Pressure in Action: Handling High-Burst Workloads with Akka Streams & Kafka

26
Back-Pressure in Action Handling High-Burst Workloads with Akka Streams & Kafka Akara Sucharitakul, PayPal Anil Gursel, PayPal

Transcript of Back-Pressure in Action: Handling High-Burst Workloads with Akka Streams & Kafka

Page 1: Back-Pressure in Action: Handling High-Burst Workloads with Akka Streams & Kafka

Back-Pressure in Action

Handling High-Burst Workloads with Akka Streams & Kafka

Akara Sucharitakul, PayPal

Anil Gursel, PayPal

Page 2: Back-Pressure in Action: Handling High-Burst Workloads with Akka Streams & Kafka

Intro & Agenda Crawler Intro & Problem Statements

Crawler Architecture

Infrastructure: Akka Streams, Kafka, etc.

The Goodies

Page 3: Back-Pressure in Action: Handling High-Burst Workloads with Akka Streams & Kafka

Crawl Jobs

Job DB

Validate

URLCache

Download Process

URLs

URLs

Timestamps

High-Level View

Page 4: Back-Pressure in Action: Handling High-Burst Workloads with Akka Streams & Kafka

Requirements Ever-expanding # of URLs

Can’t crawl all URLs at once

Control over concurrent web GETs

Efficient resource usage

Resilient under high burst

Scales horizontally & vertically

Page 5: Back-Pressure in Action: Handling High-Burst Workloads with Akka Streams & Kafka

Sizing the Crawl Job

Let:i = Number of crawl URLs in a jobn = Average number of links per paged = The crawl depth (how many layers to follow links)u = The max number of URLs to process

Then:u = ind

0 2 4 6 8 10 121.00E+001.00E+011.00E+021.00E+031.00E+041.00E+051.00E+061.00E+07

totalURLs vs depth

depth (initialURLs = 1, outLinks = 5)

1E+00 1E+01 1E+02 1E+03 1E+04 1E+05 1E+06 1E+071.00E+031.00E+041.00E+051.00E+061.00E+071.00E+081.00E+091.00E+101.00E+11

totalURLs vs initialURLs

initialURLs (depth = 5, outLinks = 5)

Page 6: Back-Pressure in Action: Handling High-Burst Workloads with Akka Streams & Kafka

The Reactive Manifesto

Responsive

Message Driven

Elastic Resilient

Page 7: Back-Pressure in Action: Handling High-Burst Workloads with Akka Streams & Kafka

Why Does it Matter?

Respond in a deterministic, timely manner

Stays responsive in the face of failure – even cascading failures

Stays responsive under workload spikes

Basic building block for responsive, resilient, and elastic systems

Responsive

Resilient

Elastic

Message Driven

Page 8: Back-Pressure in Action: Handling High-Burst Workloads with Akka Streams & Kafka

The Right Ingredients• Kafka

• Huge persistent buffer for the bursts• Load distribution to very large number of

processing nodes• Enable horizontal scalability

• Akka streams• High performance, highly efficient

processing pipeline• Resilient with end-to-end back-pressure• Fully asynchronous – utilizes

mapAsyncUnordered with Async HTTP client• Async HTTP client

• Non-blocking and consumes no threads in waiting

• Integrates with Akka Streams for a high parallelism, low resource solution

EfficientResilient

Scale

AkkaStream

AsyncHTTP

Reactive Kafka

Page 9: Back-Pressure in Action: Handling High-Burst Workloads with Akka Streams & Kafka

Crawl Jobs

Job DB

Validate

URLCache

Download Process

URLs

URLs

Timestamps

Adding Kafka & Akka Streams

URLsAkka

Streams

Page 10: Back-Pressure in Action: Handling High-Burst Workloads with Akka Streams & Kafka

Akka Streams,what???

High performance, pure async, stream processing

Conforms to reactive streams

Simple, yet powerful GraphDSL allows clear stream topology declaration

Central point to understand processing pipeline

Page 11: Back-Pressure in Action: Handling High-Burst Workloads with Akka Streams & Kafka

Crawl Stream

Actual Stream Declaration in Code prioritizeSource ~> crawlerFlow ~> bCast0 ~> result ~> bCast ~> outLinksFlow ~> outLinksSink bCast ~> dataSinkFlow ~> kafkaDataSink bCast ~> hdfsDataSink bCast ~> graphFlow ~> merge ~> graphSink bCast0 ~> maxPage ~> merge bCast0 ~> retry ~> bCastRetry ~> retryFailed ~> merge bCastRetry ~> errorSink

PrioritizedSource

Crawl

Result

MaxPageReached

Retry

OutLinks

Data

Graph

CheckFail

CheckErr

OutLinksSinkKafka DataSinkHDFS DataSink

GraphSink

ErrorSink

Page 12: Back-Pressure in Action: Handling High-Burst Workloads with Akka Streams & Kafka

Resulting CharacteristicsEfficient• Low thread count, controlled by Akka and pure non-blocking async HTTP• High latency URLs do not block low latency URLs using MapAsyncUnordered• Well-controlled download concurrency using MapAsyncUnordered• Thread per concurrent crawl jobResilient• Processes only what can be processed – no resource overload• Kafka as short-term, persistent queueScale• Kafka feeds next batch of URLs to available node cluster• Pull model – only processes that have capacity will get the load• Kafka distributes work to large number of processing nodes in cluster

Page 13: Back-Pressure in Action: Handling High-Burst Workloads with Akka Streams & Kafka

Back-Pressure

0 100 200 300 400 500 600 7000

20000

40000

60000

80000

100000

120000

Queue Size

Time (seconds)

0

100

200

300

400

URLs/secTime (seconds)

initialURLs : 100parallelism : 1000processTime : 1 – 5 soutLinks : 0 - 10depth : 5totalCrawled : 312500

Page 14: Back-Pressure in Action: Handling High-Burst Workloads with Akka Streams & Kafka

ChallengesTraining• Developers not used to E2E stream

definitions

• More familiar with deeply nested function calls

Maturity of Infrastructure• Kafka 0.9 use fetch as heartbeat

• Slow nodes cause timeout & rebalance

• Solved in 0.10

Page 15: Back-Pressure in Action: Handling High-Burst Workloads with Akka Streams & Kafka

What it would have been…

Bloated, ineffective concurrency control

Lack of well-thought-out and visible processing pipeline

Clumsy code, hard to manage & understand

Low training cost, high project TCODev / Support / Maintenance

Page 16: Back-Pressure in Action: Handling High-Burst Workloads with Akka Streams & Kafka

Bottom LineCrawl Time Reduced to 1/10th (compared to thread-based architecture)

Page 17: Back-Pressure in Action: Handling High-Burst Workloads with Akka Streams & Kafka

Standardized Reactive PlatformFor Large Scale Internet Deployments

Page 18: Back-Pressure in Action: Handling High-Burst Workloads with Akka Streams & Kafka

Efficiency & Resilience meets Standardization

• Monitoring• Need to collect metrics, consistently

• Logging• Correlation across services• Uniformity in logs

• Security• Need to apply standard security configuration

• Environment Resolution• Staging, production, etc.

Consistency in the face of Heterogeneity

Page 19: Back-Pressure in Action: Handling High-Burst Workloads with Akka Streams & Kafka

squbs is not… A framework by its own

A programming model – use Akka

Take all or none – Components/patterns can mostly be used independently

Page 20: Back-Pressure in Action: Handling High-Burst Workloads with Akka Streams & Kafka

squbsAkka for large scale deployments

Bootstrap

Lifecycle management

Loosely-coupled module system

Integration hooks for logging, monitoring, ops integration

Page 21: Back-Pressure in Action: Handling High-Burst Workloads with Akka Streams & Kafka

squbsAkka for large scale deployments

JSON console

HttpClient with pluggable resolver and monitoring/logging hooks

Test tools and interfaces

Goodies:- Activators for Scala & Java- Programming patterns and helpers for Akka and Akka Stream Use cases…, and growing

Page 22: Back-Pressure in Action: Handling High-Burst Workloads with Akka Streams & Kafka

PerpetualStream

• Provides a convenience trait to help write streams controlled by system lifecycle• Minimal/no message losses

• Register PerpetualStream to make stream start/stop

• Provides customization hooks – especially for how to stop the stream

• Provides killSwitch (from Akka) to be embedded into stream

• Implementers - just provide your stream!

A non-stop stream; starts and stops with the systemclass MyStream extends PerpetualStream[Future[Int]] {

def generator = Iterator.iterate(0) { p => if (p == Int.MaxValue) 0 else p + 1 } val source = Source.fromIterator(generator _) val ignoreSink = Sink.ignore[Int]

override def streamGraph = RunnableGraph.fromGraph( GraphDSL.create(ignoreSink) { implicit builder => sink => import GraphDSL.Implicits._ source ~> killSwitch.flow[Int] ~> sink ClosedShape })}

Page 23: Back-Pressure in Action: Handling High-Burst Workloads with Akka Streams & Kafka

PersistentBuffer/BroadcastBuffer• Data & indexes in rotating memory-mapped files• Off-heap rotating file buffer – very large buffers• Restarts gracefully with no or minimal message loss• Not as durable as a remote data store, but much faster

• Does not back-pressure upstream beyond data/index writes• Similar usage to Buffer and Broadcast• BroadcastBuffer – a FanOutShape decouples each output port making each

downstream independent• Useful if downstream stage blocked or unavailable• Kafka is unavailable/rebalancing but system cannot backpressure/deny

incoming traffic• Optional commit stage for at-least-once delivery semantics• Implementation based on Chronicle Queue

A buffer of virtually unlimited size

Page 24: Back-Pressure in Action: Handling High-Burst Workloads with Akka Streams & Kafka

Summary• Kafka + Akka Streams + Async I/O = Ideal Architecture for

High Bursts & High Efficiency• Akka Streams• Clear view of stream topology• Back-pressure & Kafka allows buffering load bursts

• Standardization• Walk like a duck, quack like a duck, and manage it like a duck

• squbs: Have the cake, and eat it too, with goodies like• PerpetualStream• PersistentBuffer• BroadcastBuffer

Page 25: Back-Pressure in Action: Handling High-Burst Workloads with Akka Streams & Kafka

Q&A – Feedback AppreciatedJoin us on – link from https://github.com/paypal/squbs @squbs, @S_Akara, @anilgursel

Page 26: Back-Pressure in Action: Handling High-Burst Workloads with Akka Streams & Kafka