Journey into Reactive Streams and Akka Streams

58
A journey into stream processing with Reactive Streams and Akka Streams

Transcript of Journey into Reactive Streams and Akka Streams

Page 1: Journey into Reactive Streams and Akka Streams

A journey into stream processing with

Reactive Streamsand

Akka Streams

Page 2: Journey into Reactive Streams and Akka Streams

Before we get started...

http://scalaupnorth.com/

Scala Up North, September 25 & 26

• Keynote from Bill Venners

• BoldRadius offering Scala training

http://boldradius.com

Page 3: Journey into Reactive Streams and Akka Streams

What to expect

• Core concepts

• What is a stream?

• Common use cases?

• The Reactive Streams specification

• A deep-dive into Akka Streams

• Code walkthrough and demo

• Q&A

Page 4: Journey into Reactive Streams and Akka Streams

Disclaimer• I am not a stream processing expert, but I am passionately

curious about an alternate approach to common problems

• This is a deep topic, the contents of this talk are a starting point for further exploration

• Feel free to jump in

Page 5: Journey into Reactive Streams and Akka Streams

Core ConceptsPart 1 of 5

Page 6: Journey into Reactive Streams and Akka Streams

What is a stream?• Flow of data

• Events, commands, machine data, etc

• Live or at rest

• Bounded or unbounded in size

• Similar to an array laid out in time instead of memory

Page 7: Journey into Reactive Streams and Akka Streams

Appeal of stream processing?• Scaling business logic

• Processing real-time data (fast data)

• Batch processing of large data sets (big data)

• Monitoring, analytics, complex event processing, etc

Page 8: Journey into Reactive Streams and Akka Streams

Scaling business logic• Streams can be useful for modelling and breaking apart

monolithic apps that primarily transform data

• Async stream processing steps can be scaled individually

Page 9: Journey into Reactive Streams and Akka Streams
Page 10: Journey into Reactive Streams and Akka Streams
Page 11: Journey into Reactive Streams and Akka Streams
Page 12: Journey into Reactive Streams and Akka Streams
Page 13: Journey into Reactive Streams and Akka Streams

Processing real-time data

• Ephemeral

• Unbounded in size

• Potential "flooding" downstream

You cannot step twice into the same stream. For as you are stepping in, other waters are ever flowing on to you. — Heraclitus

Page 14: Journey into Reactive Streams and Akka Streams

Push vs pull

Page 15: Journey into Reactive Streams and Akka Streams
Page 16: Journey into Reactive Streams and Akka Streams

Pull1. Consumer calls producer

2. Consumer blocks

3. Producer sends data when available

Works best when producer is faster than consumer

Page 17: Journey into Reactive Streams and Akka Streams

Push1. Producer sends data to consumer

Works best when producer is slower than the consumer

Page 18: Journey into Reactive Streams and Akka Streams

Backpressure

Page 19: Journey into Reactive Streams and Akka Streams

Backpressure?• We need a way to signal when a consumer is able to

process more data

• Propogate backpressure through the entire flow

• Without backpressure data keeps flowing at full speed

• Leads to OOM errors, crashes, etc

Page 20: Journey into Reactive Streams and Akka Streams

Consumer usually has some kind of buffer.

Page 21: Journey into Reactive Streams and Akka Streams

Fast producers can overwhelm the buffer of a slow consumer.

Page 22: Journey into Reactive Streams and Akka Streams

Option 1: Use bounded buffer and drop messages.

Page 23: Journey into Reactive Streams and Akka Streams

Option 2: Increase buffer size if memory available.

Page 24: Journey into Reactive Streams and Akka Streams

Option 3: Pull-based backpressure.

Page 25: Journey into Reactive Streams and Akka Streams

Reactive StreamsPart 2 of 5

Page 26: Journey into Reactive Streams and Akka Streams

Reactive StreamsReactive Streams is a specification and low-level API for library developers.

Compliant RS implementations include the following:

• RxJava (Netflix)

• Reactor (Pivotal)

• Vert.x (RedHat)

• Akka Streams and Slick (Typesafe)

Page 27: Journey into Reactive Streams and Akka Streams
Page 28: Journey into Reactive Streams and Akka Streams

Three main repositories• Reactive Streams for the JVM

• Reactive Streams for JavaScript

• Reactive Streams IO (for network protocols such as TCP, WebSockets and possibly HTTP/2)

• Early exploration kicked off by Netflix

• 2016 timeframe

Page 29: Journey into Reactive Streams and Akka Streams

Reactive Streams JVM API specOnly for library builders, not for direct usage.public interface Processor<T, R> extends Subscriber<T>, Publisher<R> {}

public interface Publisher<T> { public void subscribe(Subscriber<? super T> s);}

public interface Subscriber<T> { public void onSubscribe(Subscription s); public void onNext(T t); public void onError(Throwable t); public void onComplete();}

public interface Subscription { public void request(long n); public void cancel();}

Page 30: Journey into Reactive Streams and Akka Streams

Faster publisher responsibilities?• Not generate elements, if it is able to control their

production rate

• Try buffering the elements in a bounded manner until more demand is signalled

• Drop elements until more demand is signalled

• Tear down the stream if unable to apply any of the above strategies

Page 31: Journey into Reactive Streams and Akka Streams

Reactive StreamsVisit the Reactive Streams website for more information.

http://www.reactive-streams.org/

Details:• TCK (Technology Compatibility Kit)

• API (JVM, JavaScript)

• Specifications

• Early conversation on future spec for IO

Page 32: Journey into Reactive Streams and Akka Streams

Akka StreamsPart 3 of 5

Page 33: Journey into Reactive Streams and Akka Streams

Akka StreamsAkka Streams provides a way to express and run a chain of asynchronous processing steps acting on a sequence of elements.

• DSL for async/non-blocking stream processing

• With "free" backpressure

• Conforms to the Reactive Streams spec for compatibility

Page 34: Journey into Reactive Streams and Akka Streams

Basics• Source - A processing stage with exactly one output

• Sink - A processing stage with exactly one input

• Flow - A processing stage which has exactly one input and output

• RunnableFlow - A Flow that has both ends "attached" to a Source and Sink

Page 35: Journey into Reactive Streams and Akka Streams
Page 36: Journey into Reactive Streams and Akka Streams
Page 37: Journey into Reactive Streams and Akka Streams
Page 38: Journey into Reactive Streams and Akka Streams
Page 39: Journey into Reactive Streams and Akka Streams
Page 40: Journey into Reactive Streams and Akka Streams

API designGoals

• Supremely composable

• Exhaustive model, everything you need for stream processing including error handling

Page 41: Journey into Reactive Streams and Akka Streams

API designConsiderations

• Immutable, reuseable stream blueprints

• Explicit materialization step

• No magic at the expense of some extra code

Page 42: Journey into Reactive Streams and Akka Streams

Materialization• Separate the what from the how

• Declarative Source/Flow/Sink to create a blueprint

• FlowMaterializer turns blueprint into actors

• Involves an extra step, but no magic

Page 43: Journey into Reactive Streams and Akka Streams

Error handling• The element causing division by zero will be dropped

• Result will be a Future completed with Success(228)val decider: Supervision.Decider = exc => exc match { case _: ArithmeticException => Supervision.Resume case _ => Supervision.Stop}// ActorFlowMaterializer takes the list of transformations comprising a akka.stream.scaladsl.Flow // and materializes them in the form of org.reactivestreams.Processorimplicit val mat = ActorFlowMaterializer( ActorFlowMaterializerSettings(system).withSupervisionStrategy(decider))val source = Source(0 to 5).map(100 / _)val result = source.runWith(Sink.fold(0)(_ + _))

Page 44: Journey into Reactive Streams and Akka Streams

Dynamic push/pull backpressure• Fast consumer can issue more Request(n) even before more

data arrives

• Producer can accumulate demand

• Total demand of elements is safe to publish

• Consumer's buffer will never overflow

• Default is push-based until consumer cannot cope

Page 45: Journey into Reactive Streams and Akka Streams

Fan out• Broadcast[T] (1 input, n outputs)

• Signals each output given an input signal

• Balance[T] (1 input => n outputs)

• Signals one of its output ports given an input signal

• FlexiRoute[In] (1 input, n outputs)

• Write custom fan out elements using a simple DSL

Page 46: Journey into Reactive Streams and Akka Streams

Fan in• Merge[In] (n inputs , 1 output)

• Picks signals randomly from inputs

• Zip[A,B,Out] (2 inputs, 1 output)

• Zipping into an (A,B) tuple stream

• Concat[T] (2 inputs, 1 output)

• Concatenate streams (first, then second)

Page 47: Journey into Reactive Streams and Akka Streams
Page 48: Journey into Reactive Streams and Akka Streams

val g = FlowGraph.closed() { implicit builder: FlowGraph.Builder => import FlowGraph.Implicits._ val in = Source(1 to 10) val out = Sink.ignore

val bcast = builder.add(Broadcast[Int](2)) val merge = builder.add(Merge[Int](2))

val f1, f2, f3, f4 = Flow[Int].map(_ + 10)

in ~> f1 ~> bcast ~> f2 ~> merge ~> f3 ~> out bcast ~> f4 ~> merge}

Page 49: Journey into Reactive Streams and Akka Streams

conflateabstract def conflate[S](seed: (T) ⇒ S, aggregate: (S, T) ⇒ S): Flow[S]

Allows a faster upstream to progress independently of a slower consumer by conflating elements into a summary until the consumer is ready to accept them.

Page 50: Journey into Reactive Streams and Akka Streams

groupedWithinabstract def groupedWithin(n: Int, d: FiniteDuration): Flow[Seq[T]]

Chunk up this stream into groups of elements received within a time window, or limited by the given number of elements, whatever happens first.

Page 51: Journey into Reactive Streams and Akka Streams

Simple streaming from/to Kafkaimplicit val actorSystem = ActorSystem("ReactiveKafka")implicit val materializer = ActorMaterializer()

val kafka = new ReactiveKafka(host = "localhost:9092", zooKeeperHost = "localhost:2181")val publisher = kafka.consume("lowercaseStrings", "groupName", new StringDecoder())val subscriber = kafka.publish("uppercaseStrings", "groupName", new StringEncoder())

// consume lowercase strings from kafka and publish them transformed to uppercaseSource(publisher).map(_.toUpperCase).to(Sink(subscriber)).run()

Page 52: Journey into Reactive Streams and Akka Streams

Akka Streams versus other streams

Part 4 of 5

Page 53: Journey into Reactive Streams and Akka Streams

Akka Streams• Distributed and fault-tolerant

• Sensitive to bidirectional pressure

• Easy to program complex processing flow graphs

Page 54: Journey into Reactive Streams and Akka Streams

Java Streams• Iterators with a weaker but more parallelism-friendly

interface

• Only high-level control (no next/hasNext)

• Transformation, not distribution

• Push or pull chosen statically

Page 55: Journey into Reactive Streams and Akka Streams

RxJava• Pure push model

• Extensive DSL for transformations

• Only allows blocking backpressure

• Unbounded buffering across async boundary

Page 56: Journey into Reactive Streams and Akka Streams

Code review and demoPart 5 of 5

Source code available at https://github.com/rocketpages

Page 57: Journey into Reactive Streams and Akka Streams
Page 58: Journey into Reactive Streams and Akka Streams

Thank you!