1
Fast Data: Reactive Systems Meet Big Data
February25,[email protected]@deanwampler ©Dean Wampler 2014-2016, All Rights Reserved
Motivation:
3
eCommerce
Cyber Monday?
4
On demand?
5
Motivation:
6
Internet of Things
7
Medical Devices,IT Systems
8
Aircraft Engines
9
Trucks, Farm Equipment
10
Remote Sensors
11
Robotics
12
Health Monitoring,Home Automation
13
Reactive Systems
1420
Message Driven
ResilientElastic
Responsive
Message Driven
ResilientElastic
Responsive
15
Message Driven
ResilientElastic
Responsive
16
17
Requests or commandsrequire timely responses.
ResilientElastic
Responsive
18
Responsive
19
Requires predictableresponse times
and quality of service.
Responsive
20
Requires pre-plannedgraceful degradation
of service.
Responsive
21
Awareness of timeis first class.
Responsive
Message Driven
ResilientElastic
Responsive
22
23
Recovers from errors
Message Driven
ResilientElastic
Responsive
24
Resilient
25
Resilient
Failure isnot disruptive.
It’s routine.
26
Resilient
Therefore, failure is a
first class concept.
Message Driven
ResilientElastic
Responsive
27
28
Scale up and down
Message Driven
ResilientElastic
Responsive
29
Elastic
30
Elastic
Detect changinginput patterns.Automaticallyadjust services.
31
Elastic
No bottlenecksor contention points.
Message Driven
ResilientElastic
Responsive
32
33
To react, you must be message driven.
Message Driven
ResilientElastic
Message Driven
34
35
Asynchronousmessage passing.
Message Driven
36
Defines boundaries, promotes loose coupling
and isolation.
Message Driven
37
Fast Data
38
Spark
39
Productivity?
Very concise, elegant, functional APIs.•Python, R•Scala, Java
•... and SQL!
40
Productivity?
Interactive shell (REPL)•Scala, Python, R, and SQL!•“Notebooks”
41
Batch + Streaming?
Streams - “mini batch” processing:•Reuse “batch” code•Adds “window” functions•“Discretized Streams”: DStreams
42
43
import org.apache.spark.SparkContextimport org.apache.spark.SparkContext._
val sparkContext = new SparkContext(master, “Inv. Index”)sparkContext.textFile("/path/to/input").map { line => val array = line.split(",", 2) (array(0), array(1))}.flatMap { case (id, contents) => toWords(contents).map(w => ((w,id),1))}.reduceByKey(_ + _).map { case ((word,id),n) => (word,(id,n))}.groupByKey.mapValues { seq => sortByCount(seq)}.saveAsTextFile("/path/to/output")
Powerful,beautiful
combinators
44
reduceByKey
flatMap
textFile
map
map
groupByKey
map
saveAsTextFile
Lazy API, inspired by Scala collections.•Combines steps into “stages”. •Keeps intermediate data in
memory.
45
46
47
48
•Message Queue Semantics•Log-oriented ingestion, scalability
n+5
n+4
n+3
n+2
n+1 n
Consumer 1
Producer 1
Producer 2
n+?
n+?
Consumer 2
Topic A
49
Service 1
Log & Other Files
Internet
Services
Service 2
Service 3
Services
Services
N * M links ConsumersProducers
50
Service 1
Log & Other Files
Internet
Services
Service 2
Service 3
Services
Services
N + M links ConsumersProducers
52
Even
t
Even
t
Even
t
Even
t
Even
t
Even
tEvent/Data Stream
Consumer
Consumerfeedback
feedback
feedback
Unbounded queues eventually exhaust the heap.
53
Reactive Streams
Even
t
Even
t
Even
t
Even
t
Even
t
Even
tEvent/Data Stream
Consumer
Consumerfeedback
feedback
feedback
Bounded queues cause blocking or arbitrary dropping of events.
54
Reactive Streams
Even
t
Even
t
Even
t
Even
t
Even
t
Even
tEvent/Data Stream
Consumer
Consumerfeedback
feedback
feedback
Solution: Back pressure where the producer and
consumer negotiate.55
Reactive Streams
Reactive Streams over Actors.
56
Akka Streams
Putting It Together
57
58
lightbend.com/[email protected]
©Dean Wampler 2014-2016, All Rights Reserved
Top Related