Mesoscon2017 - Streaming Data Pipelines on Mesos - …...Spark job #1 Spark job #2 Flume Sqoop DBs...

47
Streaming Data Pipelines on Mesos: Lessons Learned Dean Wampler, Ph.D. VP of Fast Data Engineering, Lightbend

Transcript of Mesoscon2017 - Streaming Data Pipelines on Mesos - …...Spark job #1 Spark job #2 Flume Sqoop DBs...

Page 1: Mesoscon2017 - Streaming Data Pipelines on Mesos - …...Spark job #1 Spark job #2 Flume Sqoop DBs Slave Node DiskDiskDisk Node Mgr Data Node Master Resource Manager Name Node. 8 YARN

Streaming Data Pipelines on Mesos: Lessons Learned

Dean Wampler, Ph.D. VP of Fast Data Engineering, Lightbend

Page 2: Mesoscon2017 - Streaming Data Pipelines on Mesos - …...Spark job #1 Spark job #2 Flume Sqoop DBs Slave Node DiskDiskDisk Node Mgr Data Node Master Resource Manager Name Node. 8 YARN

2

• Modern data applications: • What makes them different?

• Mesos as a platform • Specific insights

Themes

Page 3: Mesoscon2017 - Streaming Data Pipelines on Mesos - …...Spark job #1 Spark job #2 Flume Sqoop DBs Slave Node DiskDiskDisk Node Mgr Data Node Master Resource Manager Name Node. 8 YARN

3

Why we care? Lightbend Fast Data Platform

Page 4: Mesoscon2017 - Streaming Data Pipelines on Mesos - …...Spark job #1 Spark job #2 Flume Sqoop DBs Slave Node DiskDiskDisk Node Mgr Data Node Master Resource Manager Name Node. 8 YARN

4

Microservices Streaming

Machine Learning

Management

Management

Data Back Plane

StorageDeployment

Page 5: Mesoscon2017 - Streaming Data Pipelines on Mesos - …...Spark job #1 Spark job #2 Flume Sqoop DBs Slave Node DiskDiskDisk Node Mgr Data Node Master Resource Manager Name Node. 8 YARN

5

Page 6: Mesoscon2017 - Streaming Data Pipelines on Mesos - …...Spark job #1 Spark job #2 Flume Sqoop DBs Slave Node DiskDiskDisk Node Mgr Data Node Master Resource Manager Name Node. 8 YARN

6

Why not Hadoop?

Page 7: Mesoscon2017 - Streaming Data Pipelines on Mesos - …...Spark job #1 Spark job #2 Flume Sqoop DBs Slave Node DiskDiskDisk Node Mgr Data Node Master Resource Manager Name Node. 8 YARN

7

YARN

HDFS

MRjob#1

MRjob#2

Sparkjob#1

Sparkjob#2

Flume Sqoop

DBs

SlaveNode

DiskDiskDisk

NodeMgr

DataNode

Master

ResourceManager

NameNode

Page 8: Mesoscon2017 - Streaming Data Pipelines on Mesos - …...Spark job #1 Spark job #2 Flume Sqoop DBs Slave Node DiskDiskDisk Node Mgr Data Node Master Resource Manager Name Node. 8 YARN

8

YARN

HDFS

MRjob#1

MRjob#2

Sparkjob#1

Sparkjob#2

Flume Sqoop

DBs

SlaveNode

DiskDiskDisk

NodeMgr

DataNode

Master

ResourceManager

NameNode

What’s not to Like?

• YARN limitations • Batch orientation • Microservices?

Page 9: Mesoscon2017 - Streaming Data Pipelines on Mesos - …...Spark job #1 Spark job #2 Flume Sqoop DBs Slave Node DiskDiskDisk Node Mgr Data Node Master Resource Manager Name Node. 8 YARN

9

YARN

HDFS

MRjob#1

MRjob#2

Sparkjob#1

Sparkjob#2

Flume Sqoop

DBs

SlaveNode

DiskDiskDisk

NodeMgr

DataNode

Master

ResourceManager

NameNode

YARN Limitations• Only understands

running jobs like Spark, MapReduce

• Can’t even run HDFS services!

Page 10: Mesoscon2017 - Streaming Data Pipelines on Mesos - …...Spark job #1 Spark job #2 Flume Sqoop DBs Slave Node DiskDiskDisk Node Mgr Data Node Master Resource Manager Name Node. 8 YARN

10

YARN

HDFS

MRjob#1

MRjob#2

Sparkjob#1

Sparkjob#2

Flume Sqoop

DBs

SlaveNode

DiskDiskDisk

NodeMgr

DataNode

Master

ResourceManager

NameNode

YARN Limitations

• Slow adoption of container standards

• Too coarse-grained

Page 11: Mesoscon2017 - Streaming Data Pipelines on Mesos - …...Spark job #1 Spark job #2 Flume Sqoop DBs Slave Node DiskDiskDisk Node Mgr Data Node Master Resource Manager Name Node. 8 YARN

11

YARN

HDFS

MRjob#1

MRjob#2

Sparkjob#1

Sparkjob#2

Flume Sqoop

DBs

SlaveNode

DiskDiskDisk

NodeMgr

DataNode

Master

ResourceManager

NameNode

Batch Orientation

• Streaming services more ad hoc

• Must manage C*, Kafka, etc. separately

Page 12: Mesoscon2017 - Streaming Data Pipelines on Mesos - …...Spark job #1 Spark job #2 Flume Sqoop DBs Slave Node DiskDiskDisk Node Mgr Data Node Master Resource Manager Name Node. 8 YARN

12

YARN

HDFS

MRjob#1

MRjob#2

Sparkjob#1

Sparkjob#2

Flume Sqoop

DBs

SlaveNode

DiskDiskDisk

NodeMgr

DataNode

Master

ResourceManager

NameNode

Microservices?• Hadoop wants to

own the whole cluster

• No mixed application work loads

Page 13: Mesoscon2017 - Streaming Data Pipelines on Mesos - …...Spark job #1 Spark job #2 Flume Sqoop DBs Slave Node DiskDiskDisk Node Mgr Data Node Master Resource Manager Name Node. 8 YARN

13

Why Mesos?

Page 14: Mesoscon2017 - Streaming Data Pipelines on Mesos - …...Spark job #1 Spark job #2 Flume Sqoop DBs Slave Node DiskDiskDisk Node Mgr Data Node Master Resource Manager Name Node. 8 YARN

14

Page 15: Mesoscon2017 - Streaming Data Pipelines on Mesos - …...Spark job #1 Spark job #2 Flume Sqoop DBs Slave Node DiskDiskDisk Node Mgr Data Node Master Resource Manager Name Node. 8 YARN

15

Page 16: Mesoscon2017 - Streaming Data Pipelines on Mesos - …...Spark job #1 Spark job #2 Flume Sqoop DBs Slave Node DiskDiskDisk Node Mgr Data Node Master Resource Manager Name Node. 8 YARN

16

Page 17: Mesoscon2017 - Streaming Data Pipelines on Mesos - …...Spark job #1 Spark job #2 Flume Sqoop DBs Slave Node DiskDiskDisk Node Mgr Data Node Master Resource Manager Name Node. 8 YARN

17

Page 18: Mesoscon2017 - Streaming Data Pipelines on Mesos - …...Spark job #1 Spark job #2 Flume Sqoop DBs Slave Node DiskDiskDisk Node Mgr Data Node Master Resource Manager Name Node. 8 YARN

18

…Node NodeMesos Executor …Mesos Executor

master

Spark Executor

task task

task task

Spark Executor

task task

task task

Mesos Master

Spark Framework

Spark Driverobject MyApp { def main() { val sc = new SparkContext(…) … }}

Scheduler

Page 19: Mesoscon2017 - Streaming Data Pipelines on Mesos - …...Spark job #1 Spark job #2 Flume Sqoop DBs Slave Node DiskDiskDisk Node Mgr Data Node Master Resource Manager Name Node. 8 YARN

19

Streaming - Characteristics• Continuous

processing • Variable lifespans • Resilience • Scalability

Page 20: Mesoscon2017 - Streaming Data Pipelines on Mesos - …...Spark job #1 Spark job #2 Flume Sqoop DBs Slave Node DiskDiskDisk Node Mgr Data Node Master Resource Manager Name Node. 8 YARN

20

Continuous Processing

• Never ending input and output • Dynamic scaling on demand

(more in a moment…) • CNI support

Page 21: Mesoscon2017 - Streaming Data Pipelines on Mesos - …...Spark job #1 Spark job #2 Flume Sqoop DBs Slave Node DiskDiskDisk Node Mgr Data Node Master Resource Manager Name Node. 8 YARN

21

Variable Lifespans

• Apps live minutes to months! • Mesos can deploy and manage:

• Very short-lived containers • Long-running services

Page 22: Mesoscon2017 - Streaming Data Pipelines on Mesos - …...Spark job #1 Spark job #2 Flume Sqoop DBs Slave Node DiskDiskDisk Node Mgr Data Node Master Resource Manager Name Node. 8 YARN

22

Resilience

• Mesos’ built-in fault tolerance features make app resilience easier to implement.

• …

Page 23: Mesoscon2017 - Streaming Data Pipelines on Mesos - …...Spark job #1 Spark job #2 Flume Sqoop DBs Slave Node DiskDiskDisk Node Mgr Data Node Master Resource Manager Name Node. 8 YARN

23

Resilience

• … • But stateful streaming services

need to manage state persistence for recovery

Page 24: Mesoscon2017 - Streaming Data Pipelines on Mesos - …...Spark job #1 Spark job #2 Flume Sqoop DBs Slave Node DiskDiskDisk Node Mgr Data Node Master Resource Manager Name Node. 8 YARN

24

Scalability

• Mesos’ flexible scheduling model supports dynamic scaling on demand

Page 25: Mesoscon2017 - Streaming Data Pipelines on Mesos - …...Spark job #1 Spark job #2 Flume Sqoop DBs Slave Node DiskDiskDisk Node Mgr Data Node Master Resource Manager Name Node. 8 YARN

25

Characteristics of Particular Streaming Apps

Page 26: Mesoscon2017 - Streaming Data Pipelines on Mesos - …...Spark job #1 Spark job #2 Flume Sqoop DBs Slave Node DiskDiskDisk Node Mgr Data Node Master Resource Manager Name Node. 8 YARN

26

• Low latency? How low? • High volume? How high?

Streaming Tradeoffs (1/4)

Page 27: Mesoscon2017 - Streaming Data Pipelines on Mesos - …...Spark job #1 Spark job #2 Flume Sqoop DBs Slave Node DiskDiskDisk Node Mgr Data Node Master Resource Manager Name Node. 8 YARN

27

• Which kinds of data processing & analytics are required? • SQL? • Machine learning? • Simple filtering & transformations?

Streaming Tradeoffs (2/4)

Page 28: Mesoscon2017 - Streaming Data Pipelines on Mesos - …...Spark job #1 Spark job #2 Flume Sqoop DBs Slave Node DiskDiskDisk Node Mgr Data Node Master Resource Manager Name Node. 8 YARN

28

• How will this processing be done? • Individual processing of events? • Bulk processing of records?

Streaming Tradeoffs (3/4)

Page 29: Mesoscon2017 - Streaming Data Pipelines on Mesos - …...Spark job #1 Spark job #2 Flume Sqoop DBs Slave Node DiskDiskDisk Node Mgr Data Node Master Resource Manager Name Node. 8 YARN

29

• Which tools and data sources/sinks must interoperate with your streaming tool?

Streaming Tradeoffs (4/4)

Page 30: Mesoscon2017 - Streaming Data Pipelines on Mesos - …...Spark job #1 Spark job #2 Flume Sqoop DBs Slave Node DiskDiskDisk Node Mgr Data Node Master Resource Manager Name Node. 8 YARN

30

Characteristics of Streaming: How Several Tools Line Up

Page 31: Mesoscon2017 - Streaming Data Pipelines on Mesos - …...Spark job #1 Spark job #2 Flume Sqoop DBs Slave Node DiskDiskDisk Node Mgr Data Node Master Resource Manager Name Node. 8 YARN

• Low latency • Med. volume • ETL, “tables” • Data flow or per

event

31

Page 32: Mesoscon2017 - Streaming Data Pipelines on Mesos - …...Spark job #1 Spark job #2 Flume Sqoop DBs Slave Node DiskDiskDisk Node Mgr Data Node Master Resource Manager Name Node. 8 YARN

• Low latency • Med. volume • Complex flows • Complex Event

Processing

32

Page 33: Mesoscon2017 - Streaming Data Pipelines on Mesos - …...Spark job #1 Spark job #2 Flume Sqoop DBs Slave Node DiskDiskDisk Node Mgr Data Node Master Resource Manager Name Node. 8 YARN

33

MicroserviceMicroservice

Microservice

Microservice

ServiceActor1

Event

Event

Event

Event

Event

Event Router

Actor

ServiceActor2

SA13SA11

SA12

SA23

SA21SA22

• AS and KS are libraries (with some services)

• Run the apps as microservices

Page 34: Mesoscon2017 - Streaming Data Pipelines on Mesos - …...Spark job #1 Spark job #2 Flume Sqoop DBs Slave Node DiskDiskDisk Node Mgr Data Node Master Resource Manager Name Node. 8 YARN

• Med. latency • High volume • Data flows, SQL • En masse

processing

34

Page 35: Mesoscon2017 - Streaming Data Pipelines on Mesos - …...Spark job #1 Spark job #2 Flume Sqoop DBs Slave Node DiskDiskDisk Node Mgr Data Node Master Resource Manager Name Node. 8 YARN

• Low latency • High volume • Data flows,

correctness • En masse

processing35

Page 36: Mesoscon2017 - Streaming Data Pipelines on Mesos - …...Spark job #1 Spark job #2 Flume Sqoop DBs Slave Node DiskDiskDisk Node Mgr Data Node Master Resource Manager Name Node. 8 YARN

36

• Spark and Flink run services to which you submit “jobs”

• Jobs are partitioned into “tasks”

…Node NodeMesos Executor …Mesos Executor

master

Spark Executor

task task

task task

Spark Executor

task task

task task

Mesos Master

Spark Framework

Spark Driverobject MyApp { def main() { val sc = new SparkContext(…) … }}

Scheduler

Page 37: Mesoscon2017 - Streaming Data Pipelines on Mesos - …...Spark job #1 Spark job #2 Flume Sqoop DBs Slave Node DiskDiskDisk Node Mgr Data Node Master Resource Manager Name Node. 8 YARN

37

Architecture Trends…

Page 38: Mesoscon2017 - Streaming Data Pipelines on Mesos - …...Spark job #1 Spark job #2 Flume Sqoop DBs Slave Node DiskDiskDisk Node Mgr Data Node Master Resource Manager Name Node. 8 YARN

38

Big DataServices

Architectures Trend

Page 39: Mesoscon2017 - Streaming Data Pipelines on Mesos - …...Spark job #1 Spark job #2 Flume Sqoop DBs Slave Node DiskDiskDisk Node Mgr Data Node Master Resource Manager Name Node. 8 YARN

39

Big DataServicesMicroservices and Fast Data

New Architecture

Page 40: Mesoscon2017 - Streaming Data Pipelines on Mesos - …...Spark job #1 Spark job #2 Flume Sqoop DBs Slave Node DiskDiskDisk Node Mgr Data Node Master Resource Manager Name Node. 8 YARN

40

• Single responsibilities.

• Easy to evolve. • Fits the fine-grained

model of Mesos very well.

Browse

REST

AccountOrders

ShoppingCart

APIGateway

Inventory

Page 41: Mesoscon2017 - Streaming Data Pipelines on Mesos - …...Spark job #1 Spark job #2 Flume Sqoop DBs Slave Node DiskDiskDisk Node Mgr Data Node Master Resource Manager Name Node. 8 YARN

41

• Continuous processing

• Variable lifespans • Resilience • Scalability

Page 42: Mesoscon2017 - Streaming Data Pipelines on Mesos - …...Spark job #1 Spark job #2 Flume Sqoop DBs Slave Node DiskDiskDisk Node Mgr Data Node Master Resource Manager Name Node. 8 YARN

42

• Each data stream app (or μservice): • has one responsibility • ingests unending data (or messages)

Synergies…

Browse

REST

AccountOrders

ShoppingCart

APIGateway

Inventory

Page 43: Mesoscon2017 - Streaming Data Pipelines on Mesos - …...Spark job #1 Spark job #2 Flume Sqoop DBs Slave Node DiskDiskDisk Node Mgr Data Node Master Resource Manager Name Node. 8 YARN

43

• Each data stream app (or μservice): • must operate asynchronously • must offer never-ending service

Synergies…

Browse

REST

AccountOrders

ShoppingCart

APIGateway

Inventory

Page 44: Mesoscon2017 - Streaming Data Pipelines on Mesos - …...Spark job #1 Spark job #2 Flume Sqoop DBs Slave Node DiskDiskDisk Node Mgr Data Node Master Resource Manager Name Node. 8 YARN

44

• So, both 1. have similar design problems 2. dominated by data (at least eventually for microservices)

Synergies…

Browse

REST

AccountOrders

ShoppingCart

APIGateway

Inventory

Page 45: Mesoscon2017 - Streaming Data Pipelines on Mesos - …...Spark job #1 Spark job #2 Flume Sqoop DBs Slave Node DiskDiskDisk Node Mgr Data Node Master Resource Manager Name Node. 8 YARN

45

What’s Still Needed?

Page 46: Mesoscon2017 - Streaming Data Pipelines on Mesos - …...Spark job #1 Spark job #2 Flume Sqoop DBs Slave Node DiskDiskDisk Node Mgr Data Node Master Resource Manager Name Node. 8 YARN

46

• Stateful streaming apps use ad-hoc mechanisms to persist the state for resilience

State

Page 47: Mesoscon2017 - Streaming Data Pipelines on Mesos - …...Spark job #1 Spark job #2 Flume Sqoop DBs Slave Node DiskDiskDisk Node Mgr Data Node Master Resource Manager Name Node. 8 YARN

47

lightbend.com/fast-data-platform

Thank You!