Kostas Tzoumas_Stephan Ewen - Keynote -The maturing data streaming ecosystem and Apache Flink’s...

48

Transcript of Kostas Tzoumas_Stephan Ewen - Keynote -The maturing data streaming ecosystem and Apache Flink’s...

Page 1: Kostas Tzoumas_Stephan Ewen - Keynote -The maturing data streaming ecosystem and Apache Flink’s accelerated growth
Page 2: Kostas Tzoumas_Stephan Ewen - Keynote -The maturing data streaming ecosystem and Apache Flink’s accelerated growth

Some practical informationNetwork name: Flink Forward 2016Password: #flinkforward16

Twitter handle: @flinkforward Hashtag: #ff16

Group photo today at 3.30 pm

All talks will be recorded and can be found on our YouTube channel “Apache Flink Berlin” after the conference

FlinkFest today at Palais starting at 6.10 pm

Attention: Some last minute changes to the program, please consult

online schedule

Page 3: Kostas Tzoumas_Stephan Ewen - Keynote -The maturing data streaming ecosystem and Apache Flink’s accelerated growth

3

The Venue

Page 4: Kostas Tzoumas_Stephan Ewen - Keynote -The maturing data streaming ecosystem and Apache Flink’s accelerated growth

4

A big thanks to our sponsors!

Page 5: Kostas Tzoumas_Stephan Ewen - Keynote -The maturing data streaming ecosystem and Apache Flink’s accelerated growth

5

A big thanks to our program committee!

Tyler AkidauGoogle

Stephan Ewen

data Artisans

Jamie Grierdata Artisans

Vasia KalavriKTH

Neha NarkhedeConfluent

Page 6: Kostas Tzoumas_Stephan Ewen - Keynote -The maturing data streaming ecosystem and Apache Flink’s accelerated growth

6

A big thanks to our speakers!

Page 7: Kostas Tzoumas_Stephan Ewen - Keynote -The maturing data streaming ecosystem and Apache Flink’s accelerated growth

7

A big thanks to our speakers!

Page 8: Kostas Tzoumas_Stephan Ewen - Keynote -The maturing data streaming ecosystem and Apache Flink’s accelerated growth

8

Kostas TzoumasStephan Ewen

Flink ForwardSeptember 12, 2016

The data streaming ecosystem and Apache Flink®: present and

future

Page 9: Kostas Tzoumas_Stephan Ewen - Keynote -The maturing data streaming ecosystem and Apache Flink’s accelerated growth

9

Founded by the original creators of Apache Flink®, our goal is to make stream processing accessible to the enterprise

Contributing and helping the Flink community grow

Providing enterprise support and services

Page 10: Kostas Tzoumas_Stephan Ewen - Keynote -The maturing data streaming ecosystem and Apache Flink’s accelerated growth

Streaming is a rapidly growing and maturing market category of its own

Streaming is the biggest change in data infrastructure (Flink Forward 2015)

10

Page 11: Kostas Tzoumas_Stephan Ewen - Keynote -The maturing data streaming ecosystem and Apache Flink’s accelerated growth

The Flink community has been at the center of this journey. And there is

innovation and convergence in all parts of the stack.

message transport

computeengine

programmingparadigm

11

Page 12: Kostas Tzoumas_Stephan Ewen - Keynote -The maturing data streaming ecosystem and Apache Flink’s accelerated growth

Why? Streaming technology is enabling the obvious: continuous processing on

data that is continuously produced

Hint: you already have streaming data12

Page 13: Kostas Tzoumas_Stephan Ewen - Keynote -The maturing data streaming ecosystem and Apache Flink’s accelerated growth

Data streaming adoption patterns

Real-time products and business monitoring Robust continuous applications Decentralized architecture

Unify real-time and historical data

13

Page 14: Kostas Tzoumas_Stephan Ewen - Keynote -The maturing data streaming ecosystem and Apache Flink’s accelerated growth

Retail, e-commerce

Better product recommendations

Process monitoring Inventory

management

Finance Differentiation

via tech Push-based

products Fraud detection

Telco, IoT, Infrastructure Infrastructure

monitoring Anomaly

detection

Internet & mobile Personalization User behavior

monitoring Analytics

14

Page 15: Kostas Tzoumas_Stephan Ewen - Keynote -The maturing data streaming ecosystem and Apache Flink’s accelerated growth

30 Flink applications in production for more than one year. 10 billion events (2TB) processed daily

Complex jobs of > 30 operators running 24/7, processing 30 billion events daily, maintaining state of 100s of GB with exactly-once guarantees

Largest job has > 20 operators, runs on > 5000 vCores in 1000-node cluster, processes millions of events per second

15

Page 16: Kostas Tzoumas_Stephan Ewen - Keynote -The maturing data streaming ecosystem and Apache Flink’s accelerated growth

What is Flink's unique role in the streaming data ecosystem?

16

Page 17: Kostas Tzoumas_Stephan Ewen - Keynote -The maturing data streaming ecosystem and Apache Flink’s accelerated growth

Before Flink, users had to make hard choices between:

Volume Latency Accuracy

17

Page 18: Kostas Tzoumas_Stephan Ewen - Keynote -The maturing data streaming ecosystem and Apache Flink’s accelerated growth

Flink eliminates these tradeoffs

10s of millions events per second for stateful applications

Sub-second latency, as low as single-digit milliseconds

Accurate computation results

18

Page 19: Kostas Tzoumas_Stephan Ewen - Keynote -The maturing data streaming ecosystem and Apache Flink’s accelerated growth

A broader definition of accuracy: the results that I want when I want them

1. Accurate under failures and downtime2. Accurate under out of order data3. Results when you need them4. Accurate modeling of the world

19

Page 20: Kostas Tzoumas_Stephan Ewen - Keynote -The maturing data streaming ecosystem and Apache Flink’s accelerated growth

1. Failures and downtime

Checkpoints & savepoints Exactly-once guarantees

2. Out of order and late data Event time support Watermarks

3. Results when you need them Low latency Triggers

4. Accurate modeling True streaming engine Sessions and flexible

windows

20

Page 21: Kostas Tzoumas_Stephan Ewen - Keynote -The maturing data streaming ecosystem and Apache Flink’s accelerated growth

5. Batch + streaming One engine Dedicated APIs

6. Reprocessing High throughput, event

time support, and savepoints

7. Ecosystem Rich connector

ecosystem and 3rd party packages

8. Community support One of the most active

projects with over 200 contributors

21

flink -s <savepoint> <job>

Page 22: Kostas Tzoumas_Stephan Ewen - Keynote -The maturing data streaming ecosystem and Apache Flink’s accelerated growth

What are the next steps for Flink?

22

Page 23: Kostas Tzoumas_Stephan Ewen - Keynote -The maturing data streaming ecosystem and Apache Flink’s accelerated growth

Provide state of the art streaming capabilities (✔) Operate in the largest infrastructures of the world Open up to a wider set of enterprise users Broaden the scope of stream processing

23

Page 24: Kostas Tzoumas_Stephan Ewen - Keynote -The maturing data streaming ecosystem and Apache Flink’s accelerated growth

Apache Flink today

24

The Apache Flink community haspushed the boundaries of

open source stream processing.

Page 25: Kostas Tzoumas_Stephan Ewen - Keynote -The maturing data streaming ecosystem and Apache Flink’s accelerated growth

Flink's unique combination of features

25

Low latencyHigh Throughput

Well-behavedflow control

(back pressure)

Consistency

Works on real-timeand historic data

Performance Event Time

APIsLibraries

StatefulStreaming

Savepoints(replays, A/B testing,upgrades, versioning)

Exactly-once semanticsfor fault tolerance

Windows &user-defined state

Flexible windows(time, count, session, roll-your own)

Complex Event Processing

Fluent API

Out-of-order events

Fast and largeout-of-core state

Page 26: Kostas Tzoumas_Stephan Ewen - Keynote -The maturing data streaming ecosystem and Apache Flink’s accelerated growth

Flink v1.1

26

Connectors MetricSystem (Stream) SQL Session

WindowsLibrary

enhancements

Page 27: Kostas Tzoumas_Stephan Ewen - Keynote -The maturing data streaming ecosystem and Apache Flink’s accelerated growth

Flink v1.1 + current threads

27

ConnectorsSession

Windows(Stream) SQL

Libraryenhancements

MetricSystem

Metrics &Visualization

Dynamic Scaling

Savepointcompatibility Checkpoints

to savepoints

More connectors Stream SQLWindows

Large stateMaintenance

Fine grainedrecovery

Side in-/outputsWindow DSL

Security

Mesos &others

Dynamic ResourceManagement

Authentication

Queryable State

Page 28: Kostas Tzoumas_Stephan Ewen - Keynote -The maturing data streaming ecosystem and Apache Flink’s accelerated growth

Flink v1.1 + current threads

28

ConnectorsSession

Windows(Stream) SQL

Libraryenhancements

MetricSystem

Operations

Ecosystem ApplicationFeatures

Metrics &Visualization

Dynamic Scaling

Savepointcompatibility Checkpoints

to savepoints

More connectors Stream SQLWindows

Large stateMaintenance

Fine grainedrecovery

Side in-/outputsWindow DSL

BroaderAudience

Security

Mesos &others

Dynamic ResourceManagement

Authentication

Queryable State

Page 29: Kostas Tzoumas_Stephan Ewen - Keynote -The maturing data streaming ecosystem and Apache Flink’s accelerated growth

Flink v1.1 + current threads

29

ConnectorsSession

Windows(Stream) SQL

Libraryenhancements

MetricSystem

Operations

Ecosystem ApplicationFeatures

Metrics &Visualization

Dynamic Scaling

Savepointcompatibility Checkpoints

to savepoints

More connectors Stream SQLWindows

Large stateMaintenance

Fine grainedrecovery

Side in-/outputsWindow DSL

BroaderAudience

Security

Mesos &others

Dynamic ResourceManagement

Authentication

Queryable State

Page 30: Kostas Tzoumas_Stephan Ewen - Keynote -The maturing data streaming ecosystem and Apache Flink’s accelerated growth

Queryable State

Flink v1.1 + current threads

30

ConnectorsSession

Windows(Stream) SQL

Libraryenhancements

MetricSystem

Operations

Ecosystem ApplicationFeatures

Metrics &Visualization

Dynamic Scaling

Savepointcompatibility Checkpoints

to savepoints

More connectors Stream SQLWindows

Large stateMaintenance

Fine grainedrecovery

Side in-/outputsWindow DSL

BroaderAudience

Security

Mesos &others

Dynamic ResourceManagement

AuthenticationMore details in the Talk

"The Future of Apache Flink"

(Monday, 11:00)

Page 31: Kostas Tzoumas_Stephan Ewen - Keynote -The maturing data streaming ecosystem and Apache Flink’s accelerated growth

Security / Authentication

31

No unauthorized data accessSecured clusters with Kerberos-based authentication• Kafka, ZooKeeper, HDFS, YARN, HBase, …

No unencrypted traffic between Flink Processes• RPC, Data Exchange, Web UI

Largely contributed by

Prevent malicious users to hook into Flink jobsSee talk

"Flink Security Enhancements"(Tuesday, 11.45)

Page 32: Kostas Tzoumas_Stephan Ewen - Keynote -The maturing data streaming ecosystem and Apache Flink’s accelerated growth

Checkpoints / Savepoints

32

Recover a running job into a new job

Recover a running job onto a new clusterApplication state backwards compatibility• Flink 1.0 made the APIs backwards compatible• Now making the savepoints backwards compatible

• Applications can be moved to newer versions ofFlink even when state backends or internals change

v1.x v2.0v1.y

Page 33: Kostas Tzoumas_Stephan Ewen - Keynote -The maturing data streaming ecosystem and Apache Flink’s accelerated growth

Dynamic scaling

33

Changing load bears changing resource requirements• Need to adjust parallelism of running streaming jobs

Re-scaling stateless operators is trivialRe-scaling stateful operators is hard (windows, user state)• Efficiently re-shard state

time

WorkloadResources

Re-scaling Flink jobs preservesexactly-once guarantees

See talk"Dynamic scaling: How Apache Flink adapts to changing workloads"

(Tuesday, 14.45)

Page 34: Kostas Tzoumas_Stephan Ewen - Keynote -The maturing data streaming ecosystem and Apache Flink’s accelerated growth

Cluster management

34

Series of improvements to seamlessly interoperate with various cluster managers• YARN, Mesos, Docker, Standalone, …• Proper isolation of jobs, clean support for multi-job

sessionsDynamic acquire/release of resourcesUsing mixed container sizes

Driven byMesos integration contributed by

and

Page 35: Kostas Tzoumas_Stephan Ewen - Keynote -The maturing data streaming ecosystem and Apache Flink’s accelerated growth

Cluster management

35

Series of improvements to seamlessly interoperate with various cluster managers• YARN, Mesos, Docker, Standalone, …• Proper isolation of jobs, clean support for multi-job

sessionsDynamic acquire/release of resourcesUsing mixed container sizes

Driven byMesos integration contributed by

and

See talk"Introducing Flink on

Mesos"(Tuesday, 11.30)

See talk"Running Flink Everywhere"

(Monday, 16.45)

Page 36: Kostas Tzoumas_Stephan Ewen - Keynote -The maturing data streaming ecosystem and Apache Flink’s accelerated growth

Stream SQL

36

SQL is the standard high-level query languageA natural way to open up streaming to more peopleProblem: There is no Streaming SQL standard• At least beyond the basic operations• Challenging: Incorporate windows and time

semanticsFlink community working withApache Calcite to draft a new model

Page 37: Kostas Tzoumas_Stephan Ewen - Keynote -The maturing data streaming ecosystem and Apache Flink’s accelerated growth

Stream SQL

37

SQL is the standard high-level query languageA natural way to open up streaming to more people

Flink community working with users and withApache Calcite to draft a new model

Problem: There is no Streaming SQL standard• At least beyond the basic operations• Challenging: Incorporate windows and time

semantics

See talk"Streaming SQL"(Monday, 11:00)

See talk"Taking a look under the hood of Apache Flink’s

relational APIs"(Monday, 16.45)

Page 38: Kostas Tzoumas_Stephan Ewen - Keynote -The maturing data streaming ecosystem and Apache Flink’s accelerated growth

38

Looking further

Page 39: Kostas Tzoumas_Stephan Ewen - Keynote -The maturing data streaming ecosystem and Apache Flink’s accelerated growth

Streaming and batch

39

The separation of batch and streaming …

… is quite artificial… has been largely technology driven (not by use cases)

In fact – several talks here are about batch processing…

People are approaching Flink for batch processing as well

Page 40: Kostas Tzoumas_Stephan Ewen - Keynote -The maturing data streaming ecosystem and Apache Flink’s accelerated growth

Streaming and batch

40

2016-3-112:00 am

2016-3-11:00 am

2016-3-12:00 am

2016-3-1111:00pm

2016-3-1212:00am

2016-3-121:00am

2016-3-1110:00pm

2016-3-122:00am

2016-3-123:00am…

partition

partition

Page 41: Kostas Tzoumas_Stephan Ewen - Keynote -The maturing data streaming ecosystem and Apache Flink’s accelerated growth

Streaming and batch

41

2016-3-112:00 am

2016-3-11:00 am

2016-3-12:00 am

2016-3-1111:00pm

2016-3-1212:00am

2016-3-121:00am

2016-3-1110:00pm

2016-3-122:00am

2016-3-123:00am…

partition

partition

Stream (low latency)

Stream (high latency)

Page 42: Kostas Tzoumas_Stephan Ewen - Keynote -The maturing data streaming ecosystem and Apache Flink’s accelerated growth

Streaming and batch

42

2016-3-112:00 am

2016-3-11:00 am

2016-3-12:00 am

2016-3-1111:00pm

2016-3-1212:00am

2016-3-121:00am

2016-3-1110:00pm

2016-3-122:00am

2016-3-123:00am…

partition

partition

Stream (low latency)

Batch(bounded stream)Stream (high latency)

Page 43: Kostas Tzoumas_Stephan Ewen - Keynote -The maturing data streaming ecosystem and Apache Flink’s accelerated growth

Why use batch at all now?

43

… or Flink's DataSet API… dedicated batch processors

Cost of fault toleranceand accuracy

Resource elasticity /efficiency

Missing primitives(example: BSP iterations)

Possible to add toDataStream API

Deeper integrationbetween batch and streaming

techniques

Page 44: Kostas Tzoumas_Stephan Ewen - Keynote -The maturing data streaming ecosystem and Apache Flink’s accelerated growth

Some batch proof points…

44

TeraSort

Relational Join

Classic Batch Jobs

GraphProcessing

LinearAlgebra

Page 45: Kostas Tzoumas_Stephan Ewen - Keynote -The maturing data streaming ecosystem and Apache Flink’s accelerated growth

State in stream processing

45

Stateless Streaming(Apache Storm)

Stateful Streaming(Apache Samza)

Accurate Stateful Streaming(Apache Flink)

State sizes in Flink today (my assessment): 10s gigabytes per operatorHow to scale this to many terabytes?• Queryable State• Data driven triggers over large state

Page 46: Kostas Tzoumas_Stephan Ewen - Keynote -The maturing data streaming ecosystem and Apache Flink’s accelerated growth

Large-state streaming

46

How to scale the stream processor state?

… and maintain fast checkpoint intervals?… and have very fast recovery on machine failures?

More and more database techniques coming into Flink

Page 47: Kostas Tzoumas_Stephan Ewen - Keynote -The maturing data streaming ecosystem and Apache Flink’s accelerated growth

…in conclusion1. Flink is running in some of the largest streaming

setups2. Community is working on adding many

state-of-the-art operational features3. Available to broader audiences, via Stream SQL4. Streaming has even more potential to subsume

batchand will hold more and more application state

47

Page 48: Kostas Tzoumas_Stephan Ewen - Keynote -The maturing data streaming ecosystem and Apache Flink’s accelerated growth

48

Enjoy the conference!