First Flink Bay Area meetup

35
Introduction to Apache Flink™ Kostas Tzoumas @kostas_tzoumas

Transcript of First Flink Bay Area meetup

Page 1: First Flink Bay Area meetup

Introduction toApache Flink™

Kostas Tzoumas@kostas_tzoumas

Page 2: First Flink Bay Area meetup

2

Flink is a stream processor with many faces

Streaming dataflow runtime

Page 3: First Flink Bay Area meetup

3

case class Path (from: Long, to: Long)val tc = edges.iterate(10) { paths: DataSet[Path] => val next = paths .join(edges) .where("to") .equalTo("from") { (path, edge) => Path(path.from, edge.to) } .union(paths) .distinct() next }

Optimizer

Type extraction

stack

Task schedulin

g

Dataflow metadata

Pre-flight (Client)

JobManagerTaskManagers

Data Sourceorders.tbl

Filter

MapDataSour

celineitem.tbl

JoinHybrid Hash

buildHT

probe

hash-part [0] hash-part [0]

GroupRed

sort

forward

Program

DataflowGraph

deployoperators

trackintermediate

results

Page 4: First Flink Bay Area meetup

Flink's internal execution model

4

Page 5: First Flink Bay Area meetup

Flink execution model A program is a dag of operators

Operators = computation + state

Operators produce intermediate results = logical streams of records

Other operators can consume those

5

map

join sum

ID1

ID2

ID3

Page 6: First Flink Bay Area meetup

6

A map-reduce job with Flink

ExecutionGraph

JobManager

TaskManager 1

TaskManager 2

M1

M2

RP1

RP2

R1

R2

1

2 3a3b

4a

4b

5a

5b

Page 7: First Flink Bay Area meetup

One runtime for batch and streaming

Pipelined Blocked

Ephemeral Stream data shuffles

Batch data shuffles

Checkpointed Caching for recovery or reuse

7

Page 8: First Flink Bay Area meetup

Pipelining

8

Basic building block to “keep the data moving”

Note: pipelined systems do not usually transfer individual tuples, but buffers that batch several tuples!

Page 9: First Flink Bay Area meetup

Streaming fault tolerance

Ensure that operators see all events• “At least once”• Solved by replaying a stream from a

checkpoint, e.g., from a past Kafka offset

Ensure that operators do not perform duplicate updates to their state• “Exactly once”• Several solutions

9

Page 10: First Flink Bay Area meetup

Exactly once approaches

Discretized streams (Spark Streaming)• Treat streaming as a series of small atomic

computations• “Fast track” to fault tolerance, but restricts

computational and programming model (e.g., cannot mutate state across “mini-batches”, window functions correlated with mini-batch size)

MillWheel (Google Cloud Dataflow)• State update and derived events committed as atomic

transaction to a high-throughput transactional store• Requires a very high-throughput transactional store

Chandy-Lamport distributed snapshots (Flink)10

Page 11: First Flink Bay Area meetup

11

JobManagerRegister checkpointbarrier on master

Replay will start from here

Page 12: First Flink Bay Area meetup

12

JobManagerBarriers “push” prior events (assumes in-order delivery in individual channels)

Operator checkpointing starting

Operator checkpointing finished

Operator checkpointing in progress

Page 13: First Flink Bay Area meetup

13

JobManagerOperator checkpointing takes snapshot of state after ack’d data have updated the state. Checkpoints currently one-off and synchronous, WiP for incremental and asynchronous

State backup

Pluggable mechanism. Currently either JobManager (for small state) or file system (HDFS/Tachyon). WiP for in-memory grids

Page 14: First Flink Bay Area meetup

14

JobManager

State snapshots at sinks signal successful end of this checkpoint

At failure, recover last checkpointed state and restart sources from last barrier guarantees at least once

State backup

Page 15: First Flink Bay Area meetup

Best of all worlds for streaming

Low latency• Thanks to pipelined engine

Exactly-once guarantees• Variation of Chandy-Lamport

High throughput• Controllable checkpointing overhead

Separates app logic from recovery• Checkpointing interval is just a config parameter

15

Page 16: First Flink Bay Area meetup

Faces of a stream processor

16

Stream processing

Batchprocessing

Machine Learning at scale

Graph Analysis

Page 17: First Flink Bay Area meetup

Stream data analytics

17

Page 18: First Flink Bay Area meetup

18

DataStream API

case class Word (word: String, frequency: Int)

val lines: DataStream[String] = env.fromSocketStream(...)

lines.flatMap {line => line.split(" ") .map(word => Word(word,1))} .window(Time.of(5,SECONDS)).every(Time.of(1,SECONDS)) .groupBy("word").sum("frequency") .print()

val lines: DataSet[String] = env.readTextFile(...)

lines.flatMap {line => line.split(" ") .map(word => Word(word,1))} .groupBy("word").sum("frequency") .print()

DataSet API (batch):

DataStream API (streaming):

Page 19: First Flink Bay Area meetup

Flink stack

19

DataStream (Java/Scala)

Streaming dataflow runtime

Page 20: First Flink Bay Area meetup

20

Batch data analytics

Page 21: First Flink Bay Area meetup

21

Batch is a special case of streaming

map ID1

ID2

ID3

Blocking work units are embedded in streaming topology

Lower-overhead fault-tolerance via replaying intermediate results

join sum

Page 22: First Flink Bay Area meetup

Managed memory in Flink

22Memory runs out

Page 23: First Flink Bay Area meetup

Cost-based optimizer

23

Page 24: First Flink Bay Area meetup

Flink stack

24

Table

DataSet (Java/Scala)DataStream (Java/Scala)

Hadoop M

/R

Local Cluster Yarn Tez Embedded

Table

Streaming dataflow runtime

Page 25: First Flink Bay Area meetup

25

Iterative processing

Page 26: First Flink Bay Area meetup

26

FlinkML

API for ML pipelines inspired by scikit-learn

Collection of packaged algorithms • SVM, Multiple Linear Regression, Optimization, ALS, ...

val trainingData: DataSet[LabeledVector] = ...val testingData: DataSet[Vector] = ...

val scaler = StandardScaler()val polyFeatures = PolynomialFeatures().setDegree(3)val mlr = MultipleLinearRegression()

val pipeline = scaler.chainTransformer(polyFeatures).chainPredictor(mlr)

pipeline.fit(trainingData)

val predictions: DataSet[LabeledVector] = pipeline.predict(testingData)

Page 27: First Flink Bay Area meetup

Gelly

Graph API and library

Packaged algorithms• PageRank, SSSP, Label Propagation, Community

Detection, Connected Components

27

ExecutionEnvironment env = ExecutionEnvironment.getExecutionEnvironment();

Graph<Long, Long, NullValue> graph = ...

DataSet<Vertex<Long, Long>> verticesWithCommunity = graph.run(new

LabelPropagation<Long>(30)).getVertices();

verticesWithCommunity.print();

env.execute();

Page 28: First Flink Bay Area meetup

Iterative processing in FlinkFlink offers built-in iterations and delta iterations to execute ML and graph algorithms efficiently

28

map

join sum

ID1

ID2

ID3

Page 29: First Flink Bay Area meetup

Example: Matrix Factorization

29

Factorizing a matrix with28 billion ratings forrecommendations

More at: http://data-artisans.com/computing-recommendations-with-flink.html

Page 30: First Flink Bay Area meetup

The full stack

30

Gelly

Table

ML

SA

MO

A

DataSet (Java/Scala) DataStream

Hadoop M

/R

Local Cluster Yarn Tez Embedded

Data

flow

Data

flow

(W

iP)

MR

QL

Table

Casc

adin

g

(WiP

)

Streaming dataflow runtime

Sto

rm (

WiP

)

Zeppelin

Page 31: First Flink Bay Area meetup

Closing

31

Page 32: First Flink Bay Area meetup

32

tl;dr: what was this about? The case for Flink as a stream

processor• Low latency• High throughput• Exactly once• Easy to use APIs, library ecosystem• Growing community

A stream processor that is great for batch analytics as well

Page 33: First Flink Bay Area meetup

Demo time

33

Page 34: First Flink Bay Area meetup

I Flink, do you?

34

If you find this exciting,

get involved and start a discussion on Flink‘s mailing list,

or stay tuned by

subscribing to [email protected],following flink.apache.org/blog, and

@ApacheFlink on Twitter

Page 35: First Flink Bay Area meetup

35

flink-forward.org