HERACLITUS TEACHES Kafka Streams - JAX London · 2018-06-20 · 0.11 Exactly-once semantics 0.10...

42
@tlberglund Kafka Streams HERACLITUS TEACHES (AND LEARNS TO STOP CRYING!)

Transcript of HERACLITUS TEACHES Kafka Streams - JAX London · 2018-06-20 · 0.11 Exactly-once semantics 0.10...

Page 1: HERACLITUS TEACHES Kafka Streams - JAX London · 2018-06-20 · 0.11 Exactly-once semantics 0.10 Data processing (Streams API) 0.9 Data integration (Connect API) Intra-cluster replication

@tlberglund

Kafka StreamsH E R A C L I T U S T E A C H E S

( A N D L E A R N S T O S T O P C R Y I N G ! )

Page 2: HERACLITUS TEACHES Kafka Streams - JAX London · 2018-06-20 · 0.11 Exactly-once semantics 0.10 Data processing (Streams API) 0.9 Data integration (Connect API) Intra-cluster replication

@tlberglund

Kafka StreamsH E R A C L I T U S T E A C H E S

( A N D L E A R N S T O S T O P C R Y I N G ! )

Page 3: HERACLITUS TEACHES Kafka Streams - JAX London · 2018-06-20 · 0.11 Exactly-once semantics 0.10 Data processing (Streams API) 0.9 Data integration (Connect API) Intra-cluster replication

Heraclitus• Lived 535-475 BC in

Ephesus • Wrestled with problems

of metaphysics • Struggled with

depression • Probably did not use

Kafka Streams

Page 4: HERACLITUS TEACHES Kafka Streams - JAX London · 2018-06-20 · 0.11 Exactly-once semantics 0.10 Data processing (Streams API) 0.9 Data integration (Connect API) Intra-cluster replication

Heraclitus

• Tension of opposites • Fire • All things change • “No one steps into the

same river twice.”

Page 5: HERACLITUS TEACHES Kafka Streams - JAX London · 2018-06-20 · 0.11 Exactly-once semantics 0.10 Data processing (Streams API) 0.9 Data integration (Connect API) Intra-cluster replication
Page 6: HERACLITUS TEACHES Kafka Streams - JAX London · 2018-06-20 · 0.11 Exactly-once semantics 0.10 Data processing (Streams API) 0.9 Data integration (Connect API) Intra-cluster replication
Page 7: HERACLITUS TEACHES Kafka Streams - JAX London · 2018-06-20 · 0.11 Exactly-once semantics 0.10 Data processing (Streams API) 0.9 Data integration (Connect API) Intra-cluster replication

0.11 Exactly-oncesemantics

0.10 Data processing (Streams API)

0.9 Data integration (Connect API)

Intra-clusterreplication

0.8

2012 2014 2015 2016 2017

Cluster mirroring0.7

2013

Page 8: HERACLITUS TEACHES Kafka Streams - JAX London · 2018-06-20 · 0.11 Exactly-once semantics 0.10 Data processing (Streams API) 0.9 Data integration (Connect API) Intra-cluster replication

As developers, we want to build

not

APPS

INFRASTRUCTURE

Page 9: HERACLITUS TEACHES Kafka Streams - JAX London · 2018-06-20 · 0.11 Exactly-once semantics 0.10 Data processing (Streams API) 0.9 Data integration (Connect API) Intra-cluster replication

• Scalable • Elastic • Fault-tolerant • Stateful • Distributed

We want our apps to be:

Page 10: HERACLITUS TEACHES Kafka Streams - JAX London · 2018-06-20 · 0.11 Exactly-once semantics 0.10 Data processing (Streams API) 0.9 Data integration (Connect API) Intra-cluster replication

Where do I put my compute?

Page 11: HERACLITUS TEACHES Kafka Streams - JAX London · 2018-06-20 · 0.11 Exactly-once semantics 0.10 Data processing (Streams API) 0.9 Data integration (Connect API) Intra-cluster replication

Where do I put my state?

Page 12: HERACLITUS TEACHES Kafka Streams - JAX London · 2018-06-20 · 0.11 Exactly-once semantics 0.10 Data processing (Streams API) 0.9 Data integration (Connect API) Intra-cluster replication

Where is my code?The actual question is

Page 13: HERACLITUS TEACHES Kafka Streams - JAX London · 2018-06-20 · 0.11 Exactly-once semantics 0.10 Data processing (Streams API) 0.9 Data integration (Connect API) Intra-cluster replication

the

is aKAFKA STREAMS API

TO POWER THE BUSINESS

to build real-time applicationsJAVA API

Page 14: HERACLITUS TEACHES Kafka Streams - JAX London · 2018-06-20 · 0.11 Exactly-once semantics 0.10 Data processing (Streams API) 0.9 Data integration (Connect API) Intra-cluster replication

<— Not running inside brokers!

Page 15: HERACLITUS TEACHES Kafka Streams - JAX London · 2018-06-20 · 0.11 Exactly-once semantics 0.10 Data processing (Streams API) 0.9 Data integration (Connect API) Intra-cluster replication

Brokers? Still nope!

Page 16: HERACLITUS TEACHES Kafka Streams - JAX London · 2018-06-20 · 0.11 Exactly-once semantics 0.10 Data processing (Streams API) 0.9 Data integration (Connect API) Intra-cluster replication

Before

Page 17: HERACLITUS TEACHES Kafka Streams - JAX London · 2018-06-20 · 0.11 Exactly-once semantics 0.10 Data processing (Streams API) 0.9 Data integration (Connect API) Intra-cluster replication

Before

Page 18: HERACLITUS TEACHES Kafka Streams - JAX London · 2018-06-20 · 0.11 Exactly-once semantics 0.10 Data processing (Streams API) 0.9 Data integration (Connect API) Intra-cluster replication

Before

Page 19: HERACLITUS TEACHES Kafka Streams - JAX London · 2018-06-20 · 0.11 Exactly-once semantics 0.10 Data processing (Streams API) 0.9 Data integration (Connect API) Intra-cluster replication

After

Page 20: HERACLITUS TEACHES Kafka Streams - JAX London · 2018-06-20 · 0.11 Exactly-once semantics 0.10 Data processing (Streams API) 0.9 Data integration (Connect API) Intra-cluster replication

After

Page 21: HERACLITUS TEACHES Kafka Streams - JAX London · 2018-06-20 · 0.11 Exactly-once semantics 0.10 Data processing (Streams API) 0.9 Data integration (Connect API) Intra-cluster replication

After

Page 22: HERACLITUS TEACHES Kafka Streams - JAX London · 2018-06-20 · 0.11 Exactly-once semantics 0.10 Data processing (Streams API) 0.9 Data integration (Connect API) Intra-cluster replication

this means you can

DEPLOY

YOU WANTusing whatever technologyANYWHERE

your app

Page 23: HERACLITUS TEACHES Kafka Streams - JAX London · 2018-06-20 · 0.11 Exactly-once semantics 0.10 Data processing (Streams API) 0.9 Data integration (Connect API) Intra-cluster replication

Things Kafka Streams Does• Runs everywhere • Clustering done for you • Exactly-once processing • Event-time processing • Integrated database • Joins, windowing, aggregation • S/M/L/XL/XXL/XXXL sizes

Page 24: HERACLITUS TEACHES Kafka Streams - JAX London · 2018-06-20 · 0.11 Exactly-once semantics 0.10 Data processing (Streams API) 0.9 Data integration (Connect API) Intra-cluster replication

An integration story?

For another time…

Page 25: HERACLITUS TEACHES Kafka Streams - JAX London · 2018-06-20 · 0.11 Exactly-once semantics 0.10 Data processing (Streams API) 0.9 Data integration (Connect API) Intra-cluster replication

first, some

APICONCEPTS

Page 26: HERACLITUS TEACHES Kafka Streams - JAX London · 2018-06-20 · 0.11 Exactly-once semantics 0.10 Data processing (Streams API) 0.9 Data integration (Connect API) Intra-cluster replication

areSTREAMS

EVERYWHERE

Page 27: HERACLITUS TEACHES Kafka Streams - JAX London · 2018-06-20 · 0.11 Exactly-once semantics 0.10 Data processing (Streams API) 0.9 Data integration (Connect API) Intra-cluster replication

areTABLES

EVERYWHERE

Page 28: HERACLITUS TEACHES Kafka Streams - JAX London · 2018-06-20 · 0.11 Exactly-once semantics 0.10 Data processing (Streams API) 0.9 Data integration (Connect API) Intra-cluster replication

Streams to Tables

Page 29: HERACLITUS TEACHES Kafka Streams - JAX London · 2018-06-20 · 0.11 Exactly-once semantics 0.10 Data processing (Streams API) 0.9 Data integration (Connect API) Intra-cluster replication

Tables to Streams

Page 30: HERACLITUS TEACHES Kafka Streams - JAX London · 2018-06-20 · 0.11 Exactly-once semantics 0.10 Data processing (Streams API) 0.9 Data integration (Connect API) Intra-cluster replication

Stream/Table Duality

33

Page 31: HERACLITUS TEACHES Kafka Streams - JAX London · 2018-06-20 · 0.11 Exactly-once semantics 0.10 Data processing (Streams API) 0.9 Data integration (Connect API) Intra-cluster replication

Stream/Table Duality

Page 32: HERACLITUS TEACHES Kafka Streams - JAX London · 2018-06-20 · 0.11 Exactly-once semantics 0.10 Data processing (Streams API) 0.9 Data integration (Connect API) Intra-cluster replication

KStream

KStream<Long, String> rawRatings = builder.stream(Serdes.Long(), Serdes.String(), "raw-ratings");

KStream<Long, Rating> ratings = rawRatings .mapValues(text -> Parser.parseRating(text)) .map((key, rating) -> new KeyValue<Long, Rating>(rating.getMovieId(), rating));

Page 33: HERACLITUS TEACHES Kafka Streams - JAX London · 2018-06-20 · 0.11 Exactly-once semantics 0.10 Data processing (Streams API) 0.9 Data integration (Connect API) Intra-cluster replication

KTableKStream<Long, Float> numericalRatings = ratings.mapValues(rating -> rating.getRating());

KGroupedStream<Long, Float> ratingsByMovieId = numericalRatings.groupByKey();

KTable<Long, Long> ratingCount = ratingsByMovieId.count();KTable<Long, Float> ratingSum = ratingsByMovieId.reduce((r1, r2) -> r1 + r2);KTable<Long, Float> ratingAvg = ratingSum.join(ratingCount, (sum, count) -> sum.floatValue()/count.floatValue());

Page 34: HERACLITUS TEACHES Kafka Streams - JAX London · 2018-06-20 · 0.11 Exactly-once semantics 0.10 Data processing (Streams API) 0.9 Data integration (Connect API) Intra-cluster replication

DEMO

Page 35: HERACLITUS TEACHES Kafka Streams - JAX London · 2018-06-20 · 0.11 Exactly-once semantics 0.10 Data processing (Streams API) 0.9 Data integration (Connect API) Intra-cluster replication

Remember, we want to build

not

APPS

INFRASTRUCTURE

Page 36: HERACLITUS TEACHES Kafka Streams - JAX London · 2018-06-20 · 0.11 Exactly-once semantics 0.10 Data processing (Streams API) 0.9 Data integration (Connect API) Intra-cluster replication

Fault Tolerance

Page 37: HERACLITUS TEACHES Kafka Streams - JAX London · 2018-06-20 · 0.11 Exactly-once semantics 0.10 Data processing (Streams API) 0.9 Data integration (Connect API) Intra-cluster replication

Elasticity

Page 38: HERACLITUS TEACHES Kafka Streams - JAX London · 2018-06-20 · 0.11 Exactly-once semantics 0.10 Data processing (Streams API) 0.9 Data integration (Connect API) Intra-cluster replication

Elasticity

Econimcal at small and large scale

Page 39: HERACLITUS TEACHES Kafka Streams - JAX London · 2018-06-20 · 0.11 Exactly-once semantics 0.10 Data processing (Streams API) 0.9 Data integration (Connect API) Intra-cluster replication

Shared StateProbably failing at life

Page 40: HERACLITUS TEACHES Kafka Streams - JAX London · 2018-06-20 · 0.11 Exactly-once semantics 0.10 Data processing (Streams API) 0.9 Data integration (Connect API) Intra-cluster replication

Shared State Adulation of peers

Page 41: HERACLITUS TEACHES Kafka Streams - JAX London · 2018-06-20 · 0.11 Exactly-once semantics 0.10 Data processing (Streams API) 0.9 Data integration (Connect API) Intra-cluster replication

Shared State

Lower

infrastructure

costs…

Page 42: HERACLITUS TEACHES Kafka Streams - JAX London · 2018-06-20 · 0.11 Exactly-once semantics 0.10 Data processing (Streams API) 0.9 Data integration (Connect API) Intra-cluster replication

THANKYOU!

@tlberglund