Download - Introduction to Kafka Streams

Transcript
Page 1: Introduction to Kafka Streams

Kafka Streams Stream processing Made Simple with Kafka

1

Guozhang Wang Hadoop Summit, June 28, 2016

Page 2: Introduction to Kafka Streams

2

What is NOT Stream Processing?

Page 3: Introduction to Kafka Streams

3

Stream Processing isn’t (necessarily)

• Transient, approximate, lossy…

• .. that you must have batch processing as safety net

Page 4: Introduction to Kafka Streams

4

Page 5: Introduction to Kafka Streams

5

Page 6: Introduction to Kafka Streams

6

Page 7: Introduction to Kafka Streams

7

Page 8: Introduction to Kafka Streams

8

Stream Processing

• A different programming paradigm

• .. that brings computation to unbounded data

• .. with tradeoffs between latency / cost / correctness

Page 9: Introduction to Kafka Streams

9

Why Kafka in Stream Processing?

Page 10: Introduction to Kafka Streams

10

• Persistent Buffering

• Logical Ordering

• Scalable “source-of-truth”

Kafka: Real-time Platforms

Page 11: Introduction to Kafka Streams

11

Stream Processing with Kafka

Page 12: Introduction to Kafka Streams

12

• Option I: Do It Yourself !

Stream Processing with Kafka

Page 13: Introduction to Kafka Streams

13

• Option I: Do It Yourself !

Stream Processing with Kafka

while (isRunning) { // read some messages from Kafka inputMessages = consumer.poll();

// do some processing…

// send output messages back to Kafka producer.send(outputMessages); }

Page 14: Introduction to Kafka Streams

14

Page 15: Introduction to Kafka Streams

15

• Ordering

• Partitioning &

Scalability

• Fault tolerance

DIY Stream Processing is Hard

• State Management

• Time, Window &

Out-of-order Data

• Re-processing

Page 16: Introduction to Kafka Streams

16

• Option I: Do It Yourself !

• Option II: full-fledged stream processing system

• Storm, Spark, Flink, Samza, ..

Stream Processing with Kafka

Page 17: Introduction to Kafka Streams

17

MapReduce Heritage?

• Config Management

• Resource Management

• Configuration

• etc..

Page 18: Introduction to Kafka Streams

18

MapReduce Heritage?

• Config Management

• Resource Management

• Deployment

• etc..

Page 19: Introduction to Kafka Streams

19

MapReduce Heritage?

• Config Management

• Resource Management

• Deployment

• etc..

Can I just use my own?!

Page 20: Introduction to Kafka Streams

20

• Option I: Do It Yourself !

• Option II: full-fledged stream processing system

• Option III: lightweight stream processing library

Stream Processing with Kafka

Page 21: Introduction to Kafka Streams

Kafka Streams

• In Apache Kafka since v0.10, May 2016

• Powerful yet easy-to-use stream processing library• Event-at-a-time, Stateful

• Windowing with out-of-order handling

• Highly scalable, distributed, fault tolerant

• and more..21

Page 22: Introduction to Kafka Streams

22

Anywhere, anytime

Ok. Ok. Ok. Ok.

Page 23: Introduction to Kafka Streams

23

Anywhere, anytime

<dependency>

<groupId>org.apache.kafka</groupId> <artifactId>kafka-streams</artifactId> <version>0.10.0.0</version>

</dependency>

Page 24: Introduction to Kafka Streams

24

Anywhere, anytime

War File

Rsync

Puppet/

Chef

YARN

Mesos

Docker

Kuberne

tes

Very Uncool Very Cool

Page 25: Introduction to Kafka Streams

25

Simple is Beautiful

Page 26: Introduction to Kafka Streams

Kafka Streams DSL

26

public static void main(String[] args) { // specify the processing topology by first reading in a stream from a topic KStream<String, String> words = builder.stream(”topic1”);

// count the words in this stream as an aggregated table KTable<String, Long> counts = words.countByKey(”Counts”);

// write the result table to a new topic counts.to(”topic2”);

// create a stream processing instance and start running it KafkaStreams streams = new KafkaStreams(builder, config); streams.start(); }

Page 27: Introduction to Kafka Streams

Kafka Streams DSL

27

public static void main(String[] args) { // specify the processing topology by first reading in a stream from a topic KStream<String, String> words = builder.stream(”topic1”);

// count the words in this stream as an aggregated table KTable<String, Long> counts = words.countByKey(”Counts”);

// write the result table to a new topic counts.to(”topic2”);

// create a stream processing instance and start running it KafkaStreams streams = new KafkaStreams(builder, config); streams.start(); }

Page 28: Introduction to Kafka Streams

Kafka Streams DSL

28

public static void main(String[] args) { // specify the processing topology by first reading in a stream from a topic KStream<String, String> words = builder.stream(”topic1”);

// count the words in this stream as an aggregated table KTable<String, Long> counts = words.countByKey(”Counts”);

// write the result table to a new topic counts.to(”topic2”);

// create a stream processing instance and start running it KafkaStreams streams = new KafkaStreams(builder, config); streams.start(); }

Page 29: Introduction to Kafka Streams

Kafka Streams DSL

29

public static void main(String[] args) { // specify the processing topology by first reading in a stream from a topic KStream<String, String> words = builder.stream(”topic1”);

// count the words in this stream as an aggregated table KTable<String, Long> counts = words.countByKey(”Counts”);

// write the result table to a new topic counts.to(”topic2”);

// create a stream processing instance and start running it KafkaStreams streams = new KafkaStreams(builder, config); streams.start(); }

Page 30: Introduction to Kafka Streams

Kafka Streams DSL

30

public static void main(String[] args) { // specify the processing topology by first reading in a stream from a topic KStream<String, String> words = builder.stream(”topic1”);

// count the words in this stream as an aggregated table KTable<String, Long> counts = words.countByKey(”Counts”);

// write the result table to a new topic counts.to(”topic2”);

// create a stream processing instance and start running it KafkaStreams streams = new KafkaStreams(builder, config); streams.start(); }

Page 31: Introduction to Kafka Streams

Kafka Streams DSL

31

public static void main(String[] args) { // specify the processing topology by first reading in a stream from a topic KStream<String, String> words = builder.stream(”topic1”);

// count the words in this stream as an aggregated table KTable<String, Long> counts = words.countByKey(”Counts”);

// write the result table to a new topic counts.to(”topic2”);

// create a stream processing instance and start running it KafkaStreams streams = new KafkaStreams(builder, config); streams.start(); }

Page 32: Introduction to Kafka Streams

32

Native Kafka IntegrationProperty cfg = new Properties();

cfg.put(StreamsConfig.APPLICATION_ID_CONFIG, “my-streams-app”);

cfg.put(StreamsConfig.BOOTSTRAP_SERVERS_CONFIG, “broker1:9092”);

cfg.put(ConsumerConfig.AUTO_OFFSET_RESET_CONIFG, “earliest”);

cfg.put(CommonClientConfigs.SECURITY_PROTOCOL_CONFIG, “SASL_SSL”);

cfg.put(KafkaAvroSerDeConfig.SCHEMA_REGISTRY_URL_CONFIG, “registry:8081”);

StreamsConfig config = new StreamsConfig(cfg);

KafkaStreams streams = new KafkaStreams(builder, config);

Page 33: Introduction to Kafka Streams

33

Property cfg = new Properties();

cfg.put(StreamsConfig.APPLICATION_ID_CONFIG, “my-streams-app”);

cfg.put(StreamsConfig.BOOTSTRAP_SERVERS_CONFIG, “broker1:9092”);

cfg.put(ConsumerConfig.AUTO_OFFSET_RESET_CONIFG, “earliest”);

cfg.put(CommonClientConfigs.SECURITY_PROTOCOL_CONFIG, “SASL_SSL”);

cfg.put(KafkaAvroSerDeConfig.SCHEMA_REGISTRY_URL_CONFIG, “registry:8081”);

StreamsConfig config = new StreamsConfig(cfg);

KafkaStreams streams = new KafkaStreams(builder, config);

Native Kafka Integration

Page 34: Introduction to Kafka Streams

34

API, coding

“Full stack” evaluation

Operations, debugging, …

Page 35: Introduction to Kafka Streams

35

API, coding

“Full stack” evaluation

Operations, debugging, …

Simple is Beautiful

Page 36: Introduction to Kafka Streams

36

Key Idea:

Outsource hard problems to Kafka!

Page 37: Introduction to Kafka Streams

Kafka Concepts: the Log

4 5 5 7 8 9 10 11 12...

Producer Write

Consumer1 Reads (offset 7)

Consumer2 Reads (offset 10)

Messages

3

Page 38: Introduction to Kafka Streams

Topic 1

Topic 2

Partitions

Producers

Producers

Consumers

Consumers

Brokers

Kafka Concepts: the Log

Page 39: Introduction to Kafka Streams

39

Kafka Streams: Key Concepts

Page 40: Introduction to Kafka Streams

Stream and Records

40

Key Value Key Value Key Value Key Value

Stream

Record

Page 41: Introduction to Kafka Streams

Processor Topology

41

Stream

Page 42: Introduction to Kafka Streams

Processor Topology

42

StreamProcessor

Page 43: Introduction to Kafka Streams

Processor Topology

43

KStream<..> stream1 = builder.stream(”topic1”);

KStream<..> stream2 = builder.stream(”topic2”);

KStream<..> joined = stream1.leftJoin(stream2, ...);

KTable<..> aggregated = joined.aggregateByKey(...);

aggregated.to(”topic3”);

Page 44: Introduction to Kafka Streams

Processor Topology

44

KStream<..> stream1 = builder.stream(”topic1”);

KStream<..> stream2 = builder.stream(”topic2”);

KStream<..> joined = stream1.leftJoin(stream2, ...);

KTable<..> aggregated = joined.aggregateByKey(...);

aggregated.to(”topic3”);

Page 45: Introduction to Kafka Streams

Processor Topology

45

KStream<..> stream1 = builder.stream(”topic1”);

KStream<..> stream2 = builder.stream(”topic2”);

KStream<..> joined = stream1.leftJoin(stream2, ...);

KTable<..> aggregated = joined.aggregateByKey(...);

aggregated.to(”topic3”);

Page 46: Introduction to Kafka Streams

Processor Topology

46

KStream<..> stream1 = builder.stream(”topic1”);

KStream<..> stream2 = builder.stream(”topic2”);

KStream<..> joined = stream1.leftJoin(stream2, ...);

KTable<..> aggregated = joined.aggregateByKey(...);

aggregated.to(”topic3”);

Page 47: Introduction to Kafka Streams

Processor Topology

47

KStream<..> stream1 = builder.stream(”topic1”);

KStream<..> stream2 = builder.stream(”topic2”);

KStream<..> joined = stream1.leftJoin(stream2, ...);

KTable<..> aggregated = joined.aggregateByKey(...);

aggregated.to(”topic3”);

Page 48: Introduction to Kafka Streams

Processor Topology

48

Source Processor

Sink Processor

KStream<..> stream1 = builder.stream(

KStream<..> stream2 = builder.stream(

aggregated.to(

Page 49: Introduction to Kafka Streams

Processor Topology

49

KStream<..> stream1 = builder.stream(”topic1”);

KStream<..> stream2 = builder.table(”topic2”);

KStream<..> joined = stream1.leftJoin(stream2, ...);

KTable<..> aggregated = joined.aggregateByKey(...);

aggregated.to(”topic3”);

builder.addSource(”Source1”, ”topic1”) .addSource(”Source2”, ”topic2”)

.addProcessor(”Join”, MyJoin:new, ”Source1”, ”Source2”) .addProcessor(”Aggregate”, MyAggregate:new, ”Join”)

.addStateStore(Stores.persistent().build(), ”Aggregate”)

.addSink(”Sink”, ”topic3”, ”Aggregate”)

Page 50: Introduction to Kafka Streams

Processor Topology

50

builder.addSource(”Source1”, ”topic1”) .addSource(”Source2”, ”topic2”)

.addProcessor(”Join”, MyJoin:new, ”Source1”, ”Source2”) .addProcessor(”Aggregate”, MyAggregate:new, ”Join”)

.addStateStore(Stores.persistent().build(), ”Aggregate”)

.addSink(”Sink”, ”topic3”, ”Aggregate”)

Page 51: Introduction to Kafka Streams

Processor Topology

51

builder.addSource(”Source1”, ”topic1”) .addSource(”Source2”, ”topic2”)

.addProcessor(”Join”, MyJoin:new, ”Source1”, ”Source2”) .addProcessor(”Aggregate”, MyAggregate:new, ”Join”)

.addStateStore(Stores.persistent().build(), ”Aggregate”)

.addSink(”Sink”, ”topic3”, ”Aggregate”)

Page 52: Introduction to Kafka Streams

Processor Topology

52

builder.addSource(”Source1”, ”topic1”) .addSource(”Source2”, ”topic2”)

.addProcessor(”Join”, MyJoin:new, ”Source1”, ”Source2”) .addProcessor(”Aggregate”, MyAggregate:new, ”Join”)

.addStateStore(Stores.persistent().build(), ”Aggregate”)

.addSink(”Sink”, ”topic3”, ”Aggregate”)

Page 53: Introduction to Kafka Streams

Processor Topology

53Kafka Streams Kafka

Page 54: Introduction to Kafka Streams

Processor Topology

54

sink1.to(”topic1”);

source1 = builder.table(”topic1”);

source2 = sink1.through(”topic2”);

Page 55: Introduction to Kafka Streams

Processor Topology

55

sink1.to(”topic1”);

source1 = builder.table(”topic1”);

source2 = sink1.through(”topic2”);

Page 56: Introduction to Kafka Streams

Processor Topology

56

sink1.to(”topic1”);

source1 = builder.table(”topic1”);

source2 = sink1.through(”topic2”);

Page 57: Introduction to Kafka Streams

Processor Topology

57

sink1.to(”topic1”);

source1 = builder.table(”topic1”);

source2 = sink1.through(”topic2”);

Page 58: Introduction to Kafka Streams

Processor Topology

58

sink1.to(”topic1”);

source1 = builder.table(”topic1”);

source2 = sink1.through(”topic2”);

Sub-Topology

Page 59: Introduction to Kafka Streams

Processor Topology

59Kafka Streams Kafka

Page 60: Introduction to Kafka Streams

Processor Topology

60Kafka Streams Kafka

Page 61: Introduction to Kafka Streams

Processor Topology

61Kafka Streams Kafka

Page 62: Introduction to Kafka Streams

Processor Topology

62Kafka Streams Kafka

Page 63: Introduction to Kafka Streams

Stream Partitions and Tasks

63

Kafka Topic B Kafka Topic A

P1

P2

P1

P2

Page 64: Introduction to Kafka Streams

Stream Partitions and Tasks

64

Kafka Topic B Kafka Topic A

Processor TopologyP1

P2

P1

P2

Page 65: Introduction to Kafka Streams

Stream Partitions and Tasks

65

Kafka Topic AKafka Topic B

Page 66: Introduction to Kafka Streams

Kafka Topic B

Task2Task1

Stream Partitions and Tasks

66

Kafka Topic A

Page 67: Introduction to Kafka Streams

Kafka Topic B

Stream Partitions and Tasks

67

Kafka Topic A

Task2Task1

Page 68: Introduction to Kafka Streams

Kafka Topic B

Stream Threads

68

Kafka Topic A

MyApp.1Task2Task1

Page 69: Introduction to Kafka Streams

Kafka Topic B

Stream Threads

69

Kafka Topic A

Task2Task1MyApp.1 MyApp.2

Page 70: Introduction to Kafka Streams

Kafka Topic B

Stream Threads

70

Kafka Topic A

MyApp.1 MyApp.2Task2Task1

Page 71: Introduction to Kafka Streams

Stream Threads

71

Kafka Topic AKafka Topic B

Task2Task1MyApp.1 MyApp.2

Page 72: Introduction to Kafka Streams

Stream Threads

72

Task3MyApp.3

Kafka Topic AKafka Topic B

Task2Task1MyApp.1 MyApp.2

Page 73: Introduction to Kafka Streams

Stream Threads

73

Task3

Kafka Topic AKafka Topic B

Task2Task1MyApp.1 MyApp.2 MyApp.3

Page 74: Introduction to Kafka Streams

Stream Threads

74Thread1

Kafka Topic B

Task2Task1Thread2

Task4Task3

Kafka Topic AKafka Topic A

Page 75: Introduction to Kafka Streams

Stream Threads

75Thread1

Kafka Topic B

Task2Task1Thread2

Task4Task3

Kafka Topic AKafka Topic A

Page 76: Introduction to Kafka Streams

Stream Threads

76Thread1

Kafka Topic B

Task2Task1Thread2

Task4Task3

Kafka Topic AKafka Topic A

Page 77: Introduction to Kafka Streams

Stream Threads

77Thread1

Kafka Topic B

Task2Task1Thread2

Task4Task3

Kafka Topic AKafka Topic A

Page 78: Introduction to Kafka Streams

78

• Ordering

• Partitioning &

Scalability

• Fault tolerance

Stream Processing Hard Parts

• State Management

• Time, Window &

Out-of-order Data

• Re-processing

Page 79: Introduction to Kafka Streams

States in Stream Processing

79

• filter

• map

• join

• aggregate

Stateless

Stateful

Page 80: Introduction to Kafka Streams

80

Page 81: Introduction to Kafka Streams

States in Stream Processing

81

KStream<..> stream1 = builder.stream(”topic1”);

KStream<..> stream2 = builder.stream(”topic2”);

KStream<..> joined = stream1.leftJoin(stream2, ...);

KTable<..> aggregated = joined.aggregateByKey(...);

aggregated.to(”topic2”);

State

Page 82: Introduction to Kafka Streams

82

builder.addSource(”Source1”, ”topic1”) .addSource(”Source2”, ”topic2”)

.addProcessor(”Join”, MyJoin:new, ”Source1”, ”Source2”) .addProcessor(”Aggregate”, MyAggregate:new, ”Join”)

.addStateStore(Stores.persistent().build(), ”Aggregate”)

.addSink(”Sink”, ”topic3”, ”Aggregate”) State

States in Stream Processing

Page 83: Introduction to Kafka Streams

Kafka Topic B

Task2Task1

States in Stream Processing

83

Kafka Topic A

State State

Page 84: Introduction to Kafka Streams

It’s all about Time

• Event-time (when an event is created)

• Processing-time (when an event is processed)

84

Page 85: Introduction to Kafka Streams

Event-time 1 2 3 4 5 6 7Processing-time 1999 2002 2005 1997 1980 1983 2015

85

PHAN

TOM

MEN

ACE

ATTA

CK O

F TH

E CL

ON

ES

REV

ENG

E O

F TH

E SI

TH

A N

EW H

OPE

THE

EMPI

RE

STR

IKES

BAC

K

RET

UR

N O

F TH

E JE

DI

THE

FORC

E AW

AKEN

S

Out-of-Order

Page 86: Introduction to Kafka Streams

Timestamp Extractor

86

public long extract(ConsumerRecord<Object, Object> record) {

return System.currentTimeMillis();

}

public long extract(ConsumerRecord<Object, Object> record) {

return record.timestamp();

}

Page 87: Introduction to Kafka Streams

Timestamp Extractor

87

public long extract(ConsumerRecord<Object, Object> record) {

return System.currentTimeMillis();

}

public long extract(ConsumerRecord<Object, Object> record) {

return record.timestamp();

}

processing-time

Page 88: Introduction to Kafka Streams

Timestamp Extractor

88

public long extract(ConsumerRecord<Object, Object> record) {

return System.currentTimeMillis();

}

public long extract(ConsumerRecord<Object, Object> record) {

return record.timestamp();

}

processing-time

event-time

Page 89: Introduction to Kafka Streams

Timestamp Extractor

89

public long extract(ConsumerRecord<Object, Object> record) {

return System.currentTimeMillis();

} processing-time

event-time

public long extract(ConsumerRecord<Object, Object> record) {

return ((JsonNode) record.value()).get(”timestamp”).longValue();

}

Page 90: Introduction to Kafka Streams

Windowing

90

t…

Page 91: Introduction to Kafka Streams

Windowing

91

t…

Page 92: Introduction to Kafka Streams

Windowing

92

t…

Page 93: Introduction to Kafka Streams

Windowing

93

t…

Page 94: Introduction to Kafka Streams

Windowing

94

t…

Page 95: Introduction to Kafka Streams

Windowing

95

t…

Page 96: Introduction to Kafka Streams

Windowing

96

t…

Page 97: Introduction to Kafka Streams

97

• Ordering

• Partitioning &

Scalability

• Fault tolerance

Stream Processing Hard Parts

• State Management

• Time, Window &

Out-of-order Data

• Re-processing

Page 98: Introduction to Kafka Streams

Stream v.s. Table?

98

KStream<..> stream1 = builder.stream(”topic1”);

KStream<..> stream2 = builder.stream(”topic2”);

KStream<..> joined = stream1.leftJoin(stream2, ...);

KTable<..> aggregated = joined.aggregateByKey(...);

aggregated.to(”topic2”);

State

Page 99: Introduction to Kafka Streams

99

Tables ≈ Streams

Page 100: Introduction to Kafka Streams

100

Page 101: Introduction to Kafka Streams

101

Page 102: Introduction to Kafka Streams

102

Page 103: Introduction to Kafka Streams

The Stream-Table Duality

• A stream is a changelog of a table

• A table is a materialized view at time of a stream

• Example: change data capture (CDC) of databases

103

Page 104: Introduction to Kafka Streams

KStream = interprets data as record stream

~ think: “append-only”

KTable = data as changelog stream

~ continuously updated materialized view

104

Page 105: Introduction to Kafka Streams

105

alice eggs bob lettuce alice milk

alice lnkd bob googl alice msft

KStream

KTable

User purchase history

User employment profile

Page 106: Introduction to Kafka Streams

106

alice eggs bob lettuce alice milk

alice lnkd bob googl alice msft

KStream

KTable

User purchase history

User employment profile

time

“Alice bought eggs.”

“Alice is now at LinkedIn.”

Page 107: Introduction to Kafka Streams

107

alice eggs bob lettuce alice milk

alice lnkd bob googl alice msft

KStream

KTable

User purchase history

User employment profile

time

“Alice bought eggs and milk.”

“Alice is now at LinkedIn Microsoft.”

Page 108: Introduction to Kafka Streams

108

alice 2 bob 10 alice 3

timeKStream.aggregate()

KTable.aggregate()

(key: Alice, value: 2)

(key: Alice, value: 2)

Page 109: Introduction to Kafka Streams

109

alice 2 bob 10 alice 3

time

(key: Alice, value: 2 3)

(key: Alice, value: 2+3)

KStream.aggregate()

KTable.aggregate()

Page 110: Introduction to Kafka Streams

110

KStream KTable

reduce() aggregate() …

toStream()

map() filter() join() …

map() filter() join() …

Page 111: Introduction to Kafka Streams

111

KTable aggregated

KStream joined

KStream stream1KStream stream2

Updates Propagation in KTable

State

Page 112: Introduction to Kafka Streams

112

KTable aggregated

KStream joined

KStream stream1KStream stream2

State

Updates Propagation in KTable

Page 113: Introduction to Kafka Streams

113

KTable aggregated

KStream joined

KStream stream1KStream stream2

State

Updates Propagation in KTable

Page 114: Introduction to Kafka Streams

114

KTable aggregated

KStream joined

KStream stream1KStream stream2

State

Updates Propagation in KTable

Page 115: Introduction to Kafka Streams

115

• Ordering

• Partitioning &

Scalability

• Fault tolerance

Stream Processing Hard Parts

• State Management

• Time, Window &

Out-of-order Data

• Re-processing

Page 116: Introduction to Kafka Streams

116

Remember?

Page 117: Introduction to Kafka Streams

117

StateProcess

StateProcess

StateProcess

Kafka ChangelogFault ToleranceKafka

Kafka Streams

Kafka

Page 118: Introduction to Kafka Streams

118

StateProcess

StateProcess Protoco

l

StateProcess

Fault ToleranceKafka

Kafka Streams

Kafka Changelog

Kafka

Page 119: Introduction to Kafka Streams

119

StateProcess

StateProcess Protoco

l

StateProcess

Fault Tolerance

StateProcess

KafkaKafka Streams

Kafka Changelog

Kafka

Page 120: Introduction to Kafka Streams

120

Page 121: Introduction to Kafka Streams

121

Page 122: Introduction to Kafka Streams

122

Page 123: Introduction to Kafka Streams

123

Page 124: Introduction to Kafka Streams

124

• Ordering

• Partitioning &

Scalability

• Fault tolerance

Stream Processing Hard Parts

• State Management

• Time, Window &

Out-of-order Data

• Re-processing

Page 125: Introduction to Kafka Streams

125

• Ordering

• Partitioning &

Scalability

• Fault tolerance

Stream Processing Hard Parts

• State Management

• Time, Window &

Out-of-order Data

• Re-processing

Simple is Beautiful

Page 126: Introduction to Kafka Streams

Ongoing Work (0.10+)

• Beyond Java APIs

• SQL support, Python client, etc

• End-to-End Semantics (exactly-once)

• Queryable States

• … and more126

Page 127: Introduction to Kafka Streams

Queryable States

127

State

Real-time Analytics

select Count(*), Sum(*)

from “MyAgg”

where windowId > now() - 10;

Page 128: Introduction to Kafka Streams

128

But how to get data in / out Kafka?

Page 129: Introduction to Kafka Streams

129

Page 130: Introduction to Kafka Streams

130

Page 131: Introduction to Kafka Streams

131

Page 132: Introduction to Kafka Streams

132

Page 133: Introduction to Kafka Streams

Take-aways

• Stream Processing: a new programming paradigm

133

Page 134: Introduction to Kafka Streams

Take-aways

• Stream Processing: a new programming paradigm

• Kafka Streams: stream processing made easy

134

Page 135: Introduction to Kafka Streams

Take-aways

• Stream Processing: a new programming paradigm

• Kafka Streams: stream processing made easy

135

THANKS!

Guozhang Wang | [email protected] | @guozhangwang

Visit Confluent at the Syncsort Booth (#1303), live demos @ 29thDownload Kafka Streams: www.confluent.io/product

Page 136: Introduction to Kafka Streams

136

We are Hiring!