Kubernetes Deploy and use Apache Kafka on - Puzzle...Kafka Streams Client library for building...

31
Deploy and use Apache Kafka on Kubernetes Jakub Scholz Bern, 20.6.2019

Transcript of Kubernetes Deploy and use Apache Kafka on - Puzzle...Kafka Streams Client library for building...

Page 1: Kubernetes Deploy and use Apache Kafka on - Puzzle...Kafka Streams Client library for building streaming applications Just a library, not a framework Perfect integration with Apache

Deploy and use Apache Kafka on Kubernetes

Jakub Scholz

Bern, 20.6.2019

Page 2: Kubernetes Deploy and use Apache Kafka on - Puzzle...Kafka Streams Client library for building streaming applications Just a library, not a framework Perfect integration with Apache

Jakub Scholz● Principal Software Engineer with the Red Hat

Messaging and IoT engineering team● Long-term messaging specialist● Former messaging team lead at Deutsche Börse

@scholzj

https://github.com/scholzj

https://www.linkedin.com/in/scholzj/

2

About me

Page 3: Kubernetes Deploy and use Apache Kafka on - Puzzle...Kafka Streams Client library for building streaming applications Just a library, not a framework Perfect integration with Apache

Web site: http://strimzi.io/

GitHub: https://github.com/strimzi

Twitter: @strimziio

3

Strimzi

● Open source project licensed under Apache License 2.0● Focuses on running Apache Kafka on Kubernetes and OpenShift:

○ Container images for Apache Kafka and Apache Zookeeper○ Operators for managing and configuring Kafka clusters, topics or users

● Provides Kubernetes-native experience for running Kafka on Kubernetes and OpenShift○ Doesn’t manage only Kafka brokers, but also users or topics

Page 4: Kubernetes Deploy and use Apache Kafka on - Puzzle...Kafka Streams Client library for building streaming applications Just a library, not a framework Perfect integration with Apache

4

Strimzi

● Just released: 0.12.0! What’s new?○ Support for Kubernetes 1.9 and 1.10 / OpenShift 3.9 and 3.10 dropped○ Reduced number of container images○ Custom Resources improvements

○ Upgraded from v1alpha1 to v1beta1○ Status subresource for the Kafka resource○ Improved handling of invalid fields in the Custom Resources

○ Improvements to persistent storage handling○ Adding / removing JBOD volumes○ Volume resizing

○ HTTP Bridge

Page 5: Kubernetes Deploy and use Apache Kafka on - Puzzle...Kafka Streams Client library for building streaming applications Just a library, not a framework Perfect integration with Apache

5

Demo:Kafka cluster status

Page 6: Kubernetes Deploy and use Apache Kafka on - Puzzle...Kafka Streams Client library for building streaming applications Just a library, not a framework Perfect integration with Apache

6

What do you use Kafka for?

Page 7: Kubernetes Deploy and use Apache Kafka on - Puzzle...Kafka Streams Client library for building streaming applications Just a library, not a framework Perfect integration with Apache

7

Using Kafka

● Many different use-cases can be addressed by Kafka○ Microservice integration / messaging○ Event sourcing○ Metrics and log collection○ Stream processing○ Website activity tracking○ Storage / Key-value database / Commit log

Page 8: Kubernetes Deploy and use Apache Kafka on - Puzzle...Kafka Streams Client library for building streaming applications Just a library, not a framework Perfect integration with Apache

8

Events

Value

Key

Olomouc

Jakub

Prague

Jakub

New York

Joe

Frankfurt

Jakub

Prague

Jakub

Bern

Joe

Page 9: Kubernetes Deploy and use Apache Kafka on - Puzzle...Kafka Streams Client library for building streaming applications Just a library, not a framework Perfect integration with Apache

9

Stream-Table duality

Value

Key

Olomouc

Jakub

Prague

Jakub

New York

Joe

Frankfurt

Jakub

Prague

Jakub

Bern

Joe

Key Value

New YorkJoe

OlomoucJakub PragueJakub FrankfurtJakub PragueJakub

BernJoe

Page 10: Kubernetes Deploy and use Apache Kafka on - Puzzle...Kafka Streams Client library for building streaming applications Just a library, not a framework Perfect integration with Apache

10

Compacted topics

Value

Key

London

Jack

Olomouc

Jakub

New York

Joe

Prague

Jakub

Boston

Joe

Bern

Joe

Prague

Jakub

Bern

Joe

Page 11: Kubernetes Deploy and use Apache Kafka on - Puzzle...Kafka Streams Client library for building streaming applications Just a library, not a framework Perfect integration with Apache

11

Demo:Address book

Page 12: Kubernetes Deploy and use Apache Kafka on - Puzzle...Kafka Streams Client library for building streaming applications Just a library, not a framework Perfect integration with Apache

12

But what if ...

… my app already uses a database?

Page 13: Kubernetes Deploy and use Apache Kafka on - Puzzle...Kafka Streams Client library for building streaming applications Just a library, not a framework Perfect integration with Apache

Using database with Kafka

M M M M M M MService

● Multiple connections to different services● What do you do when one of them is not available?● How do you make sure no data are lost?

Page 14: Kubernetes Deploy and use Apache Kafka on - Puzzle...Kafka Streams Client library for building streaming applications Just a library, not a framework Perfect integration with Apache

14

Connecting Kafka

Kafka Connect:Source

Connector

Kafka Connect:Sink

Connector

External System

External System

Page 15: Kubernetes Deploy and use Apache Kafka on - Puzzle...Kafka Streams Client library for building streaming applications Just a library, not a framework Perfect integration with Apache

15

Connecting Kafka

● Framework for integrating Kafka with other systems○ Source connectors for getting data into Kafka○ Sink connectors to ship data from Kafka somewhere else

● Many 3rd party connectors available○ Messaging systems (JMS, Amazon SNS/SQS, WebSphere MQ, ...)○ Databases / Change Data Capture (JDBC, Debezium, ...)○ Data Storage (Amazon S3, HDFS, Casandra, ...)○ and more

● You can use Connect API to write your own connectors○ Implement few simple methods to create your own connectors

Page 16: Kubernetes Deploy and use Apache Kafka on - Puzzle...Kafka Streams Client library for building streaming applications Just a library, not a framework Perfect integration with Apache

16

CDC

● Change Data Capture (CDC)● Captures the data directly from database

○ Doesn’t require your application to send / receive data to Kafka○ Instead a CDC application captures them from database

● Debezium is one of the implementations of this pattern○ Debezium connects to selected database, reads its transaction log and publishes it

as Kafka messages○ Supported databases are MySQL, PostgreSQL, MongoDB and SQL Server

○ Additional plugins might be needed to access the DB and its logs○ The Kafka messages can be send for example in JSON format○ https://debezium.io/

Page 17: Kubernetes Deploy and use Apache Kafka on - Puzzle...Kafka Streams Client library for building streaming applications Just a library, not a framework Perfect integration with Apache

Debezium

M M M M M M M

Service● With CDC / Debezium

○ Decouples the messaging layer form your application○ Make it easier to handle disruptions

Page 18: Kubernetes Deploy and use Apache Kafka on - Puzzle...Kafka Streams Client library for building streaming applications Just a library, not a framework Perfect integration with Apache

18

Debezium

Kafka Connect with Debezium

Kafka BrokersDatabase

Page 19: Kubernetes Deploy and use Apache Kafka on - Puzzle...Kafka Streams Client library for building streaming applications Just a library, not a framework Perfect integration with Apache

19

Debezium

● What makes CDC / Debezium so great?○ Makes it easy to integrate DB based applications into Kafka

○ No need to change your application○ No need to write data to DB and send to Kafka○ Easier to handle errors and (un)availability of DB or Kafka

○ Use cases (for example)○ Microservice integration○ Data replication

Page 20: Kubernetes Deploy and use Apache Kafka on - Puzzle...Kafka Streams Client library for building streaming applications Just a library, not a framework Perfect integration with Apache

20

Demo:Address book with DB

Page 21: Kubernetes Deploy and use Apache Kafka on - Puzzle...Kafka Streams Client library for building streaming applications Just a library, not a framework Perfect integration with Apache

21

HTTP Bridge

● HTTP Bridge is new component in 0.12.0● Still a bit work in progress● You can download it and use it on your own or deploy it using Strimzi operators● Can be used to access Kafka using HTTP REST API

○ Tries to maintain compatibility with the Confluent REST Proxy○ But implements only selected features○ Consumers are stateful, producers are stateless○ Supports JSON and Binary payloads (=> Anything encoded in Base64)

● AMQP Bridge (part of the same upstream code) is not exposed by operators!

Page 22: Kubernetes Deploy and use Apache Kafka on - Puzzle...Kafka Streams Client library for building streaming applications Just a library, not a framework Perfect integration with Apache

22

HTTP Bridge

● Use cases○ Applications which do not speak Kafka (e.g. because of language etc.)○ Web applications○ Access from application outside of of Kubernetes cluster / public internet○ IoT use cases (collecting telemetry etc.)

● Security and exposing○ Easier than with Kafka because it is just regular HTTP REST API○ We have chosen different approach then with Kafka○ Users can configure it on their own using things like Ingress, API Gateways, OAuth

Proxies etc.

Page 23: Kubernetes Deploy and use Apache Kafka on - Puzzle...Kafka Streams Client library for building streaming applications Just a library, not a framework Perfect integration with Apache

23

HTTP Bridge

1. Create a consumer instance2. Subscribe to topicsPoll for messagesCommit messagesClose consumer

Page 24: Kubernetes Deploy and use Apache Kafka on - Puzzle...Kafka Streams Client library for building streaming applications Just a library, not a framework Perfect integration with Apache

24

Demo:HTTP Bridge

Page 25: Kubernetes Deploy and use Apache Kafka on - Puzzle...Kafka Streams Client library for building streaming applications Just a library, not a framework Perfect integration with Apache

25

Now we have two topics … one with the emails and one with cities!?

Let’s use some stream processing to fix it!

Page 26: Kubernetes Deploy and use Apache Kafka on - Puzzle...Kafka Streams Client library for building streaming applications Just a library, not a framework Perfect integration with Apache

26

Kafka Streams

● Client library for building streaming applications○ Just a library, not a framework○ Perfect integration with Apache Kafka

● Provides a DSL for○ Data transformation○ Data aggregation○ Joining of streams○ Windowing

Page 27: Kubernetes Deploy and use Apache Kafka on - Puzzle...Kafka Streams Client library for building streaming applications Just a library, not a framework Perfect integration with Apache

27

Kafka Streams

Kafka Streams applicationAggregate

addresses with emails

Join

addressesemails

Transform Aggregate

Page 28: Kubernetes Deploy and use Apache Kafka on - Puzzle...Kafka Streams Client library for building streaming applications Just a library, not a framework Perfect integration with Apache

28

Demo:Kafka Streams

Page 29: Kubernetes Deploy and use Apache Kafka on - Puzzle...Kafka Streams Client library for building streaming applications Just a library, not a framework Perfect integration with Apache

29

Kafka Streams

● Running Kafka Streams on Kubernetes○ Stateless operations (such as transformations) are easy to run as a Deployment○ Stateful applications rely on local cache for performance

○ Can run as Deployments○ Often run as StatefulSets with persistent volumes○ When the pod crashes or restarts, it can quickly recover big part of the data

from the local cache => no need to re-read them from Kafka

Page 30: Kubernetes Deploy and use Apache Kafka on - Puzzle...Kafka Streams Client library for building streaming applications Just a library, not a framework Perfect integration with Apache

30

Slides and demo source codes:http://jsch.cz/bernmeetup2019

Page 31: Kubernetes Deploy and use Apache Kafka on - Puzzle...Kafka Streams Client library for building streaming applications Just a library, not a framework Perfect integration with Apache

THANK YOUplus.google.com/+RedHat

linkedin.com/company/red-hat

youtube.com/user/RedHatVideos

facebook.com/redhatinc

twitter.com/RedHatNews