Kafka Connect & Kafka Streams/KSQL - the ecosystem around Kafka

65
BASEL BERN BRUGG DÜSSELDORF FRANKFURT A.M. FREIBURG I.BR. GENF HAMBURG KOPENHAGEN LAUSANNE MÜNCHEN STUTTGART WIEN ZÜRICH Kafka Connect & Streams the Ecosystem around Kafka Guido Schmutz 29.11.2017 @ gschmutz guidoschmutz.wordpress.com

Transcript of Kafka Connect & Kafka Streams/KSQL - the ecosystem around Kafka

Page 1: Kafka Connect & Kafka Streams/KSQL - the ecosystem around Kafka

BASEL BERN BRUGG DÜSSELDORF FRANKFURT A.M. FREIBURG I.BR. GENF HAMBURG KOPENHAGEN LAUSANNE MÜNCHEN STUTTGART WIEN ZÜRICH

Kafka Connect & Streamsthe Ecosystem around Kafka

Guido Schmutz – 29.11.2017

@gschmutz guidoschmutz.wordpress.com

Page 2: Kafka Connect & Kafka Streams/KSQL - the ecosystem around Kafka

Guido Schmutz

Working at Trivadis for more than 20 yearsOracle ACE Director for Fusion Middleware and SOAConsultant, Trainer Software Architect for Java, Oracle, SOA andBig Data / Fast DataHead of Trivadis Architecture BoardTechnology Manager @ Trivadis

More than 30 years of software development experience

Contact: [email protected]: http://guidoschmutz.wordpress.comSlideshare: http://www.slideshare.net/gschmutzTwitter: gschmutz

Kafka Connect & Streams - the Ecosystem around Kafka

Page 3: Kafka Connect & Kafka Streams/KSQL - the ecosystem around Kafka

Our company.

Kafka Connect & Streams - the Ecosystem around Kafka

Trivadis is a market leader in IT consulting, system integration, solution engineeringand the provision of IT services focusing on and technologiesin Switzerland, Germany, Austria and Denmark. We offer our services in the following strategic business fields:

Trivadis Services takes over the interacting operation of your IT systems.

O P E R A T I O N

Page 4: Kafka Connect & Kafka Streams/KSQL - the ecosystem around Kafka

COPENHAGEN

MUNICH

LAUSANNEBERN

ZURICHBRUGG

GENEVA

HAMBURG

DÜSSELDORF

FRANKFURT

STUTTGART

FREIBURG

BASEL

VIENNA

With over 600 specialists and IT experts in your region.

Kafka Connect & Streams - the Ecosystem around Kafka

14 Trivadis branches and more than600 employees

200 Service Level Agreements

Over 4,000 training participants

Research and development budget:CHF 5.0 million

Financially self-supporting and sustainably profitable

Experience from more than 1,900 projects per year at over 800customers

Page 5: Kafka Connect & Kafka Streams/KSQL - the ecosystem around Kafka

Agenda

1. What is Apache Kafka?2. Kafka Connect3. Kafka Streams4. KSQL5. Kafka and "Big Data" / "Fast Data" Ecosystem6. Kafka in Software Architecture

Kafka Connect & Streams - the Ecosystem around Kafka

Page 6: Kafka Connect & Kafka Streams/KSQL - the ecosystem around Kafka

Demo Example

Truck-2 truck/nn/position

Truck-1

Truck-3

mqtt-source

truck_position

detect_dangerous_driving

dangerous_driving

TruckDriver

jdbc-source trucking_driver

join_dangerous_driving_driver

dangerous_driving_driver

consoleconsumer

2016-06-0214:39:56.605|98|27|803014426|Wichita toLittle RockRoute2|Normal|38.65|90.21|5187297736652502631

Kafka Connect & Streams - the Ecosystem around Kafka

27,Walter,Ward,Y,24-JUL-85,2017-10-0215:19:00 {"id":27,"firstName":"Walter","lastName":"Ward","available":"Y","birthdate":"24-JUL-85","last_update":1506923052012}

Page 7: Kafka Connect & Kafka Streams/KSQL - the ecosystem around Kafka

What is Apache Kafka?

Kafka Connect & Streams - the Ecosystem around Kafka

Page 8: Kafka Connect & Kafka Streams/KSQL - the ecosystem around Kafka

Apache Kafka History

2012 2013 2014 2015 2016 2017

Clustermirroringdatacompression

Intra-clusterreplication0.7

0.8

0.9

DataProcessing(StreamsAPI)

0.10

DataIntegration(ConnectAPI)

0.11

2018

ExactlyOnceSemanticsPerformanceImprovements

KSQLDeveloperPreview

Kafka Connect & Streams - the Ecosystem around Kafka

1.0 JBODSupportSupportJava9

Page 9: Kafka Connect & Kafka Streams/KSQL - the ecosystem around Kafka

Apache Kafka - Unix Analogy

$ cat < in.txt | grep "kafka" | tr a-z A-Z > out.txt

KafkaConnectAPI KafkaConnectAPIKafkaStreamsAPI

KafkaCore(Cluster)

Adaptedfrom:Confluent

KSQL

Kafka Connect & Streams - the Ecosystem around Kafka

Page 10: Kafka Connect & Kafka Streams/KSQL - the ecosystem around Kafka

Kafka High Level Architecture

The who is who• Producers write data to brokers.• Consumers read data from

brokers.• All this is distributed.

The data• Data is stored in topics.• Topics are split into partitions,

which are replicated.

Kafka Cluster

Consumer Consumer Consumer

Producer Producer Producer

Broker 1 Broker 2 Broker 3

ZookeeperEnsemble

Kafka Connect & Streams - the Ecosystem around Kafka

Page 11: Kafka Connect & Kafka Streams/KSQL - the ecosystem around Kafka

Kafka – Distributed Log at the Core

At the heart of Apache Kafka sits a distributed log

collection of messages, appended sequentially to a file

service ‘seeks’ to the position of the last message it read, then scans sequentially, reading messages in order

log-structured character makes Kafka well suited to performing the role of an Event Store in Event Sourcing

Event Hub

01020304050607080910111213141516171819202122

Reads are a singleseek & scan

Writes are append only

Kafka Connect & Streams - the Ecosystem around Kafka

Page 12: Kafka Connect & Kafka Streams/KSQL - the ecosystem around Kafka

Scale-Out Architecture

Kafka Connect & Streams - the Ecosystem around Kafka

topic consists of many partitions

producer load load-balanced over all partitions

consumer can consume with as many threads as there are partitions

Producer 1

Consumer 1Broker 1

Producer 2

Producer 3

Broker 2

Broker 3

Consumer 2

Consumer 3

Consumer 4

ConsumerGroup1

ConsumerGroup2

KafkaCluster

Page 13: Kafka Connect & Kafka Streams/KSQL - the ecosystem around Kafka

Strong Ordering Guarantees

most business systems need strong ordering guarantees

messages that require relative ordering need to be sent to the same partition

supply same key for all messages that require a relative order

To maintain global ordering use a single partition topic

Producer 1

Consumer 1

Broker 1

Broker 2

Broker 3

Consumer 2

Consumer 3

Key-1

Key-2

Key-3Key-4

Key-5

Key-6

Key-3

Key-1

Kafka Connect & Streams - the Ecosystem around Kafka

Page 14: Kafka Connect & Kafka Streams/KSQL - the ecosystem around Kafka

Durable and Highly Available Messaging

Producer 1

Broker 1

Broker 2

Broker 3

Producer 1

Broker 1

Broker 2

Broker 3

Consumer 1 Consumer 1

Consumer 2Consumer 2

Kafka Connect & Streams - the Ecosystem around Kafka

P1

P0

P0

P0

P1

P1

P1

P0

P0

P0

P1

P1

Page 15: Kafka Connect & Kafka Streams/KSQL - the ecosystem around Kafka

Durable and Highly Available Messaging (II)

Producer 1

Broker 1

Broker 2

Broker 3

Producer 1

Broker 1

Broker 2

Broker 3

Consumer 1 Consumer 1

Consumer 2

Consumer 2

Kafka Connect & Streams - the Ecosystem around Kafka

P1

P0

P0

P0

P1

P1

P1

P0

P0

P0

P1

P1

Page 16: Kafka Connect & Kafka Streams/KSQL - the ecosystem around Kafka

Replay-ability – Logs never forget

by keeping events in a log, we have a version control system for our data

if you were to deploy a faulty program, the system might become corrupted, but it would always be recoverable

sequence of events provides an audit point, so that you can examine exactly what happened

rewind and reply events, once service is back and bug is fixed

Event Hub

01020304050607080910111213141516171819202122

Replay

RewindService

Logic State

Kafka Connect & Streams - the Ecosystem around Kafka

Page 17: Kafka Connect & Kafka Streams/KSQL - the ecosystem around Kafka

Hold Data for Long-Term – Data Retention

Producer 1

Broker 1

Broker 2

Broker 3

1. Never

2. Time based (TTL) log.retention.{ms | minutes | hours}

3. Size based log.retention.bytes

4. Log compaction based (entries with same key are removed):

kafka-topics.sh --zookeeper zk:2181 \--create --topic customers \--replication-factor 1 \--partitions 1 \--config cleanup.policy=compact

Kafka Connect & Streams - the Ecosystem around Kafka

Page 18: Kafka Connect & Kafka Streams/KSQL - the ecosystem around Kafka

Keep Topics in Compacted Form

0 1 2 3 4 5 6 7 8 9 10 11

K1 K2 K1 K1 K3 K2 K4 K5 K5 K2 K6 K2

V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 V11

OffsetKey

Value

3 4 6 8 9 10

K1 K3 K4 K5 K2 K6

V4 V5 V7 V9 V10 V11

OffsetKey

Value

Compaction

Kafka Connect & Streams - the Ecosystem around Kafka

V1 V2

V3

V4 V5

V6

V7

V8

V9V10 V11

K1 K3 K4 K5K2 K6

Page 19: Kafka Connect & Kafka Streams/KSQL - the ecosystem around Kafka

How to get a Kafka environent

Kafka Connect & Streams - the Ecosystem around Kafka

On Premises• Bare Metal Installation

• Docker

• Mesos / Kubernetes

• Hadoop Distributions

Cloud• Oracle Event Hub Cloud Service

• Azure HDInsight Kafka

• Confluent Cloud

• …

Page 20: Kafka Connect & Kafka Streams/KSQL - the ecosystem around Kafka

Demo (I)

Truck-2 truckposition

Truck-1

Truck-3

consoleconsumer

2016-06-0214:39:56.605|98|27|803014426|Wichita toLittle RockRoute2|Normal|38.65|90.21|5187297736652502631

Testdata-GeneratorbyHortonworks

Kafka Connect & Streams - the Ecosystem around Kafka

Page 21: Kafka Connect & Kafka Streams/KSQL - the ecosystem around Kafka

Demo (I) – Create Kafka Topic

$ kafka-topics --zookeeper zookeeper:2181 --create \--topic truck_position --partitions 8 --replication-factor 1

$ kafka-topics --zookeeper zookeeper:2181 –list__consumer_offsets_confluent-metrics_schemasdocker-connect-configsdocker-connect-offsetsdocker-connect-statustruck_position

Kafka Connect & Streams - the Ecosystem around Kafka

Page 22: Kafka Connect & Kafka Streams/KSQL - the ecosystem around Kafka

Demo (I) – Run Producer and Kafka-Console-Consumer

Kafka Connect & Streams - the Ecosystem around Kafka

Page 23: Kafka Connect & Kafka Streams/KSQL - the ecosystem around Kafka

Demo (I) – Java Producer to "truck_position"

Constructing a Kafka Producer

private Properties kafkaProps = new Properties();kafkaProps.put("bootstrap.servers","broker-1:9092);kafkaProps.put("key.serializer", "...StringSerializer");kafkaProps.put("value.serializer", "...StringSerializer");

producer = new KafkaProducer<String, String>(kafkaProps);

ProducerRecord<String, String> record =new ProducerRecord<>("truck_position", driverId, eventData);

try {metadata = producer.send(record).get();

} catch (Exception e) {}

Kafka Connect & Streams - the Ecosystem around Kafka

Page 24: Kafka Connect & Kafka Streams/KSQL - the ecosystem around Kafka

Demo (II) – devices send to MQTT instead of Kafka

Truck-2 truck/nn/position

Truck-1

Truck-3

2016-06-0214:39:56.605|98|27|803014426|Wichita toLittle RockRoute2|Normal|38.65|90.21|5187297736652502631

Kafka Connect & Streams - the Ecosystem around Kafka

Page 25: Kafka Connect & Kafka Streams/KSQL - the ecosystem around Kafka

Demo (II) – devices send to MQTT instead of Kafka

Kafka Connect & Streams - the Ecosystem around Kafka

Page 26: Kafka Connect & Kafka Streams/KSQL - the ecosystem around Kafka

Demo (II) - devices send to MQTT instead of Kafka –how to get the data into Kafka?

Truck-2 truck/nn/position

Truck-1

Truck-3

truckposition raw

?

2016-06-0214:39:56.605|98|27|803014426|Wichita toLittle RockRoute2|Normal|38.65|90.21|5187297736652502631

Kafka Connect & Streams - the Ecosystem around Kafka

Page 27: Kafka Connect & Kafka Streams/KSQL - the ecosystem around Kafka

Kafka Connect

Kafka Connect & Streams - the Ecosystem around Kafka

Page 28: Kafka Connect & Kafka Streams/KSQL - the ecosystem around Kafka

Kafka Connect - Overview

SourceConnector

SinkConnector

Kafka Connect & Streams - the Ecosystem around Kafka

Page 29: Kafka Connect & Kafka Streams/KSQL - the ecosystem around Kafka

Kafka Connect – Single Message Transforms (SMT)

Simple Transformations for a single message

Defined as part of Kafka Connect• some useful transforms provided out-of-the-box• Easily implement your own

Optionally deploy 1+ transforms with each connector• Modify messages produced by source

connector• Modify messages sent to sink connectors

Makes it much easier to mix and match connectors

Some of currently available transforms:• InsertField• ReplaceField• MaskField• ValueToKey• ExtractField• TimestampRouter• RegexRouter• SetSchemaMetaData• Flatten• TimestampConverter

Kafka Connect & Streams - the Ecosystem around Kafka

Page 30: Kafka Connect & Kafka Streams/KSQL - the ecosystem around Kafka

Kafka Connect – Many Connectors

60+ since first release (0.9+)

20+ from Confluent and Partners

Source:http://www.confluent.io/product/connectors

ConfluentsupportedConnectors

CertifiedConnectors CommunityConnectors

Kafka Connect & Streams - the Ecosystem around Kafka

Page 31: Kafka Connect & Kafka Streams/KSQL - the ecosystem around Kafka

Demo (III)

Truck-2 truck/nn/position

Truck-1

Truck-3

mqtt tokafka

truck_position

2016-06-0214:39:56.605|98|27|803014426|Wichita toLittle RockRoute2|Normal|38.65|90.21|5187297736652502631

consoleconsumer

Kafka Connect & Streams - the Ecosystem around Kafka

Page 32: Kafka Connect & Kafka Streams/KSQL - the ecosystem around Kafka

Demo (III) – Create MQTT Connect through REST API #!/bin/bashcurl -X "POST" "http://192.168.69.138:8083/connectors" \

-H "Content-Type: application/json" \-d $'{

"name": "mqtt-source","config": {"connector.class":

"com.datamountaineer.streamreactor.connect.mqtt.source.MqttSourceConnector","connect.mqtt.connection.timeout": "1000","tasks.max": "1","connect.mqtt.kcql":

"INSERT INTO truck_position SELECT * FROM truck/+/position","name": "MqttSourceConnector","connect.mqtt.service.quality": "0", "connect.mqtt.client.id": "tm-mqtt-connect-01","connect.mqtt.converter.throw.on.error": "true","connect.mqtt.hosts": "tcp://mosquitto:1883"}

}'

Kafka Connect & Streams - the Ecosystem around Kafka

Page 33: Kafka Connect & Kafka Streams/KSQL - the ecosystem around Kafka

Demo (III) – Call REST API and Kafka Console Consumer

Kafka Connect & Streams - the Ecosystem around Kafka

Page 34: Kafka Connect & Kafka Streams/KSQL - the ecosystem around Kafka

Demo (III)

Truck-2 truck/nn/position

Truck-1

Truck-3

mqtt tokafka

truck_position

2016-06-0214:39:56.605|98|27|803014426|Wichita toLittle RockRoute2|Normal|38.65|90.21|5187297736652502631

consoleconsumer

whataboutsomeanalytics?

Kafka Connect & Streams - the Ecosystem around Kafka

Page 35: Kafka Connect & Kafka Streams/KSQL - the ecosystem around Kafka

Kafka Streams

Kafka Connect & Streams - the Ecosystem around Kafka

Page 36: Kafka Connect & Kafka Streams/KSQL - the ecosystem around Kafka

Kafka Streams - Overview

• Designed as a simple and lightweight library in Apache Kafka

• no external dependencies on systems other than Apache Kafka

• Part of open source Apache Kafka, introduced in 0.10+• Leverages Kafka as its internal messaging layer• Supports fault-tolerant local state• Event-at-a-time processing (not microbatch) with millisecond

latency• Windowing with out-of-order data using a Google DataFlow-like

model

Kafka Connect & Streams - the Ecosystem around Kafka

Page 37: Kafka Connect & Kafka Streams/KSQL - the ecosystem around Kafka

Kafka Stream DSL and Processor Topology

KStream<Integer, String> stream1 =builder.stream("in-1");

KStream<Integer, String> stream2=builder.stream("in-2");

KStream<Integer, String> joined =stream1.leftJoin(stream2, …);

KTable<> aggregated = joined.groupBy(…).count("store");

aggregated.to("out-1");

1 2

lj

a

t

State

Kafka Connect & Streams - the Ecosystem around Kafka

Page 38: Kafka Connect & Kafka Streams/KSQL - the ecosystem around Kafka

Kafka Stream DSL and Processor Topology

KStream<Integer, String> stream1 =builder.stream("in-1");

KStream<Integer, String> stream2=builder.stream("in-2");

KStream<Integer, String> joined =stream1.leftJoin(stream2, …);

KTable<> aggregated = joined.groupBy(…).count("store");

aggregated.to("out-1");

1 2

lj

a

t

State

Kafka Connect & Streams - the Ecosystem around Kafka

Page 39: Kafka Connect & Kafka Streams/KSQL - the ecosystem around Kafka

Kafka Streams Cluster

Processor Topology

Kafka Cluster

input-1

input-2

store(changelog)

output

1 2

lj

a

tState

Kafka Connect & Streams - the Ecosystem around Kafka

Page 40: Kafka Connect & Kafka Streams/KSQL - the ecosystem around Kafka

Kafka Cluster

Processor Topology

input-1Partition0

Partition1

Partition2

Partition3

input-2Partition0

Partition1

Partition2

Partition3

Kafka Streams 1

Kafka Streams 2

Kafka Connect & Streams - the Ecosystem around Kafka

Page 41: Kafka Connect & Kafka Streams/KSQL - the ecosystem around Kafka

Kafka Cluster

Processor Topology

input-1Partition0

Partition1

Partition2

Partition3

input-2Partition0

Partition1

Partition2

Partition3

Kafka Streams 1 Kafka Streams 2

Kafka Streams 3 Kafka Streams 4

Kafka Connect & Streams - the Ecosystem around Kafka

Page 42: Kafka Connect & Kafka Streams/KSQL - the ecosystem around Kafka

Stream vs. Table

Event Stream State Stream (Change Log Stream)

2017-10-02T20:18:46 11,Normal,41.87,-87.67

2017-10-02T20:18:55 11,Normal,40.38,-89.17

2017-10-02T20:18:59 21,Normal,42.23,-91.78

2017-10-02T20:19:01 21,Normal,41.71,-91.32

2017-10-02T20:19:02 11,Normal,38.65,-90.2

2017-10-02T20:19:23 21,Normal41.71,-91.32

11 2017-10-02T20:18:46,11,Normal,41.87,-87.67

11 2017-10-02T20:18:55,11,Normal,40.38,-89.17

21 2017-10-02T20:18:59,21,Normal,42.23,-91.78

21 2017-10-02T20:19:01,21,Normal,41.71,-91.32

11 2017-10-02T20:19:02,11,Normal,38.65,-90.2

21 2017-10-02T20:19:23,21,Normal41.71,-91.32

Kafka Connect & Streams - the Ecosystem around Kafka

KStream KTable

Page 43: Kafka Connect & Kafka Streams/KSQL - the ecosystem around Kafka

Kafka Streams: Key Features

Kafka Connect & Streams - the Ecosystem around Kafka

• Native, 100%-compatible Kafka integration• Secure stream processing using Kafka’s security features• Elastic and highly scalable• Fault-tolerant• Stateful and stateless computations• Interactive queries• Time model• Windowing• Supports late-arriving and out-of-order data• Millisecond processing latency, no micro-batching• At-least-once and exactly-once processing guarantees

Page 44: Kafka Connect & Kafka Streams/KSQL - the ecosystem around Kafka

Demo (IV)

Truck-2 truck/nn/position

Truck-1

Truck-3

mqtt tokafka

truck_position_s

detect_dangerous_driving

dangerous_driving

consoleconsumer

2016-06-0214:39:56.605|98|27|803014426|Wichita toLittle RockRoute2|Normal|38.65|90.21|5187297736652502631

Kafka Connect & Streams - the Ecosystem around Kafka

Page 45: Kafka Connect & Kafka Streams/KSQL - the ecosystem around Kafka

Demo (IV) - Create Stream

final KStreamBuilder builder = new KStreamBuilder();

KStream<String, String> source = builder.stream(stringSerde, stringSerde, "truck_position");

KStream<String, TruckPosition> positions = source.map((key,value) ->

new KeyValue<>(key, TruckPosition.create(value)));

KStream<String, TruckPosition> filtered = positions.filter(TruckPosition::filterNonNORMAL);

filtered.map((key,value) -> new KeyValue<>(key,value._originalRecord))

.to("dangerous_driving");

Kafka Connect & Streams - the Ecosystem around Kafka

Page 46: Kafka Connect & Kafka Streams/KSQL - the ecosystem around Kafka

KSQL

Kafka Connect & Streams - the Ecosystem around Kafka

Page 47: Kafka Connect & Kafka Streams/KSQL - the ecosystem around Kafka

KSQL: a Streaming SQL Engine for Apache Kafka

• Enables stream processing with zero coding required• The simples way to process streams of data in real-time• Powered by Kafka and Kafka Streams: scalable, distributed, mature• All you need is Kafka – no complex deployments• available as Developer preview!

• STREAM and TABLE as first-class citizens• STREAM = data in motion• TABLE = collected state of a stream• join STREAM and TABLE

Kafka Connect & Streams - the Ecosystem around Kafka

Page 48: Kafka Connect & Kafka Streams/KSQL - the ecosystem around Kafka

KSQL Deployment Models

Standalone Mode Cluster Mode

Source:Confluent

Kafka Connect & Streams - the Ecosystem around Kafka

Page 49: Kafka Connect & Kafka Streams/KSQL - the ecosystem around Kafka

Demo (V)

Truck-2 truck/nn/position

Truck-1

Truck-3

mqtt-source

truck_position

detect_dangerous_driving

dangerous_driving

TruckDriver

jdbc-source trucking_driver

join_dangerous_driving_driver

dangerous_driving_driver

27,Walter,Ward,Y,24-JUL-85,2017-10-0215:19:00

consoleconsumer

2016-06-0214:39:56.605|98|27|803014426|Wichita toLittle RockRoute2|Normal|38.65|90.21|5187297736652502631

{"id":27,"firstName":"Walter","lastName":"Ward","available":"Y","birthdate":"24-JUL-85","last_update":1506923052012}

Kafka Connect & Streams - the Ecosystem around Kafka

Page 50: Kafka Connect & Kafka Streams/KSQL - the ecosystem around Kafka

Demo (V) - Start Kafka KSQL$ docker-compose exec ksql-cli ksql-cli local --bootstrap-server broker-1:9092

======================================= _ __ _____ ____ _ == | |/ // ____|/ __ \| | == | ' /| (___ | | | | | == | < \___ \| | | | | == | . \ ____) | |__| | |____ == |_|\_\_____/ \___\_\______| == == Streaming SQL Engine for Kafka =

Copyright 2017 Confluent Inc.

CLI v0.1, Server v0.1 located at http://localhost:9098

Having trouble? Type 'help' (case-insensitive) for a rundown of how things work!

ksql>

Kafka Connect & Streams - the Ecosystem around Kafka

Page 51: Kafka Connect & Kafka Streams/KSQL - the ecosystem around Kafka

Demo (V) - Create Streamksql> CREATE STREAM dangerous_driving_s \(ts VARCHAR, \truckid VARCHAR, \driverid BIGINT, \routeid BIGINT, \routename VARCHAR, \eventtype VARCHAR, \latitude DOUBLE, \longitude DOUBLE, \correlationid VARCHAR) \WITH (kafka_topic='dangerous_driving', \

value_format='DELIMITED');

Message----------------Stream created

Kafka Connect & Streams - the Ecosystem around Kafka

Page 52: Kafka Connect & Kafka Streams/KSQL - the ecosystem around Kafka

Demo (V) - Create Stream

ksql> describe dangerous_driving_s;

Field | Type---------------------------------ROWTIME | BIGINTROWKEY | VARCHAR(STRING)TS | VARCHAR(STRING)TRUCKID | VARCHAR(STRING)DRIVERID | BIGINTROUTEID | BIGINTROUTENAME | VARCHAR(STRING)EVENTTYPE | VARCHAR(STRING)LATITUDE | DOUBLELONGITUDE | DOUBLECORRELATIONID | VARCHAR(STRING)

Kafka Connect & Streams - the Ecosystem around Kafka

Page 53: Kafka Connect & Kafka Streams/KSQL - the ecosystem around Kafka

Demo (V) - Create Stream

ksql> SELECT * FROM dangerous_driving_s;

1511166635385 | 11 | 2017-11-20T09:30:35 | 83 | 11 | 371182829 | Memphis to Little Rock | Unsafe following distance | 41.11 | -88.42 | 70159356601042621421511166652725 | 11 | 2017-11-20T09:30:52 | 83 | 11 | 371182829 | Memphis to Little Rock | Lane Departure | 38.65 | -90.2 | 70159356601042621421511166667645 | 10 | 2017-11-20T09:31:07 | 77 | 10 | 160779139 | Des Moines to Chicago Route 2 | Overspeed | 37.09 | -94.23 | 70159356601042621421511166670385 | 11 | 2017-11-20T09:31:10 | 83 | 11 | 371182829 | Memphis to Little Rock | Lane Departure | 41.48 | -88.07 | 70159356601042621421511166674175 | 25 | 2017-11-20T09:31:14 | 64 | 25 | 1090292248 | Peoria to Ceder Rapids Route 2 | Unsafe following distance | 36.84 | -89.54 | 70159356601042621421511166686315 | 15 | 2017-11-20T09:31:26 | 90 | 15 | 1927624662 | Springfield to KC Via Columbia | Lane Departure | 35.19 | -90.04 | 70159356601042621421511166686925 | 11 | 2017-11-20T09:31:26 | 83 | 11 | 371182829 | Memphis to Little Rock | Unsafe following distance | 40.38 | -89.17 | 7015935660104262142

Kafka Connect & Streams - the Ecosystem around Kafka

Page 54: Kafka Connect & Kafka Streams/KSQL - the ecosystem around Kafka

Demo (V) – Create JDBC Connect through REST API #!/bin/bashcurl -X "POST" "http://192.168.69.138:8083/connectors" \

-H "Content-Type: application/json" \-d $'{

"name": "jdbc-driver-source","config": {

"connector.class": "JdbcSourceConnector","connection.url":"jdbc:postgresql://db/sample?user=sample&password=sample","mode": "timestamp","timestamp.column.name":"last_update","table.whitelist":"driver","validate.non.null":"false","topic.prefix":"trucking_","key.converter":"org.apache.kafka.connect.json.JsonConverter","key.converter.schemas.enable": "false","value.converter":"org.apache.kafka.connect.json.JsonConverter","value.converter.schemas.enable": "false","name": "jdbc-driver-source","transforms":"createKey,extractInt", "transforms.createKey.type":"org.apache.kafka.connect.transforms.ValueToKey", "transforms.createKey.fields":"id", "transforms.extractInt.type":"org.apache.kafka.connect.transforms.ExtractField$Key", "transforms.extractInt.field":"id"

}}'

Kafka Connect & Streams - the Ecosystem around Kafka

Page 55: Kafka Connect & Kafka Streams/KSQL - the ecosystem around Kafka

Demo (V) – Create JDBC Connect through REST API

Kafka Connect & Streams - the Ecosystem around Kafka

Page 56: Kafka Connect & Kafka Streams/KSQL - the ecosystem around Kafka

Demo (V) - Create Table with Driver Stateksql> CREATE TABLE driver_t \

(id BIGINT, \first_name VARCHAR, \last_name VARCHAR, \available VARCHAR) \WITH (kafka_topic='trucking_driver', \

value_format='JSON');Message----------------Table created

Kafka Connect & Streams - the Ecosystem around Kafka

Page 57: Kafka Connect & Kafka Streams/KSQL - the ecosystem around Kafka

Demo (V) - Create Table with Driver Stateksql> CREATE STREAM dangerous_driving_and_driver_s \WITH (kafka_topic='dangerous_driving_and_driver_s', \

value_format='JSON') \AS SELECT driverid, first_name, last_name, truckid, routeid,routename, eventtype \FROM truck_position_s \LEFT JOIN driver_t \ON dangerous_driving_and_driver_s.driverid = driver_t.id;

Message----------------------------Stream created and running

ksql> select * from dangerous_driving_and_driver_s;1511173352906 | 21 | 21 | Lila | Page | 58 | 1594289134 | Memphis to Little Rock Route 2 | Unsafe tail distance1511173353669 | 12 | 12 | Laurence | Lindsey | 93 | 1384345811 | Joplin to KansasCity | Lane Departure1511173435385 | 11 | 11 | Micky | Isaacson | 22 | 1198242881 | Saint Louis to Chicago Route2 | Unsafe tail distance

Kafka Connect & Streams - the Ecosystem around Kafka

Page 58: Kafka Connect & Kafka Streams/KSQL - the ecosystem around Kafka

Kafka and "Big Data" / "Fast Data"Ecosystem

Kafka Connect & Streams - the Ecosystem around Kafka

Page 59: Kafka Connect & Kafka Streams/KSQL - the ecosystem around Kafka

Kafka and the Big Data / Fast Data ecosystem

Kafka integrates with many popular products / frameworks

• Apache Spark Streaming

• Apache Flink

• Apache Storm

• Apache Apex

• Apache NiFi

• StreamSets

• Oracle Stream Analytics

• Oracle Service Bus

• Oracle GoldenGate

• Oracle Event Hub Cloud Service

• Debezium CDC

• …

AdditionalInfo:https://cwiki.apache.org/confluence/display/KAFKA/EcosystemKafka Connect & Streams - the Ecosystem around Kafka

Page 60: Kafka Connect & Kafka Streams/KSQL - the ecosystem around Kafka

Kafka in Software Architecture

Kafka Connect & Streams - the Ecosystem around Kafka

Page 61: Kafka Connect & Kafka Streams/KSQL - the ecosystem around Kafka

Hadoop ClusterdHadoop Cluster

Big Data Cluster

Traditional Big Data Architecture

BITools

Enterprise Data Warehouse

Billing &Ordering

CRM / Profile

MarketingCampaigns

File Import / SQL Import

SQL

Search/Explore

Online&MobileApps

Search

NoSQL

Parallel BatchProcessing

DistributedFilesystem

• MachineLearning• GraphAlgorithms• NaturalLanguageProcessing

Kafka Connect & Streams - the Ecosystem around Kafka

Page 62: Kafka Connect & Kafka Streams/KSQL - the ecosystem around Kafka

Event HubEvent

Hub

Hadoop ClusterdHadoop Cluster

Big Data Cluster

Event Hub – handle event stream data

BITools

Enterprise Data Warehouse

Location

Social

Clickstream

Sensor Data

Billing &Ordering

CRM / Profile

MarketingCampaigns

Event Hub

CallCenter

WeatherData

MobileApps

SQL

Search/Explore

Online&MobileApps

Search

Data Flow

NoSQL

Parallel BatchProcessing

DistributedFilesystem

• MachineLearning• GraphAlgorithms• NaturalLanguageProcessing

Kafka Connect & Streams - the Ecosystem around Kafka

Page 63: Kafka Connect & Kafka Streams/KSQL - the ecosystem around Kafka

Hadoop ClusterdHadoop ClusterBig Data Cluster

Event Hub – taking Velocity into account

Location

Social

Clickstream

Sensor Data

Billing &Ordering

CRM / Profile

MarketingCampaigns

CallCenter

MobileApps

Batch Analytics

Streaming Analytics

Results

Parallel BatchProcessing

DistributedFilesystem

Stream AnalyticsNoSQL

Reference /Models

SQL

Search

Dashboard

BITools

Enterprise Data Warehouse

Search/Explore

Online&MobileApps

File Import / SQL Import

WeatherData

Event HubEvent

HubEvent Hub

Kafka Connect & Streams - the Ecosystem around Kafka

Page 64: Kafka Connect & Kafka Streams/KSQL - the ecosystem around Kafka

Container

Hadoop ClusterdHadoop ClusterBig Data Cluster

Event Hub – Asynchronous Microservice Architecture

Location

Social

Clickstream

Sensor Data

Billing &Ordering

CRM / Profile

MarketingCampaigns

CallCenter

MobileApps

ParallelBatch

ProcessingDistributedFilesystem

Microservice

NoSQLRDBMS

SQL

Search

BITools

Enterprise Data Warehouse

Search/Explore

Online&MobileApps

File Import / SQL Import

WeatherData

{}

API

Event HubEvent

HubEvent Hub

Kafka Connect & Streams - the Ecosystem around Kafka

Page 65: Kafka Connect & Kafka Streams/KSQL - the ecosystem around Kafka

Kafka Connect & Streams - the Ecosystem around Kafka

Technology on its own won't help you.You need to know how to use it properly.