Kafka Connect & Kafka Streams/KSQL - powerful ecosystem around Kafka core

download Kafka Connect & Kafka Streams/KSQL - powerful ecosystem around Kafka core

of 59

  • date post

    22-Jan-2018
  • Category

    Technology

  • view

    101
  • download

    4

Embed Size (px)

Transcript of Kafka Connect & Kafka Streams/KSQL - powerful ecosystem around Kafka core

  1. 1. BASEL BERN BRUGG DSSELDORF FRANKFURT A.M. FREIBURG I.BR. GENF HAMBURG KOPENHAGEN LAUSANNE MNCHEN STUTTGART WIEN ZRICH Kafka Connect & Kafka Streams/KSQL Powerful Ecosystem Around Kafka Core Guido Schmutz 5.12.2017 @gschmutz guidoschmutz.wordpress.com
  2. 2. Guido Schmutz Working at Trivadis for more than 20 years Oracle ACE Director for Fusion Middleware and SOA Consultant, Trainer Software Architect for Java, Oracle, SOA and Big Data / Fast Data Head of Trivadis Architecture Board Technology Manager @ Trivadis More than 30 years of software development experience Contact: guido.schmutz@trivadis.com Blog: http://guidoschmutz.wordpress.com Slideshare: http://www.slideshare.net/gschmutz Twitter: gschmutz Kafka Connect & Kafka Streams/KSQL
  3. 3. Agenda 1. What is Apache Kafka? 2. Kafka Connect 3. Kafka Streams 4. KSQL 5. Kafka and "Big Data" / "Fast Data" Ecosystem 6. Kafka in Software Architecture Kafka Connect & Kafka Streams/KSQL
  4. 4. Demo Example Truck-2 truck/nn/ position Truck-1 Truck-3 mqtt- source truck_ position detect_danger ous_driving dangerous_ driving Truck Driver jdbc-source trucking_ driver join_dangerous _driving_driver dangerous_dri ving_driver console consumer 2016-06-0214:39:56.605|98|27|803014426| Wichita toLittle RockRoute2| Normal|38.65|90.21|5187297736652502631 Kafka Connect & Kafka Streams/KSQL 27,Walter,Ward,Y,24-JUL-85,2017-10-0215:19:00 {"id":27,"firstName":"Walter", "lastName":"Ward","available ":"Y","birthdate":"24-JUL- 85","last_update":150692305 2012}
  5. 5. What is Apache Kafka? Kafka Connect & Kafka Streams/KSQL
  6. 6. Apache Kafka History 2012 2013 2014 2015 2016 2017 Clustermirroring datacompression Intra-cluster replication 0.7 0.8 0.9 DataProcessing (StreamsAPI) 0.10 DataIntegration (ConnectAPI) 0.11 2018 ExactlyOnce Semantics Performance Improvements KSQLDeveloper Preview Kafka Connect & Kafka Streams/KSQL 1.0 JBODSupport SupportJava9
  7. 7. Apache Kafka - Unix Analogy $ cat < in.txt | grep "kafka" | tr a-z A-Z > out.txt KafkaConnectAPI KafkaConnectAPIKafkaStreamsAPI KafkaCore(Cluster) Adaptedfrom:Confluent KSQL Kafka Connect & Kafka Streams/KSQL
  8. 8. Apache Kafka A Streaming Platform Kafka Connect & Kafka Streams/KSQL High-Level Architecture Distributed Log at the Core Scale-Out Architecture Logs do not (necessarilys) forget
  9. 9. How to get a Kafka environent Kafka Connect & Kafka Streams/KSQL On Premises Bare Metal Installation Docker Mesos / Kubernetes Hadoop Distributions Cloud Oracle Event Hub Cloud Service Azure HDInsight Kafka Confluent Cloud
  10. 10. Demo (I) Truck-2 truck position Truck-1 Truck-3 console consumer 2016-06-0214:39:56.605|98|27|803014426| Wichita toLittle RockRoute2| Normal|38.65|90.21|5187297736652502631 Testdata-GeneratorbyHortonworks Kafka Connect & Kafka Streams/KSQL
  11. 11. Demo (I) Create Kafka Topic $ kafka-topics --zookeeper zookeeper:2181 --create--topic truck_position --partitions 8 --replication-factor 1 $ kafka-topics --zookeeper zookeeper:2181 list __consumer_offsets _confluent-metrics _schemas docker-connect-configs docker-connect-offsets docker-connect-status truck_position Kafka Connect & Kafka Streams/KSQL
  12. 12. Demo (I) Run Producer and Kafka-Console-Consumer Kafka Connect & Kafka Streams/KSQL
  13. 13. Demo (I) Java Producer to "truck_position" Constructing a Kafka Producer private Properties kafkaProps = new Properties(); kafkaProps.put("bootstrap.servers","broker-1:9092); kafkaProps.put("key.serializer", "...StringSerializer"); kafkaProps.put("value.serializer", "...StringSerializer"); producer = new KafkaProducer(kafkaProps); ProducerRecord record = new ProducerRecord("truck_position", driverId, eventData); try { metadata = producer.send(record).get(); } catch (Exception e) {} Kafka Connect & Kafka Streams/KSQL
  14. 14. Demo (II) devices send to MQTT instead of Kafka Truck-2 truck/nn/ position Truck-1 Truck-3 2016-06-0214:39:56.605|98|27|803014426| Wichita toLittle RockRoute2| Normal|38.65|90.21|5187297736652502631 Kafka Connect & Kafka Streams/KSQL
  15. 15. Demo (II) devices send to MQTT instead of Kafka Kafka Connect & Kafka Streams/KSQL
  16. 16. Demo (II) - devices send to MQTT instead of Kafka how to get the data into Kafka? Truck-2 truck/nn/ position Truck-1 Truck-3 truck position raw ? 2016-06-0214:39:56.605|98|27|803014426| Wichita toLittle RockRoute2| Normal|38.65|90.21|5187297736652502631 Kafka Connect & Kafka Streams/KSQL
  17. 17. Kafka Connect Kafka Connect & Kafka Streams/KSQL
  18. 18. Kafka Connect - Overview Source Connector Sink Connector Kafka Connect & Kafka Streams/KSQL
  19. 19. Kafka Connect Single Message Transforms (SMT) Simple Transformations for a single message Defined as part of Kafka Connect some useful transforms provided out-of-the-box Easily implement your own Optionally deploy 1+ transforms with each connector Modify messages produced by source connector Modify messages sent to sink connectors Makes it much easier to mix and match connectors Some of currently available transforms: InsertField ReplaceField MaskField ValueToKey ExtractField TimestampRouter RegexRouter SetSchemaMetaData Flatten TimestampConverter Kafka Connect & Kafka Streams/KSQL
  20. 20. Kafka Connect Many Connectors 60+ since first release (0.9+) 20+ from Confluent and Partners Source:http://www.confluent.io/product/connectors ConfluentsupportedConnectors CertifiedConnectors CommunityConnectors Kafka Connect & Kafka Streams/KSQL
  21. 21. Demo (III) Truck-2 truck/nn/ position Truck-1 Truck-3 mqtt to kafka truck_ position 2016-06-0214:39:56.605|98|27|803014426| Wichita toLittle RockRoute2| Normal|38.65|90.21|5187297736652502631 console consumer Kafka Connect & Kafka Streams/KSQL
  22. 22. Demo (III) Create MQTT Connect through REST API #!/bin/bash curl -X "POST" "http://192.168.69.138:8083/connectors"-H "Content-Type: application/json"-d $'{ "name": "mqtt-source", "config": { "connector.class": "com.datamountaineer.streamreactor.connect.mqtt.source.MqttSourceConnector", "connect.mqtt.connection.timeout": "1000", "tasks.max": "1", "connect.mqtt.kcql": "INSERT INTO truck_position SELECT * FROM truck/+/position", "name": "MqttSourceConnector", "connect.mqtt.service.quality": "0", "connect.mqtt.client.id": "tm-mqtt-connect-01", "connect.mqtt.converter.throw.on.error": "true", "connect.mqtt.hosts": "tcp://mosquitto:1883" } }' Kafka Connect & Kafka Streams/KSQL
  23. 23. Demo (III) Call REST API and Kafka Console Consumer Kafka Connect & Kafka Streams/KSQL
  24. 24. Demo (III) Truck-2 truck/nn/ position Truck-1 Truck-3 mqtt to kafka truck_ position 2016-06-0214:39:56.605|98|27|803014426| Wichita toLittle RockRoute2| Normal|38.65|90.21|5187297736652502631 console consumer whataboutsome analytics? Kafka Connect & Kafka Streams/KSQL
  25. 25. Kafka Streams Kafka Connect & Kafka Streams/KSQL
  26. 26. Kafka Streams - Overview Designed as a simple and lightweight library in Apache Kafka no external dependencies on systems other than Apache Kafka Part of open source Apache Kafka, introduced in 0.10+ Leverages Kafka as its internal messaging layer Supports fault-tolerant local state Event-at-a-time processing (not microbatch) with millisecond latency Windowing with out-of-order data using a Google DataFlow-like model Kafka Connect & Kafka Streams/KSQL
  27. 27. Kafka Stream DSL and Processor Topology KStream stream1 = builder.stream("in-1"); KStream stream2= builder.stream("in-2"); KStream joined = stream1.leftJoin(stream2, ); KTable aggregated = joined.groupBy().count("store"); aggregated.to("out-1"); 1 2 lj a t State Kafka Connect & Kafka Streams/KSQL
  28. 28. Kafka Stream DSL and Processor Topology KStream stream1 = builder.stream("in-1"); KStream stream2= builder.stream("in-2"); KStream joined = stream1.leftJoin(stream2, ); KTable aggregated = joined.groupBy().count("store"); aggregated.to("out-1"); 1 2 lj a t State Kafka Connect & Kafka Streams/KSQL
  29. 29. Kafka Streams Cluster Processor Topology Kafka Cluster input-1 input-2 store(changelog) output 1 2 lj a t State Kafka Connect & Kafka Streams/KSQL
  30. 30. Kafka Cluster Processor Topology input-1 Partition0 Partition1 Partition2 Partition3 input-2 Partition0 Partition1 Partition2 Partition3 Kafka Streams 1 Kafka Streams 2 Kafka Connect & Kafka Streams/KSQL
  31. 31. Kafka Cluster Processor Topology input-1 Partition0 Partition1 Partition2 Partition3 input-2 Partition0 Partition1 Partition2 Partition3 Kafka Streams 1 Kafka Streams 2 Kafka Streams 3 Kafka Streams 4 Kafka Connect & Kafka Streams/KSQL
  32. 32. Stream vs. Table Event Stream State Stream (Change Log Stream) 2017-10-02T20:18:46 11,Normal,41.87,-87.67 2017-10-02T20:18:55 11,Normal,40.38,-89.17 2017-10-02T20:18:59 21,Normal,42.23,-91.78 2017-10-02T20:19:01 21,Normal,41.71,-91.32 2017-10-02T20:19:02 11,Normal,38.65,-90.2 2017-10-02T20:19:23 21,Normal41.71,-91.32 11 2017-10-02T20:18:46,11,Normal,41.87,-87.67 11 2017-10-02T20:18:55,11,Normal,40.38,-89.17 21 2017-10-02T20:18:59,21,Normal,42.23,-91.78 21 2017-10-02T20:19:01,21,Normal,41.71,-91.32 11 2017-10-02T20:19:02,11,Normal,38.65,-90.2 21 2017-10-02T20:19:23,21,Normal41.71,-91.32 Kafka Connect & Kafka Streams/KSQL KStream KTable
  33. 33. Kafka Streams: Key Features Kafka Connect & Kafka Streams/KSQL Native, 100%-compatible Kafka integration Secure stream processing using Kafka's security features Elastic and highly scalable Fault-tolerant Stateful and stateless computations Interactive queries Time model Windowing Supports late-arriving and out-of-order data Millisecond processing latency, no micro-batching At-least-onc