Apache kafka-a distributed streaming platform

26
Apache Kafka A Distributed Streaming Platform StreamProcessing.be - Belgium Wednesday, 18 th January 2017 < paolo @ confluent.io >

Transcript of Apache kafka-a distributed streaming platform

Page 1: Apache kafka-a distributed streaming platform

Apache Kafka

A Distributed Streaming Platform

StreamProcessing.be - Belgium Wednesday, 18th January 2017

< paolo @ confluent.io >

Page 2: Apache kafka-a distributed streaming platform

https://www.confluent.io/blog/stream-data-platform-1/

Industry shift from Big Data to Fast Data and Stream Processing

Page 3: Apache kafka-a distributed streaming platform

$ cat < in.txt | grep “apache” | tr a-z A-Z > out.txt

Apache Kafka APIs and UNIX analogy

Page 4: Apache kafka-a distributed streaming platform

$ cat < in.txt | grep “apache” | tr a-z A-Z > out.txt

Connect APIs

Apache Kafka APIs and UNIX analogy

Page 5: Apache kafka-a distributed streaming platform

$ cat < in.txt | grep “apache” | tr a-z A-Z > out.txt

Producer/Consumer APIs

Apache Kafka APIs and UNIX analogy

Page 6: Apache kafka-a distributed streaming platform

$ cat < in.txt | grep “apache” | tr a-z A-Z > out.txt

Streams APIs

Apache Kafka APIs and UNIX analogy

Page 7: Apache kafka-a distributed streaming platform

Streams APIs part of Apache Kafka

http://kafka.apache.org/documentation/streams

Page 8: Apache kafka-a distributed streaming platform

Build applications, not clusters

<dependency> <groupId>org.apache.kafka</groupId> <artifactId>kafka-streams</artifactId> <version>0.10.1.1</version> </dependency>

Page 9: Apache kafka-a distributed streaming platform

Spot the difference(s)

Page 10: Apache kafka-a distributed streaming platform

How do I run in production?

Page 11: Apache kafka-a distributed streaming platform

How do I run in production?

As any other Java applications...

Page 12: Apache kafka-a distributed streaming platform

How do I run in production?

Uncool Cool

Page 13: Apache kafka-a distributed streaming platform

Typical High Level Architecture

Page 14: Apache kafka-a distributed streaming platform

Typical High Level Architecture

Real-time Data

Ingestion

Page 15: Apache kafka-a distributed streaming platform

Typical High Level Architecture

Stream Processing

Storage

Real-time Data

Ingestion

Page 16: Apache kafka-a distributed streaming platform

Typical High Level Architecture

Data Publishing / Visualization

Stream Processing

Storage

Real-time Data

Ingestion

Page 17: Apache kafka-a distributed streaming platform

How many clusters do you count?

NoSQL (Cassandra,

HBase, Couchbase,

MongoDB, …) or

Elasticsearch, Solr,

Storm, Flink, Spark

Streaming, Ignite, Akka

Streams, Apex, …

HDFS, NFS, Ceph,

GlusterFS, Lustre,

...

Apache Kafka

Page 18: Apache kafka-a distributed streaming platform

Simplicity is the ultimate sophistication

Apache Kafka Distributed Streaming Platform

Publish & Subscribe to streams of data like a messaging system

Store streams of data safely in a distributed replicated cluster

Process streams of data efficiently and in real-time

Node.js

Page 19: Apache kafka-a distributed streaming platform

Apache Kafka and Streams APIs benefits

• Build applications, not clusters • Native integration with Apacke Kafka • Elastic, fast, distributed, fault-tolerant, secure • Scalable: S, M, L, XL, XXL • Run everywhere: from containers to cloud • Streams (with KStream) and tables (with KTable)

• Local state replicated to Kafka for fault-tolerance • Windowing and event time semantics out of the box • Supports late-arriving and out-of-order events

Page 20: Apache kafka-a distributed streaming platform

Apache Kafka adoption across the industry… … everybody loves simplicity!

Page 21: Apache kafka-a distributed streaming platform

References

• http://kafka.apache.org/ • http://kafka.apache.org/documentation/streams

• http://docs.confluent.io/

• http://docs.confluent.io/current/streams/

• http://blog.confluent.io/

• http://github.com/confluentinc/examples

• http://github.com/apache/kafka/tree/trunk/streams

Page 22: Apache kafka-a distributed streaming platform

References

Page 23: Apache kafka-a distributed streaming platform

The easiest way to get you started

https://www.confluent.io/download/

Page 24: Apache kafka-a distributed streaming platform

SIMPLICITY

WE

Page 25: Apache kafka-a distributed streaming platform

YOUR FEEDBACK!

Page 26: Apache kafka-a distributed streaming platform

Discount code: kafcom17

Use the Apache Kafka community discount code to get $50 off

www.kafka-summit.org

Kafka Summit New York: May 8

Kafka Summit San Francisco: August 28

Presented by