Introducing Kafka Streams: Large-scale Stream Processing with Kafka, Neha Narkhede
Stream Processing with Kafka and Samza
-
Upload
diego-pacheco -
Category
Technology
-
view
564 -
download
3
Transcript of Stream Processing with Kafka and Samza
Stream Processing with Kafka and Samza
Diego Pacheco @diego_pacheco Principal Software Architect
●LinkedIN 2011●Implemented with Scala and Java●Motivation: Real-time data feeds●Goals:–Low Latency–High Throughtput
●Kafka at LinkedIN(2014):–300+ brokers–18k topics–140k partitions–220B messages per day–40TB inboud–160TB outbound–Peak Load: 3.25M messages/second
●Use case: Activity Stream, Offline log processing
NO JMS
● LinkedIN 2013
● Stream Processing with Save Points.
● Multi-tenancy: 1 Thread per container
● State is simple
– You handle logging and restoring
– Single threaded programing
● Works with YARN
● Works well with Kafka
● Simple API – Record-like.
● Stream Processing
● Low Latency
● Async Processing
● Local State● Stores data localy on DISK● SAME machine where container runs
– Awesome FIT for Statefull processing
● Tight Integration with Kafka
● Strong Model For Streams: Ordered, Highly Avaliable, Partitioned and Durable(Kafka).
● Full feature Set of Kafka
● Client Side Join
Stream Processing with Kafka and Samza
Diego Pacheco @diego_pacheco Principal Software Architect
Thank You!Obrigado !