Stream Processing with Kafka and Samza

download Stream Processing with Kafka and Samza

of 25

  • date post

    16-Feb-2017
  • Category

    Technology

  • view

    551
  • download

    3

Embed Size (px)

Transcript of Stream Processing with Kafka and Samza

  • Stream Processing with Kafka and Samza

    Diego Pacheco @diego_pacheco Principal Software Architect

  • LinkedIN 2011Implemented with Scala and JavaMotivation: Real-time data feedsGoals:Low LatencyHigh Throughtput

    Kafka at LinkedIN(2014):300+ brokers18k topics140k partitions220B messages per day40TB inboud160TB outboundPeak Load: 3.25M messages/second

    Use case: Activity Stream, Offline log processing

  • NO JMS

  • LinkedIN 2013 Stream Processing with Save Points. Multi-tenancy: 1 Thread per container State is simple

    You handle logging and restoring Single threaded programing

    Works with YARN Works well with Kafka Simple API Record-like.

  • Stream Processing Low Latency Async Processing Local State

    Stores data localy on DISK SAME machine where container runs

    Awesome FIT for Statefull processing Tight Integration with Kafka Strong Model For Streams: Ordered, Highly Avaliable, Partitioned

    and Durable(Kafka). Full feature Set of Kafka Client Side Join

  • Stream Processing with Kafka and Samza

    Diego Pacheco @diego_pacheco Principal Software Architect

    Thank You!Obrigado !

    Slide 1Slide 2Slide 3Slide 4Slide 5Slide 6Slide 7Slide 8Slide 9Slide 10Slide 11Slide 12Slide 13Slide 14Slide 15Slide 16Slide 17Slide 18Slide 19Slide 20Slide 21Slide 22Slide 23Slide 24Slide 25