XebiConFr 15 - Kafka par la face nord

Post on 09-Jan-2017

995 views 1 download

Transcript of XebiConFr 15 - Kafka par la face nord

Ka ka par la face Nord

Messaging System?

Jay KrepsNeha NarkhedeJun Rao

History

WebAppRelationalDB

NoSQLDB

DWH

Hadoop

ETL

Monitoring Logs

WebApp

RelationalDB

NoSQLDB

DWH

Hadoop ETL

ActiveMQ

WebApp

Logs

Monitoring

WebApp

Search

WebApp

RelationalDB

NoSQLDB

DWH

Hadoop ETL

ActiveMQ

WebApp

Logs

Monitoring

WebApp

Search

BIGMESS

Stream Data Platform

● Distributed● High throughput● Large number of consumers● Ad-hoc consumers● Batch consumers● Automatic recovery from broker failure

Features

Distributed Commit Logs

10 11 12 13 14 15 16 17 18987654321 19

1st recordNext recordWritten

Consumer

Broker

Consumer

Consumer

Kafka Cluster

Broker Broker

Broker Broker Broker

Zookeeper

Producer

Producer

Producer

Architecture

Producer

10 11 12 13 14 15 16 17 18987654321

10 11 12 13 14987654321 15

10 11 12 13 14 15987654321 16

Partition #1

Partition #2

Partition #3

ProducerProducer

19

16

17

offset

Old New

Writes

Writes

Writes

Consumer group

Consumer group

10 11 12 13 14 15 16 17 18987654321

10 11 12 13 14987654321 15

10 11 12 13 14 15987654321 16

Partition #1

Partition #2

Partition #3

Group Topic # Offset

1 log 1 18

1 log 2 12

1 log 3 14

2 log 1 1

2 log 2 0

2 log 3 3

Consumer group 1 Consumer group 2

Old New

Topic storage

10 11 12 13 14 15 16 17 18987654321

Partition #1

directory segment = file

app_log-2:total 1864-rw-r--r-- 1 root root 512 Oct 28 01:00 00000000000000208027.index-rw-r--r-- 1 root root 337762 Oct 27 19:03 00000000000000208027.log-rw-r--r-- 1 root root 10485760 Oct 28 19:24 00000000000000208739.index-rw-r--r-- 1 root root 1553051 Oct 28 19:24 00000000000000208739.log

app_log-3:total 1940-rw-r--r-- 1 root root 48 Oct 27 07:02 00000000000000207555.index-rw-r--r-- 1 root root 31360 Oct 27 04:05 00000000000000207555.log

Topic clustering

10 11 12 13 14 15 16 17 18987654321

10 11 12 13 14987654321 15

10 11 12 13 14 15987654321 16

Partition #1

Partition #2

Partition #3

Leader

Topic clustering

10 11 12 13 14 15 16 17 18987654321

10 11 12 13 14987654321 15

10 11 12 13 14 15987654321 16

Partition #1

Partition #2

Partition #3

Leader Replica Replica

Topic clustering

10 11 12 13 14 15 16 17 18987654321

10 11 12 13 14987654321 15

10 11 12 13 14 15987654321 16

Partition #1

Partition #2

Partition #3

Jay KrepsNeha NarkhedeJun Rao

Ka ka Enterprise Ready

2011 2012

2014

● User behaviour, click stream analysis● Infrastructure monitoring and security ● Telemetry data from mobile/sensors● IoT● Log analysis● ...

Usage

Used by

● LinkedIn : activity stream, metrics● Netflix : Real-time Monitoring● Twitter : Real-time data pipeline● Spotify : log delivery● Loggly : log collection and processing● Mozilla : telemetry data● Airbnb, Square, Uber, Criteo, OVH ...

● Need to write code to use it (no ready made producers and consumers)

● Not JMS replacement● No data transformations yet● No encryption, authorization or

authentication yet (v0.9.0 KAFKA-2210, KAFKA-2211)

Pain Points

Hand’s On : The Road

InstallationZookeeperBrokersTopicConsole Tools

Hand’s On : The Trail

Producer Kafka (Java/Scala)

High Level Consumer (Java/Scala)

Hand’s On : The North Face

“Simple” Consumer

Go!