Apache’Kaa - PUC-Rioendler/courses/RT-Analytics/transp/Kafka.pdf · Intro’ •...

Apache Ka)a

A distributed Publish/subscribe messaging system:

A middleware to reliably distribute data to consumers

Intro •  Originally developed by LinkedIn, and later turned into an Apache open source project

•  Designed for processing real-‐Eme acEvity streams (e.g, log metrics collecEons)

•  WriJen in Scala •  Features:

–  Persistent messaging –  High-‐throughput –  Supports both queue and Pub/Sub semanEcs –  Parallel data distribuEon –  Uses Zookeeper for forming a cluster of nodes

•  hJp://ka)a.apache.org

•  A Ka)a cluster consists of N Ka)a brokers, using zookeeper to coordinate their states.

•  Ka)a brokers receive messages from Producers (push) and deliver messages to Consumers (pull)

•  They are responsible for persisEng the messages for a configurable period of Eme or up to an available space (depending on a configuraEon)

•  Messages are persisted to append-‐only log files (sequenEal writes), and consumers read a range of these files (sequenEal reads).

Push-‐based

Pull-‐based

Example

Need for Persistent Queuing

Typical configuraEon of Real-‐Eme Stream processing Reasons for data persistency: •  workers may fail during processing and data must be distributed

again •  System must be flexible to handle bursts in traffic => the persistent

queue becomes the data buffer unEl more workers are started

Stream Producer

Dispatcher Worker

Worker

Worker

….

Interface for a Single-‐consumer Persistent queue server

Basic idea: •  When an item is read it is not immediately removed, •  Consumer sends an explicit ack telling that processing of the item

was successful. Else, a failure is reported and the item is reassigned to a new worker.

•  Only when an item gets an ack, it is definitely removed from the queue.

But, what if many applicaEons need to consume the same stream? (e.g. page-‐view data stream -‐> App1: analysis of pageviews over Eme, App2: analysis of unique visitors over Eme)

Struct Item { Long Id; byte[] item;

}

Interface Queue { Item get(); void ack(long Id) void fail (long Id)item;

}

Handling several consuming applicaEons

Op#on A: •  Make all applicaEons reside in the same codebase (and run in the same workers) or use a

single “queue consumer” •  Disadvantage: lack of isolaEon and of parallelism Op#on B: •  Maintain a separate queue for each consumer applicaEon •  Disadvantage: The load on the server is now proporEonal to the number of separate

applicaEons and the frequency of incoming events The best choice would be: •  Single queue where adding a consumer is simple and introduces a minimal increase of load

The main problem of a single-‐consumer queue is that the queue server has to keep track whether an item has been successfully consumed or not. Main idea: why not shil this responsibility to the consuming client applicaEons?

Queue Queue

consumer

App 1

App N OpEon A: Single consumer

MulE-‐consumer queue

Main idea: •  each consumer applicaEon keeps track of the consumed event objects •  It can request the event stream to be replayed from any point in the event

stream history. •  The queue server guarantees that a certain amount of stream is always

available (e.g. the items produced during the last 12 hours, or the freshest events in 50 GBs)

•  It also ensures that events are processed in the order in which they have been produced (their order in the queue)

0 1 2 3 4 5 7 8

MulE-‐consumer queue

App1

App2

Get 3 items from PosiEon 5 on

Get 7 items from PosiEon 2 on

Last succ-‐consumed = 4

Last succ-‐consumed = 1

Ka)a Main Concepts •  Data distribuEon is organized in topics. Data contained within a topic is somehow related (e.g. they are parsed in the same way).

•  Topic:: a category or feed name to which messages are published.

•  Producer:: any program that can publish a message to a topic (in some given serializaEon method). It can also send a set of messages in single publish “burst” request.

•  Published messages are stored in a set of distributed servers (Brokers), called a Ka)a Cluster.

Logs Ka)a is “inspired” by logs Logs are/have: •  Append-‐only data structure •  SequenEal, strictly ordered •  SequenEal writes and sequenEal reads can be made very

fast •  the log records when things happened (through the

message’s offset) •  Allows determinisEc replay of the history of events •  A log can be seen as a queue of things that sEll have to be

processed

•  In Ka)a, a topic is a sort of append-‐only log

Ka)a Main concepts •  Each topic is like a parallel queue. •  Data in a topic is further divided into par##ons (disjoint parallel

slices), which is a means of parallelizing the consumpEon of messages

•  Each par##on is an ordered, immutable sequence of messages of the topic that is conEnually appended to—a log.

•  Each parEEon is hold by a Ka)a broker •  Producers choose which message to assign to which parEEon

within the topic. This can be done in a round-‐robin fashion to balance load or it can be done according to some semanEc parEEon funcEon (say based on some key in the message).

Advantage of ParEEons

•  Split up the message stream into independent message streams

•  Split the load among the consumers (the workers) èhorizontal parallelism

•  Strict ordering (append-‐only) within each parEEon

•  No ordering between parEEons (producer may write in any order into parEEons) è no need to coordinate any consumpEon

Main Concepts

•  The parEEons of the topic have two purposes: –  allow the log to scale beyond a size that will fit on a single broker.

–  Each individual parEEon must fit on the broker that host it, but a topic may have many parEEons so it can handle an arbitrary amount of data.

–  They act as a unit of parallelism (each consumer instance will receive data from a single parEEon)

– Gives a way of distribuEng the produced data items (messages) to different workers for beJer load balancing

Parallel DistribuEon through parEEons

•  Producers select the parEEons to send their messages. •  Consumer always consumes messages from a single parEEon sequenEally (through a

pull request to the Ka)a cluster), informing the message’s offset •  if the consumer acknowledges parEcular message offset, it implies that the consumer

has consumed all prior messages.

Source: Abhishek Sharma – Ka)a Artcitecture, 2014

Main Concepts •  Each parEEon of a topic corresponds to a logical log. •  Every Eme a producer publishes a message to a parEEon, the

broker simply appends the message to the end of the log. •  The messages in the parEEons are each assigned an id number

which is its offset in the log.

ParEEons with the sequenEal ids of messages.

Consumer Groups In Big Data it is common to have several machines working together to consume/process data from a topic •  Consumer machines label themselves with a consumer group •  Each message published to a topic is delivered to one

machine within each consumer group –  If all consumers are in one consumer group -‐> then Ka)a provides tradiEonal queuing messaging semanEcs

–  If each consumer has its own group, then Ka)a provides broadcast-‐type Pub/Sub semanEcs (all messages get delivered to all the consumer machines)

–  More commonly, each topic has several consumer groups (each of which with some machines)

•  Ka)a assigns each parEEon of the topic to exactly one consumer instance/machine in the group. Since there are several parEEons this balances the load over the consumer instances.

Log-‐Structured Storage •  Ka)a maintains separate logs for each parEEon of each topic

•  Append-‐only log mechanism similar to the write-‐ahead-‐log protocol

•  write-‐ahead-‐log: a new message (being wriJen) is only made available to consumers unEl aler it has been commiJed to the log. So, no consumer will consume a message that may potenEally be lost in the event of a broker failure;

•  This maximizes throughput while guaranteeing reliable message delivery (messages are hold in several Brokers)

Ka)a Delivery Guarantees •  Messages sent by a producer to a parEcular topic parEEon

will be appended in the order they are sent. That is, if a message M1 is sent by the same producer as a message M2, and M1 is sent first, then M1 will have a lower offset than M2 and appear earlier in the log.

•  A consumer instance sees messages in the order they are stored in the log.

•  For a topic with replicaEon factor N, it will tolerate up to N-‐1 server failures without loosing any messages commiJed to the log.

Not a pure Queuing System •  Ka)a avoids the overhead of guaranteeing that messages

are processed in the order in which they were received (such as in AcEveMQ) but only guarantees ordering within each parEEon (by a same producer)

•  There is no defined order of writes and reads to different parEEons of a topic -‐> Ka)a does not guarantee that different consumer instances will read messages in the same order

•  This enhances parallelism •  Ka)a also does not remove messages that have been

passed to consumers. (It is the consumer’s duty to keep track of the offset of the latest consumed message).

•  Messages are automaEcally removed aler some Eme.

ReplicaEon Ka)a (from version 0.8 onwards) protects against broker failures by replicaEng data •  Each topic has a replicaEon factor: it means that each of

the topic’s parEEon will have a number of synchronized replicas

•  Replicas guarantee that commiJed messages won’t be lost as long as at least one replica survives.

•  One replica is designated as the Leader; •  The follower replicas fetch data from the leader •  The leader holds the list of “in-‐sync” replicas (ISR – In-‐

synch-‐replica set), i.e. brokers that have up-‐to-‐date logs of the parEEon(s).

Topics, ParEEons and Replicas

Source: Michael G. Noll, Running a MulE-‐Broker Apache Ka)a 0.8 Cluster on a Single Node, www.michael-‐noll.com

ReplicaEon •  Each parEEon of a topic has a leader par##on that is replicated in follower par##ons.

•  The replicas are distributed among the different physical brokers (each follower on a different broker)

•  When a message arrives at the leader parEEon, it is first appended to the leader’s log, and forwarded to the followers’ parEEons in the ISR. Only aler each followers’ parEEons sends an acknowledgement, the message is considered to be commiJed and is made available to consumers for reading.

•  The leader also occasionally sends a high watermark with the offset of the most recently commiJed messages and propagates this to the follower parEEons in the ISR.

Replication and ISRs

0

1

2

0

1

2

0

1

2

Producer

Broker 100

Broker 101

Broker 102

Topic: Partitions: Replicas:

my_topic 3 3

Partition: Leader:

ISR:

1 101

100,102

Partition: Leader:

ISR:

2 102

101,100

Partition: Leader:

ISR:

0 100

101,102

ReplicaEon •  If a broker containing a parEcular parEEon fails, it is removed from the ISR, and the leader does not wait any more for its acknowledgement

•  Ka)a’s Zookeeper support is in charge of keeping a set of live brokers for a Topic with replicaEon factor >1

•  Aler recovery of a failed follower parEEon, it examines the last known high watermark and copies all messages from the leader parEEon up to the current commiJed offset. Aler this is finished, it is added back to the ISR

Three levels of acknowledgement in the Producer API

Producer and consumer are replication-aware. Durability can be configured with the producer configuration •  None: producer awaits no ack from the leader parEEon.

Highest throughput but messages may be lost •  Leader: leader parEEon sends an ackn as soon as it has

received the message. This reduces performance a bit, but offers a reasonable level of durability

•  All: leader sends an ack only aler the message has been commiJed (through acks from all follower parEEons in ISR). Much worse performance, but message can be recovered as long as there are parEEons in the ISR.

•  If less than Min_ISR follower parEEons are acEve, Producer gets an feedback and starts to buffer

Ka)a RetenEon Policies

Ka)a is not a long-‐terms storage system (as HDFS), so data cannot be persisted indefinitely in the brokers. A retenEon policy determines which newest data will be maintained in each parEEon. There are several policies: •  Space-‐based:: keep the last X GBs of messages •  Time-‐based:: keep the messages produced during last 24

hours •  Key-‐based:: keep only the N latest messages of each key.

CoordinaEon of Ka)a Brokers

•  Ka)a Producers, Subscribers and Brokers need to know which Ka)a brokers are acEve and execuEng

•  For each parEEon, brokers need to know which one is the leader, and which ones are the followers

•  This coordinaEon is done through Zookeeper service, a distributed coordinaEon service.

Apache’Kaa - PUC-Rioendler/courses/RT-Analytics/transp/Kafka.pdf · Intro’ •...

Documents

Transcript of Apache’Kaa - PUC-Rioendler/courses/RT-Analytics/transp/Kafka.pdf · Intro’ •...