Kafka to the Maxka - (Kafka Performance Tuning)

66
KAFKA TO THE MAXKA By Matt Andruff

Transcript of Kafka to the Maxka - (Kafka Performance Tuning)

Page 1: Kafka to the Maxka - (Kafka Performance Tuning)

KAFKA TO THE MAXKA By Matt Andruff

Page 2: Kafka to the Maxka - (Kafka Performance Tuning)

Kafka Performance Tuning

Page 3: Kafka to the Maxka - (Kafka Performance Tuning)

Welcome!

Matt Andruff - Hortonworks Practice lead @ Yoppworks

@MattAndruff

Page 4: Kafka to the Maxka - (Kafka Performance Tuning)

Because I get asked a lot...Yoppworks

Page 5: Kafka to the Maxka - (Kafka Performance Tuning)

Because I get asked a lot...Yoppworks

Page 6: Kafka to the Maxka - (Kafka Performance Tuning)

Because I get asked a lot...Yoppworks

Page 7: Kafka to the Maxka - (Kafka Performance Tuning)

Performance Tuning...

Page 8: Kafka to the Maxka - (Kafka Performance Tuning)

Agenda

• Performance tuning - Just some quick points

• What you can change • Simple changes• Kafka Configuration Changes

• Brief Canned Demo• Beware Kafka settings are not exciting for everyone

• Architectural changes

Page 9: Kafka to the Maxka - (Kafka Performance Tuning)

Perfomance Tuning

What do you need to make changes?

Page 10: Kafka to the Maxka - (Kafka Performance Tuning)

Performance tuning

There is no magic bulletGuesses are just GuessesEmpirical fact requires testing

Requires hardware, SME’s, time, effort

It’s non-trivial to do performance testing.

Page 11: Kafka to the Maxka - (Kafka Performance Tuning)

Performance tuning

The better your load tests are the better your tuning will be.Garbage in, Garbage out.

Page 12: Kafka to the Maxka - (Kafka Performance Tuning)

Performance tuning

The better your load tests are the better your tuning will be.Garbage in, Garbage out.

Page 13: Kafka to the Maxka - (Kafka Performance Tuning)

Performance tuning

The better your load tests are the better your tuning will be.Garbage in, Garbage out.

Everyone (Every client) is differentHas a unique signature of data/hardware/topics

Page 14: Kafka to the Maxka - (Kafka Performance Tuning)

Performance tuning

The better your load tests are the better your tuning will be.Garbage in, Garbage out.

Everyone client is differentHas a unique signature of data/hardware/topics

Tune for bottlenecks found through testing.Yes, There is always some low hanging fruit.

Page 15: Kafka to the Maxka - (Kafka Performance Tuning)

Beyond Tuning

What your boss understands:

Page 16: Kafka to the Maxka - (Kafka Performance Tuning)

Beyond Tuning

What you understand:

Page 17: Kafka to the Maxka - (Kafka Performance Tuning)

First a minor detour to the OS

I promise to move fast but it can’t be ignored.

To be complete we need to cover some of the basics.

Page 18: Kafka to the Maxka - (Kafka Performance Tuning)

Which OS to use?

Page 19: Kafka to the Maxka - (Kafka Performance Tuning)

The basics

● Noatime ○ removes last access time from files○ Save’s a write on read.

Page 20: Kafka to the Maxka - (Kafka Performance Tuning)

The basics

● Ext 4 is widely in use● XFS has shown better performance

metrics

https://kafka.apache.org/documentation.html#filesystems

Page 21: Kafka to the Maxka - (Kafka Performance Tuning)

JVM settingsexport $KAFKA_JVM_PERFORMANCE_OPTS = ‘...’

Java 1.8-Xmx6g -Xms6g -XX:MetaspaceSize=96m -XX:+UseG1GC-XX:MaxGCPauseMillis=20 -XX:InitiatingHeapOccupancyPercent=35 -XX:G1HeapRegionSize=16M-XX:MinMetaspaceFreeRatio=50 -XX:MaxMetaspaceFreeRatio=80

Java 1.7 beware of older versions-Xms4g -Xmx4g -XX:PermSize=48m -XX:MaxPermSize=48m -XX:+UseG1GC-XX:MaxGCPauseMillis=20 -XX:InitiatingHeapOccupancyPercent=35

Page 22: Kafka to the Maxka - (Kafka Performance Tuning)

The basics

● File descriptor limits○ Per broker Partitions * segments +

Overhead ■ Watch this when you upgrade to 0.10

● set vm.swappiness = 0

Page 23: Kafka to the Maxka - (Kafka Performance Tuning)

The basics

● Kafka Data should be on its own disks● If you encounter read/write issues add

more disks● Each data folder you add to config will

be written to in round robin

Page 24: Kafka to the Maxka - (Kafka Performance Tuning)

Latest is the Greatest

● Have you upgraded to 0.10 ● Add 8 bytes of time stamp

○ Not great for small messages.● No longer does broker decompression

○ Better performance when you use compression.● File descriptor limits

○ Segments indexing changed

Page 25: Kafka to the Maxka - (Kafka Performance Tuning)

Defaults are your friends

Page 26: Kafka to the Maxka - (Kafka Performance Tuning)

Defaults are your friends

The default when you drive is to put on your seatbelt.If you are going to change the default to not wearing a seatbelt I hope you have thought through your choice.

Kafka’s defaults are setup to help keep you safe.If you are going to change the default to something else I hope you have thought through your choice.

Page 27: Kafka to the Maxka - (Kafka Performance Tuning)

The Producer

Page 28: Kafka to the Maxka - (Kafka Performance Tuning)

Default Example

Acks:

Setting Description Risk of Data loss Performance

Acks=0 No acknowledgment from the server at all. (Set it and forget it.)

Highest Highest

Acks=1 Leader completes write of data.

Medium Medium

Acks=all All leaders and followers have written the data.

Lowest Lowest

Page 29: Kafka to the Maxka - (Kafka Performance Tuning)

Default Example

Acks:

Setting Description Risk of Data loss Performance

Acks=0 No acknowledgment from the server at all. (Set it and forget it.)

Highest Highest

Acks=1 Leader completes write of data.

Medium Medium

Acks=all All leaders and followers have written the data.

Lowest Lowest

Page 30: Kafka to the Maxka - (Kafka Performance Tuning)

Definitions:

Latency: The length of time for one message to be processed.

Throughput: The number of messages processed

Batch:• “Message 1” - Time 1• “Message 2” - Time 2• “Message 3” - Time 3

← Worst Latency

← Best Latency

Page 31: Kafka to the Maxka - (Kafka Performance Tuning)

Batch Management

Producer

Batch -Partition 1- TopicA

Broker

Partition

“data” “data” “data”

Batch -Partition 1- TopicA

Batch -Partition 1- TopicB

“data” “data” “data”

“data”

Segment

Page 32: Kafka to the Maxka - (Kafka Performance Tuning)

Batch Management

Batch.size - How many messages define the maximum batch size?

Linger.ms - What is the maximum amount of time to wait before sending a batch?

Other:- Same Broker Sending (Piggy Back)- flush() or close() is called

Page 33: Kafka to the Maxka - (Kafka Performance Tuning)

Batch Management

Producer

Broker

Partition 1 - TopicA

Batch -Partition 1- TopicA

Batch -Partition 1- TopicB

“data” “data” “data”

“data”

Segment

Partition 1 - TopicB

Segment

Page 34: Kafka to the Maxka - (Kafka Performance Tuning)

Batch Management

Default Message size is 2048 (If linger.ms is large)

Buffer.memory / Batch.size > Message size

33554432 / 16384 > 2048

Page 35: Kafka to the Maxka - (Kafka Performance Tuning)

Batch Management

Producer

Batch -Partition 1- TopicA

Broker

Partition

“data” “data” “data”

Segment

Page 36: Kafka to the Maxka - (Kafka Performance Tuning)

Batch Management

Default Message size is 2048 (If linger.ms is small)

Buffer.memory / Batch.size > Message size

33554432 / (< 16384) > (>2048)

Page 37: Kafka to the Maxka - (Kafka Performance Tuning)

Batch Management

Producer

Batch -Partition 1- TopicA

Broker

Partition

“data” “data”

Segment

Batch -Partition 1- TopicB

“data”

Partition 1 - TopicB

Segment

“data”

← Linger is triggeringBefore batch is full.

← Using bigger messages to fill the batch

Page 38: Kafka to the Maxka - (Kafka Performance Tuning)

Batch Management

Tune your Batch.size/linger.ms

batch.size + linger.ms = latency + through put

batch.size + linger.ms = latency + through put

Once tuned, do not forget to size your buffer.memory

Page 39: Kafka to the Maxka - (Kafka Performance Tuning)

Compression

Compression.type = none

Compression can introduce performance due to transferring less data over the network. (Cost of additional CPU)

Generalization: Use snappy ****** You should do real performance tests.

Page 40: Kafka to the Maxka - (Kafka Performance Tuning)

Batch ManagementProducer

Batch -Partition 1- TopicA

“data” “data”

Batch -Partition 1- TopicB

“data” “data”

Serializer Partitioner

Page 41: Kafka to the Maxka - (Kafka Performance Tuning)

Did we stick with the Defaults?

Custom Class written for performance?

● Partitioner ○ - Create a custom key based on data - help prevent Skew

● Serializer ○ - Pluggable

● Interceptors ○ - Allows manipulation of records into Kafka ○ - Are they being used? Should they? How are they written?

Page 42: Kafka to the Maxka - (Kafka Performance Tuning)

Tuning

To tune performance you need to experiment with different settings.Data and throughput are different with every project.There is no one size fits all.

Luckily there is a tool to help test configurations.

Page 43: Kafka to the Maxka - (Kafka Performance Tuning)

kafka-run-class.sh bin/kafka-run-class.sh \

org.apache.kafka.clients.tools.ProducerPerformance \test 50000000 100 -1 acks=1 \bootstrap.servers=esv4-hcl198.yoppworks.rules.com:9092 \buffer.memory=67108864 batch.size=8196

Or use the short cut:bin/kafka-producer-perf-test.sh \

test 50000000 100 -1 acks=1 \bootstrap.servers=esv4-hcl198.yoppworks.rules.com:9092 \

buffer.memory=67108864 batch.size=8196

There is also one for the consumer:bin/kafka-consumer-perf-test.sh \

Page 44: Kafka to the Maxka - (Kafka Performance Tuning)

Time for a quick walkthrough

Page 45: Kafka to the Maxka - (Kafka Performance Tuning)

Monitoring

Ops Clarity- Now owned by Lightbend - Cadillac of monitoring.

Burrow- A little Resource heavy, (Kafka client per partition)- Health monitor has some false positives

Yahoo Kafka-managerConfluent Control Center

- Confluent distro

Roll your own Kafka JMX & MBeans

Page 46: Kafka to the Maxka - (Kafka Performance Tuning)

Where did they get the name Kafka?

My Guess

Putting Apache Kafka to Use for Event Streams,https://www.youtube.com/watch?v=el-SqcZLZlI

~ Jay Kreps

Page 47: Kafka to the Maxka - (Kafka Performance Tuning)

Where did they get the name Kafka?

My Guess

Page 48: Kafka to the Maxka - (Kafka Performance Tuning)

Where did they get the name Kafka?

My Guess

Page 49: Kafka to the Maxka - (Kafka Performance Tuning)

Where did they get the name Kafka?

Page 50: Kafka to the Maxka - (Kafka Performance Tuning)

Where did they get the name Kafka?

“I thought that since Kafka was a system optimized for writing using a writer's name would make sense. I had taken a lot of lit classes in college and liked Franz Kafka. Plus the name sounded cool for an open source project.” ~ Jay Kreps

https://www.quora.com/What-is-the-relation-between-Kafka-the-writer-and-Apache-Kafka-the-distributed-messaging-system

Page 51: Kafka to the Maxka - (Kafka Performance Tuning)

Where did they get the name Kafka?

“I thought that since Kafka was a system optimized for writing using a writer's name would make sense. I had taken a lot of lit classes in college and liked Franz Kafka. Plus the name sounded cool for an open source project.” ~ Jay Kreps

https://www.quora.com/What-is-the-relation-between-Kafka-the-writer-and-Apache-Kafka-the-distributed-messaging-system

Page 52: Kafka to the Maxka - (Kafka Performance Tuning)

The Broker

Page 53: Kafka to the Maxka - (Kafka Performance Tuning)

Broker Disk Usage

● What your rate of growth and when will you need to expand?

● Try and make sure the number of partions you select covers that growth

Page 54: Kafka to the Maxka - (Kafka Performance Tuning)

Broker Disk Usage

● Log.retention.bytes■ Default is unlimited (-1)

● Log.retention.[time interval]■ Default is 7 days (168 hours)

Page 55: Kafka to the Maxka - (Kafka Performance Tuning)

Broker

● num.io.threads■ Default is 8 - should match physical

disks

Page 56: Kafka to the Maxka - (Kafka Performance Tuning)

Beyond Tuning

How do we optimize writing:

Page 57: Kafka to the Maxka - (Kafka Performance Tuning)

Beyond Tuning

Measure the throughput:

Page 58: Kafka to the Maxka - (Kafka Performance Tuning)

Beyond Tuning

Page 59: Kafka to the Maxka - (Kafka Performance Tuning)

The Consumer

Page 60: Kafka to the Maxka - (Kafka Performance Tuning)

replica.high.watermark.checkpoint.interval.ms - You might think that the high water mark ensures

reliability. It also has has implications on performance.

- Whatch our for consumer lag

Page 61: Kafka to the Maxka - (Kafka Performance Tuning)

Beyond Tuning

Page 62: Kafka to the Maxka - (Kafka Performance Tuning)

Beyond TuningThe future Consumers ability to scale is constrained by the number of partitions.

Page 63: Kafka to the Maxka - (Kafka Performance Tuning)

Beyond Tuning> # of Partitions means:

> Level of parallelism> # files open

( Partitions * Segment count * Replication) / Brokers ~= # of open files per machine

10’s of Thousands of files is manageable on appropriate hardware.> Memory usage (Broker and Zookeeper)> Leader fail over time (Can be mitigated by increased # brokers)

Page 64: Kafka to the Maxka - (Kafka Performance Tuning)

Beyond TuningHow do I calculate the number of partitions to have on a broker?

What’s the rule of thumb to start testing at?

[# partitions per broker] = c x [# brokers] x [replication factor]

c ~ Your machine's awesomeness

c ~ Your appetite for risk

c ~ 100 a good safe starting point

Page 65: Kafka to the Maxka - (Kafka Performance Tuning)

Beyond TuningCan I move an existing partition around? I just added a new broker, and it’s not sharing the load.

Use: bin/kafka-reassign-partitions.sh

1) Create a JSON file of the topics you want to redistribute topics.json2) Use kafka-reassign-partitions.sh … --generate to suggest partition reassignment 3) Copy proposed assignment to a JSON file.4) Use kafka-reassign-partitions.sh … --execute to start the redistirbution process.

a) Can take several hours, depending on data.5) Use kafka-reassign-partitions.sh … --verify to check progress of the redistirbution process.

Link to documentation from conference sponsor.

topics.json:{"topics": [{"topic": "weather"}, {"topic": "sensors"}], "version":1}

Page 66: Kafka to the Maxka - (Kafka Performance Tuning)

Thanks!

Matt Andruff - Hortonworks Practice lead @ Yoppworks

@MattAndruff

I’m not an expert I just sound like one.