Updated July 2022
Apache Kafka performance is directly tied to system optimization and utilization. Here, we compiled the best practices for a high volume, clustered, general use case.
Keep in mind, these recommendations are generalized, and thorough Kafka Monitoring would inform the Kafka implementation that best fits your custom use case.
Table of Contents: Click to jump to section
Use Java 11. For security concerns, we recommend the latest patch.
-Xmx6g -Xms6g -XX:MetaspaceSize=96m -XX:+UseG1GC
-XX:MaxGCPauseMillis=20 -XX:InitiatingHeapOccupancyPercent=35 -XX:G1HeapRegionSize=16M
-XX:MinMetaspaceFreeRatio=50 -XX:MaxMetaspaceFreeRatio=80 -XX:+ExplicitGCInvokesConcurrent
Once the JVM size is determined leave the rest of the RAM to the OS for page caching. Kafka needs the page cache for writes and reads.
Kafka runs on any Unix system and has been tested on Linux and Solaris. We run the latest version of CentOS for reasons outside of Kafka.
File descriptor limits: Kafka uses file descriptors for log segments and open connections. We recommend at least 100,000 allowed file descriptors for the broker processes as a starting point.
Max socket buffer size: Kafka can increase buffer size to enable high-performance data transfer between data centers.
Disks And File System
The disk and file system usage is where we see people make the most mistakes.
Use only one drive or RAID array per partition! If you have multiple partitions per hard drive, use an SSD instead of HDD. Do not share the drive(s) dedicated to a partition with other applications or the operating system as other partitions or programs will disrupt sequential reads/writes.
Multiple drives can be configured using log.dirs in server.properties. Kafka assigns partitions in round-robin fashion to log.dirs directories.
Create alerts based off disk usage on each of your Kafka-dedicated drives.
We use RAID mainly because of the automatic recovery feature. Keep in mind that while a RAID array is rebuilding, the Kafka node will act as though it is down due to disk usage being dedicated to the rebuild.
Log Flush Management
Use the default flush settings which disable application fsync entirely. Done.
File System Selection
Use XFS, it has the most auto-tuning for Kafka already in place.
Do not co-locate Zookeeper on the same boxes as Kafka.
Ensure you allocate sufficient JVM. Monitoring will tell you the exact number, but start with 3GB.
Use JMX metrics to monitor the Zookeeper instance.
Use the Zookeeper that ships with the Kafka version you’re using, not an OS package.
Use your Zookeeper cluster only for Kafka.
Use monitoring to keep the Zookeeper cluster as small and simple as possible.
Partitions drive the parallelism of consumers. With a higher number of partitions, a higher number of consumers can be added resulting in higher throughput. Base the number of partitions off the performance of your consumer and the rate of consumption needed. In other words, if your consumer handles 1k EPS and you need to consume 20k EPS, go with 20 partitions.
More partitions can increase the end-to-end latency in Kafka, defined by the time a message is published by the producer to when it is read by the consumer.
Our posts on Creating a Kafka Topic and How Many Partitions are Needed? provide more detailed information on Kafka topics and partitions.
Kafka Broker Configs
Set the Kafka broker JVM by exporting KAFKA_HEAP_OPTS.
Log.retention.hours – This setting controls when the old messages in a topic will be deleted. Take into consideration the disk space and how long messages should be available. We typically go with three days or 72 hours.
Message.max.bytes – Maximum size of the message the server can receive. Ensure you set replica.fetch.max.bytes to be equal to or greater than message.max.bytes.
Delete.topic.enable – This will allow users to delete a topic from Kafka. This is set to false by default. Delete topic functionality will only work from Kafka 0.9 onwards.
unclean.leader.election – This config is set to true by default. By turning this on you are choosing availability over durability.
Batch.size – The number of events to batch before sending to Kafka.
Linger.ms – The time to wait before sending to Kafka.
Compression.type – Snappy is the highest performing compression type of the three types that ship with Kafka.
Max.in.flight.requests.per.connection – Change this if the order messages are received in is not important.
Acks – We recommend “acks=all” to be used for reliability.
A producer thread that sends data to a single partition is faster than a producer that sends to multiple partitions.
Batch your data! We recommend performance testing to fine tune, but at least have a batch size of 1kb.
If the producer throughput maxes out and there is spare CPU and network capacity on box, add more producer processes.
Avoid having linger.ms as the trigger for sending batched messages. Small batches of less than 1kb will destroy performance. Try linger.ms=20 or the highest amount of latency you can handle.
Max.in.flight.requests.per.connection > 1 adds pipelining and increases throughput; however, it may cause out-of-order delivery when a retry occurs. Excessive max in-flight requests will drop throughput.
On the consumer side, you’re going to see most performance increases by using better libraries or writing more efficient code.
Keep the number of consumers/consumer threads at or lower than the partition count. A consumer can read from many partitions, but a partition can only be read by one consumer.
Have Kafka Questions?
Managed Kafka on your environment with 24/ 7 support.
Consulting support to implement, troubleshoot,
and optimize Kafka.