Updated July 2022
Measure latency and throughput to performance tune your Apache Kafka implementation. Latency is the measure of how long it takes Kafka to process a single event. Throughput is the measure of how many events arrive within a particular period of time.
To achieve the best balance of latency and throughput, tune your Producers, Brokers, and Consumers for the largest possible batch sizes for your use case. This will prevent the architecture from slowing down to an unmanageable level during peak times.
Below we outline Kafka Performance Tuning tips that we use with our clients in a range of industries from high-volume Fortune 100 Companies, to high-security government infrastructure, to customized start-up use cases.
Tuning Brokers
As discussed in a previous post, Kafka is a distributed system, running in a cluster. Each of the nodes in a Kafka cluster are referred to as Brokers. The topics are partitioned and replicated across the Brokers throughout the entirety of the implementation. These partitions allow users to parallelize topics, meaning data for any topic can be divided over multiple Brokers.
Since a topic can be divided into partitions over several machines, multiple consumers can read a topic in parallel. This architecture sets Kafka up for high message throughput.
In other words, the greater the parallelization the greater the throughput.
However, you typically don’t want to use more partitions than needed because increasing partition count also increases the amount of open server files and increases replication latency.
Below is the easy calculation we use to tune the number of partitions:
# Partitions = Desired Throughput / Partition Speed
You can estimate that a single partition for a single Kafka topic runs at 10 MB/s, conservatively.
—
For more information about Kafka Brokers, check out our posts on Creating a Kafka Topic and How Many Partitions are Needed?
Tuning Producers
There are several optimizations to keep in mind for tuning Kafka Producers. First, a Producer thread that sends data to a single partition will be faster than a producer that sends to multiple partitions.
Secondly, batch your data! We recommend our clients have a minimum batch size of 1 kb, and we recommend performance testing to fine tune the value.
Additionally for batches, avoid having linger.ms as the trigger for sending batched messages. Batch sizes below 1 kb will significantly impair performance. Instead, try linger.ms=20 or the greatest latency your implementation/use case can handle.
Finally, if there is spare CPU and network capacity on box after the Producer throughput maxes out, then add more Producer processes.
Tuning Consumers
Consumers can impair throughput on the far side of the pipeline. Keep in mind that a Consumer can be read from many partitions. However, a partition can only be read by one Consumer.
Therefore, tuning the Consumer for best performance means keeping the number of Consumers/Consumer threads at or lower than the partition count.
Further, we often see the greatest performance increases for our clients by writing more efficient code and using better libraries.
Kafka Performance Tuning Summary
- Optimize the number of Partitions using this simple equation
# Partitions = Desired Throughput / Partition Speed
where you can conservatively estimate a single partition for a single Kafka topic to run at 10 MB/s.
- A producer thread that sends data to a single partition will be faster than a producer that sends to multiple partitions.
- Batch your data! Use a minimum batch size of 1 kb.
- Keep the number of Consumers/Consumer threads at or lower than the partition count.
Interested in understanding Kafka better? Learn about how Kafka uses consumer groups for scaling event streaming.
Have Kafka Questions?
Managed Kafka on your environment with 24/ 7 support.
Consulting support to implement, troubleshoot,
and optimize Kafka.