Published February 2023
Yes, Apache Kafka does guarantee message order. In this article we will walk you through how Kafka guarantees order using the detailed figure below.
Producers' Role in Kafka Ordering
Data is sent to Kafka from producers. Producers are often applications that generate messages. And each message sent to Kafka will correlate with an assigned key, depicted above as k# (k1, k2, k3, k4).
Let’s say you run a video streaming service. Each user has a specific username. That username could be set as the key.
The figure also shows that messages for user k4 are coming from two different producers. This can occur when the same video streaming account is used on two different devices, for example a smart tv and a laptop.
Partitions' Role in Kafka Ordering
Kafka stores messages within topics, and each topic has one or more partitions. A partition is a logical unit of parallelization of work across multiple Kafka instances. If there are three partitions, then three Kafka brokers can process data across three computers.
Messages, depicted above as m# (m1, m2, …), associated with a specific key will always be sent to the same partition. This is an important aspect of guaranteeing order.
If messages from the same key were sent to different partitions, then ordering could be mixed up because one computer (partition) is slower than another, or it’s further from the producer, or it goes offline temporarily.
Sending messages associated with the same key to a single partition removes these potential causes of misordering.
A partition can accept messages from multiple different keys. In our video streaming example, the data for multiple users can be saved on the same partition. Users k1 and k2 both store messages in Partition 1.
However, each message key goes to a single partition. In our example, user k4 will always have their messages sent to Partition 3, no matter which producer is sending the message.
An important note on partitions – You cannot change the number of partitions and retain ordering because when the number of partitions changes, keys can be instantaneously assigned to new partitions.
For more information on Kafka partitions, check out our articles on How to Determine the Number of Partitions Needed and Load Balancing in Kafka.
Consumers' Role in Kafka Ordering
Messages are read by consumers from the partitions. Each partition can only be read by a single consumer, but a consumer can read from multiple partitions.
For example, in the figure above Consumer B reads from both Partition 2 and Partition 3.
In other words, the number of partitions can outnumber the number of consumers. However, there is no benefit to having more consumers than partitions.
If partitions could send messages to multiple consumers, then order couldn’t be guaranteed. Just like with partitions, one consumer could be faster than another or go offline temporarily. By partitions writing to only one consumer, these potential ordering errors are removed and ordering is preserved.
For more information on Kafka consumers, check out our articles on Kafka Consumer Basics and Kafka Consumer Offset.
Kafka Guarantees Order
With messages of the same key being stored on a single partition and sent to a single consumer, the message ordering is guaranteed.
Have Kafka Questions?
Managed Kafka on your environment with 24/ 7 support.
Consulting support to implement, troubleshoot,
and optimize Kafka.