Updated August 2023
Apache Kafka is a distributed messaging system that implements pieces of the two traditional messaging models, Shared Message Queues and Publish-Subscribe. Both Shared Message Queues and Publish-Subscribe models present limitations for handling high throughput use cases.
Apache Kafka provides fault tolerant, high throughput stream processing that can handle even the most complicated use cases with outstanding uptime. Some Kafka managed service providers, like Dattell, can offer 99.999% uptime guarantees because of Kafka’s robustness when appropriately implemented and optimized.
To understand Kafka, let’s first review how the two traditional messaging models work.
Traditional Messaging Models
How a Shared Message Queue Works:
In this first model, a shared message queue is used for all producers and consumers. Individual messages are sent to the queue by a producer and read only once by a single consumer.
Message queue models can be suitable for imperative programming because in that case messages are used like commands for consumers within the same domain.
Cons to a Shared Message Queue:
After a single consumer reads a message it is removed from the queue. This is problematic for event-driven programming where individual events/messages can drive multiple actions on the consumers’ ends, in which different actions pertain to different domains.
This limitation can be overcome by having multiple consumers connected to the shared queue. However, in that case the consumers must execute on the same functionality and reside in the same logical domain. You can probably understand now why traditional message queues are difficult to scale.
How a Publish-Subscribe Model Works:
In the pub/sub model the senders (publishers) and receivers (subscribers) of messages are decoupled. Publishers send messages to topics without knowledge of the message receiver(s), and subscribers receive messages without awareness of the publisher.
Subscribers receive messages only from the topics that they express interest in, and subscribers can receive messages from multiple partitions. Additionally, multiple subscribers can be subscribed to the same partition.
Cons to a Publish-Subscribe Model:
While this model lends itself to greater scalability than the traditional message queue model, scaling is still limited. For a subscriber to access messages from partitions it needs to individually subscribe to each partition.
Instability in message delivery can also arise because of the decoupling of the publishers and subscribers. For instance, subscribers may not be in sync with one another, which interferes with the processing of streams at scale.
Kafka Uses Consumer Groups & Broker Retention to Join the two Models
Apache Kafka improves on the pub/sub and message queue models by employing consumer groups and calling on brokers to retain messages. (Check out this post on optimizing Kafka brokers for more information.)
In Kafka, publishers are referred to as producers, and subscribers are referred to as consumers. As with the pub/sub model, the producers and consumers are decoupled from one another, not having awareness of the message sender/receiver.
Producers are responsible for choosing the partition within a topic to publish their message. This decision can either be accomplished in a round-robin format or based on semantics.
Kafka functions like a traditional message queue system when there is only one consumer group. However, when multiple consumer groups are subscribed to a topic, then it behaves like a pub/sub model with messages being received by multiple consumers.
The diagram below shows a simplified version of the Kafka model. For instance, it doesn’t show that partitions can be distributed over multiple servers within the cluster. Additionally, every partition is replicated across a number of servers to provide fault tolerance. This replication value is configurable.
Kafka uses consumer groups to increase scalability.
One or more consumers can belong to a consumer group, with all of them sharing a group id. If a consumer group has only one consumer, then it is referred to as an exclusive consumer. An exclusive consumer must subscribe to all of the topic partitions it needs. For consumer groups with more than one consumer, the workload is divided as evenly as possible.
Within a consumer group, only one of the consumers will subscribe to an individual partition. In other words, an individual message from a topic is received by a single consumer within a consumer group. Multiple consumer groups can be subscribed to read from a partition at different times.
If a consumer drops off or a new partition is added, then the consumer group rebalances the workload by shifting ownership of partitions between the remaining consumers. A rebalance within one consumer group does not have an effect on any other consumer groups.
The figure below shows how when Consumer 1 in Consumer Group A crashes, the consumer group rebalances the workload to have Consumer 2 receive messages from Partition 1.
Kafka brokers retain messages for a configurable period of time.
When a producer sends a message to Kafka it is appended to the end of the structured commit log for a particular partition. Rather than the messages disappearing after being consumed, they are retained for a predetermined period of time. The retention time is defined by log.retention.hours. For instance, if log retention is set to 7 days, then a message will be stored for consumption for 7 days after the message was published. On the 8th day it is discarded.
Even after the retention period, Kafka still retains metadata that keeps track of the position of a consumer in a log. This is referred to as the offset. More information on Kafka consumer offset is available here.
Additionally, Kafka can be set to keep a configurable size of log data known as size based retention. This parameter is defined by log.retention.bytes. It is important to note that Kafka’s performance is not affected by retaining data.
Log compaction can be employed to ensure that the Kafka cluster retains at minimum the last message key within a single topic partition. This functionality is useful for restoring the state after system failure or application crashes.
Finally, a note on ordering. For Kafka ordering is guaranteed within a partition, but not between partitions. If messages must remain ordered for an entire topic, then use only one partition for that topic.
A Final Note on Apache Kafka for Event Streaming
Kafka builds on the publication-subscribe and message queue models to provide scalable, robust event streaming. To achieve this performance, Kafka employs consumer groups and calls on brokers to retain messages for a configurable amount of time.
Have Kafka Questions?
Managed Kafka on your environment with 24/ 7 support.
Consulting support to implement, troubleshoot,
and optimize Kafka.