Updated July 2022
Kafka topics are the categories used to organize messages. Each topic has a name that is unique across the entire Kafka cluster.
Messages are sent to and read from specific topics. In other words, producers write data to topics, and consumers read data from topics.
Kafka topics are multi-subscriber. This means that a topic can have zero, one, or multiple consumers subscribing to that topic and the data written to it.
In Kafka, topics are partitioned and replicated across brokers throughout the implementation. Brokers refer to each of the nodes in a Kafka cluster. The partitions are important because they enable parallelization of topics, enabling high message throughput. See the figure below taken from the Apache Kafka website for a visual of how topics are split into partitions.
Offsets are assigned to each message in a partition to keep track of the messages in the different partitions of a topic. Here’s a link to our article that covers the fundamentals of Kafka Consumer Offsets.
How to Create Kafka Topics
Kafka topics can be created either automatically or manually. It is best practice to manually create all input/output topics before starting an application, rather than using auto topic. However, topics do not need to be manually created.
Creating topics automatically is the default setting. You can confirm if this is the case for your implementation by checking that the property auto.create.topics.enable is set to true. With this setting, topics are automatically created when applications produce, consume, or fetch metadata from a not yet existent topic.
For auto topic creation, it’s good practice to check num.partitions for the default number of partitions and default.replication.factor for the default number of replicas of the created topic.
To create topics manually, run kafka-topics.sh and insert topic name, replication factor, and any other relevant attributes. For example:
> bin/kafka-topics.sh –create
Where are Kafka topics stored?
This distributed approach is important for scaling. Client applications can read from multiple brokers at once because the topics are partitioned over several brokers.
The partitioning of Kafka topics is so important that it’s one of the most critical components of Kafka optimization. We have an entire post dedicated to optimizing the number of partitions for your implementation. Learn more here.
How to View a List of Kafka Topics
There are two simple ways to list Kafka topics.
To view a list of Kafka topics, run the following command:
> bin/kafka-topics.sh –list
For a more granular view of the topics and partitions:
> bin/kafka-topics.sh –describe
How to Clear a Kafka Topic
A Kafka topic can be cleared (also referred to as being cleaned or purged) by reducing the retention time.
For instance, if the retention time is 168 hours (one week), then reduce retention time down to a second. After that 1 second, the Kafka topic will be cleaned / purged / cleared.
Have Kafka Questions?
Managed Kafka on your environment with 24/ 7 support.
Consulting support to implement, troubleshoot,
and optimize Kafka.