The Apache Kafka logo and large letters that read "What is a Kafka Topic?".

What is a Kafka Topic?

Updated July 2022

Kafka topics are the categories used to organize messages. Each topic has a name that is unique across the entire Kafka cluster.

Messages are sent to and read from specific topics.  In other words, producers write data to topics, and consumers read data from topics. 

Kafka topics are multi-subscriber.  This means that a topic can have zero, one, or multiple consumers subscribing to that topic and the data written to it.

In Kafka, topics are partitioned and replicated across brokers throughout the implementation.  Brokers refer to each of the nodes in a Kafka cluster. The partitions are important because they enable parallelization of topics, enabling high message throughput.  See the figure below taken from the Apache Kafka website for a visual of how topics are split into partitions.

Kafka topic
In Kafka, topics are partitioned and replicated across brokers throughout the implementation. Brokers refer to each of the nodes in a Kafka cluster. The partitions are important because they enable parallelization of topics, enabling high message throughput. Image source: https://kafka.apache.org/documentation/.

Offsets are assigned to each message in a partition to keep track of the messages in the different partitions of a topic.  Here’s a link to our article that covers the fundamentals of Kafka Consumer Offsets.

How to Create Kafka Topics

Kafka topics can be created either automatically or manually.  It is best practice to manually create all input/output topics before starting an application, rather than using auto topic. However, topics do not need to be manually created.

Creating topics automatically is the default setting.  You can confirm if this is the case for your implementation by checking that the property auto.create.topics.enable is set to true.  With this setting, topics are automatically created when applications produce, consume, or fetch metadata from a not yet existent topic. 

For auto topic creation, it’s good practice to check num.partitions for the default number of partitions and default.replication.factor for the default number of replicas of the created topic.

To create topics manually, run kafka-topics.sh and insert topic name, replication factor, and any other relevant attributes. For example: 

> bin/kafka-topics.sh –create 
–bootstrap-server localhost:9092 
–replication-factor 10 
–partitions 3 
–topic test

Where are Kafka topics stored?

Kafka topics are stored on brokers.   And rather than being confined to a single broker, topics are partitioned (spread) over multiple brokers.

This distributed approach is important for scaling.  Client applications can read from multiple brokers at once because the topics are partitioned over several brokers.

The partitioning of Kafka topics is so important that it’s one of the most critical components of Kafka optimization.  We have an entire post dedicated to optimizing the number of partitions for your implementation.  Learn more here.

How to View a List of Kafka Topics

There are two simple ways to list Kafka topics.

To view a list of Kafka topics, run the following command:

> bin/kafka-topics.sh –list 
–bootstrap-server localhost:9092

For a more granular view of the topics and partitions:

> bin/kafka-topics.sh –describe
–bootstrap-server localhost:9092

How to Clear a Kafka Topic

A Kafka topic can be cleared (also referred to as being cleaned or purged) by reducing the retention time.  

For instance, if the retention time is 168 hours (one week), then reduce retention time down to a second.  After that 1 second, the Kafka topic will be cleaned / purged / cleared.

Have Kafka Questions?

Managed Kafka on your environment with 24/ 7 support.

Consulting support to implement, troubleshoot,
and optimize Kafka.

Schedule a call with a Kafka solution architect.

Published by

Dattell - Kafka & Elasticsearch Support

Benefit from the experience of our Kafka, Pulsar, Elasticsearch, and OpenSearch expert services to help your team deploy and maintain high-performance platforms that scale. We support Kafka, Elasticsearch, and OpenSearch both on-prem and in the cloud, whether on stand alone clusters or running within Kubernetes. We’ve saved our clients $100M+ over the past six years. Without our guidance companies tend to overspend on hardware or purchase unnecessary licenses. We typically save clients multiples more money than our fees cost in addition to building, optimizing, and supporting fault-tolerant, highly available architectures.