The key to ensuring Kafka uptime and maintaining peak performance is through monitoring. By reviewing disk performance, memory usage, CPU, network traffic, and load in real-time abnormal metrics or trends can be identified before a performance dip or outage occurs.
Furthermore, monitoring Kafka provides assurance to your users that all messages are correctly processed.
There are several programs available for monitoring Kafka. Some come with added cost, such as Confluent, and others are available for free because they are open source. If you are using Kafka, then you already understand the benefits of open source tools. If you need a refresher check out our post on the benefits of open source tools.
When setting up Kafka for our clients we always use the open source database Elasticsearch and its companion visualization tool Kibana for monitoring. It is cost effective, has an appealing user interface, and provides the necessary functionality to create a comprehensive and easy-to-use monitoring tool.
Elasticsearch as an Open Source Monitoring Tool for Kafka
If you aren’t familiar with using Elasticsearch as a Kafka monitoring tool, let’s get acquainted.
#1 Elasticsearch is FREE. Being an open source tool under the Apache license, Elasticsearch and its companion tool Kibana are free to download, use, and modify.
#2. Elasticsearch is ADAPTABLE. Elasticsearch works as a multifaceted and adaptable tool that offers horizontally scalable data storage and fast data retrieval.
#3 Kibana provides attractive DASHBOARDS. Kibana is a sister product to Elasticsearch that delivers customizable and appealing graphics for building dashboards for the Kafka monitoring platform.
#4 Elasticsearch plays well with ALERTING. Setting up an alerting program for Kafka issues is simplified, as there are several tools that play well with Elasticsearch to provide threshold and Machine Learning based alerting.
Schematic of Information Flow for Open Source Kafka Monitoring
The figure below shows how information flows from the producers to the Kafka cluster(s), and then to the Consumers and Zookeeper. However, unlike the “normal” use of Kafka where information then flows from the Consumers to your database, here shown as Elasticsearch, information is being sent from Kafka and Zookeeper to database for monitoring the performance of the Kafka cluster(s).
What metrics should be monitored to track Kafka performance and health?
Both the status of Kafka and the operating system need to be recorded to evaluate system health and performance. In our article on Kafka Monitoring With Elasticsearch and Kibana we dive into the details of what factors should be included and what “normal” looks like.