Kafka on Kubernetes

Updated April 2022 More and more companies are coming to us specifically for assistance with deploying and managing Apache Kafka on Kubernetes.  With many teams already familiar with Kubernetes and its ability to orchestrate infrastructure, it can sometimes be the best choice to spin up Kafka servers on Kubernetes alongside their other applications.

What is Kafka Connect?

Updated March 2022 Kafka Connect is a free tool for efficiently moving data into and out of Apache Kafka.  Kafka Connect simplifies streaming data while also improving scalability and reliability. Features of Kafka Connect Standardizes integrations with Kafka.  Kafka Connect provides a shared framework for all Kafka connectors, which improves efficiency for connector development and

Kafka Uses Consumer Groups for Scaling Event Streaming

Updated November 2021 Apache Kafka is a distributed messaging system that implements pieces of the two traditional messaging models, Shared Message Queues and Publish-Subscribe.  Both Shared Message Queues and Publish-Subscribe models present limitations for handling high throughput use cases.   Apache Kafka provides fault tolerant, high throughput stream processing that can handle even the most complicated

Kafka Case Studies

Updated February 2022 Apache Kafka's high throughput and high availability make its applications vast.  Here we dive into eight Kafka case studies.  These accounts are taken from work our Kafka solutions architects / Kafka consultants have done in the field with our clients. Medical Manufacturing Company automating the drug manufacturing process with multiple machines needs

Kafka Definitions

Updated October 2021 Taking a break from Kafka optimization posts to get back to the basics of Apache Kafka and define fundamental Kafka concepts. Kafka Definitions:  A Primer for Apache Kafka Fundamentals Kafka Producer.  A Kafka producer is a standalone application, or addition to your application, that sends data to Kafka broker(s). Kafka Broker.  A

Kafka Consumer Optimization

Updated December 2021 Kafka Consumer's Role. The role of the Kafka consumer is to read data from Kafka.  Kafka consumer optimization can help avoid errors and increase performance of your application.   While the focus of this blog post is on the consumer, we will also review several broker configurations which affect the performance of consumers. Top

What is a Kafka Topic?

Updated April 2022 Kafka topics are the categories used to organize messages. Each topic has a name that is unique across the entire Kafka cluster. Messages are sent to and read from specific topics.  In other words, producers write data to topics, and consumers read data from topics.  Kafka topics are multi-subscriber.  This means that

Open Source Monitoring for Kafka

Updated December 2021 A critical component to ensuring Kafka uptime and maintaining peak performance is through monitoring.  Open source monitoring of disk performance, memory usage, CPU, network traffic, and load allow you to identify abnormal metrics in real-time and address potential issues before a performance dip or outage occurs. In other words, monitoring Apache Kafka

Load Balancing With Kafka

Updated May 2022 What is Kafka loading balancing? Load balancing with Kafka is a straightforward process and is handled by the Kafka producers by default.  While it isn't traditional load balancing, it does spread out the message load between partitions while preserving message ordering. Round-robin approach:  By default, producers choose the partition assignment for each

Kafka Use Cases

Updated April 2021 Apache Kafka is a high-throughput, open source message queue used by Fortune 100 companies, government entities, and startups alike. Part of Kafka's appeal is its wide array of use cases.  In this post we will outline several of Kafka's uses cases from event sourcing to tracking web activities to metrics and more.

Performance Tuning for Apache Kafka

For Apache Kafka performance tuning measure latency and throughput for your Kafka implementation. Latency is the measure of how long it takes Kafka to process a single event. Throughput is the measure of how many events arrive within a particular period of time.

Kafka Monitoring With Elasticsearch and Kibana

Monitoring Kafka cluster performance is crucial for diagnosing system issues and preventing future problems. We recommend using Elasticsearch for Kafka monitoring because Elasticsearch is free and highly versatile as a single source of truth throughout any organization.

Frequently Asked Questions: Apache Kafka

Our team is experienced with implementing and fixing Kafka on a wide-range of systems for an even wider-range of business needs. From our real-world experience with Kafka consulting, we found that there are common questions that many new clients have about the technology.
Here are some quick answers to those questions.

Kafka Optimization

Issues with Apache Kafka performance are directly tied to system optimization and utilization. Here, we compiled the best practices for a high volume, clustered, general use case.

When to Consider Physical and Logical Separation With Kafka

When companies scale, their data handling needs change, and systems that worked a year ago are now over-taxed with the increase in message volume. One particular component of the data handling system, the cluster architecture, should be revisited.