Kafka Definitions

Taking a break from Kafka optimization posts to get back to the basics of Apache Kafka and define fundamental Kafka concepts.

Kafka Definitions:  A Primer for Apache Kafka Fundamentals

Kafka Producer.  A Kafka producer is a standalone application, or addition to your application, that sends data to Kafka broker(s).

Kafka Broker.  A Kafka broker is the central Kafka application that acts as a message queue for your data.

Kafka Consumer.  A Kafka Consumer is a standalone application, or addition to your application, that pulls data from Kafka brokers and outputs the data to your desired location.

Kafka Topic.  Kafka topics are used to organize message feeds into categories.  Each topic has a name that is unique across the entire Kafka cluster.  Continue reading.

Kafka Monitoring.  Kafka monitoring is key to ensuring Kafka uptime and maintaining peak performance.  Kafka monitoring should include tracking both Kafka status and operating system status. Continue reading.

Kafka Load Balancing.  Kafka load balancing is the way in which Kafka assigns partitions for each incoming message within a topic.  Continue reading.

Kafka Consumer Offset.  Kafka consumer offset is defined as a way of tracking the sequential order in which messages are received by Kafka topics.  Continue reading.

Kafka Performance Tuning.  Kafka performance tuning is the process of optimizing the Kafka implementation to reduce system latency and maximize throughput. Continue reading.

Kafka Latency.  Kafka latency is the measure of how long it takes Kafka to process a single message. Continue reading.

Kafka Throughput.  Kafka throughput is the measure of how many messages and the size of the messages that arrive within a particular period of time. Continue reading.

Kafka Partitions.  Kafka partitions allow data to be processed in parallel either across multiple brokers or a single broker.  The number of partitions needed is dependent on the desired throughput and partition speed.  Continue reading.

Kafka Data Separation.  Kafka data separation can either be physical or logical.  Implementing both forms of data separation will increase Kafka cluster performance, reduce cost, and reduce downtime.  Continue reading. 

Kafka as a Service.  Kafka as a Service provides 24/7 Kafka monitoring, support, and maintenance to maximize performance and uptime.  Kafka as a Service is typically provided by a team of engineers who have extensive experience with Kafka management. Continue reading.

Managed Kafka.  Managed Kafka is used interchangeably with Kafka as a Service.  Managed Kafka provides 24/7 Kafka monitoring, support, and maintenance to maximize performance and uptime.  Managed Kafka is typically provided by a team of engineers who have extensive experience with Kafka management. Continue reading.

Hosted Kafka.  Hosted Kafka is a type of managed Kafka where the service provider hosts their clients’ Kafka clusters on the service provider’s own environment.  Hosted Kafka tends to increase latency and cost more than when Kafka is run in a client’s own environment.  Continue reading.

Kafka Uptime.  Kafka uptime is the measure of how much time Kafka is online over a prescribed period of time.  If the uptime is 100%, that means Kafka was fully online and didn’t lose any messages.  If uptime is 99% over a 24 period, then out of the last 24 hours Kafka was down for a little over 14 minutes. A well-designed Kafka deployment should provide 24/7, reliable, fault-tolerant message collection and processing.  Continue reading.