When to Consider Physical and Logical Separation With Kafka

Within Kafka, there are two different ways to organize data through separation: physical and logical.

PHYSICAL SEPARATION

Physical separation of clusters is the best approach in four primary cases.

#1

If the message sizes coming to your system vary widely, from exceedingly small to exceedingly large, physical separation is best.

Messages are processed in the order they are received and a single, large message would add considerable delay for the smaller messages.

Implementing both forms of physical and logical separation of data will increase the performance of your Kafka clusters, reduce cost, and reduce downtime.

Implementing both forms of physical and logical separation of data will increase the performance of your Kafka clusters, reduce cost, and reduce downtime.

#2

Physical separation is best is when messages are consumed in intervals instead of constantly.

Kafka prefers to have messages consumed as they are received. If a consumer wakes up once per hour to consume messages, Kafka will need to pull messages from disk and insert them into RAM, effectively decreasing the performance of the entire Kafka cluster.

 

Did you know that Dattell offers Kafka as a Service?

Dattell’s Kafka as a Service is a fully managed, high-throughput messaging system built on your cloud instances or On-Prem servers, providing enhanced security, reduced latency, and cost effectiveness.

Learn about Kafka as a Service

#3

Exceedingly high overall bandwidth for a single service also merits physical separation.

If a specific subset of the messages in a cluster are the source of the majority of the bandwidth, decrease the overall liability to your message queue by creating a separate cluster dedicated to the high bandwidth messages.

Auto-scaling operations become easier as well when working with only a single service.

#4

The fourth case for a separate cluster is when you have critical messages that must be guaranteed.

As mentioned previously, auto-scaling operations are more robust when only accounting for a single service.

Additionally, physical separation of your critical data protects the messages from an unrelated, less important service with a bug causing a degradation of your Kafka cluster holding the critical data.

Drawbacks to Physical Separation

The drawback to implementing physically separated clusters is that there are now multiple clusters to monitor and alert on, and multiple clusters to expand and manage.

These perceived drawbacks are minimized by automating the majority of those tasks.

LOGICAL SEPARATION

Logical separation of data is a bit trickier than physical separation because much thought and consideration must be given to the creation of  Kafka topics and which services apply to each topic.

Unless multiple services need the same stream of messages, all of the data a service needs should have its own topic.

You want to keep your topics to a minimum to maintain high performance of the Kafka cluster.

Benefit of Logical Separation

The benefit of logical separation is that missing data is easily retrieved from Kafka because of the organization of information into topics.

Drawback

The drawback to this approach is that for optimization purposes, each topic should have its own dedicated disk to maintain sequential read and write performance.

Implementing both forms of physical and logical separation of data will increase the performance of your Kafka clusters, reduce cost, and reduce downtime.

Still have questions about Kafka? Connect with one of our Kafka engineers.

Talk to a Kafka Expert

 

 


Click to learn about Dattell’s Kafka as a Service.

99.95% Uptime Guarantee, Built on Your Servers or Cloud Instances for Unmatched Data Authority, Reduced Latency, and Cost Effectiveness.

dattell logo bars (6)