blue background with text and cartoon scale

How to Choose a Managed Kafka Service Provider

Published July 2022

It can be difficult to choose a managed Kafka service provider because they can all somehow appear so different and yet also so similar.  Here we break down the 8 biggest factors to consider when comparing providers. 

8 Considerations for Choosing a Managed Apache Kafka Provider

#1 Preventative maintenance

Preventative maintenance guided by real-time monitoring should be included with your managed Apache Kafka service.  Expect the provider to monitor for topic offset deltas, consumer group lag, disk cache hit percent, and several other important metrics.  Check out our Kafka monitoring post if you’re interested in learning more about monitoring.  

#2 Response times

For priority 1 issues, 30-minutes should be the guaranteed response time.  Response times for non-emergency issues are also important.  You don’t want to wait for input for hours or days when you are working on a project or your Kafka cluster needs attention.  

Look for down-priority response times as well when choosing a provider. For instance, at Dattell our longest possible wait times for non-production issues are 3-hours.  That’s the longest our clients ever wait for a response from their dedicated engineer, no matter how trivial.

#3 Dedicated engineer

Will the managed Kafka provider be connecting you with a dedicated engineer that you will have direct access to work with daily, who is the person that understands your use case, your team, your business needs, and who is the first to respond in the event of a critical issue or downtime?   Or is the provider connecting you with whichever Kafka engineer happens to be available at the time you need help?  

Consider which approach would provider better, faster, more personalized service.  

#4 Uptime guarantee

Kafka should be the most reliable component of your data architecture.  That’s what makes it so valuable.  The goal of any managed service provider is to provide 100% uptime.  

Now, in the real world issues can arise.  What you want to ensure is that there is a guarantee included in your contract that if Kafka is down for X amount of time, then there is compensation by the provider for that downtime.  You want your provider monetarily incentivized to keep Kafka running because that drives preventative maintenance, monitoring, alerting, upgrades, and all of the other work that prevents downtime. 

Uptime guarantees are typically provided as percentages.  Let’s translate those into time to better understand.  If a provider is guaranteeing 99.95% uptime, that means that there is no repercussion to the provider for up to 22 minute of downtime per month. 

At Dattell, we guarantee 99.999% uptime, which correlates to less than one minute of downtime per month before we are monetarily on the hook for that downtime.  We work extremely hard at preventative maintenance, monitoring, and all of the important work that keeps Kafka running smoothly.

#5 Cluster security

Choose a managed Kafka provider that will address all of your security goals.  Encryption, authorization, and authentication should all be standard.

#6 Hosted or on your environment

Running Kafka in your environment offers many benefits over third party hosted Kafka.  Several providers, including Dattell, offer fully managed Kafka on your environment, whether on-prem or in the cloud.   

We will talk about latency and pricing concerns in the sections that follow.  Here we want to focus on data security and ownership over your Kafka implementation.

There is less risk exposure when Kafka is running in your environment.  When choosing hosted Kafka, your company’s data, one of its most sensitive assets, is being handed over to a third party.  This third party now makes security decisions, and you lose oversight and overall control.  Additionally, an unknown number of their employees and contractors could have access to your data and your Kafka implementation opening up a number of risks from data leaks to system downtime.

From a security standpoint, it’s best to have Kafka managed in your environment where you control who exactly has access to the VPN or cloud to interact with the cluster(s).

You also lose ownership of your Kafka implementation when you use a hosted provider.  For instance, say you choose to use a hosted Kafka offering.  In that case, you do not own any piece of your Kafka implementation.  If the host makes changes to their service, their pricing, their security, their response times, or any other number of changes that can affect your clusters you might be able to walk away from the provider, but you’re walking away empty handed.  You need to start from scratch somewhere else. 

Managed Kafka on your environment, in contrast, is Kafka built in your environment.  If you decide to part ways with your service provider, then you still have a Kafka cluster, one you own, and can either manage in-house or choose another managed Kafka provider.

#7 Latency considerations

If you run Kafka in your environment, alongside your infrastructure, then you can expect tens of milliseconds of latency.  If you choose a third party hosted option, then latency will increase to hundreds-to-thousands of milliseconds.

#8 Cost structure 

Consider how the managed Kafka provider is making money and how they lose money. 

Firstly, as mentioned above, you want them to lose money for downtime.  They need skin in the game to incentivize preventative maintenance, manual configuration, and real-time monitoring to prevent downtime. 

Secondly, if they are hosting your data, then they are going to make more money if they wildly overprivision your cluster.  Hosted Kafka may be the right path forward for you, but if you go that route keep an eye out to avoid being overcharged.

Thirdly, look for a fixed rate approach.  You want to avoid being charged extra throughout the length of the contract.  It makes it hard to budget, and can become much more expensive unexpectedly.  

Final Thoughts

Depending on your use case, data volume, team size, and other considerations some of the factors discussed in this post will be more or less relevant.  That’s why it’s good there are several different providers to choose from.

If you’re interested in learning more about Dattell’s Managed Kafka service check out our Managed Kafka product page

Have Kafka Questions?

Managed Kafka on your environment with 24/ 7 support.

Consulting support to implement, troubleshoot,
and optimize Kafka.

Schedule a call with a Kafka solution architect.

Published by

Dattell - Kafka & Elasticsearch Support

Benefit from the experience of our Kafka, Pulsar, Elasticsearch, and OpenSearch expert services to help your team deploy and maintain high-performance platforms that scale. We support Kafka, Elasticsearch, and OpenSearch both on-prem and in the cloud, whether on stand alone clusters or running within Kubernetes. We’ve saved our clients $100M+ over the past six years. Without our guidance companies tend to overspend on hardware or purchase unnecessary licenses. We typically save clients multiples more money than our fees cost in addition to building, optimizing, and supporting fault-tolerant, highly available architectures.