There are several message queue programs to choose from: Kafka, RabbitMQ, ActiveMQ, ZeroMQ, Redis, among others. In this post we discuss the primary factors to consider when choosing a message broker, and we will focus on two of the most popular choices: Kafka and RabbitMQ.
Kafka is ideal for handling large amounts of homogeneous messages, such as logs or metric, and it is the right choice for instances with high throughput.
Kafka doesn’t come prepacked with a friendly graphical user interface. However, it is possible to create one using Kibana. See our post on monitoring Kafka for examples of what Kafka monitoring looks like with Kibana.
Replay Messages. One of the key highlights of Kafka is that it allows the user to replay messages. This is especially important for analytical use cases. For instance, if you are tracking device data for internet of things (IoT) sensors and discover an issue with your database not storing all data, then you can replay data to replace the missing information in the database.
Big Data Consideration. Kafka also pairs well with big data systems such a Elasticsearch and Hadoop. We serve clients across industries and with a range of throughputs that are best suited to a Kafka message broker and high capacity backend. The two programs work well together for scalable data collection and storage.
Scaling Capability. Kafka allows topics to be split into partitions. This feature is important because it allows multiple consumers to handle a portion of the stream and makes horizontal scaling possible.
Batches. Kafka works best when messages are batched. In other words, its performance improves with fewer, larger batches of data, rather than receiving lots of small messages. We recommend the smallest batch be 100 bytes, but preferably 1-10 kilobytes in size. Larger batch sizes come with a small latency increase depending on your implementation, typically a few milliseconds.
Security. Earlier versions of Kafka did not include built-in security features, but that changed with Apache Kafka 0.9. Version 0.9 includes four primary security features. 1) Administrators have the ability to require client authentication through either Transport Layer Security (TLS) or Kerberos. 2) Administrators can use a Unix-like permissions system to control user access to data. 3) Messages can be securely sent across networks with encryption. 4) Administrators can now require authentication for communication between Kafka brokers and ZooKeeper.
RabbitMQ can be a good choice for instances with lower throughput. RabbitMQ is slower than Kafka but provides an optimal REST API as well as an approachable graphic user interface (GUI) for monitoring queues.
Apache Cassandra is often used alongside RabbitMQ when administrators want access to stream history.
Discrete Messaging. RabbitMQ allows users to set sophisticated rules for message delivery. Options include security and conditional routing, among others.
Multiprotocol Support. RabbitMQ was originally built as an AMQP broker, and it also supports STOMP, MQTT, WebSockets, and others.
Flexibility. RabbitMQ offers the capability to vary point-to-point, request / reply, and publish / subscribe messaging. It also allows for complex routing to consumers.
Security. RabbitMQ has a history of strong authentication and expression-based authorization.
Redis is another message broker option. Being in-memory only, Redis is faster than even Kafka. It works best for customers whose destination can receive data far faster than the data can be generated.
To summarize, if you’re looking for a message broker to handle high throughput and provide access to stream history, Kafka is the likely the better choice. If you have complex routing needs and want a built-in GUI to monitor the broker, then RabbitMQ might be best for your application.
Dattell’s engineers specialize in data architecture, including the design, implementation, and support of messages brokers. Reach out to us with any unanswered questions or unique use cases.
Data consulting and implementation services from Dattell provide STRATEGY, ENGINEERING, and PERSPECTIVE to support your organization’s data projects. Our services include custom Data Architecture, Business Analytics, Operational Intelligence, Centralized Reporting, Automation, and Machine Learning. Dattell specializes in Apache Kafka and the Elastic Stack for reliable data collection, storage, and real-time display.