What is Kafka Streams?


Published September 2023 Traditional batch processing systems, albeit effective for fixed datasets, are often ill-suited for real-time, high-throughput scenarios. Kafka Streams enables organizations to act on data as it arrives, making it particularly useful for applications that require immediate response. Kafka Streams is a client library for building real-time applications and microservices. It’s a part … Continue reading What is Kafka Streams?

Comparing OpenSearch and Google Cloud Search


Published September 2023 The choice of search platforms can significantly impact business operations. As experts in high-throughput implementations, we’ve witnessed the rise of various search platforms, each with its unique offerings. Today, we’ll explore OpenSearch and Google Cloud Search in depth, drawing parallels and highlighting differences. Introducing Opensearch & Cloud Search OpenSearch: A relatively new … Continue reading Comparing OpenSearch and Google Cloud Search

Optimizing Kafka Brokers: Lessons From Managing Fortune 500 Implementations


Published August 2023 Optimizing Kafka broker performance will have a direct impact on your overall Kafka implementation. In this blog post, we cover what a Kafka broker does, why it’s essential, and how to optimize Kafka broker settings.  We also share firsthand experiences optimizing Kafka brokers from our work with Fortune 500 companies.  We encourage … Continue reading Optimizing Kafka Brokers: Lessons From Managing Fortune 500 Implementations

OpenSearch Cluster Optimization for Small, Medium, and Large Clusters


Published May 2023 Small, medium, and large OpenSearch clusters require different approaches for optimization. Dattell’s engineers are expert at designing, optimizing, and maintaining OpenSearch.  Find out more about our OpenSearch support services. Optimizing a Small OpenSearch Cluster The minimum number of nodes for a small, highly available OpenSearch cluster is three (3). Three nodes might … Continue reading OpenSearch Cluster Optimization for Small, Medium, and Large Clusters

Enterprise Managed OpenSearch


Updated August 2023 Anyone in charge of ensuring their company’s data pipeline has the following five priorities in mind:  reliability, security, speed, cost, and ownership.  In this article we discuss how enterprise managed OpenSearch provides peace of mind, especially having someone to call when a cluster fails in the middle of the night. And we … Continue reading Enterprise Managed OpenSearch

OpenSearch Shard Optimization


Updated August 2023 Optimizing OpenSearch for shard size is an important component for achieving maximum performance from your cluster. OpenSearch shards enable parallelization of data processing across both a single node and multiple OpenSearch nodes. OpenSearch automatically manages the allocation of shards within the nodes. However, choosing the number of shards needed is up to … Continue reading OpenSearch Shard Optimization

Vector Search for OpenSearch


Updated August 2023 OpenSearch includes a plugin for vector search.  In this post, we introduce vector search and compare the different methods available.   We will also point you in the right direction for example code.   For  personalized help, contact us to learn more about our OpenSearch support services. What is vector search? Here’s the … Continue reading Vector Search for OpenSearch

Kafka vs Pulsar


Updated August 2023 Pulsar and Kafka achieve the same result. They both guarantee messages reach their intended destination(s). Yet, there are important differences between the two message queues. These differences can make one of the technologies a better fit, depending on your use case. In this post we cover 8 ways in which Apache Kafka … Continue reading Kafka vs Pulsar

Preparing for a Cloud Outage


Published August 2022 Nearly all of our clients and a majority of companies are using the cloud for at least a portion of their infrastructure.  It’s important for companies to plan for cloud outages to minimize the damage caused by them. In this post we will cover how to minimize damage and recover quickly after … Continue reading Preparing for a Cloud Outage

OpenSearch vs. Elasticsearch


Updated January 2023 With OpenSearch originating as a fork from Elasticsearch, the two databases can appear to be near-identical to the unacquainted.  However, they are unique, becoming more so with each new update.   Here we will discuss how the two search engines compare when it comes to security, licensing, core features, documentation, community support, dashboards, … Continue reading OpenSearch vs. Elasticsearch

Elasticsearch Support Services FAQ


Published July 2022 Our team of engineers has been architecting, optimizing, and managing Elasticsearch for over 6 years.  We’ve found that there are common questions that new clients have about Elasticsearch support services. Below is a list of a few of the most common questions inquiring new clients have when they reach out.  Let us … Continue reading Elasticsearch Support Services FAQ

Data Engineering Study


Published June 27, 2022 Data engineering is the field dedicated to building data infrastructure to ingest, process, and store large amounts of data.  This is a quickly growing field, with both the number of jobs in data engineering and the number of tools on the market steadily increasing.  Despite the popularity of data engineering as … Continue reading Data Engineering Study

What is a Virtual CIO?


Published June 2022 Virtual CIOs provide the leadership and expertise to build, grow, and maintain reliable data architecture.  They are often hired by midsized companies that are looking for a trusted authority to drive data architecture and the supporting team. Virtual CIOs are also referred to as vCIOs, fractional CIOs, part-time CIOs, and CIOs for … Continue reading What is a Virtual CIO?

What is OpenSearch?


Updated May 2022 OpenSearch is an open source search and analytics software.  It’s a community led project with Amazon Web Services (AWS) leading the development.  It was first created as a fork from Elasticsearch 7.10.2 and Kibana 7.10.2 in 2021.  The OpenSearch search engine is simply referred to as OpenSearch, and the dashboard tool is … Continue reading What is OpenSearch?

Elasticsearch Basics: What it is, Licensing, Languages, and Getting Help


Updated March 2023 Elasticsearch is a distributed search and analytics engine.  It is built on top of Apache Lucene.   Elasticsearch was first released in 2010 by the company now known as Elastic.  It was originally completely open source, but license changes have limited its usage. More on that below. Elasticsearch is part of a group … Continue reading Elasticsearch Basics: What it is, Licensing, Languages, and Getting Help

4 Approaches to Data Backup


We outlined the four primary ways for backing up data and their benefits and drawbacks to help you decide on which approach best meets your company’s needs.