Green background with text and OpenSearch logo.

OpenSearch Cluster Optimization for Small, Medium, and Large Clusters

Published May 2023

Small, medium, and large OpenSearch clusters require different approaches for optimization.

Dattell’s engineers are expert at designing, optimizing, and maintaining OpenSearch.  Find out more about our OpenSearch support services.

Optimizing a Small OpenSearch Cluster

The minimum number of nodes for a small, highly available OpenSearch cluster is three (3).

Three nodes might seem like one more than you need.  Afterall, only two nodes are needed for fault tolerance.

However, two nodes aren’t enough for a distributed system.

Distributed systems, like OpenSearch, have the potential for a split-brain. The split-brain problem occurs when two different nodes both claim to be the master node.

If in a two node system one node drops out (Node A) the second node (Node B) continues to ingest data. The split-brain occurs when Node A comes back online and claims it is the master, while Node B has the correct data.

In the two-node scenario, both Node A and Node B have 50% voting rights as to which node has the correct data.

A split-brain is avoided by adding a third node and the rule that at least two nodes must vote for the master.

Optimizing a Medium OpenSearch Cluster

As cluster size grows to around 10 nodes, it’s time to consider dedicating nodes to specific tasks. This approach allows the nodes to be optimized for their particular task.

Data Nodes (Hot/Warm)

Data nodes are optimized for storage space and search with less compute power. Data nodes can be further split into hot nodes and warm nodes. Hot nodes run on the fastest available hardware. Warm nodes run on cheaper hardware, such as spinning disks.

Master Nodes

Master nodes are important for maintaining a consistent view of the cluster. Typically, you want to have three (3) master nodes. Scale the RAM for the master nodes to keep pace with the growth of the cluster. This is because every shard will take a finite amount of space on the master node(s). Learn more about OpenSearch shard optimization.

Coordinator Nodes

Coordinator nodes coordinate execution queries or load balance executive queries. Coordinator nodes can improve performance by reducing the burden on data nodes.

Ingest Nodes

Ingest nodes can be used like a “lightweight” Logstash/Filebeat/Fluentd/etc. to run ETL transformations on newly indexed data.

Machine Learning Nodes

Machine learning nodes will be the most compute heavy of any of the node types.

Optimizing a Large OpenSearch Cluster

Large clusters — in the range of 40 nodes — can leave more teams open to issues and be harder to manage.

Let’s imagine a situation where 20 teams are logging data in a single OpenSearch cluster. If one team sends bad data to that cluster, then all 20 teams will be affected when the failure occurs.

Another concern with these larger clusters is the shard count. Beyond about 400 TB the shard count generally starts to introduce lag to the cluster. You will want to start performance testing.

Splitting up the large cluster into a pair of smaller clusters will be the next step. Rest assured that you won’t lose searchability with multiple clusters. The security plugin that comes standard with your OpenSearch download includes cross-cluster search.

Have OpenSearch Questions?

Managed OpenSearch on your environment with
24/ 7 support.

Consulting support to implement, troubleshoot, and optimize OpenSearch.

Schedule a call with a OpenSearch solution architect.

Published by

Dattell - Kafka & Elasticsearch Support

Benefit from the experience of our Kafka, Pulsar, Elasticsearch, and OpenSearch expert services to help your team deploy and maintain high-performance platforms that scale. We support Kafka, Elasticsearch, and OpenSearch both on-prem and in the cloud, whether on stand alone clusters or running within Kubernetes. We’ve saved our clients $100M+ over the past six years. Without our guidance companies tend to overspend on hardware or purchase unnecessary licenses. We typically save clients multiples more money than our fees cost in addition to building, optimizing, and supporting fault-tolerant, highly available architectures.