Elasticsearch Optimization for Small, Medium, and Large Clusters

The way nodes are organized in an Elasticsearch cluster changes depending on the size of the cluster.  For small, medium, and large Elasticsearch clusters there will be different approaches for optimization.

Dattell’s team of engineers are expert at designing, optimizing, and maintaining Elasticsearch implementations and supporting technologies.  Find our more about our Elasticsearch services here.

Optimizing a Small Elasticsearch Cluster

The most important thing to keep in mind when creating a small Elasticsearch cluster is that the minimum number of nodes is three (3).  At first three might seem like one more than you need. Afterall, all you need are two nodes for fault tolerance.  However, two nodes aren’t enough for a distributed system.

Distributed systems, like Elasticsearch, have the potential for a split brain problem, that is, two different nodes both claiming to be the master.  This can happen if two nodes are separated from one another, i.e. a partition occurs.

elasticsearch_spltbrain

Distributed systems, like Elasticsearch, have the potential for a split brain problem.  If in a two node system one node drops out (Node A) the second node (Node B) continues to ingest data.  The split brain then occurs when Node A comes back online and claims it is the master while Node B has the correct data. Both Node A and Node B have 50% voting rights as to which has the correct data in the two-node scenario.

In order to determine which data set is correct, a majority vote must occur.  If you have a third node and require at least two nodes to vote for a master, you will eliminate the possibility for the split brain.

Optimizing a Medium Sized Elasticsearch Cluster

As your cluster size grows to about 8 or 12 nodes, it’s time to consider dedicating nodes to specific tasks. Doing this will allow you to optimize the nodes for the different tasks.

Data Nodes (Hot/Warm) – Data nodes will be optimized for storage space and search with less compute power.  Further, data nodes can be split into hot nodes which are run on your fastest hardware and warm nodes that can be run on cheaper hardware like spinning disks.

Master Nodes – These nodes are important for maintaining a consistent view of the cluster. Typically you want to have three (3) master nodes.  Make sure to scale the RAM for the dedicated master nodes to keep pace with the growth of the cluster because every shard will take a finite amount of space on the master node(s).

Coordinator Nodes – Coordinator nodes can be important for improving performance by reducing the burden on data nodes.  Coordinator nodes are aptly named as they coordinate execution queries or load balance executive queries.

Ingest Nodes – Ingest nodes can be used like a “lightweight” Logstash to run ETL transformations on newly indexed data.

Machine Learning Nodes – These nodes will be the most compute heavy of any of the node types.

Optimizing a Large Elasticsearch Cluster

With medium-to-large size clusters (in the range of 100), you will want to start performance testing.  Tools like Rally will allow you to test metrics like your ideal shard size, indexing time, cpu usage, and min/max throughput.

Once you reach the triple digit number of nodes it will get harder to manage the cluster.  For instance, if you have 30 teams logging data in a single Elasticsearch cluster and one team sends bad data to that cluster, then all 30 teams will be affected when the failure occurs.

Furthermore, Elasticsearch clusters don’t scale well past 400 TB because at that point the shard count generally starts to introduce lag to the cluster.

At this point you might want to consider how you can split up a large cluster into multiple smaller clusters.  If you are concerned about being able to search information across clusters when you split up your nodes, fret not. Cross Cluster Search (previously Tribe Nodes) allows you to search across multiple Elasticsearch clusters.

Dattell’s team of engineers are expert at designing, optimizing, and maintaining Elasticsearch implementations and supporting technologies.  Check out our Elasticsearch  Consulting page to learn more about our services.


Dattell LLC

Data consulting and implementation services from Dattell provide STRATEGY, ENGINEERING, and PERSPECTIVE to support your organization’s data projects. Our services include custom Data Architecture, Business Analytics, Operational Intelligence, Centralized Reporting, Automation, and Machine Learning. Dattell specializes in Apache Kafka and the Elastic Stack for reliable data collection, storage, and real-time display.

Dattell customers and partners

One thought on “Elasticsearch Optimization for Small, Medium, and Large Clusters

Comments are closed.