Published July 2022
With companies revisiting their budgets to brace for a possible recession, now is the time to review your data storage costs and find places to reduce those fees without sacrificing performance. In this article we consolidate our top tips for saving money on data storage costs.
From the top we want to highlight that the biggest, fastest cost savings for most companies is obtained by moving off of paid, licensed database tools and onto an open source option such as OpenSearch or free Elasticsearch. We’ll talk more about that later in Tip 5 below.
5 Ways to Save Money on Data Storage
#1 Be deliberate with performance tiers
Cloud storage providers offer multiple tiers for data storage. For instance, Google Cloud Platform offers Standard, Nearline, Coldline, and Archive. Standard storage is for data that is accessed regularly, i.e. hot data. Nearline is for data accessed less actively, in the range of once a month or less. Coldline is for data accessed a couple times a year, and archive is for data accessed less than once a year.
The storage tiers come with different pricing that is charged per-GB of storage, with the “hot data” or Standard tier being the most expensive.
The tradeoff comes from the fact that pulling data from the Standard tier is free, but pulling data from the cooler tiers has a fee. That’s where good tracking of your data usage is helpful to see where the tradeoff is for the lower cost of storage with the charge per data pull.
Amazon S3 offers an “Intelligent Tiering” service for a fee. The promise is that it automatically moves data to the most cost-effective tier depending on how it’s being accessed at a particular time.
We are weary of recommending clients pay for a service that they can run on their own for free. In our experience it’s better to set up a monitoring system on your own, or hire a contractor to do it for you and pay once.
#2 Keep an up-to-date users list
Providers can charge by the number of users that have access to the service. Check monthly to ensure that you aren’t paying for users that no longer need access to any particular service or tool.
#3 Remove data redundancies
Sometimes companies are storing the same data in multiple places without even being aware of it. Here we aren’t talking about deliberate backups. Backups are important and worth investing money in.
Where we see opportunities for saving money on storage costs is where data is being inadvertently stored multiple times. For instance, if a stream is pulling information about the last three days, every day, then there is going to be two days of redundant data collected daily. Additionally, if you have a database that stores user account information and another more general database, you could remove the user account information from the general database to save space.
Looking for and removing redundant data will reduce the amount of data storage needed and the related costs.
#4 Assign a designated auditor
Assign a team member to periodically review storage fees. This will be true whether you use cloud storage, a data center, or a mix of the two. The idea is that you want to make it someone’s specific responsibility to be aware of the monthly fees, look for jumps in fees, and consider possible cost saving solutions.
#5 Move to a free database tool
All of the above can make a dent in your fees, but the biggest payoff comes from moving off of a licensed database.
Companies are spending hundreds of thousands to millions of dollars a year in licensing fees for their database tool. These tools might include Splunk, Oracle, or Elasticsearch, among others.
These licensing fees simply exist to use the software. . On top of those fees there can also be additional costs for support and data storage.
There are a number of reasons why companies are paying for database tools. Three common reasons are:
(1) At first the licensed tool was free or cheap because the amount of data was small, but over time data collection and the accompanying fees have increased markedly.
(2) There is concern of staff turnover and having an external party have ownership or responsibility for the data storage tool provides a sense of security.
(3) A misperception persists that paying a license for a database means the tool offers better reliability or performance.
Let’s talk about each of these reasons one at a time.
Cost of paid licenses
The first one is simple but hard. It can be difficult to really consider how a fee has increased overtime. The flip side of this is that offering a way out of that fee can make you a company hero.
To give an example, we helped a client move off of paid Elasticsearch and onto the free version of Elasticsearch, saving them over $10M in licensing fees each year. Our Elasticsearch experts helped them to put the components in place to move off of the paid service and trained their team to build internal expertise. Our team also provides ongoing support for the client to run preventative maintenance, monitoring and alerting, and troubleshooting.
Our one time fee is less than 5% of the amount they save yearly, allowing them to effectively reduce their database costs by 95%. The system that resulted is optimized to meet the client’s exact needs, and is thus more efficient and has better performance than the clusters they were using at the start of the project.
Wanting the consistency of an external solution
With tech turnover at nearly 4x the rate of other industries, the concern for consistency in data systems is an important one. For instance, Google reports engineers stay an average of just 1.1 years.
There are ways to achieve consistency that are much more cost effective than paying a license fee.
Hiring an external company to manage a free tool rather than paying licensing fees for Splunk, Oracle, or Elasticsearch, is a fraction of the cost while providing that same consistency.
Further, if you choose a managed service provider that is on the smaller side, then you will get a more personalized experience. For instance, we’ve seen first hand that if you aren’t a Fortune 100 company, you can have trouble getting fast access to support from the big companies, even the big companies that charge licensing fees.
Free tools offer equivalent performance
It’s a common misperception that the cost of a tool is a direct reflection of its value. We are engrained to believe in this correlation between cost and value, and it can be disorienting to consider it might not be true in every instance.
When it comes to database tools, there are fantastic free tools that offer just as good performance and reliability as paid tools. The free, open source tool OpenSearch and the free version of Elasticsearch are our favorites for their high performance and reliability.
For instance, we build and manage database systems that handle 800k+ documents per second using the free, open source tool OpenSearch.
Summarizing Data Storage Savings
There you have it, there are a number of ways to reduce data storage costs, and the biggest change you can make is moving to a free database tool such as the free version of Elasticsearch or OpenSearch. Our engineers are available to talk with you more about whether moving to a free tool is a good option for your use case and can provide an estimate for the cost savings.