Whenever we start a search consulting project from scratch, the obvious question is: which search engine to use? We’ve talked about Elasticsearch vs Solr before, but here we’ll compare Elasticsearch with its fork, OpenSearch. Chances are, if you need to decide between the two, you’ll be looking at a few dimensions:
- Features. Which engine does the job better?
- Community. What about future versions?
- License and governance. Can I use it now and in the future?
- Ethics and principles. Because, let’s be honest, we’re not using vim for its capabilities.
As both engines offer tons of features, let’s break them down into categories: common, competing and diverging.
Common functionality is – and will likely be forever – what comes from Lucene: everything from indexing and merging documents to similarities and filter caches is exposed by both Elasticsearch and OpenSearch. As both upgrade to newer versions of Lucene, they will inherit the same improvements, bugfixes and trade-offs.
Besides Lucene, there’s common functionality coming from Elasticsearch 7.10.2, on which OpenSearch was originally based. However, as these two engines will evolve separately, more and more of this code will be replaced with evolutions that will happen in parallel.
Competing functionality is what started this fork in the first place (more on that later). There’s functionality such as authentication and authorization, index management, alerting and so on that in Elasticsearch was traditionally proprietary, so OpenSearch has implemented open-source alternatives. For example, while Elasticsearch has Index Lifecycle Management, OpenSearch has Index State Management. By and large, they do the same thing, the difference is in the details. These details change frequently, so you might want to check what’s the current state of the particular features you’re interested in. In general terms, Elasticsearch feels more rounded and mature – for example, at the time of writing this, searchable snapshots were around in Elasticsearch for a long time, while in OpenSearch they look a bit alpha. But not all the functionality in Elasticsearch is free, i.e. covered by the Basic license. For example, index lifecycle management is free, but cross-cluster replication is not – you have a complete and detailed list here.
Finally, both Elasticsearch and OpenSearch come up with functionality that isn’t replicated by the other. For example, Elasticsearch has Time series data streams while OpenSearch has (re)introduced segment replication. Nothing major so far, but as the two projects continue to evolve, we can expect more and more of this. My general feeling is that Elasticsearch tends to push harder for the logs (and other time series) use-cases. Meanwhile, OpenSearch still does that, but is likely to get more contributions in the enterprise search use-cases, since many of the Elasticsearch counterparts (e.g. those related to machine learning) are not free.
What does this mean to me?
At the time of writing this, there’s no major difference in functionality that would likely determine you to choose one over the other. But that might change in the future, as the two projects diverge. Which also means that if you’re planning to migrate from Elasticsearch to OpenSearch (or the other way around), you’d want to do this sooner rather than later – as time goes by you’ll find more missing or incompatible features.
At first, Elasticsearch appears to have more commits and contributors:
Compared to OpenSearch, which “inherited” the commits pre-fork, but notice how the number of contributions dropped after the fork in 2021 and is now slowly increasing:
But it’s not comparing apples-to-apples, because Elasticsearch’s GitHub repository includes X-Pack (with all plugins like SQL or machine learning), while for OpenSearch these are separate repositories.
Similarly, it’s hard to compare the number of forum posts: Elastic has one category for everything ELK-related, while OpenSearch has different categories for e.g. OpenSearch vs OpenSearch Dashboards (the fork of Kibana). If we add the numbers up, Elastic’s is about 5 times higher – still not a 100% accurate comparison, but the difference is clear enough.
Google Trends suggests that the interest for Elasticsearch is slowly dropping in the last year, while the one for OpenSearch is slowly increasing. So maybe all the absolute numbers will get closer in the future:
What does this mean to me?
Both Elasticsearch and OpenSearch have active communities, I wouldn’t worry about one of them dying or being poorly supported anytime soon. It’s just that, if you’re an OpenSearch user, you’re likely to refer to the (more complete) Elasticsearch documentation and more diverse discussions to understand how it works. At least in the short term.
License and Governance
OpenSearch is Apache-licensed, which means that you can pretty much use it as you wish. Elasticsearch, on the other hand, is for the most part on the Elastic License, which implies that you can’t provide a substantial set of its functionality to others as a service.
In terms of governance, you can see the code and contribute to both OpenSearch and Elasticsearch in pretty much the same way. OpenSearch is stewarded by AWS, much like Elasticsearch is by Elastic. But the license difference also means that OpenSearch is more appealing to outside contributors.
What does this mean to me?
If your search engine’s functionality isn’t exposed to the outside world – like in the case of an E-commerce website – nothing. Unless your company has policies in terms of which licenses you can use, OpenSearch’s Apache license will likely get an OK, which may not be true for Elasticsearch’s Elastic license.
If you want to build a SaaS that covers search in a significant way – say, a SaaS for searching in E-commerce sites – then OpenSearch is the safer choice (of course, a lawyer will know better).
Last but not least, if open governance is important to you, then have a look at Apache Solr. We already have a post on Elasticsearch vs Solr and one about OpenSearch vs Solr will come soon.
Ethics and Principles
So far, I have seen two extreme narratives. One is that Amazon is evil, abused the Elasticsearch trademark, while Elastic is good and stood up for the open-source community, forcing AWS to actually maintain the software on offer. The other is that Elastic is evil, abusing the trust of users and contributors alike, while AWS saved the day by forking Elasticsearch and continuing where Elastic failed to deliver.
A more nuanced point of view is that both companies had (and have) both good intentions, contributing to the community, and their own interests, which sometimes conflicted with the views of many. Here’s a brief timeline so you can form your own judgements:
- 2010: Elasticsearch was released and Apache-licensed, written by Shay Banon.
- 2012: A company was founded around Elasticsearch, offering services (support, training, consulting) as well as continuing and stewarding the development of Elasticsearch, soon joined by Logstash and Kibana (forming ELK).
- 2014: Elastic (then Elasticsearch) started launching commercial addons (e.g. for monitoring) which were later combined in a package called X-Pack, while stating that the core of Elasticsearch will be Apache-licensed forever.
- 2015: AWS launched an Elasticsearch SaaS: Amazon Elasticsearch Service. Then, Elastic sued AWS for trademark infringement. When OpenSearch was launched in 2021, Amazon Elasticsearch Service was renamed to Amazon OpenSearch Service.
- 2018: Elastic makes the X-Pack code open, though not open-source – functionality is available based on license ranging from free to Platinum. Default distributions of Elasticsearch became Elastic-licensed, but you could still download the Apache-licensed Elasticsearch without X-Pack.
- 2019: AWS launched Open Distro for Elasticsearch – a collection of open-source tools that provide features like security, roughly overlapping with what X-Pack offered. Open Distro for Elasticsearch is now deprecated, replaced by OpenSearch.
- 2021: Elastic changes the license for new versions of Elasticsearch. The core – which used to be Apache-licensed – can now be compiled separately as SSPL, while the default distribution remained on Elastic License, as it was before. This makes 7.10.2 the last Apache-licensed Elasticsearch version.
- 2021: AWS releases OpenSearch, a fork of Elasticsearch 7.10.2.
- 2021: AWS renames Amazon Elasticsearch Service to Amazon OpenSearch Service, now offering OpenSearch and extra proprietary functionality, such as UltraWarm.
What does this mean to me?
Above are some distilled, hard facts. Feel free to draw your own conclusions and of course to check out more about what the community (communities?) have to say. At Sematext, we don’t have strong opinions on this particular matter. But stay away from vim vs emacs, that’s a completely different story. 🙂
So far we’ve assumed that you’ll want to manage your own Elasticsearch/OpenSearch. It’s worth noting that both Elastic and AWS offer managed versions of them, and there are some differences:
- Elastic Cloud is essentially a hosted Elasticsearch at the respective license level (from Standard to Enterprise), while AWS OpenSearch Service is OpenSearch + additional proprietary features
- Elastic Cloud will work on other cloud providers besides AWS
Whether you want a managed service or not is also a valid question. We’ve written a blog post on AWS Elasticsearch Service vs. Elasticsearch on EC2 a while ago, but the basic flow diagram described at the end still stands: it’s down to how much control you need/want as well as the cost when you scale.
Of course there’s no straight “this is better” answer, but I hope that with the above you can decide what’s best for your use-case. No matter what you choose: Elasticsearch or OpenSearch, managed or self-hosted, we’re here to provide expert, vendor-neutral and unbiased support via:
- Consulting: everything from designing and tuning clusters and their logging pipelines to search relevance to migrating from Elasticsearch to OpenSearch or the other way around.
- Production Support: we can help put out both Elasticsearch and OpenSearch fires 24/7 and we won’t charge you per node for it.
- Both OpenSearch Training and Elasticsearch Training. From generic introduction and operations to use-case specific on enterprise search and logging.
- Sematext Cloud, so you can send detailed metrics and logs from both Elasticsearch and OpenSearch. Very useful especially when you migrate between the two!
- For log analysis and monitoring, Sematext Cloud can also be your fully-managed service. When it comes to logs, you’ll have an Elasticsearch/OpenSearch-like API both for reads and writes. And yes, this means you can bring your own pipeline (e.g. Logstash) and easily switch between self-hosted and managed. Click here to give it as spin: