By Chris Riley
What is the Elastic Stack?
Elasticsearch, Logstash, and Kibana — the trio better known as Elastic Stack (or ELK, if you prefer a term that is now going out of style), make up a powerful set of tools for searching and analyzing data. Their power derives not just from their technical features, but also the fact that Elastic Stack is an open source platform that anyone can download and set up anywhere.
Yet before you go downloading and installing it on your servers, consider an alternative approach to running ELK: using a fully hosted, cloud-based implementation. Although the DIY, on-premises approach for running Elastic Stack may save you a bit of money in the short term, in the long run, it’s likely to cost more than it’s worth.
The Benefits of Running Elastic in the Cloud
Wondering why? Here are five reasons why it’s better to run Elastic Stack in the cloud than on-premises.
Instant ELK Deployment with Minimal Effort
Being able to deploy and start using the stack with minimal time and effort may be one of the most obvious benefits of running it in the cloud. It’s also one of the most compelling reasons to choose that approach.
This is particularly true because installing Elastic Stack yourself is more than a trivial affair. You have to download and install each of the three core components (Elasticsearch, Kibana, and Logstash), plus any add-ons you want to use, separately. You also have to ensure that you install each of the components in the right order (for details on that order, see Elastic’s documentation).
Last but not least, the installation process also entails working from the command line (yes, even for those of you deploying on Windows, where the CLI went out of style circa 1995) if you use the official packages from Elastic.co. Although Elastic provides scripts that automate much of the process, there is still a chance that something could go wrong.
You could alternatively use a prebuilt package, such as Elastic’s Debian package for Elasticsearch, which will theoretically make installation easier. But you still have to worry about package management and dependency problems, or the risk that a configuration issue that is specific to your environment prevents Elastic Stack from installing or starting properly.
In contrast, when you run Elastic Stack in the cloud as a hosted service, there is very little that you need to do to get the platform up and running.
ELK Stack Architecture
The open-source Elastic Stack consists of following parts:
- Elasticsearch as the core engine for search and analytics
- Master nodes to manage the Elasticsearch cluster
- Ingestion nodes for data transformation pipelines
- Data nodes to store indexed data
- Client nodes as load balancers for search queries
- Kibana server as the user interface for data visualization
- Logstash for data collection, transformation, and log shipping
- Beats as data collectors (e.g. MetricBeat, Filebeat)
The Architecture of the ELK stack makes it possible to scale Elasticsearch, by deploying more nodes for data ingestion or data storage.
On top of the open-source stack, Elastic provides commercial features for the Elastic Stack, formally known as X-Pack. Using commercial extensions or not, is always a question of the IT budget so many people are interested in X-Pack / Elastic Features alternatives for their specific use case.
Logstash and Beats are not the only data collectors working with Elasticsearch. A wide range of Logstash alternatives can be used as log shippers, having better performance or other advantages over Logstash.
Improving the Elastic Stack Scalability
Part of the core value of ELK is its ability to enable you to search massive volumes of data quickly. This can be done due largely to the architecture of the platform itself.
However, even the most optimized Elastic Stack code can perform only so well if the infrastructure on which it is running is limited in capacity. This is likely to happen if you run it on your own infrastructure and the platform consumes more resources than you have available.
In the cloud, you won’t run into this problem, because the infrastructure available to Elastic Stack is virtually limitless. Nor do you have to worry about expanding the host infrastructure if your deployment needs more capacity; your hosting provider will handle that for you.
ELK Performance Optimization & Monitoring
While the minimum system requirements of the ELK stack are typically light, it’s not only infrastructure limitations that could undercut the performance of your installation. The extent to which you fine-tune for performance is also a factor. There are steps you can and should take, such as freezing unused indices and optimizing shard size, which can have a significant impact on Elastic Stack performance.
You may or may not have the skills to do this performance tuning in-house. And even if your team does have the necessary expertise, they may not have the time to manage it, optimize its performance, scale it up and so on an ongoing basis.
For example, Sematext Cloud exposes an Elasticsearch API and includes seamlessly integrated Kibana. This means you don’t have to launch, configure, and manage Elasticsearch and Kibana yourself. Doing that is time consuming and requires expertise. Please note that this is different from “Elasticsearch in the Cloud” services that still require Elasticsearch knowledge, such as how to create the most optimal index mapping for your use case, how to adequately size Elasticsearch node instances, etc. Sematext, on the other hand, is a fully managed Elastic Stack where the only thing you need to do is ship data, leaving nothing for you to “manage” because Sematext does it for you. This eliminates the need for Elasticsearch knowledge and high costs in management/config time.
If you run Elastic Stack in the cloud using Sematext Cloud much of the performance tuning will be handled for you. There may still be some deployment-specific tweaks you can make, but the configuration will be optimized for performance by your hosting provider.
(Parenthetically, note that you could also install Elastic Stack in a cloud environment yourself, without using a fully managed service. Doing so would save you from having to set up and maintain infrastructure, but it would leave you with the burden of managing the installation yourself, which is not the case for a fully managed offering like Sematext Cloud.)
Elastic Stack Data Storage, Retention and Security
If you are relying on your own on-premises storage infrastructure, archiving data after it has been ingested can be challenging. Not only do you need sufficient storage resources, but you also need to develop a process for archiving data and removing it after retention periods have passed.
In a hosted Elastic Stack deployment, these tasks are simpler. You can typically archive to the cloud storage provided by the cloud vendor, or your host may offer automated archiving services as part of its platform. Sematext Logs, for example, automatically purges your old data, but can also archive logs in AWS S3 in real-time.
Keeping Elastic Stack Up-to-Date
Last but not least is the issue of updating Elastic Stack. If you set up and manage ELK yourself on your own infrastructure, it’s your responsibility to determine when updates are available to one or more of the platform components and apply them manually. Failure to keep the installation up-to-date could result not just in performance problems, but also security issues — which are not the type of thing you want to play around with.
In contrast, most cloud hosted Elasticsearch solutions will apply updates automatically, making updates a totally hands-off affair from the user’s perspective.
Total Cost of Ownership
All of the above drives home the point that total cost of ownership, or TCO, for a fully managed, cloud-based Elastic Stack service will be lower than that for an installation that you maintain yourself. Although fully managed Elastic Stack typically comes with a cost, it saves money in the long run by significantly reducing the amount of time your staff has to spend maintaining the installation. It can also save on infrastructure costs because the managed service will be fully optimized and kept up-to-date.
Sure, you can run ELK yourself, in your own data center. But there are several benefits of choosing instead to run Elastic Stack in the cloud. Not only is it easier to set up and manage, but data retention, performance optimization, and scalability are also handled automatically when you use a fully managed Elastic Stack service.
If you need any help with running ELK yourself, please reach out, because we provide:
- Consulting for Elasticsearch and Elastic Stack
- Production support for Elasticsearch and Elastic Stack
- Elasticsearch and Elastic Stack training classes (on site and remote, public and private). You can pick from a wide range of short (2h), use case focused online training classes to fit your exact needs: from Elasticsearch Fundamentals and Kibana and Logstash Fundamentals to Administering Elasticsearch. See all upcoming classes here.
If you’re looking for an Elastic Stack cloud provider, Sematext Logs offers managed ELK-as-a-service capabilities that eliminate the need for costly Elastic Stack expertise and management costs. Click here to learn more.
Good luck on managing the Elastic Stack in production!
Chris Riley (@HoardingInfo) is a technologist who has spent 15 years helping organizations transition from traditional development practices to a modern set of culture, processes and tooling. In addition to being an industry analyst, he is a regular author, speaker, and evangelist in the areas of DevOps, BigData, and IT. Chris believes the biggest challenges faced in the tech market are not tools, but rather people and planning.