At the end of November, we’ll be migrating the Sematext Logs backend from Elasticsearch to OpenSearch

Embracing Kubernetes Successfully

December 18, 2017

Table of contents

Kubernetes is a really hot topic at the moment. All major cloud providers adopted it as a solution for deploying cloud native apps. Just a few weeks ago, AWS at reInvent introduced EKS (Amazon Elastic Container Service for Kubernetes) which is a fully managed Kubernetes cluster.

That is a huge step because AWS is the biggest public cloud provider and most Kubernetes deployments are on AWS. Typically, the official kops tool is used to deploy and manage Kubernetes clusters, which is fine because it is production ready. With Kubernetes gaining more popularity companies are pushing really hard to embrace it because it looks like Kubernetes can solve a lot of common problems.

Should you really start the migration to Kubernetes and what does it take to truly embrace it? In this post, I will share my experiences and that is the question I will try to answer.

The problem

Earlier this year, we started containerizing our Sematext Cloud and Sematext Enterprise deployments and it really seemed straightforward. All you need to do is to containerize all your applications, create a few Kubernetes configs for its resources like Deployments, Services, etc. and you are ready to rock! Well, not so easy.

The main problem is that all the things used for release management process will not fit into the cloud native paradigm. Your application is not cloud native, you are probably not using CI/CD deployment pipelines, and health checks or using monitoring tools for logs, metrics, and alerts are problematic aspects. Your application is probably really static and complicated to deploy. This complexity will not help or magically disappear with the migration to Kubernetes. It will just make things worse. Cloud native means decoupling OS from the application, which is what containers are doing.

In software industry evolution, first we’ve had a waterfall, then Agile, and now release management called DevOps. This is a really important piece of the puzzle and there is not one rule to follow here. Each team uses what works best for them and if you want to stick to your existing routine, please do. You just need to make sure that your routine will work for cloud native apps meaning you will probably have to change something. It is not easy to switch the whole team to embrace DevOps principles, so this can take some time. Be prepared for that.

If you want to check my take on DevOps in general, please read What is DevOps?

The solution

This is not exactly a solution that you need to blindly follow. It is here to give you an idea and to explain the processes and problems you might encounter along the way. It is an extensive topic and I will not go down into the details trying to explain everything I mention.

As the first step,  get rid of all unused components and do some cleanup. After developing software for a few years it gets really complex, because there are always more important things to do – new features, new production fixes, etc.

Second, Implement health checks for Kubernetes readiness and liveness probes. That should not be that hard. You will mostly spend your time managing your configurations. That is the hard part. My advice is to leverage config maps and secrets and to stay away from environments variables. You can use some of them, but don’t make the whole configuration management just using environments variables. Your app should really use the full potential of Kubernetes.

Kubernetes apps are talking to each other using services. There are different types of services in Kubernetes, and think of them as a load balancer. The service name you define is your endpoint http://service_name:port. If you are working with stateful apps, you will probably want to use headless service which enables you to access a particular Kubernetes Pod. As you can see, services in Kubernetes also solve the service discovery problem in a way. If you’ve already used something like Consul for service discovery, congrats to you! You can stick with it and just deploy it on the Kubernetes – check Consul Helm chart for easy deployment.

Your applications should be really easily scalable in case of stateless apps from Kubernetes perspective. You need to use Deployments as a resource, because this type of the service also manages easy upgrades – rolling updates. Of course, your application needs to handle scaling without any problems, and it might require some code changes to implement it .

The main problem is stateful applications, like databases. Kubernetes has a StatefulSet resource for that kind of apps, but doesn’t know how a particular app should react when adding a new node or in case of failure, etc. This is what operations people usually do when they manage it. Luckily, it is not that hard to write Kubernetes operators to do exactly that.

In short, the operator is a Kubernetes custom resource definition or CRD. You can write your own or use the existing one. The operator that we are using is Elasticsearch operator and we will happily contribute to that project. I’ve already made some pull requests to this project.

You probably started connecting all the pieces together. Don’t forget about computing resources and limiting computing resources for your containers. There are two types of compute resources, the requests, which specify the minimum amount of free resources on one node for Kubernetes scheduler to run a particular Pod, and limits, which is the maximum amount of computing resources that a Pod could use. This is really important, especially for Java apps. Please note that for Java apps you also need to adjust those limits according to heap memory requirements. My recommendation is to use Java version 8u131 or newer which is Docker-aware with respect to Docker CPU and memory limits transparently.

Label your apps – this can really help later when you need to monitor your containers and apps, and metadata info, in general, is how you are connecting different pieces of resources through selectors. Deployment and Service, for example.

When you start writing Kubernetes configuration files, you will be fine with it and think that maintaining them is not a big deal. However, when trying to implement a deployment pipeline, you will realize that using just a bunch of configuration files is a really bad idea. This is where Helm comes to the rescue. Helm is a tool for packaging Kubernetes apps, and I highly recommend using it for deployment pipelines. Things like supporting multiple environments, dependencies, versioning, rollback, different hooks (think of DB migrations) are enough to describe all good things about Helm. The bad news is that you need to learn yet another tool, but I assure you it is easy to learn and worth your time.

You don’t need multiple clusters to have different environments. Just use different Kubernetes namespaces. For example, in order to bring up a new environment just create a separate namespace, and when you are done with the testing just delete it. This way you are saving a lot of time and money. However, never forget about security. Kubectl is like root user to your cluster. It is probably not a good idea to give everyone access to kubectl. Sometimes, it is probably better to create a completely new cluster than to manage RBAC, which can be really complicated. RBAC is highly recommended, though.

So how will you version your container images, helm packages, etc? This is up to you, really. The most commonly used approach is to tag images with commit ID and later with release tag. At that point you need to store the docker images somewhere, private or public registry. My recommendation is to use DockerHub. It is probably the most cost efficient also. If you solved everything, then you need to create a deployment pipeline. You will probably use Jenkins to deploy everything and act as one of your main tools for DevOps.


Kubernetes Cheat Sheet

We’ve prepared a Kubernetes Cheat Sheet which puts all key Kubernetes commands (think kubectl) at your fingertips. Organized in logical groups from resource management (e.g. creating or listing pods, services, daemons), viewing and finding resources, to monitoring and logging. Download yours.


Conclusion

Lately, the focus is on cloud native, not just Kubernetes itself. Kubernetes is just a tool. If you want to understand more about containers, you may download the Docker Commands Cheat Sheet.

Our migration to Kubernetes for Sematext Cloud and Sematext Enterprise is work in progress. You need to think of it as a process. It is not a one time job or wonder and it is never completely finished.

I hope this post gives you an idea of what it is really like to work on migration process to cloud native and Kubernetes, and what to expect along the way. It is hard and time-consuming, but can be really beneficial to your company culture as well as your software. Scaling is not a problem anymore and infrastructure is just code and a software problem. Embrace Kubernetes – check out our guides to Kubernetes monitoring and Kubernetes logging – but be ready to face challenges, some of them covered in this post. 

 

Beginner’s Guide to RabbitMQ Logging: How to View, Locate, and Analyze Logs

RabbitMQ is one of the most popular open-source message brokers...

Solr 7 – New Replica Types

With the release of Solr 7 the community around it...

Introducing Sematext Cloud

While some are hallucinating about building walls, we at Sematext...