2017 was a good year for Sematext in many ways. On the engineering side of the organization, we’ve made a number of changes that had a positive impact and paved the way for a more productive 2018. The following is obviously not a complete list and it’s focused purely on engineering changes. Here are 10 engineering changes we’ve done in 2017:
1. From static infra to containers orchestrated with Kubernetes
We’ve been running in AWS since our very beginning. We run in multiple AWS regions (US and EU) and multiple Availability Zones in each of those regions. We wanted better utilization of resources, more automation, more self-healing, and lower costs. We drank the Cloud Native Kool Aid, containerized all our components, and started using Kubernetes for orchestration, building Helm charts for everything. In the process, we also created the Sematext Docker Agent Helm Chart. Using containers and Kubernetes allowed us to make better utilization of the infrastructure, while also making it more resilient and automated. To reduce further costs our operating expenses we’ve made full use of AWS spot instances by utilizing autospotting.
2. From Subversion to GitHub
Believe it or not, we were using internal Subversion until 2017. Some of us pretended we were working git by using git svn, but deep down inside we knew we were dealing with svn. We were already making use of Github for our open-source projects, so one day we took all our code and moved it to private Github repos. Late bloomers? Yeah.
3. From releases to GitHub flow, Sprints, and CI/CD
For years we’ve been following a very loose Scrum-like development structure where we’d have 1-2 months long “Sprints” followed by a release. Can something that takes 1-2 months really be called a Sprint? Probably not. We observed that when our releases had a bunch of small changes the releases would go smoothly. However, when changes were a little more involved a release would require N people to be actively involved and take a few hours. Sometime we’d also have unexpected glitches. Planned maintenance page, anyone? It became obvious that if we could have smaller, more frequent releases we’d have less trouble. So we switched to Github flow, automated everything we could, and never looked back. Along with that, we moved to 2-week Sprints. We now do Sprint planning every other Fridays and have a mid-Sprint review with 5 minutes per person demos each Friday. Builds and deployments to production are completely automated with Jenkins and Ansible and happen many times throughout the day.
4. From Java to Go: Agent
In the second half of 2017, we started introducing Go internally. We first used Go to replace our OS monitoring agent. Written in Java and running on top of the JVM it had too big of a memory footprint for an agent.
5. From Java to Go & Kotlin: Backend
After initial positive experience reimplementing our OS monitoring agent in Go, we’ve replaced a few backend components with Go as well. Benchmarking showed excellent performance at a fraction of the memory footprint! For those parts of the backend that needed some rewriting but had to keep running in the JVM, we’re now using Kotlin, which we find more expressive and productive than Java.
6. From JSP & JSTL to ReactJS and Redux
Back in 2016, we started replacing our old Webapp that made use of JSPs and JSTL with ReactJS and Redux. It was a steep learning curve and the reimplementation of the frontend took months, but we are extremely pleased with the result. Working with ReactJS and Redux feels cleaner, more productive, and all around better. In the process we’ve implemented and open-sourced Sematable, a versatile ReactJS/Redux data table component.
7. From HBase to ClickHouse
Sematext Monitoring used to store metrics in Apache HBase. For years we have been very happy HBase users and have also contributed to the project. It has been extremely stable and performant for us. However, we wanted to give our users more flexibility in terms of slicing and dicing their metrics and needed a new backend that can provide that and also handle the massive data volume. Because we have an off on Elasticsearch expertise in-house and because Sematext Logs makes heavy use of Elasticsearch we naturally wanted to give Elasticsearch a try. Unfortunately, at this time Elasticsearch is just not efficient-enough data store for metrics and analytical queries on metrics, while ClickHouse is designed to be a fast analytical database and seems like a perfect fit.
8. From Rsyslog to Logagent
While we love rsyslog for its speed and efficiency – so much so that we’ve written an Rsyslog eBook – for shipping our own logs to Sematext Cloud we’ve switched to Logagent. Written in Node.js, Logagent has a very small memory footprint, too, much smaller than e.g. Logstash, and has a very attractive pluggable architecture with a number of input, output, and processing plugins.
9. From Confluence to MkDocs
Being a SaaS provider ourselves we’ve used Atlassian Cloud products for years and have been mostly happy customers. We still use it internally, but for Sematext Documentation we’ve switched to using MkDocs, which gives us more flexibility around UI customization and also lets us make all our MD docs repo public, open to comments, PRs, etc.
10. From Maven to Gradle
Our JVM codebase was split among a number of Maven projects with dependencies between them. We have manually managed these dependencies by using versioning. Our release cycles used to be 1-2 months, so it was not a big deal to switch versions from time to time manually. After we switched to continuous deployments we had to make sure that dependent parts are rebuilt only when necessary. This led us to process of merging many small Java projects into a bigger project. We could have left projects in Maven, spent time on rebuilding and cleaning pom.xml files, or spend a similar amount of time and switch to Gradle. Why Gradle? Gradle is modern build automation tool which is much more flexible than Maven. The configuration is more human-friendly comparing to Maven XML files, there is no need to implement MOJOs to customize behavior, and every customization can be implemented with simple Groovy code. In addition to that, we could gain from faster builds that Gradle makes possible. Gradle’s internal caching can significantly speed up builds. For instance, when there is a small change in one place of the project but none of the subprojects depends on this change the build can be done in a matter of seconds instead of minutes.
Last, but not least, we are looking for talented engineers to join our distributed team. We have openings for frontend development, backend/full-stack development, agent development, and so on.