TL;DR: The main question here is: How Does Java 9 Work with Elasticsearch 6? It works well, but don’t expect miracles. Unless you’re using G1, then there are some miracles.
With Java 9 fresh out of the oven and Elasticsearch 6 still in the oven but almost fully baked, we thought we should take them out for a spin and find an answer to a couple of questions:
- do they actually work together, or is there some crazy error to be expected?
- if yes, what are the benefits of upgrading?
Out of the list of Java 9 features and improvements, two things stood out for us:
- compact strings, because heap size could be smaller.
- G1 garbage collector improvements. Yes, we know that some advise against G1 on Lucene-based search engines such as Elasticsearch, but the argument there isn’t all that clear and we had mostly good experiences with G1 for years now. On Java 9, G1 becomes the default and CMS – currently used by Elasticsearch as default – becomes deprecated (you will see a deprecation message in the logs).
Test conditions: Java 9 w/ Elasticsearch 6
We compared Java HotSpot 1.8.0_144 with OpenJDK build 9+181 using Elasticsearch 6 beta 2 on a 4 vCPU, 8GB of RAM, SSD-backed physical box. The test was to index some data (10M firewall logs) at moderate throughput (~5K docs/s from an external Logstash – this ate about 50% of the CPU) while at the same time looping a query with an aggregation.
Using Sematext Monitoring for Elasticsearch we captured pretty much all the Elasticsearch metrics known to mankind, from GC times to query latencies. There were two distinct runs: one with the default GC settings, and one with G1 GC enabled instead of CMS.
Since we knew there were no significant changes for CMS between Java 8 and 9, using it helped us pinpoint any non-GC related changes (like the influence of the new compact strings). Using G1 separately helped us see the G1-related changes.
Results: Garbage Collectors 3 times faster for Java 9
To our surprise, it turned out that none of those non-GC related changes existed. All the metrics we checked were similar with Java 8 and Java 9, from indexing and query throughput and latencies, to GC times, heap size variations, and CPU usage.
As an example, here’s a query latency graph (average and 99th percentile). If anything, the Java 9 result is slightly worse, especially when looking at the 99th percentile, though this can be attributed to noise.
With G1 it was mostly the same, too: query latencies were similar, so was indexing throughput. There was, however, one big difference – GC times were much shorter for Java 9 – while cleaning up the same amount of garbage. About 3 times faster, both when it came to total and average GC latency.
In the image below, you can also see the extended graph, which includes the total collection time for CMS runs as well: GC times are the same between Java 8 and Java 9 for CMS, but there’s a clear difference with G1.
Keen eyes will catch that GC times were better for CMS anyway, especially for Java 8. That’s something to be expected with a 1GB heap, where the overhead of G1 (in terms of heap size) becomes significant. Also:
- we used default settings, which made CMS kick in at 75% of heap, while G1 would start working more aggressively, at 45%. Both limits are configurable, of course, but the defaults work better for CMS in this particular setup
- tests were short-term (only loading 10M documents). In the longer run, G1 might work better because, unlike CMS, G1 also defragments heap
Conclusions
Why was there no significant difference in heap size? While we’d love to hear your opinion in the comments, our conclusion so far was that Elasticsearch, and the underlying Lucene, just aren’t using String objects that much. Looking at a few heap dumps we saw less than 2% of heap occupied by String objects – meaning their compaction wouldn’t lead to any notable memory usage drop. There are so few String objects because Lucene and Elasticsearch (and Solr, too) tend to rely more on byte arrays to represent data. Which is what Java 9 now does out of the box.
With regards to garbage collection, there are a couple of open-ended questions as well:
- Is it premature to deprecate CMS? Since it works better for smaller heaps, why deprecate it? One could argue that G1 could have been optimized for this use-case (the default settings aren’t meant to take on CMS for 1GB heaps, especially when these particular CMS settings were tuned for Elasticsearch). Another argument could be that on modern hardware, for 1GB heaps, most use-cases don’t need a concurrent collector at all, and we could have just selected the Parallel collector.
- Should we finally use G1 for every large-scale Elasticsearch deployment? We would suspect that, with proper tuning, G1 will work great performance-wise. However, the debate was always about stability. While we can always invoke the “time will tell” argument, what was your experience with G1?