Sematext JVM Profiler

Introducing On-demand Java Profiling

If you are running apps on top of JVM and want to be able to profile them in production, on-demand, without affecting your app’s performance and users, read on!  Screenshots, features, and other juicy stuff is further down.

Do you run any apps on the JVM?  How do you find bottlenecks in your apps once they are in production, so you can optimize them?  If they become slow, how do you find which part of code in your app is slow?

Maybe you look at metrics like Garbage Collection.  Maybe you run commands like jstat to see if various memory pools are full or if there are too many FGCs (Full Garbage Collections).  Or perhaps you run jstack or do kill -SIGHUP PID and look at thread dumps?

All of these are reasonable approaches… until your infrastructure grows and/or you get tired of running around, sshing to machines, running jstat and jstack or kill with sudo so you have sufficient rights to execute those commands, and so on.

Another way to tackle this is to simply have your standard profiler attached to the process, or try to attach it on the fly, but that tends to be difficult, requires more manual work, and we all know full-blown profilers typically slow down apps to the point where they could become unusable.  In short, such profiling approach is not really suitable for production.

There’s got to be a better way, right?

Indeed, there is.  Meet SPM‘s On-demand Java Profiler!
This low-impact profiler is designed to help you identify bottlenecks in your production environment without slowing down your apps.  It provides rich analysis of the running code, much like a typical profiler.

What Types of Apps can you Profile?

The SPM profiler can profile anything that runs on top of the JVM.  This means it can profile Java apps, Scala apps, even things like Clojure and Groovy.  You are not limited to profiling only your own apps – the apps you developed.  You can also use it to profile any of the other SPM Integrations that run on the JVM – you profile your webapps running in Tomcat, Jetty, JBoss, or Glassfish, you can  profile Solr or Elasticsearch, Spark, Kafka, Storm, and so on.

How to Profile your App with SPM

If your SPM agent version is 1.29.2 or newer, you’re set.  If you have an older version update it first.

You specify the application you want to profile by:

  • selecting the SPM agent that monitors it,
  • how long you want profile it,
  • … and SPM does the rest.  

Start Java Profiler in SPM
Start Java Profiler (click the image for a bigger view)

The selected SPM agent will then start profiling.  When done, the profiler will show you things like time spent in various methods and in their children.  It will show you the call tree, which you can look at top-down or bottom-up, etc.  It is really quite useful — in our very first profiler test run we immediately identified a suboptimal piece of code in one of our own applications! Talking about instant gratification!

Profiling Features and Benefits

With SPM’s On-demand profiler you can:

  • Profile anything that runs on the JVM, such as Java, Scala, Clojure, Groovy, etc.
  • Start the profiling on demand and without having to add anything to the Java command-line and without needing to restart the process/app you want to monitor.
  • Find the root cause of performance bottlenecks in your or in 3rd party apps.
  • Capture stack traces and view aggregated call trees that let you drill down to a specific line in a method.
  • View the profiled call stack using top-down or bottom-up fashion.  The former shows a standard call tree from the entry point/method down to the leaf methods being called, while bottom-up surfaces the hottest methods and points out from where they are called.
  • See both CPU and Wall clock time.  The former shows how much CPU time is taken by each method, while Wall clock time shows how much real time each method took.
  • Exclude specific classes and methods from profiling through easy to use filtering.  This helps reduce the noise and volume of output, thus making it easier and faster to analyze.
  • Hide outliers – methods which are either not called frequently or are super fast and thus not worth looking at.

View profiler results in SPM
View Profiler Results (click the image for a bigger view)

We hope you like this new addition to SPM.  Got ideas how we could make it more useful for you?  Let us know via comments, email or @sematext.

Not using SPM yet? Check out the free 30-day trial by registering here. There’s no commitment and no credit card required.  We even offer On Premises SPM and Logsene packages in addition to SaaS if that’s more to your liking.  And, even better — combine SPM with Logsene to make the integration of performance metrics, logs, events and anomalies more robust for those looking for a single pane of glass.

Logstash Performance Monitoring (1.2.2 vs 1.2.1)

We’re using Logstash to collect some of our own logs here at Sematext, because it can easily forward events to Logsene via the elasticsearch_http output. And we recently upgraded to the latest version 1.2.2.

Since we’re obsessed with performance monitoring, the first question that came up was: did the upgrade make any difference in terms of load? So we did a test to find out.


Logstash runs on a JVM, so we’re already monitoring it with SPM for Java Apps.  So we just put it through a steady, moderate logs of about 70 events per second for a while, running both 1.2.1 and 1.2.2, on the same machine.

The configuration remained the same on both machines:

  • a file input that was tailing a single file using the multiline codec
  • a couple of grok filters and a geoIP filter, that we’ll talk about in later posts
  • resulting events are fed to Logsene through the elasticsearch_http output, because Logsene exposes the Elasticsearch API


The biggest difference we’ve seen was in memory usage. The new version uses about 30% less memory:


Next, there was a significant difference in the amount of garbage collection going on. Again, some 30% difference in favor of 1.2.2:


We also had slightly less CPU usage. The difference wasn’t significant in our case because we didn’t have a lot of traffic: Logstash is installed on every host and there’s no “central” Logstash to process lots of data. I’m sure that if we had a more complex configuration and/or more traffic, we would have seen much less CPU.


Logstash is getting lighter and lighter, which seems to address its only criticism. You definitely get less memory usage and less garbage collection, so we definitely recommend upgrading to 1.2.2. And if you want to monitor its usage, you can always use our SPM. Happy stashing!

Announcement: JVM Memory Pool Monitoring

Raise your hand if you’ve never experienced the dreaded OutOfMemoryError (aka OOM or OOME) while working with Java.  If you’ve raised your hand count yourself lucky.  The vast majority of Java developers are nowhere near that lucky.  Today, SPM is making the lives of all Java developers a little better by adding JVM Memory Pool Monitoring reports to SPM to the existing JVM Heap, Threading, and Garbage Collection reports.  Note: you’ll need to get the new SPM client (version 1.16 or newer) to see these reports.

Please tweet about JVM Memory Pool Monitoring in SPM

What are JVM Memory Pools (aka Spaces)

The super simplified version of this complex topic is that inside the JVM there are different memory pools (aka spaces) that the JVM uses for different types of objects.  The JVM uses some of these pools to store permanent bits, others for storing young objects, others for storing tenured objects, and so on.  Numerous blog posts and documentation has been written on the topic of the inner workings of the JVM.  It’s a complex topic that is not easy to fully grook in one sitting.  To make the story more complex, different Garbage Collectors and certainly different JVM implementations manage objects differently.  Thus, which exact memory pools you see in SPM will depend and change as you change the JVM or select a different garbage collection algorithms or garbage collection-related parameters.  And this is precisely one case where seeing these pools in SPM comes reeeeeally handy – seeing how memory pool sizes and usage changes as you try different Garbage Collectors or any other JVM parameter can be very informative, educational, and can bring insight into the workings of the JVM that let you select parameters that are optimal for your particular application.

The new JVM Memory Pools reports are available to all SPM users right away.  Let’s have a look at what these new reports look like and what information they provide.

Memory Pool Sizes

This report should be obvious to all Java developers who know about how JVM manages memory.    Here we see relative sizes of all memory spaces and their total size.  If you are troubleshooting performance of the JVM (i.e., any Java application) this is one of the key places to check first, in addition to looking at Garbage Collection and Memory Pool Utilization report (see further below).  In this graph we see a healthy sawtooth pattern clearly showing when major garbage collection kicked in.

JVM Memory Pool Size

Memory Pool Utilization

The Memory Pool Size graph is useful, but knowing the actual utilization of each pool may be even more valuable, especially if you are troubleshooting the OutOfMemoryError we mentioned earlier in the post.  Java OOME stack traces don’t often tell you much info about where the OOME happened.  The Memory Pool Utilization graph shows what percentage of each pool is being used over time.  When some of these Memory Pools approach 100% utilization and stay there, it’s time to worry.  When that happens, if you jump over to the Garbage Collection report in SPM you will probably see spikes there as well.  And if you then jump to your CPU report in SPM you will likely see increased CPU usage there, as the JVM keeps trying to free up some space in any pools that are (nearly) full.

JVM Memory Pool Utilization

Alerting for Memory Pool before OOME

If you see scary spikes or near 100% utilization of certain memory pools, your application may be in trouble.  Not dealing with this problem, whether through improving the application’s use of memory or giving the JVM more memory via -Xmx and related parameters, will likely result in a big bad OOME.  Nobody wants that to happen.  One way to keep an eye on this is via Alerts in SPM.  As you can see from the graphs, memory utilization naturally and typically varies quite a bit.   Thus, although SPM has nice Algolerts, to monitor utilization of JVM memory pools we recommend using standard threshold-based Alerts and getting alerted when utilization percentage is > N% for M minutes.  N and M are up to you, of course.

Please tell us what you think – @sematext is always listening!  Is there something SPM doesn’t monitor that you would really like to monitor?  Please vote for tech to monitor!

Announcing Scalable Performance Monitoring (SPM) for JVM

Up until now, SPM existed in several flavors for monitoring Solr, HBase, ElasticSearch, and Sensei. Besides metrics specific to a particular system type, all these SPM flavors also monitor OS and JVM statistics.  But what if you want to monitor any Java application?  Say your custom Java application run either in some container, application server, or from a command line?  You don’t really want to be forced to look at blank graphs that are really meant for stats from one of the above mentioned systems.  This was one of our own itches, and we figured we were not the only ones craving to scratch that itch, so we put together a flavor of SPM for monitoring just the JVM and (Operating) System metrics.

Now SPM lets you monitor OS and JVM performance metrics of any Java process through the following 5 reports, along with all other SPM functionality like integrated Alerts, email Subscriptions, etc.  If you are one of many existing SPM users these graphs should look very familiar.

JVM: heap, thread stats

We are not including it here, but the JVM report includes and additional and valuable Garbage Collection graph if you are using Java 7.

Garbage Collection: collection time & count

CPU & Memory: CPU stats breakdown, system load, memory stats breakdown, swap

Disk: I/O rates, disk space used & free

Network: I/O rates

To start monitoring, one should have a valid Sematext Apps account, which you can get free of charge here. After that, define a new SPM JVM System, download the installation package, and proceed with the installation.  The whole process should take around 10 minutes (assuming you are not multitasking or suffer from ADD, that is).

Installation process is simple as always and described on the installer download page. After the installation is done, monitor is enabled in your Java process by adding just the following to the command line for the Java process/application you want to monitor: 

For example, if my application is com.sematext.Snoopy I could run it with SPM parameters as shown here:

java -javaagent:/spm/spm-monitor/lib/spm-monitor-jvm-1.6.0-withdeps.jar=/spm/spm-monitor/conf/spm-monitor-config-YourSystemTokenHere-default.xml com.sematext.Snoopy

After you are finished with the installation, the stats should start to appear in SPM after just a few minutes.

Happy monitoring!


ElasticSearch Cache Usage

We’ve been doing a ton of work with ElasticSearch. Not long ago, we had a few situations where ElasticSearch would “eat” all the JVM heap memory we give it.  It was so hungry, we could not feed it enough memory to keep it happy.  It was insatiable.  After some troubleshooting and looking at SPM for ElasticSearch (btw. we released a new version of the SPM agent earlier this week, so if you don’t have it, go grab agent v1.5.0) we figured out the cause – ElasticSearch default field cache setting was not quite right for our deployment. In this post we’ll share our experience on this topic, explain why this was happening and how to minimize the negative effect of large field caches.

ElasticSearch Cache Types

There are two types of caches in ElasticSearch whose behaviors you can control. The first cache is the filter cache. This cache is responsible for caching results of filters used in your queries. This is very handy, because after a filter is run once, ElasticSearch will subsequently use values stored in the filter cache and thus save precious disk I/O operations and by avoiding disk I/O speed up query execution. There are two main implementations of filter cache in ElasticSearch:

  1. node filter cache (default)
  2. index filter cache

The node filter cache is an LRU cache, which means that the least recently used items will be evicted when the filter cache is full. Its size can be limited to be either a percentage of the total memory allocated to the Java process or by specifying the exact amount of memory. The second type of filter cache is the index filter cache. It is not recommended for use because you can’t predict (in most cases) how much memory it will use, since that depends on which shards are located on which node. In addition to that, you can’t control the amount of memory used by index filter cache, you can only set its expiration time and maximum amount of entries in that cache.

The second type of cache in ElasticSearch is field data cache. This cache is used for sorting and faceting in most cases. It loads all values from the field you sort or facet on and then provides calculations on the basis of loaded values. You can imagine that the cost of building such a cache for a large amount of data might be very high.  And it is.  Apart from the type (which can be either resident or soft) you can control two additional parameters of field data cache – the maximum amount of entries in it and its expiration time.

The Defaults

The default implementation for the filter cache is the index filter cache, with its size set to the maximum of 20% of the memory allocated to the Java process. As you can imagine there is nothing to worry about – if the cache fills up appropriate cache entries will get evicted.  You can then consider adding more RAM to make index filter cache bigger or you must live with evictions. That’s perfectly acceptable.

On the other hand we have the default settings for ElasticSearch field data cache – it is a resident cache with unlimited size. Yes, unlimited. The cost of rebuilding this cache is very high and thus you must know how much memory it can use – you must control your queries and watch what you sort on and on which fields you do the faceting.

What Happens When You Don’t Control Your Cache Size ?

This is what can happen when you don’t control your field data cache size:

As you can see on the above chart field data cache jumped to more than 58 GB, which is enormous. Yes, we got OutOfMemory exception during that time.

What CanYou Do ?

There are actually three thing you can do to make your field data cache use less memory:

Control its Size and Expiration Time

When using the default, resident field data cache type, you can set its size and expiration time. However, please remember, that there are situations when you need the field data cache to hold values for that particular field you are sorting or faceting on. In order to change field data cache size, you specify the following property:


It specifies the maximum size entries in that cache per Lucene segment. It doesn’t limit the amount of memory field data cache can use, so you have to do some testing to ensure queries you are using won’t result in OutOfMemory exception.

The other property you can set is the expiration time.  It defaults to -1 which says that the cache will not be expired by default. In order to change that, you must set the following property:


So if, for example, you would like to have a maximum of 50k entries of field data cache per segment and if you would like to have those entries expiredafter 10 minutes, you would set the following property values in ElasticSearch configuration file:

index.cache.field.max_size: 50000
index.cache.field.expire: 10m

Change its Type

The other thing you can do is change field data cache type from the default resident to soft. Why does that matter? ElasticSearch uses Google Guava libraries to implement its cache. The soft type wraps cache values in soft references, which means that whenever memory is needed garbage collector will clear those references even when they are used. This means that when you start hitting heap memory limit, the JVM wont throw OutOfMemory exception, but will  instead release those soft references with the use of garbage collector. More about soft references can be found at:

So in order to change the default field data cache type to soft you should add the following property to ElasticSearch configuration file:

index.cache.field.type: soft

Change Your Data

The last thing you can do is the operation that requires much more effort than only changing ElasticSearch configuration – you may want to change your data. Look at your index structure, look at your queries and think. Maybe you can lowercase some string data and this way reduce the number of unique values in the field? Maybe you don’t need your dates be precise down to a second, maybe it can be minute or even an hour? Of course, when doing some faceting operations you can set the granularity, but the data will still be loaded into memory. So if there are parts of your data that can be changed in a way that will result in lower memory consumption, you should consider it. As a matter of fact, that is exactly what we did!

Caches After Some Changes

After we made some changes in our ElasticSearch configuration/deployment, this is what the field data cache usage looked like:

As you can see, the cache dropped from 58 GB down to 37 GB.  Impressive drop!  After these changes we stopped running into OutOfMemory exception problems.


You have to remember that the default settings for field data cache in ElasticSearch may not be appropriate for you. Of course, that may not be the case in your deployment. You may not need sorting, apart from the default based on Lucene scoring and you may not need faceting on fields with many unique terms. If that’s the case for you, don’t worry about field data cache. If you have enough memory for holding the fields data for your facets and sorting then you also don’t need to change anything regarding the cache setup from the default ElasticSearch configuration. What you need to remember is to monitor your JVM heap memory usage and cache statistics, so you know what is happening in your cluster and react before things get worse.

One More Thing

The charts you see in the post are taken from SPM (Scalable Performance Monitoring) for ElasticSearch.  SPM is currently free and, as you can see, we use it extensively in our client engagements.  If you give it a try, please let us know what you think and what else you would like to see in it.

@sematext (Like working with ElasticSearch?  We’re hiring!)

Solr Performance Monitoring with SPM

Originally delivered as Lightning Talk at Lucene Eurocon 2011 in Barcelona, this quick presentation shows how to use Sematext’s SPM service, currently free to use for unlimited time, to monitor Solr, OS, JVM, and more.

We built SPM because we wanted to have a good and easy to use tool to help us with Solr performance tuning during engagements with our numerous Solr customers.  We hope you find our Scalable Performance Monitoring service useful!  Please let us know if you have any sort of feedback, from SPM functionality and usability to its speed.  Enjoy!