At the end of November, we’ll be migrating the Sematext Logs backend from Elasticsearch to OpenSearch

Understanding Memory Leaks in Java: Common Causes & How to Detect Them

March 10, 2022

Table of contents

There are multiple reasons why Java and the Java Virtual Machine-based languages are very popular among developers. A rich ecosystem with lots of open-source frameworks that can be easily incorporated and used is only one of them. The second, in my opinion, is the automatic memory management with a powerful garbage collector. The Java garbage collector, or in short, the GC, takes care of cleaning up the unused bits and pieces. That means that all your objects, variables, and anything else that occupies the heap memory and is no longer used will be eventually cleared. However, not everything is as bright as it may look at first sight. The whole Java Virtual Machine ecosystem – and thus Java – is susceptible to memory leaks as well. Let’s look into what the Java memory leak is, how to detect whether our software is suffering from one and how to deal with them.

Definition: What Is a Memory Leak in Java

The memory leak is a situation where an object or objects are no longer used, but at the same time, they can’t be removed by the constantly working garbage collector. We can divide the objects that are in memory into two main categories:

  • Referenced objects are the objects that are reachable from our application code and are or will be used.
  • Unreferenced objects are those that are not reachable from the application code.

The garbage collector will eventually remove the unreferenced object from the heap, making space for new ones, but it will not remove the referenced objects as they are considered important. Such objects will make the Java heap memory larger and larger and push the garbage collection to do more work. This will result in your application slowing down or even crashing eventually by throwing the OutOfMemory exception.

Symptoms of a Memory Leak

There are a few symptoms that can point you to suspect that your Java application is suffering from memory leaks. Let’s discuss the most common ones:

  • Java OutOfMemory errors when the application is running.
  • Performance degradation when the application is running for a longer time and is not present just after the application starts.
  • Increasing garbage collection times the longer the application runs.
  • Running out of connections.

What Causes Java Memory Leaks

It’s brutal, but I have to say it – we, the developers, are causing the memory leaks. They are caused by the code of our applications that is not properly written. Luckily there are a few types of Java memory leaks that are known quite well, and by putting some degree of attention when we write our Java code, we can be sure that they will not appear in our code.

Types of Memory Leaks

Let’s look at the most common causes of memory leaks.

Static Field Holding Object Reference

One of the simplest examples of Java memory leak is objects referenced via the static fields that are not cleared. For example, a static field holding a collection of objects that we never clear or throw away. A simple example of such behavior can be demonstrated with the following code:

public class StaticReferenceLeak {
  public static List<Integer> NUMBERS = new ArrayList<>();
  public void addBatch() {
    for (int i = 0; i < 100000; i++) {
      NUMBERS.add(i);
    }
  }
  public static void main(String[] args) throws Exception {
    for (int i = 0; i < 1000000; i++) {
        (new StaticReferenceLeak()).addBatch();
        System.gc();
        Thread.sleep(10000);
    }
  }
}

The addBatch method adds 100000 integers to the collection called NUMBERS. This is, of course, perfectly fine if we need that data. But in this case, we never delete it. Even though we created the StaticReferenceLeak object in the main method and don’t hold the reference to it, we can easily see that the garbage collector can’t clean up the memory. Instead, it constantly grows: java memory leaks causes If we wouldn’t see the implementation details of the StaticReferenceLeak class, we would expect the memory used by the object to be released, but this is not the case because the NUMBERS collection is static. There would be no problem if it wouldn’t be static, so be extra careful when using static variables. How to avoid it: To avoid and potentially prevent such types of Java memory leaks, you should minimize the usage of the static variables. Be extra careful if you must have them and, of course, remove the data from static collections when no longer needed.

Unclosed Resources

It isn’t rare to access resources located on remote servers, open files and process them and so on. Such code requires opening a stream, connection, or file inside our code. But we have to remember that we are the ones responsible not only for opening the resource but also for closing it. Otherwise, our code can leak memory, eventually leading to OutOfMemory error. To illustrate the problem let’s have a look at the following example:

public class UnclosedResources {
  public static void main(String[] args) throws Exception {
    for (int i = 0; i < 1000000; i++) {
      URL url = new URL("http://www.google.com");
      URLConnection conn = url.openConnection();
      InputStream is = conn.getInputStream();
      // rest of the code goes here
    }
  }
}

Each run of the above loop results in the URLConnection instance being opened and referenced leading to slow exhaustion of resources – the memory. How to avoid it: Preventing the described situation is actually quite simple – either remember to use the finally block or use the try-with-resources code block that is a part of the updated Java versions.

Using Objects with Improper equals() and hashCode() Implementations

Another common example of Java memory leak is using objects with custom equals() and hashCode() methods that are not properly implemented (or not existing at all), with collections that use hashing to check for duplicates. One example of such a collection is HashSet. To illustrate that problem, let’s have a look at the following example:

public class HashAndEqualsNotImplemented {
  public static void main(String[] args) {
    Set<Entry> set = new HashSet<>();
    for (int i = 0; i < 1000; i++) {
      set.add(new Entry("test"));
    }
    System.out.println(set.size());
  }
}
class Entry {
  public String entry;
  public Entry(String entry) {
    this.entry = entry;
  }
}

Before we dig into the explanation, ask yourself a simple question: What will be the number that the code will print with the System.out.println(set.size()) call? If your answer is 1000 then you are right. That is because we don’t have the equals method properly implemented. This means that each instance of the Entry object added to the HashSet is added regardless of whether that is a duplicate from our perspective. That potentially leads to an OutOfMemory exception. If we would alter our code with the proper implementation the code would result in printing 1 as the size of our HashSet. To give you an example, here is the code with the equals() and hashCode() methods implemented by JetBrains IntelliJ:

public class HashAndEqualsNotImplemented {
  public static void main(String[] args) {
    Set<Entry> set = new HashSet<>();
    for (int i = 0; i < 1000; i++) {
      set.add(new Entry("test"));
    }
    System.out.println(set.size());
  }
}
class Entry {
  public String entry;
  public Entry(String entry) {
    this.entry = entry;
  }
  @Override
  public boolean equals(Object o) {
    if (this == o) return true;
    if (o == null || getClass() != o.getClass()) return false;
    Entry entry1 = (Entry) o;
    return Objects.equals(entry, entry1.entry);
  }
  @Override
  public int hashCode() {
    return Objects.hash(entry);
  }
}

How to avoid it: As a rule of thumb, when creating classes properly implement the equals() and hashCode() methods. Most of the modern IDEs will assist you in implementing them.

Inner Classes that Reference Outer Classes

A very interesting case in my opinion – the case of the inner, private class keeping the reference to its parent class. Consider the following scenario:

public class OuterClass {
  // some large arrays of values
  private InnerClass inner;
  public void create() {
    inner = new InnerClass();
    // do something with inner and keep it
  }
  class InnerClass {
    // some logic of the inner class
  }
}

Assuming that the OuterClass contains references to a large number of objects that occupy a lot of memory, even if it will no longer be used it won’t be garbage collected. That’s because the InnerClass object will have an implicit reference to the OuterClass, which makes it ineligible for garbage collection. How to avoid it: This is about the requirements of the inner class and if it should access the data in the outer class. If not, turning the inner class to static will resolve that issue. You can also think about whether the inner, private class is actually needed in the first place and maybe a different architecture pattern can be used.

ThreadLocals

ThreadLocal is a structure in the Java world that allows us to isolate the processing scope to the current thread only and thus achieve thread safety in some cases. You can keep information about the current user, the context of the execution bound to the user, or anything that requires isolation between threads. The problem appears when you start thinking from a broader perspective. Modern application servers or servlet containers use thread pools to control the number of threads that can be run concurrently and thus reuse the same threads over and over again. In such cases, the threads are reused and are not garbage collected, since the references to the threads are constantly kept in the pool itself. This is not an issue with the ThreadLocal itself but in general, a complication that is happening inside the modern technology stack. You should expect that and remember that the values assigned to ThreadLocal will be kept and thus need to be cleaned because otherwise the memory will be used inside the ThreadLocal. How to avoid it: First of all, clean up everything. ThreadLocal provides the remove() method which removes the current thread’s value for this variable effectively clearing the data. You can even consider clearing the data in the ThreadLocal in the finally block so that even if an exception happens during the code execution, the finally block will always be executed and thus the data will be removed from memory.

Java Memory Leak Detection

There are multiple ways of diagnosing Java memory leaks, but a single one will not prevent or detect everything. You need to choose the ones that are good for your use case and can be used inside your software development cycle.

Verbose Garbage Collection

One of the simplest ways to know what is happening with your memory is by observing how the Java Virtual Machine garbage collection works. The longer it takes for the garbage collection to perform its job, the more likely your application is to have issues with the memory. It may not be a memory leak in the first place, but verbose garbage collection can help you identify that something is wrong. Turning on verbose garbage collection logging is easy, you just need to add the -verbose:gc parameter to your JVM startup parameters, and you’re set.

Memory Profilers

Memory profilers or in general Java Virtual Machine profilers, such as Java VisualVM, YourKit, JProfiler, and Mission Control are applications allowing you to get a deep insight into what is happening inside the Java Virtual Machine. One of their key features is memory analysis with insights into what objects are stored on heap, what are the references, what data is kept in memory, and so on. Profiling the application during its runtime can point to issues and memory profilers are very helpful when it comes to narrowing down the reason as you can easily spot the contents and object references. java find memory leak

VisualVM

Heap Dumps

While using the memory profilers is very nice and gives a lot of insights on how your application uses the heap memory it is not always possible to use a profiler, especially in a production environment. That’s where heap dumps come in handy. The heap dump is a snapshot of the heap memory of your Java Virtual Machine that can be generated on demand or for example when the application crashes with the OutOfMemory error. how to detect memory leaks in java

Source: www.yourkit.com

To enable heap dump on out of memory error you can add the -XX:+HeapDumpOnOutOfMemoryError flag to your JVM application start parameters and whenever your application will throw an OutOfMemory error a heap dump will be generated. Once the heap is generated you will need a tool to analyze it. Most of the tools that can do memory profiling can also load a heap dump and provide the analysis. There are also others like Eclipse MAT. Keep in mind that to open a large heap dump to analyze 1 to 1 you may need the amount of memory similar to the heap size. There is one thing to remember though – the time needed to dump the memory. While this may not be an issue for applications with smaller heap sizes, the JVM with a large heap may require a substantial amount of time to write it. You also need to be sure that you have enough free storage space to hold the heap. The free disk space should be higher than the maximum heap size for your application.

Code Reviews

java memory leak detection tools

Github Code Review

Code reviews and approval of the code before including it in the main branch of the code is the manual way of detecting issues with the code. Although not dedicated to memory leaks, the process of reviewing the code by a person or a group of people will be beneficial in many ways. If approached and done right it will help in improving the general code quality and spot memory leaks that are human-detectable by static code analysis.

Code Benchmarking

Benchmarking your Java code’s performance before and after the changes is yet another way to potentially detect Java memory leaks and more. Comparing the performance of the code over a longer period of time helps detect potential performance degradation that can be caused by inefficient design and implementation, bugs, or memory leaks.

IDE Memory Leak Warnings

Some IDEs provide warnings for potential memory leaks when configured properly. For example, let’s take a look at Eclipse. We can check the options like Resource leak and Potential resource leak and set them not to be ignored and be set to Error level like this: memory leak detection tools java In such case, a code that can potentially lead to memory leaks will be displayed by the IDE: java memory leak detection tool Errors like that can help you identify the issues during development and is one of the ways to prevent leaks that can be detected by automatic code analysis.

Java Virtual Machine Monitoring Tools

Besides the tools and techniques that we’ve covered so far there is an additional family of products that can help you identify issues with your applications – the observability tools. Commercial and open-source solutions, when integrated correctly, can give you and your company powerful insights into how your application is working, how it is using its resources, one of which is the heap memory. Observing the trends in the observability platform of your choice gives you information on how your application behaves. Noticing slowdowns over time and any increase in memory usage and higher garbage collector activity points to potential memory leak problems. One of such observability platform is Sematext Cloud and the next section is dedicated to showing you how to use it to diagnose Java memory leaks. We also wrote an article about the Java monitoring tools if you want to see the best options available out there.

Find and Analyze Java Memory Leaks with Sematext

Sematext Cloud provides two main capabilities when you need a great tool for memory leak analysis. java memory leak detector

Sematext Cloud JVM Monitoring

The first one is the JVM Monitoring, which provides insight into what is happening inside your JVM in real time. The view of the memory usage, JVM pool size, JVM pool utilization, and heap memory helps you understand the patterns. When looking at these JVM metrics over a longer period of time you can easily spot the memory growing over time, which is a potential sign of the memory leak happening inside your application. The JVM Monitoring also gives the view of the basic garbage collector metrics, which is also extremely useful. java memory leak detection tool

Sematext Cloud JVM Garbage Collector Logs Integration

When talking about garbage collectors, Sematext Cloud provides the Garbage Collector log integration which parses the garbage collector logs coming from your Java Virtual Machine and gives you invaluable insights deep into the garbage collector work without the need of analyzing the files yourself. You’ll be able to see detailed information about the working of the garbage collector, such as timings or how much memory was used before and after collection. Such information can help in spotting the potential memory leaks with time as more and more time would be used to collect garbage and the memory collected would be smaller and smaller.

Conclusion

Dealing with memory leaks in your Java applications requires knowledge and carefulness when writing code and experience. But even with thoughtful coding and effective code reviews issues can happen and you should be able to quickly and efficiently narrow them down so that they don’t affect your users. Sematext Cloud is a perfect tool for your Java Virtual Machine applications monitoring – one to handle it all. You get insights into the working of your application like memory, garbage collector, JVM threads, and so on. In addition, Sematext Cloud gives you an option to ship your garbage collector logs to get invaluable insights that help you identify potential issues with memory. Running a JVM application yourself, Sematext Cloud has a 14-day free trial for you to try all of its features

Java Logging Basics: Concepts, Tools, and Best Practices

Imagine you're a detective trying to solve a crime, but...

Best Web Transaction Monitoring Tools in 2024

Websites are no longer static pages.  They’re dynamic, transaction-heavy ecosystems...

17 Linux Log Files You Must Be Monitoring

Imagine waking up to a critical system failure that has...