Memory Leak Due To Mutable Keys in Java Collections

Java Collections components (such as Map, List, Set) are widely used in our applications. When their keys are not properly handled it will result in memory leak. In this post, let’s discuss how incorrectly handled HashMap key results in OutOfMemoryError. We will also discuss how to diagnose such problems effectively and fix them.

HashMap Memory Leak

Below is a sample program which simulates Memory Leak in a HashMap, due to a mutated key:

01: public class OOMMutableKey {
02:    
03:    static class User {
04:    	
05:        String name;
06:
07:        User(String name) {
08:            this.name = name;
09:        }
10:
11:        @Override
12:        public int hashCode() {
13:            return name.hashCode();
14:        }
15:
16:        @Override
17:        public boolean equals(Object obj) {
18:            return obj instanceof User && name.equals(((User) obj).name);
19:        }
20:    }
21:
22:    public static void main(String[] args) {
23:        
24:    	   Map<User, String> map = new HashMap<>();
25:        int count = 0;
26:
27:        while (true) {
28:            // Step 1: Create a key
29:            User user = new User("Jack" + count);
30:            map.put(user, "Engineer");
31:
32:            // Step 2: Change the key *after* insertion
33:            user.name = "Jack & Jill" + count;
34:
35:            // Step 3: Try to remove using the mutated key
36:            map.remove(new User("Jack" + count)); // does not remove the record
37:            map.remove(new User("Jack & Jill" + count)); // does not remove the record either
38:            
39:            if (++count % 100_000 == 0) {
40:                System.out.println("Map size (leaked): " + map.size());
41:            }
42:        }
43:    }

Before continuing to read, please take a moment to review the above program closely.

  • In line #5, ‘User’ class is defined with the ‘name’ as the member/instance variable. This class has a legitimate ‘hashCode()’ and ‘equals()’ method implementation based on the ‘name’ variable.
  • In line #27, this program goes on an infinite loop (i.e. ‘while(true)’) and creates new ‘User’ objects. 
  • In line #29, ‘name’ variable of the ‘User’ object is set to value ‘JackX’
  • In line #30, ‘User’ object is added into the ‘HashMap’
  • In line #33 ‘name’ of the user object is changed to ‘Jack & JillX’. Basically, the key of the ‘HashMap’ is mutated (i.e. changed). 
  • In line 36, ‘JackX’ ‘User record is removed and in line #37 ‘Jack & JillX’ user record is removed from the ‘HashMap’. But both of the removal will silently fail i.e. the user object will not be removed from the ‘HashMap’. Thus, when the program is executed, HashMap will start to grow with infinite user records and eventually result in ‘java.lang.OutOfMemoryError: Java heap space’.

Why Mutable Key result in OutOfMemoryError?

Fig: HashMap Implementation

In order to understand why the above program will result in OutOfMemoryError, we need to understand how HashMap’s are implemented. In nutshell, 

  1. HashMap internally contains an array of buckets. Inside each bucket it has a list of records. 
  2. HashMap uses the ‘hashcode()’ method of the key object to determine in which bucket the record should be stored. Once the bucket is determined, the record will be placed in the appropriate list of that bucket.
  3. When we use the ‘get()’ method to retrieve the record, HashMap uses the same ‘hashcode()’ method of the key object to determine the bucket in which the record should be searched. Once the bucket is determined, ‘equals()’ method is invoked on all the records keys in the list of that bucket to retrieve the appropriate record. 

Equipped with this knowledge, let’s discuss what happens when the first ‘Jack1’ record is inserted into the ‘HashMap’. Based on ‘hashcode()’ implementation in the User object, let’s say ‘Jack1’ record gets inserted in to the list in bucket#1. Once the record is stored, then the actual name is changed to ‘Jack & Jill1’ in the ‘HashMap’. So after the insertion, user record in bucket #1, contains ‘Jack & Jill1’ as the key and not ‘Jack1’

Now let’s answer the question, Why ‘map.remove(new User(“Jack” + count))’ doesn’t remove the record? 

  • Based on the ‘hascode()’ implementation of this ‘Jack1’ user object, HashMap will determine that record is stored in bucket#1. 
  • Now HashMap will invoke ‘equals()’ operation on all the keys that are present in list of bucket #1. ‘equals()’ operation will return ‘false’, because the actual name of this user object that is present in the list ‘Jack & Jill1’ and not ‘Jack1’

Now let’s answer the question, Why map.remove(new User(“Jack & Jill” + count))’ doesn’t remove the record? 

  • The ‘hashcode()’ implementation of ‘Jack & Jill1’, will return a different value, which will cause the HashMap to look up the record in a different bucket, let’s say bucket 3.
  • Since in bucket #3, the record is not present, it will not be removed from the HashMap.

Tricky isn’t it? 😊

How to diagnose mutable key created by Memory Leak?

You want to follow the steps highlighted in this post to diagnose the OutOfMemoryError: Java Heap Space. In a nutshell you need to do:

1. Capture Heap Dump: You need to capture heap dump from the application, right before JVM throws OutOfMemoryError. In this post, 8 options to capture the heap dump are discussed. You might choose the option that fits your needs. My favorite option is to pass the ‘-XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=<FILE_PATH_LOCATION>‘ JVM arguments to your application at the time of startup. Example:

-XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/opt/tmp/heapdump.hprof

When you pass the above arguments, JVM will generate a heap dump and write it to ‘/opt/tmp/heapdump.hprof’ file whenever OutOfMemoryError is thrown.

2. Analyze Heap Dump: Once a heap dump is captured, you need to analyze the dumps. In the next section, we will discuss how to do heap dump analysis.

Heap Dump Analysis

Heap Dumps can be analyzed through various heap dump analysis tools such as HeapHero, JHat, JVisualVM… Here let’s analyze the heap dump captured from this program using the HeapHero tool.

Fig: HeapHero flags memory leak using ML algorithm

HeapHero tool uses Machine Learning algorithms internally to detect whether any memory leak patterns are occurring in the heap dump. Above is the screenshot from the heap dump analysis report, flagging a warning that ‘main’ thread’s local variables are occupying 99.92% and most objects are occupied in one instance of ‘HashMap’. It’s a strong indication that the application is suffering from memory leak and it originates from the ‘java.util.HashMap’ object.

Fig: Largest Objects section highlights ‘main’ Thread

The ‘Largest  Objects’ section in the HeapHero analysis report shows all the top memory consuming objects (refer to above screenshot). Here, you can clearly notice that the ‘main’ thread is occupying 99.92% of memory.

Fig: Outgoing Reference section of ‘main’ Thread

The tool also gives the capability to drill down into the object to investigate their content. When you drill down into the ‘main’ Thread object, reported in the ‘Largest Object’ section, you can see all its child objects. From the above figure, you can see it contains 3.38 million User records. Basically, these are the objects that got added and never removed from the HashMap. Thus, the tool helps you to point out the memory leaking object and it’s origination source, which makes troubleshooting a lot easier. 

How to fix Mutable Keys Memory Leaks?

You can declare the key of the record to be final, so that it can be changed once it’s initialized.  Example:

03:    static class User {
04:   
05:        final String name;

Conclusion

From this post, we can understand that the mutated key in the Collections has the potential to bring down the entire application. Thus, by not mutating the key and using the tools like HeapHero for faster root cause analysis, you can protect your applications from hard-to-detect outages.

Share your Thoughts!

Up ↑

Index

Discover more from HeapHero – Java & Android Heap Dump Analyzer

Subscribe now to keep reading and get access to the full archive.

Continue reading