Decoding Java Memory Analyzer Reports: A Step-by-Step Guide for Developers

Fun fact: in the 1960s, it was not uncommon for a programmer to spend half an hour figuring out how to save two bytes of memory.

Memory was expensive. By comparison, a programmer’s time was cheap.

These days, memory is cheap, and many developers cheerfully forget about memory considerations. Let the garbage collector sort it out, and if it doesn’t, it only costs a few dollars to add more RAM, right? This mentality works for a while – until it doesn’t.

Sooner or later, applications written with this mindset are likely to crash production, degrade performance or result in skyrocketing cloud computing costs.

Learning to effectively use a Java memory analyzer, and paying attention to what really goes on in memory, can save you from being the Grinch whose application stole Christmas.

The Java Memory Analyzer Toolbox

Although it’s simple enough to take a heap dump from a Java program, the dump is a large binary file, and it’s not practical to analyze it manually.

Luckily, with the right tools, memory analysis is simple. The first, and most indispensable, tool is a heap dump analyzer. The most comprehensive of these include HeapHero and Eclipse MAT. Oracle’s Java Mission Control can also analyze heap dumps, but mainly concentrates on memory wastage.

It’s also worth having a simple Java profiler that allows you to ‘look inside’ a running JVM. VisualVM is very popular, since it’s free, lightweight and simple to use. A garbage collection (GC) log analyzer, such as GCeasy, is very useful, since it allows us to analyze GC behavior over time. Lastly, the JDK provides a range of tools, such as jcmd, that help us gain insights into how well our application is running.

In this article, we’ll be using examples taken from HeapHero reports.

When is Memory Analysis Useful for Developers?

Often, memory analyzers are seen as a troubleshooting aid to detect the cause of otherwise inexplicable system crashes. In fact, tools such as HeapHero can be useful during all phases of the project lifecycle. Let’s look at some of the use cases for developers.

  • Planning, prototyping and feasibility tests: A memory analyzer takes the guesswork out of calculating memory requirements for a proposed solution;
  • Comparing coding strategies: Analyzing memory usage lets us pick the most efficient solution;
  • Reducing memory wastage: Since more and more applications are destined for small devices, and memory usage has a direct impact on cloud computing costs, memory-saving really does matter.
  • Testing: A memory analyzer can help us spot memory leaks before they hit production.
  • Performance labs: A Java memory analyzer lets us check that specifications are being met, accurately predict optimum configurations, and find memory-related performance issues.
  • Troubleshooting: When issues are referred back from production, a memory analyzer lets us solve memory problems quickly .

A Quick Look at JVM Internals

Before we look at how to analyze a heap dump, let’s do a quick recap of some of the concepts relating to Java memory management.

The JVM runtime memory is primarily divided into two areas: heap memory and native memory, as shown in the diagram below.

Fig: JVM Memory Model

Heap memory is shared between all classes making up the application, and it’s used for storing objects. It’s managed by the JVM, and cleaned regularly by the garbage collector. A heap dump is a snapshot of this portion of memory. 

Native memory is managed by the operating system. The JVM uses it for structures such as stack spaces, class metadata and fast I/O buffers. For more information, see JVM Explained in 10 minutes.

The garbage collector (GC) is a background process that works to clear unused items from memory. It first identifies garbage roots, which are items known to be in use. These include variables belonging to active methods in the stack, and static variables. From these, it identifies their children: objects that they hold references to. It continues to work recursively through the dominator tree, marking all objects with valid references as being in use. It then clears all unreferenced objects from memory.

To be eligible for GC, an object must either go out of scope (for example, when the method that defined it completes), or it must be explicitly set to null.

Once marked for GC, if an object has a finalize() method, it will be added to a finalizer queue, and the method is executed before the object is removed from memory. If it does not have one, it is immediately removed.

Static variables remain in memory until the class they belong to is unloaded, which is something that seldom happens. If they reference other variables, which in turn reference more variables, this can result in considerable unnecessarily-retained memory. The memory occupied by an object is known as its shallow heap size, whereas the memory occupied by the object and all its children is its retained heap size.

Java Heap Dumps

Heap dumps are a snapshot of the current contents of memory. There are several ways to take a heap dump. If you’re working on Android, there are  different procedures for dumping the heap.

By examining a heap dump, we can answer questions such as:

  • What’s using all the space?
  • How much space does Object X use? And how much does it retain?
  • What is the total used heap size?
  • How much memory does Strategy A use compared to Strategy B?
  • Is there a memory leak? Where?
  • How can I save space?
  • Why is Object X not being garbage collected?
  • What are the contents of Object Y?

Exploring a Heap Dump Using a Java Memory Analyzer

We’ll take examples from HeapHero, which includes all the standard memory analysis features and a few extras.

The HeapHero report begins with a problem-detection report, which can often be a good indication of where to start looking for memory-related issues.

Fig: HeapHero Problem Detection Report

Next is an overview.

Fig: HeapHero Overview

This gives us useful statistics, invaluable for comparing different coding strategies, checking during testing to make sure a new version has not introduced memory problems, and providing a basis for specifying memory requirements.

The next section is possibly the most-used part of the report: the interactive largest object report.

Fig: Interactive Largest Object Report

This is a list of objects in memory, sorted from largest to smallest, with the option to change the sort order. If an application has memory problems, the culprit can almost always be found amongst the three or four largest objects. From this list, we can navigate upwards through the dominator tree to find and view the contents of child objects. We can also navigate downwards towards the GC roots to see what is preventing this object from being garbage collected. You may like to watch this video to see how this procedure can be used to find a memory leak.

The class histogram is a list of classes used in the application, showing the total shallow and retained heap by class.

Fig: Class Histogram

It lets us see at a glance where most of the memory is being used, and comparing histograms over time gives important indications of where to look for memory leaks.

HeapHero also has a limited ability to explore threads that are currently running, and their status. For thread-related issues, a thread dump analyzer such as fastThread provides more information.

Next is the duplicate class report, which can be an excellent debugging tool for classloader or dependency issues. Web servers may by design load more than one copy of a class to preserve independence between applications, but in general, a class should be loaded only once.

The GC roots report allows us to see how much memory is being retained under each of the GC roots, and to explore upwards through the dominator tree to see where the memory is being used.

An unreachable objects report gives a good indication of whether GC is working efficiently to clear unused objects. If not, we can explore further using a GC log analyzer such as GCeasy.

HeapHero gives us the option for deeper analysis using OQL (Object Query Language), an SQL-like language.

Not all memory problems are caused by bugs. Most developers are surprised at the amount of memory that’s simply wasted by poor coding practices. HeapHero gives a full breakdown of how much memory is wasted by inefficient collections, duplicate strings, boxed numbers and other coding issues. As an example, the image below shows a breakdown of duplicate strings, and who is holding them.

Fig: Duplicate Strings Report

A little-known cause of memory performance issues is objects that are slow to finalize. This can happen if an object’s finalize() method has to wait for resources, and it can prevent all objects behind it in the finalizer queue from being garbage collected.

It’s always worth checking HeapHero’s report of objects awaiting finalization to prevent this kind of memory leak.

Fig: Objects Waiting for Finalization

Conclusion

Mastering a heap dump analyzer such as HeapHero takes your developer skills to a new level.

The heap dump contains a wealth of information that can help you develop memory-efficient, bug-free code, and troubleshoot memory problems effectively.

With the right tool, heap dump analysis is not difficult. 

Share your Thoughts!

Up ↑

Index

Discover more from HeapHero – Java & Android Heap Dump Analyzer

Subscribe now to keep reading and get access to the full archive.

Continue reading