Java Heap Generations: A Deep Dive into Eden, Survivor Spaces, and Object Promotion

Is it really worth understanding what goes on inside the JVM: garbage collection, memory management and all the other behind-the-scenes activities? Doesn’t it just happen automatically?

For smaller, non-critical systems, we can often get by with just letting the JVM get on with it, at least for a while. But if we want to write cutting-edge applications that use memory efficiently, and keep them running smoothly in production, we need to understand the application’s working environment. This allows us to code more efficiently, configure the JVM wisely, and troubleshoot faster when problems occur.

In this article, we’ll look at one of the most important techniques the JVM uses to manage memory: Generational Garbage Collection (GC). Take a deep breath and get ready for a JVM Young vs Old Generation deep dive.

The JVM Memory Model

Why do we need to understand the JVM memory model? 

Java developers and troubleshooters alike will be familiar with applications that crash with the unfriendly error message: java.lang.OutOfMemoryError. In fact, there are 9 different types of OutOfMemoryError. Diagnosing and fixing the root cause is much quicker if we know the ‘geography’ of Java memory.

We can visualize JVM memory like this:

Fig: Java Memory Regions

At the highest level, it’s divided into Heap Memory and Native Memory, often referred to as Heap and Non-heap. In this article, we’ll concentrate on heap memory, which is managed by the Garbage Collector. The heap is the memory space where all objects created by the application are stored. Non-heap is used for various purposes: storing class definitions, storing a stack for each thread, fast input-output buffers etc. It’s managed by the operating system. For more information, it’s worth watching JVM Explained in 10 minutes.

Young vs Old Generation: The Theory Behind Generational GC

In computer science, the weak generational hypothesis postulates that most objects die young. In most applications, though not all, this is true. Think about an online sales system. 

At the beginning, it creates several objects that need to remain in memory as long as the application is alive. These might include configurations, company details, database connections, tax rates and more. These objects won’t ‘die’, but there are relatively few of them.

When a customer logs in, objects related to the customer will be created, but they won’t be needed once the customer logs out. Other objects may only be needed while the system is processing a single line item.

Generational GC developed to use this concept to make memory management more efficient. It makes sense to keep newly-created objects in a small area that’s cleaned frequently. Long-lived objects are promoted to a different area. This is cleaned less frequently, since these objects probably won’t die soon. Most garbage collectors, therefore, divide the heap into the Young Generation (YG) and Old (or Tenured) Generation (OG), as shown in the diagram in an earlier paragraph.

Understanding the Heap: JVM Young vs Old Generation Deep Dive

Actually, in modern GC algorithms, the heap is broken down even further:

Fig: Memory Areas Making up the Heap in Generational GC

The first phase of GC consists of identifying and marking objects that are still in use. It begins by working through parent/child relationships starting from GC roots. GC roots are items known to be still in use, primarily stack frames and static variables. Any unmarked objects are then eligible for removal. The actual removal happens in a later phase.

Eden space is used for creating new objects. S0 and S1, known as survivor spaces, are used alternately for holding objects that have survived at least one cycle of GC. We’ll refer to them as Current and Previous Survivor Spaces. The Old Generation is used for storing objects that have survived long enough to be promoted out of the Young Generation.

In generational GC, the workflow is as follows:

  • All new objects are created in Eden.
  • When Eden fills up to a configured percentage, a minor GC event is triggered, which:
    • Moves all objects with live references from Eden to the current survivor space, and gives them an aging of 0;
    • Clears Eden space completely;
    • Moves all objects with live references from the previous survivor space to the current survivor space, increasing their aging by 1;
    • Clears the previous survivor space;
    • Moves all objects from the current survivor space whose aging is above a configured threshold to the Old Generation;
    • Switches the current survivor space to be the previous, and vice versa.
  • When the Old Generation fills up to a configured percentage, a major GC event is triggered, which:
    • Removes all objects with no live references from the Old Generation;
    • Compacts the Old Generation.

This works best if the YG is small in relation to the OG, so that the more frequent minor GC events are very fast.

Monitoring To Prevent Heap Space Issues Affecting Production

All of these activities should happen behind the scenes, with very little impact on the efficiency of the application. Until one day, they don’t.

If the GC is finding it hard to clear enough memory to honor new requests, it works harder and harder, until GC events eventually run back-to-back. From the users’ point of view, they’ll see intermittently poor response times, long pauses and dropped sessions. The system may even crash with an OutOfMemoryError. All of this makes the IT department highly unpopular.

Luckily, we don’t have to wait until the system slows down to diagnose impending problems. The early warning signs are all there if you know where to look. To detect GC problems before our users start complaining, we should:

  • Make sure GC logs are always enabled in production;
  • Monitor them regularly using a GC log analyzer, such as GCeasy or GCViewer. Look for warning signs as discussed in this video: How to Read GC Logs.
  • Early warning signs include:
    • Throughput drops (Throughput is the percentage of time an application spends doing actual work, as opposed to time spent doing GC).
    • Latency increases (Latency is time where all application threads are paused during critical GC operations).
    • Unhealthy memory usage patterns are evident.
  • If any of these issues are developing, proactively investigate the heap to determine the cause. Heap dump analyzers such as HeapHero and Eclipse MAT take the pain out of this task.

Configuring Generational GC: Young Generation, Old Generation, Thresholds and Algorithms

If GC is not performing as it should, there are several JVM arguments that we can tweak to get better performance. A word of warning: before we start tweaking, we should:

  • Make sure the problem is not caused by a memory leak, wasted memory or a shortage of actual RAM in the device or container. If it is, no amount of configuration changes will help.
  • Try the changes on the test system first if possible, and run load tests comparing results with different settings.
  • Save the original settings so we can always go back to them.
  • Start with no configuration settings to begin with. This gives us a bottom line for benchmarking, and also eliminates the possibility that the original settings were actually making the problem worse.
  • Don’t change everything all at once: make small tweaks and test the results.
  • Monitor the new settings using GC logs before we commit to using them. 

Here are some command line arguments we can use to tune generational GC performance:

Configuration ArgumentPurpose
-Xms<size>Initial heap size
-Xmx<size>Maximum heap size
-Xmn<size>Total size of the young generation
-XX:NewRatio=<N>Ratio of old generation to young generation
-XX:SurvivorRatio=<N>Eden to survivor space ratio
-XX:MaxTenuringThreshold=<N>Number of GC cycles before promotion to old generation
-XX:MaxGCPauseMillis=<N>Target max pause time  (Parallel and G1GC Only)
-XX:GCTimeRatio=<N>Target ratio GC time vs application time  (Parallel and G1GC Only)
-XX:+ZGenerationalUse generational GC in ZGC
-XX:+ShenandoahGenerationalUse generational GC in Shenandoah
-XX: MaxRAMPercentage=<Ν>Set the maximum heap size as a percentage of available memory
-XX: InitialRAMPercentage=<N>Set the initial heap size as a percentage of available memory
 -XX:+Use<GC name>Select the GC algorithm

Conclusion

This article has looked at the JVM Young vs Old Generation:  a deep dive into the inner workings of generational garbage collection.

We’ve looked at how it works, and how to monitor and configure it to keep meeting performance targets.

Understanding this concept helps us to plan, code, monitor and troubleshoot effectively.

Share your Thoughts!

Up ↑

Index

Discover more from HeapHero – Java & Android Heap Dump Analyzer

Subscribe now to keep reading and get access to the full archive.

Continue reading