Exploring Java Heap Dumps

I have now given a talk at both Oracle Code One and Devoxx Belgium on analyzing heaps programmatically. I will also be giving a new variant of Java2Days in a couple of weeks. The slides can be found here. In this blog entry, I summarize the presentation and provide starting code (on GitHub).

Analyzing Java heap dumps has been traditionally hard. While there are many commercial and open source profilers on the market, they are generic. They don’t know your application and can’t easily answer specific questions related to your data model. Real heap dumps from production applications can be huge (gigabytes) and locating data model problems is usually impossible. Navigating through thousands of classes and their references in a profiler GUI isn’t efficient, reproducible, or feasible but for the simplest problems.

The solution to the challenges of analyzing heap dumps is the NetBeans Profiler API. The Profiler API is a sub-component of the full profiler, minus the UI. It enables you to write code that analyzes your heap programmatically. Programmatically analyzing your heap allows you to automate the process and ask questions are impossible to answer diving through heaps in a GUI Profiler tool. It isn’t possible to find bugs in production heap dumps with millions of objects.

The NetBeans Profiler API is one the golden nuggets buried in the IDE. It provides a simple API enabling you to easy read heap dumps and analyze them yourself. It doesn’t depend upon anything else within the IDE and can be used standalone. No knowledge of NetBeans or heap dumps is required. With it you can iterate over millions of objects and find duplicate object graphs. You can reconstruct parts of the data model and check whether the data model created in memory matches the design.

What types of problems can you troubleshoot using this API that you cannot with a GUI profiler tool?

  • Corrupted object graphs
  • Buggy clone methods
  • Duplicate data
  • Duplicate singletons
  • Resource leaks
  • Renegade code
  • Heap size disconnect

Corrupted Object Graphs

What your documentation and/or diagrams on whiteboards says your data model should be may not be born in what actually gets created. The larger the data model the more changes for it to get “corrupted.” By corrupted I don’t mean an instantiation that will crash, but may not be what you intended or what logic was expecting to operate on. For example, an instance of an object may be stored as a member variable in one object and as an entry in a list in another object. Through a bug in coding, two instances are instead created depending on code path leading to unexpected behavior.

Buggy Clone Method

Clone methods can introduce all sorts of problems if not implemented correctly. Perhaps your designed called for deep cloning but one of the objects only does a shallow clone. So some objects end up being shared between two instances and the resulting behavior becomes baffling when threading is thrown into the mix.

Duplicate Data

Your application is supposed to have only one copy for some types of data in memory. Due to buggy logic or JPA caching, the data model is actually duplicated. User edits data on one screen goes back to a summary screen, and the original data is still or updates after a refresh.

Duplicate Singletons

For a Java EE Application you annotate your POJO with @Singleton expecting it to be instantiated just once. However, another developer accidently instantiates it (not understanding the annotation) and viola, your singleton isn’t a singleton. Suddenly there is data corruption that just doesn’t make sense.

Resource Leaks

For unexplained reasons, your application sometimes drains a connection pool resulting in errors evidently once code throws exceptions after timeouts on the pool are exceeded. Turns out that a bean isn’t correctly closing out resources when an exception is thrown. But because a resource adapter has a reference back to the bean, it doesn’t get garbage collected and the connection released back to the pool.

Renegade Code

Consider an application which is deploy/undeployed from an application server. While the app server may have “removed” the application from its list and no longer be servicing it. The application may still be partially “alive” because it started a thread that the container doesn’t know about or control. The code is still running in the JVM doing something. Using this API, you could develop a tool that could identify your application classes running inside of the JVM that aren’t connected to a deployed application.

Heap Size Disconnect

Your application requires gigabytes of memory to run. While the application isn’t leaking memory, something doesn’t smell right. Sure, it is a large application and the data model is complicated, but the minimum heap size is multiples of the actual data is the user is generating. Other than gross object counts, how is the heap split between different features and how do you quantify and analyze this.

Developers are probably struggling with many of these bugs everyday but don’t realize it. Some might blame bugs in the JVM or application container.  Maybe a restart fixes the problem. These types of bugs are much harder to find and may not be resulting in a fatal application outage or be easy for an end-user spot. It is much easier to increase -Xmx than it is to figure-out why an application uses so much memory.

NetBeans Profiler API

The NetBeans profiler API is found under the following path in the source code repository:

netbeans/profiler/lib.profiler/src/netbeans/lib.profiler/heap

You can check-out the code from git:

https://github.com/apache/incubator-netbeans.git

There approaches to getting this API:

  1. Copy the code out of the heap directory (it is completely self-contained)
  2. Create a NetBeans module application and depend upon the profiler module
  3. Pull from Maven**

** When preparing my talk I didn’t realize NetBeans had its own Nexus repository with the artifacts published to it.

Couple of things to note:

  1. Profiler API isn’t dependent on anything else in NetBeans
  2. Code is currently pre-generics so it will run Java 5 etc.
  3. Code is compatible with Java 9, 10, and 11

In a Maven project, to include the code you will need the following:

[code lang=’xml’]

<repositories>
<repository>
<id>netbeans</id>
<name>netbeans</name>
<url>http://bits.netbeans.org/maven2</url&gt;
</repository>
</repositories>

<dependencies>
<dependency>
<!– This will need to be updated now that NetBeans is part of Apache works as of 11/12/2018 –>
<groupId>org.netbeans.modules</groupId>
<artifactId>org-netbeans-lib-profiler</artifactId>
<version>RELEASE802</version>
</dependency>
</dependencies>

[/code]

Note, the Maven repository above will most likely change soon. Now that NetBeans is part of Apache, NetBeans artifacts will be moving to Maven Central.

Once you have the dependency, opening a heap is trivial:

[code lang=’java’]
Heap heap = HeapFactory.createHeap(new File(“my-app-heap.hprof”));
[/code]

Now you can iterate over the contents of the heap. You can tackle your problem by either starting from GC roots or going after specific classes. The approach you will take depends upon your data model and the problem you are trying to troubleshoot.

The heap class has the following methods:

  • getJavaClassByName(String fqn) : JavaClass
  • getAllClasses() : List
  • getBiggestObjectsByRetainedSize(int number) : List
  • getGCRoots(): GCRoot
  • getInstanceByID(long instanceId) : Instance
  • getJavaClassByID(long javaclassId) : JavaClass
  • getJavaClassesByRegExp(String regexp) : HeapSummary
  • getSummary() : Properties

With this API, you need understand two important objects:

  • JavaClass – this is analogous to java.lang.Class. For each type in a heap dump there will be one JavaClass instance. From the JavaClass you can get the list of instances.
  • Instance – this represents an instance of a JavaClass. You can use this object to drill into member variables etc. With an instance, you can find out who refers to that instance.

Each instance has an associated ID which is unique for each heap dump. You can use this ID to ensure when re-cursing through the object graph that you don’t get stuck in a loop.

Since the current code is pre-generics, you will have to do instanceof and casting.

Simple example to get data for a specific class:
[code lang=’java’]
JavaClass strClass = heap.getJavaClassByName(“net.cuprak.sample.office.Office”);
List instances = strClass.getInstances();
for(Instance instance : instances) {
System.out.println(instance.getValueOfField(“name”));
}
[/code]

Note: You don’t need the classpath of the application that generated the heap dump in order to analyze it.

In the code sample above, I printed out the name field. However that wouldn’t work as expected. Although the “name” field is a String in my application. In the heap dump that name filed points to a String object. You have then understand the data structure that makes up a String. If you look at the JDK source code of the String class, you’ll see that the value is stored in an array called value. The code to convert a String object into something that you can read is as follows:

[code lang=’java’]
static String processString(Instance instance) {
if(instance.getValueOfField(“value”) instanceof PrimitiveArrayInstance) {
PrimitiveArrayInstance pi =
(PrimitiveArrayInstance) instance.getValueOfField(“value”);
if (pi != null) {
List entries = pi.getValues();
StringBuilder builder = new StringBuilder();
for (Object obj : entries) {
if(obj instanceof Character) {
builder.append((char)obj);
} else if (obj instanceof Integer) {
int charCode = Integer.valueOf((String) obj);
builder.append(Character.toString((char) charCode));
}
}
return builder.toString();
}
} else {
return instance.getValueOfField(“value”).toString();
}
return “null”;
}
[/code]

If you have a member variable that is a ArrayList, you’ll have to write a utility method that will extract the list items from the internals.

Summary

As you can see, it is trivial to parse Java heap dumps and extract data. With this knowledge, you can now mine your heap dumps for bugs!
I have put together a demo application to get you up and running:
https://github.com/rcuprak/heapdemo
More detail of this API can be found in the recorded presentation from either conference.

%d bloggers like this: