Finding Java memory leaks in WebSphere Application Server...

Finding Java memory leaks in WebSphere Application Server, Part 1: Using Hprof and IBM heapdumps

Summary This paper explains how to diagnose and find the source of JavaTM memory leaks on IBMR WebSphereR Application Servers 3.5, 4.01 and 5.0 in the distributed and z/OS environments. It presents a general methodology, explains how to gather diagnostic data, describes tools for analyzing the data and gives examples of use.

Table of contents • Summary • Introduction

• Who should read this paper • How to read this paper

• Java memory leaks • Methodology

• Identify the symptoms • Turn on verbose garbage collection • Turn on profiling • Analyze the trace

• Sun and IBM heapdump differences • Profiling the heap with Hprof

• Enabling Hprof output • WebSphere Application Server 3.5 and 4.01 simple configuration for z/OS • WebSphere Application Server 4.01 J2EE Server for z/OS • WebSphere Application Server 3.5 distributed (version 3.5.6 and beyond) • WebSphere Application Server 4.x distributed • WebSphere Application Server 5.0 distributed

• Default behavior • Performance considerations • Creating and reading heapdumps using Hprof

• Hprof advanced usage • Understanding the JVMPI • Hprof options

• Understanding Hprof output • Using the HAT tool to intepret heapdumps

• Profiling the heap with IBM heapdumps • Is it IBM_HEAPDUMP or IBM_HEAP_DUMP?

• Enabling IBM heapdumps on WebSphere Application Server 5.0 distributed • Enabling IBM heapdumps on WebSphere Application Server 4.x distributed

• Invoking IBM heapdumps • Invoking with a signal

• On distributed Unix, AIX and Linux

• On Windows • Invoking automatically on an OutOfMemoryError • Invoking through special code

• Reading IBM_heapdump output • Interpreting IBM_heapdump output using the HeapRoots tool

• Memory and performance problems with HeapRoots • Using HeapRoots in interactive mode • An example of using HeapRoots to diagnose a memory leak • HeapRoots Summary

• Analyzing IBM dumps interactively with HeapWizard • Using HeapWizard • An example of heap analysis • HeapWizard Summary

• Conclusion • About the authors

Introduction This paper is the first in a two-part series exploring how to diagnose Java memory leaks on various IBM platforms. Due to the wide diversity of tools and techniques available, the authors have been forced to make some drastic choices about which ones are presented in these papers. Tools and techniques that are not presented within the scope of the paper are by no means invalid.

Who should read this paper

The intended readers are the developers and troubleshooters of Java applications or J2EE application servers.

How to read this paper

The paper discusses the various diagnostic tools and the types of files on which they operate. It offers examples of tool usage and output. If a new version of the tool is available by the time you read this, you might find that the output or the usage is slightly different.

The authors have tried to make it easy to jump right to the part of the paper that might specifically interest a reader. This means that some information is repeated.

Sometimes, a problem has been identified that is still not resolved at publication time. In this case, the problem is mentioned in a note.

The paper discusses each platform separately when needed. For the purpose of this document, distributed means a platform other than z/OS.

Memory problem resolution is a technology still in its infancy. The approach taken and the tools available vary considerably depending on the Java Virtual Machine (JVM) origin (Sun or IBM), the operating system platform and the problem type (performance, constraint, or leak). This document walks you through many of the resources available today. Be sure to contact your IBM Support representative and check WebSphere newsgroups periodically for updates to existing tools and to learn about emerging problem resolution technology.

A great reference for debugging aids and techniques for IBM JDKs can be found at http://www.ibm.com/developerworks/java/jdk/diagnosis/.

Java memory leaks Most people think of memory problems in terms of memory leaks and their objective is to locate the leaking object. However, there are actually four different categories of memory problems with similar or overlapping symptoms, but different causes and solutions: performance, resource constraints, Java heap leaks and native memory leaks. Performance problems are usually associated with excessive object creation and deletion, long delays in garbage collection, excessive operating system page swapping, and more. Resource constraints are caused by either not enough memory or memory too fragmented to allocate a large object -- this can be native or Java heap, but is usually Java heap-related.

Java heap leaks are the classic Java memory leak, in which Java objects are continuously created without being released. This is usually caused by latent object references. Native memory leaks are associated with any continuously growing memory utilization that is outside the Java heap, such as allocations made by JNI code, drivers or even JVM allocations. This document will limit its scope to Java heap leaks.

Methodology

The methodology for finding Java memory leaks is straightforward:

• Identify the symptoms • Turn on verbose garbage collection • Turn on profiling • Analyze the trace

Identify the symptoms

While Java contains a built-in garbage collector to collect objects that are no longer used, there is still a possibility that a program can leak memory. Memory leak is a possible suspect when:

• The Java process, for example, the WebSphere Application Server (Application Server), unpredictably slows down, freezes, or stops, even when Administrative Server (3.5 and 4.0 only) and Web Server are fine.

• The size of memory used by the Java process, such as Application Server, steadily rises and is not kept steady by garbage collection.

Eventually the Java process throws a runtime exception, java.lang.OutOfMemory, which is a clear indicator that memory resources have been exhausted.

You need to distinguish between a normal memory exhaustion from a leak. If a Java application requests more storage than the runtime heap offers, it can be because of poor design. For instance, if an application creates multiple copies of an image or loads a file into an array, it will run out of storage when the image or file are very large. This is a normal resource exhaustion. The application is working as designed, even though the design is boneheaded.

But if an application steadily increases its memory utilization while processing the same kind of data, you might have a memory leak.

Turn on verbose garbage collection

One of the quickest ways to assert that a memory leak is occurring is to use verbose garbage collection. Memory constraint problems can usually be identified by examining patterns in the verbosegc output. Java heap leak resolution involves using the verbosegc output along with some JVM-specific techniques.

The java command provides an input argument, -verbosegc, which generates a trace each time the garbage collection (GC) process is started. That is, as memory is being garbage-collected, summary reports are printed to standard error. The output shows the JVM trying to allocate memory as needed.

Typical output looks as follows:

Each block (or stanza) in this GC trace file is numbered in increasing order. "Allocation Failure" itself is normal. To make sense of this trace, you should look at successive Allocation Failure stanzas and look for freed memory, bytes and percentage, decreasing over time while total memory (here, 19725304) is increasing. These are typical signs of memory depletion. Once this is verified, heapdumps, which we’ll describe later, should be generated to isolate the cause of the leak.

Memory fragmentation can also occur. When the heap is almost full and an allocation request cannot be satisfied, the GC will trigger. However, the freed memory fragments may not be contiguous. As a result, an Out of Memory exception will occur even if the heap says it contains enough free space. Figure 1 illustrates this case.

Figure 1: A situation where a fragmented heap cannot satisfy an allocation request

Turn on profiling

The different JVMs offer ways to generate trace files that reflect the heap activity. This is called profiling the heap. This is not to be confused with other types of profiling, such as performance measurements, which are outside the scope of this paper.

Heap profiling produces a trace file, generally quite large, that contains information about the type and size of objects allocated in the heap. Once this trace has been collected, you can analyze it to pinpoint the problem.

Analyze the trace

This paper focuses on two different heap trace formats, the Sun format known as Hprof and the IBM format known as IBM heapdump. The tools applicable to each format are different, but the idea is the same: find a block of objects in the heap that should not be there, and determine if these objects accumulate instead of being cleaned up. Of particular interest are transient objects that are known to be allocated each time a certain event is triggered in the Java application. The presence of many instances of objects that ought to exist only in small quantities generally indicates an application bug.

Finally, solving memory leaks requires the application programmers to review their code and fix the leak. Telling the coders what kind of objects are leaking is a great help and considerably speeds up the correction of the program. A common bug is to have a reference to a transient object involuntarily kept around in another object, preventing the garbage collector to delete this object and reclaim its space. This paper won’t venture into the vast territories of Java debugging.

Sun and IBM heapdump differences All 1.2 or later JDKs (except for iSeries) have a built-in standard feature, called "Hprof." This feature writes a map of memory usage, or "heapdump," to a file which can be manually inspected or fed into a software tool for analysis. Hprof dumps display objects, their addresses and sizes, and Java stacks, which show where in code the objects were allocated. This document explains how to use Hprof to create and interpret heapdumps in the following sections.

IBM JDKs, which power most versions of WebSphere Application Server (including all JDKs for WebSphere Application Server 5.01 and later, except for iSeries), include an additional utility for creating an IBM-specific format of heapdumps. The IBM heapdump is a superset of the Hprof format. You cannot feed an IBM heapdump to an Hprof tool such as Heap Analysis Tool (HAT). Conversely, you cannot feed an Hprof dump to an IBM heapdump tool. You must choose the format (Hprof or IBM) of the heapdump generated by your JVM.

IBM heapdumps, like Hprof dumps, list allocated objects and their addresses and sizes. Unlike Hprof dumps, IBM heapdumps do not show where in the Java code objects are allocated, or relate objects that were allocated at the same time or in the same Java method. Like Hprof dumps, they show object relationships, that is, who has pointers to whom. This can be important, since it is these relationships which prevent objects from being garbage-collected. The section Profiling the heap with IBM heapdumps explains how to generate and interpret IBM heapdumps.

In general, Hprof dumps are more direct in pinpointing memory leak issues, because they show where in the code objects are allocated, and show combined memory size and occurrences for all objects allocated in the same code location. However, enabling Hprof causes significant performance degradation, which may make it unacceptable in a production environment and may even skew test results, whereas IBM heapdumps

require little runtime overhead.

Profiling the heap with Hprof The easiest way to obtain Hprof output is to add the -Xrunhprof option to the Java command line argument. For example:

java -Xrunhprof -X... -D.... com.mycompany.mypackage.myclass

The section below explains how to add the -Xrunhprof option to the Application Server JVM for the different levels of the Application Server and the various platforms.

Enabling Hprof output

The following sections describe how to enable Hprof output on the various platforms.

WebSphere Application Server 3.5 and 4.01 simple configuration for z/OS

Locate the application server's was.conf file.

1. Add this parameter to was.conf:

appserver.java.extraparm=-Xrunhprof:file=/some_writable_directory/name_of_output_file

Note: You must use the file option on z/OS. The application server will not write the Hprof output to a default working directory. 2. Restart the application server.

Note: Currently, this option does not produce the expected output, whether you send a kill -s SIGQUIT to the PID of the address space, or shut down WebSphere Application Server normally. This is being debugged by WebSphere and Java support.

WebSphere Application Server 4.01 J2EE Server for z/OS

1. Start the System Management console and connect to the target system. 2. Start a new conversation. 3. Expand the tree until you reach the target J2EE server. 4. Right-click on the J2EE server and select Modify. 5. Scroll down on the right panel until you reach the environment variables section. 6. Within the environment variables, scroll down until you see a blank entry. 7. Double-click inside this entry to display a dialog box for creating a new environment variable. 8. Enter the following environment variable:

Name: JVM_EXTRA_OPTIONS, value: -Xrunhprof 9. Click OK, then Save. 10. Validate, commit, and complete the tasks, then activate the conversation.

Notes:

• This procedure will write the java.hprof.txt file to your /tmp directory, not your working directory. • If you have more than one option to pass to the JVM, use the WAS_JAVA_OPTIONS environment

variable documented in APAR PQ59164.

Note: Currently, this option does not produce the expected output when sending a kill -s SIGQUIT to the PID of the address space. This is being investigated through PMR 09839,999. This option works if a normal shutdown of the J2EE server takes place via the MVS STOP command.

WebSphere Application Server 3.5 distributed (version 3.5.6 and beyond)

Locate the application server property page for each application server using the Admin Console:

1. Start the Admin Console. 2. Expand WebSphere Administrative Domain by clicking on the +. 3. Expand the target host. 4. Select the target server (for example, MyServer) and stop it. 5. In the right pane of the General tab, update the Command line arguments field with -Xrunhprof. 6. Optionally, add specifications to the Hprof option, as follows:

-Xrunhprof:depth=n,file=filename.txt

7. where n = the depth of Java stack traces you wish to view. A small depth (such as 5) will make the heapdump smaller and easier to manage.

8. Click Apply. 9. Restart the application server. 10. Trigger a thread dump using the following commands:

On Windows: DrAdmin -port portnumber -dumpThreads On AIX: kill -3 JavaProcessPID

WebSphere Application Server 4.x distributed

1. Start the Admin Console. 2. Expand WebSphere Administrative Domain by clicking on the +. 3. Expand Nodes. 4. Expand the node that hosts the problem application server. 5. Expand Application Servers. 6. Select the problem application server 7. Click the JVM Settings tab. 8. Click the Advanced JVM Settings tab. 9. Check Run Hprofs arguments. 10. Enter specifications in the Hprof arguments field, as follows:

depth=n,file=filename.txt

where n = the depth of Java stack traces you wish to view. A small depth, such as 5, will make the heapdump smaller and easier to manage.

11. Click OK, then Apply.

Note: If the application server is a member of a server group, then the Hprof settings will be disabled. The steps in this case are the same, except that the settings must be made on the server group properties instead of the application server.

12. Restart the problem application server, and trigger a thread dump using the following commands:

On Windows: DrAdmin -port portnumber -dumpThreads On AIX: kill -3 JavaProcessPID

If the absolute path was not given as part of the heap file name, look for it in the install_root/bin directory.

WebSphere Application Server 5.0 distributed

1. Start the Admin Console and navigate to Servers => Application Servers => server_name -> [Configuration Tab] => Process Definition => Java Virtual Machine.

2. Check Run Hprof and specify format=a in the box below that for Hprof Arguments. 3. Save the configuration and restart the application server. 4. Run the application that causes the leak. 5. Using the wsadmin command, get a handle to the problem application server as follows:

wsadmin>set jvm [$AdminControl completeObjectName type=JVM,process=server_name,*]

6. Generate a thread dump (which also triggers Hprof, if enabled):

wsadmin>$AdminControl invoke $jvm dumpThreads

This file will be written to the default working directory (for example, c:\winnt\system32 on Windows), unless it is overridden in the AdminConsole => targetserver =>General tab->Working directory field.

Default behavior

A header is written to java.hprof.txt when the Java process starts. A complete memory profile output, or heapdump, is appended when the Java process exits.

Incremental heapdumps are added if a signal is sent to the running Java process; for example, kill -3 JavaProcessPID. Alternately, this signal can be delivered interactively if the running process has an associated console by typing Ctrl+\ (on Solaris) or Ctrl+Break (on Windows).

On WebSphere Application Server 3.5 or 4.x distributed, you can also use the DrAdmin command:

DrAdmin -port portnumber -dumpThreads

If interactive mode is not practical, as is the case in server or daemon processes, the kill -3 option is the best alternative. The -Xrunhprof:net=host:port option is more difficult to use, because it requires a client process running on the specified host and listening on the specified port prior to starting the Java process. You can get an example of such a client from http://www.javasoft.com.

There are other ways to deliver a signal interrupt to a running application server.

Note: Every time DrAdmin is called with the -dumpThreads option, a complete heapdump is appended to the output file.

Performance considerations

It is our experience that enabling heap profiling slows down a Java process significantly. At the time of this writing, no quantification had been made to evaluate the magnitude of this performance degradation. A good

practice is to reproduce a memory leak condition in a test environment. You should take the performance degradation caused by enabling heap profiling into consideration and measure it while in pre-production before you enable it on a production system.

WebSphere Application Server 3.5 running on Win2000 showed a 2.2X performance degradation in an application server, when the default -Xrunhprof option was added. The base measurement was performed on a 1 GHz / 1 GB RAM ThinkPad. 1000 URL getRequests issued against a servlet that performed EJB queries took 19 seconds. When Hprof was enabled, the response time went up to 43 seconds.

WebSphere Application Server 5.0 running on the same set-up as above showed a base response time of 15 seconds. When Hprof was enabled, the response time degraded to 46 seconds, roughly a 3X increase. WebSphere Application Server 4.x running on Windows showed a 1.5X degradation in response time.

WebSphere Application Server 3.5 running on z/OS showed a 1.75X performance degradation. A WebSphere Application Server 4.01 basic configuration running on z/OS showed a 1.91X performance degradation. WebSphere Application Server 4.01 J2EE server running on z/OS showed a 1.25X performance degradation.

Creating and reading heapdumps using Hprof

Let's examine the Hprof command options and see how to use its output to find memory leaks.

Begin by looking at the statistical report toward the bottom of the heapdump, with the section titled SITES BEGIN:

SITES BEGIN (ordered by live bytes) Wed Apr 02 07:13:47 2003 percent live alloc'ed stack class rank self accum bytes objs bytes objs trace name 1 99.76% 99.76% 125829168 12 125829168 12 2695 [C 2 0.03% 99.79% 32776 2 32776 2 1366 [C 3 0.01% 99.80% 16392 2 16392 2 1364 [B 4 0.01% 99.81% 16032 102 16240 112 1 [C ... SITES END

The self column shows what percentage of live bytes are due to that row, and accum shows the percentage of memory due to all rows up to and including that row (which are presented in descending order by amount of memory allocated). Therefore, in the table above, 99.76% of all live, non-collectable, bytes are due to 12 character arrays, which were allocated at stack trace #2695. 0.03% of live bytes were due to character arrays allocated at stack trace #1366.

Live versus alloc'ed distinguishes non-collectable, currently live objects from all objects ever allocated, whether currently live or previously garbage-collected, at a particular call site. Look for rows that have live values nearly equal to alloc'ed values, which is a sign that an allocation site is responsible for a leak. Problems with memory leaks often surface in the SITES section. As seen above, one or two sites are often responsible for the vast majority of the total allocated memory.

Once a suspicious site has been found, look at its number in the stack trace column. Then go to the TRACE section in the file and look for that trace number. It will list the methods responsible for the allocation.

The difficulty with this information is that low-level objects, such as character arrays, float to the top. The really helpful objects such as heads of leaking data structure are far fewer in number, and will be somewhere far down the list. Locating those is sometimes difficult. This is where third party vendor tools come in handy.

Hprof advanced usage

The following section are paraphrased from IBM Developer Kit and Runtime Environment, Java Technology Edition,Version 1.3.1, Diagnostics Guide, SC34-6200-00, written by the IBM Java Technology Center in Hursley and Bangalore.

Understanding the JVMPI

The Java Virtual Machine Profiling Interface (JVMPI) is a two-way interface that allows communication between the JVM and a profiler. JVMPI enables third parties to develop profiling tools based on this interface. The interface contains mechanisms that enable the profiling agent to notify the JVM about the kinds of information it wants to receive, as well as a means of receiving the relevant notifications. Several tools are based on this interface, such as Jprobe, OptimizeIt, TrueTime, and Quantify. These are all third-party commercial tools, so IBM cannot make any guarantees or recommendations regarding their use. The Hprof profiling agent is based on this interface.

Hprof options

The command java -Xrunhprof:help displays the options available for Hprof:

Option Name and Value Description Default

heap=dump | sites | all heap profiling All

cpu=samples | times | old CPU usage Off

monitor=y | n monitor contention N

format=a | b ascii or binary output A

file=filename write data to file java.hprof (for binary), java.hprof.txt (for ascii)

net=host:port send data over a socket write to file

depth=size stack trace depth 4

cutoff=value output cutoff point 0.0001

lineno=y | n line number in traces? Y

thread=y | n thread in traces? N

doe=y | n dump on exit? Y

Example:

java -Xrunhprof:cpu=samples,file=log.txt,depth=3 FooClass

Here is a detailed description of these options:

heap=dump|sites|all This option helps in the analysis of memory usage. It tells hprof to generate stack traces, from which you can see where memory was allocated. If you use the heap=dump option, you get a dump of all live objects in the heap. With heap=sites, you get a sorted list of sites with the most heavily allocated objects at the top.

cpu=samples|times|old The cpu option outputs information that is useful in determining where the CPU spends most of its time. If cpu is set to samples, the JVM pauses execution and identifies which method call is active. If the sampling rate is high enough, you get a good picture of where your program spends most of its time. If cpu is set to times, you receive precise measurements of how many times each method was called and how long each execution took. Although this is more accurate, it slows down the program.

The cpu=old option provides an output format that is backward-compatible with an older version of the tool.

monitor=y|n The monitor option can help you understand how synchronization affects the performance of your application. Monitors are used to implement thread synchronization, so getting information on monitors can tell you how much time different threads are spending when trying to access resources that are already locked. Hprof also gives you a snapshot of the monitors in use. This is very useful for detecting deadlocks.

format=a|b The default is for the output file to be in ASCII format. Set format to b if you want to specify a binary format, which is required for some utilities such as the Heap Analysis Tool.

file=filename The file option lets you change the name of the output file. The default name for an ASCII file is java.hprof.txt. The default name for a binary file is java.hprof.

net=host:port To send the output over the network rather than to a local file, use the net option.

depth=size The depth option indicates the number of method frames to display in a stack trace (the default is 4).

thread=y|n If you set the thread option to y, the thread ID is printed beside each trace. This option is useful if it is not clear which thread is associated with which trace. This can be an issue in a multi-threaded application.

doe=y|n The default behavior is to write profile information to the output file when the application exits. To collect the profiling data during execution, set doe (dump on exit) to n and use one of the methods (wsadmin, DrAdmin or kill -3) described in section Enabling Hprof output.

Understanding Hprof output

The top of the output file contains general header information such as an explanation of the options, copyright, and disclaimers. A summary of each thread appears next.

You can see the output after using Hprof with a simple program, as shown below. This test program creates and runs two threads for a short time. From the output, you can see that the two threads called respectively apples and oranges were created after the system-generated main thread. Both threads end before the main thread. The address, identifier, name, and thread group name are displayed for each thread. You can see the order in which threads start and finish.

THREAD START (obj=11199050, id = 1, name="Signal dispatcher", group="system") THREAD START (obj=111a2120, id = 2, name="Reference Handler", group="system")

THREAD START (obj=111ad910, id = 3, name="Finalizer", group="system") THREAD START (obj=8b87a0, id = 4, name="main", group="main") THREAD END (id = 4) THREAD START (obj=11262d18, id = 5, name="Thread-0", group="main") THREAD START (obj=112e9250, id = 6, name="apples", group="main") THREAD START (obj=112e9998, id = 7, name="oranges", group="main") THREAD END (id = 6) THREAD END (id = 7) THREAD END (id = 5)

The trace output section contains regular stack trace information. The depth of each trace can be set, and each trace has a unique ID:

TRACE 5: java/util/Locale.toLowerCase(Locale.java:1188) java/util/Locale.convertOldISOCodes(Locale.java:1226) java/util/Locale.(Locale.java:273) Java/util/Locale.(Locale.java:200)

A trace contains a number of frames, and each frame contains the class name, method name, filename, and line number. In the example above, you can see that line number 1188 of Local.java (which is in the toLowerCase method) has been called from the convertOldISOCodes() function in the same class. These traces are useful in following the execution path of your program. If you set the monitor option, a monitor dump gives output that looks like this:

MONITOR DUMP BEGIN THREAD 8, trace 1, status: R THREAD 4, trace 5, status: CW THREAD 2, trace 6, status: CW THREAD 1, trace 1, status: R MONITOR java/lang/ref/Reference$Lock(811bd50) unowned waiting to be notified: thread 2 MONITOR java/lang/ref/ReferenceQueue$Lock(8134710) unowned waiting to be notified: thread 4 RAW MONITOR "_hprof_dump_lock"(0x806d7d0) owner: thread 8, entry count: 1 RAW MONITOR "Monitor Cache lock"(0x8058c50) owner: thread 8, entry count: 1 RAW MONITOR "Monitor Registry lock"(0x8058d10) owner: thread 8, entry count: 1 RAW MONITOR "Thread queue lock"(0x8058bc8) owner: thread 8, entry count: 1 MONITOR DUMP END MONITOR TIME BEGIN (total = 0 ms) Thu Aug 29 16:41:59 2002 MONITOR TIME END

The first part of the monitor dump contains a list of threads, including the trace entry that identifies the code the thread executed. There is also a thread status for each thread where:

• R = Runnable • S = Suspended • CW = Condition Wait • MW = Monitor Wait

Next is a list of monitors along with their owners and an indication of whether there are any threads waiting on them.

The Heapdump is the next section of the output file. This is a list of different areas of memory and shows how they are allocated:

CLS 1123edb0 (name=java/lang/StringBuffer, trace=1318) super 111504e8 constant[25] 8abd48 constant[32] 1123edb0

constant[33] 111504e8 constant[34] 8aad38 constant[115] 1118cdc8 CLS 111ecff8 (name=java/util/Locale, trace=1130) super 111504e8 constant[2] 1117a5b0 constant[17] 1124d600 constant[24] 111fc338 constant[26] 8abd48 constant[30] 111fc2d0 constant[34] 111fc3a0 constant[59] 111ecff8 constant[74] 111504e8 constant[102] 1124d668 ... CLS 111504e8 (name=java/lang/Object, trace=1) constant[18] 111504e8

CLS tells you that memory is being allocated for a class. The hexadecimal number following it is the actual address where that memory is allocated.

Next is the class name, followed by a trace reference. Use this to cross-reference the trace output and see when this is called. If you refer back to the particular trace, you can get the actual line number of code that led to the creation of this object.

The addresses of the constants in this class are also displayed and, in the above example, the address of the class definition for the superclass. Both classes are children of the same superclass (with address 11504e8). Looking further, you can see the class definition and name, which in this case turns out to be the Object class (a class that every class inherits from). The JVM loads the entire superclass hierarchy before it can use a subclass. Thus, class definitions for all superclasses are always present. There are also entries for Objects (OBJ) and Arrays (ARR):

OBJ 111a9e78 (sz=60, trace=1, class=java/lang/Thread@8b0c38) name 111afbf8 group 111af978 contextClassLoader 1128fa50 inheritedAccessControlContext 111aa2f0 threadLocals 111bea08 inheritableThreadLocals 111bea08 ARR 8bb978 (sz=4, trace=2, nelems=0, elem type=java/io/ObjectStreamField@8bac80)

If you set the heap option to sites or all (dump and sites), you also get a list of each area of storage allocated by your code. The sites that allocate the most memory are at the top:

SITES BEGIN (ordered by live bytes) Thu Aug 29 16:30:31 2002 percent live alloc'ed stack class rank self accum bytes objs bytes objs trace name 1 18.18% 18.18% 32776 2 32776 2 1332 [C 2 9.09% 27.27% 16392 2 16392 2 1330 [B 3 8.80% 36.08% 15864 92 15912 94 1 [C 4 4.48% 40.55% 8068 1 8068 1 31 [S 5 4.04% 44.59% 7288 4 7288 4 1130 [C 6 3.12% 47.71% 5616 36 5616 36 1 7 2.51% 50.22% 4524 29 4524 29 1 java/lang/Class 8 2.05% 52.27% 3692 1 3692 1 806 [L; 9 2.01% 54.28% 3624 90 3832 94 77 [C 10 1.40% 55.68% 2532 1 2532 1 32 [I 11 1.37% 57.05% 2468 3 2468 3 1323 [C 12 1.31% 58.36% 2356 1 2356 1 1324 [C 13 1.14% 59.50% 2052 1 2052 1 95 [B 14 1.02% 60.52% 1840 92 1880 94 1 java/lang/String 15 1.00% 61.52% 1800 90 1880 94 77 java/lang/String 16 0.64% 62.15% 1152 10 1152 10 1390 [C 17 0.57% 62.72% 1028 1 1028 1 30 [B 18 0.52% 63.24% 936 6 936 6 4

19 0.45% 63.70% 820 41 820 41 79 java/util/Hashtable$Entry

The following table identifies the class name that appears in the rightmost column for each type that can be allocated:

Type signature Data type

[Z boolean

[B byte

[C Char

[S Short

[I Int

[J Long

[F Float

[D Double

[L object array

In this example, Trace 1332 allocated 18.18% of the total allocated memory. This works out to be 32776 bytes.

The cpu option gives profiling information on the CPU. If cpu is set to samples, you get output containing the results of periodic samples during execution of the code. At each sample, the code path being executed is recorded and you get a report such as this:

CPU SAMPLES BEGIN (total = 714) Fri Aug 30 15:37:16 2002 rank self accum count trace method 1 76.28% 76.28% 501 77 MyThread2.bigMethod 2 6.92% 83.20% 47 75 MyThread2.smallMethod ... CPU SAMPLES END

You can see that the bigMethod() was responsible for 76.28% of the CPU execution time and was being executed 501 times out of the 714 samples. If you use the trace IDs, you can see the exact route that led to this method being called.

Using the HAT tool to interpret heapdumps

HAT is an interactive tool from Sun that helps you interpret heapdumps. It is available at developer.java.sun.com/developer/onlineTraining/Programming/JDCBook/perf3.html#profile. HAT analyzes binary heapdumps produced by Hprof running with the format=b option.

The binary Hprof output file can be passed as an input argument along with a free TCP/IP port number to the HAT program. HAT runs on any Java 2 JDK to display details of the heapdump in a browser. The browser should be aimed at http://localhost:port, where port is the above chosen TCP/IP free port. HAT provides for interactive following of objects to their source and finding instances of specific classes. Additionally, HAT allows viewing and following all Java root (static) objects and objects held within.

Profiling the heap with IBM heapdumps With the IBM JRE, you can enable high performance heap profiling and obtain heapdumps. This involves setting the process environment variable IBM_HEAPDUMP to true. The performance degradation is minimal, generally no more than a 1-5% decrease in response time.

IBM_HEAPDUMP is supported on all IBM JDKs (1.1.8 and up) on distributed platforms, which means that it is supported by the JDKs supplied with WebSphere Application Server 3.0.2.4 and above on the Windows, AIX and Linux platforms, as well as all WebSphere Application Server 4.0x and 5.0x versions.

Is it IBM_HEAPDUMP or IBM_HEAP_DUMP?

Either. To activate the Heapdump feature, you must set either the IBM_HEAPDUMP or the IBM_HEAP_DUMP environment variables to true before you start the JVM.

For releases of the SDK 1.3.1 prior to SR3, you are required to set the variables to true. For more recent releases, any value will work, as long as the variable is set. If you are in doubt, just set IBM_HEAPDUMP to true and it will work in every case.

When a heapdump is written to the heapdump file, you will typically see the following in the standard error (stderr) stream:

Writing Heap dump .... Written Heap dump to somepath.YYYYMMDD.HHMMSS.PID.txt

Enabling IBM heapdumps on WebSphere Application Server 5.0 distributed

To enable IBM heapdump output on Application Server 5.0 distributed, do the following:

1. Start the Admin Console. 2. Navigate to Servers => Application Servers => server_name -> [Configuration Tab] -> Process

Definition => Environment Entries. 3. Select New. 4. In the Name field, enter IBM_HEAPDUMP. In the Value field, enter true. 5. Save the configuration. 6. Restart the application server.

To obtain an incremental IBM heapdump, use the wsadmin command as follows:

1. From the bin directory, type wsadmin 2. Under wsadmin, type:

set jvm [$AdminControl completeObjectName type=JVM,process=server1,*] $AdminControl invoke $jvm dumpThreads

where server1 is the name of the application server. The dump file will be created in the WebSphere Application Server install directory under the name javacore.yyyymmdd.mmmmmm.nnnn.txt.

Enabling IBM heapdumps on WebSphere Application Server 4.x distributed

To enable IBM heapdump output on Application Server 4.x distributed, do the following:

1. Start the Admin Console. 2. Expand the WebSphere Administrative Domain by clicking on the +. 3. Expand Nodes. 4. Expand the node that hosts the problem application server. 5. Expand Application Servers. 6. Select the problem application server. 7. Click the General tab and select Environment. 8. Click Add. 9. In the Name field, enter IBM_HEAPDUMP. In the Value field, enter true. 10. Click Apply and restart the application server.

To obtain an incremental IBM heapdump, use the DrAdmin command as follows:

1. Look in the application server log file for DrAdmin port. For example:

[02.02.19 15:17:07:073 CST] 59edf8e DrAdminServer I WSVR0053I: DrAdmin available on port 1793

2. Wait for the problem to occur, if possible 3. On the command line, type:

DrAdmin -serverPort 1793 -dumpThreads

4. In WebSphere_root\bin or c:\winnt\System32, look for a javacorexxx.yyyyyyyyy.txt file with a timestamp matching the date and time when DrAdmin was executed.

Invoking IBM heapdumps

This section describes other ways of invoking IBM heapdumps.

Invoking with a signal

Another way you can invoke the IBM heapdump mechanism is by sending a signal to the JVM. This is similar to the way you can trigger a Java Hprof dump, as described in Profiling the heap with Hprof.

On distributed UNIX, AIX and Linux

On a Unix, AIX, or Linux system, send a SIGQUIT to the JVM by running the command:

kill -QUIT JVM_Process_ID

where JVM_Process_ID is the process ID of the JVM. The JVM will temporarily stop processing and invoke the heapdump mechanism.

On Windows

On Windows, you can issue a SIGINT by pressing CTRL+Break. However, most Java processes are background processes, especially those for WebSphere Application Servers; therefore, you should use the WebSphere Application Server tools in order to invoke the heapdumps:

• WebSphere Application Server v4: use DrAdmin • WebSphere Application Server v5: use wsadmin

Invoking automatically on an OutOfMemoryError

You can arrange for an IBM heapdump to be generated automatically when an OutOfMemoryError condition is encountered in the JVM by setting the IBM_HEAPDUMP_OUTOFMEMORY environment variable to true before launching the JVM.

If you have enabled verbosegc in the JVM, you should also see the message "totally out of heap space" right before the OutOfMemoryError appears and the IBM Heapdump is triggered. Here is an example of what you will see:

Writing Heap dump .... Written Heap dump to D:\Code\memtest\heapdump.20021104.163757.2312.txt java.lang.OutOfMemoryError

This feature works independently of the IBM_HEAPDUMP or IBM_HEAP_DUMP variables. It is available on the following JVM versions:

• Java 1.3: JDK 1.3.1 SR3 and above • Java 1.4: JDK 1.4.1 and above

Invoking through special code

You can invoke the heapdump mechanism with Java code. This is easily done by invoking the static method HeapDump() in com.ibm.jvm.Dump. In the following example, the code first calls System.gc() to invoke the garbage collector to clean the unreferenced objects off the heap and then calls HeapDump() to generate a dump:

System.gc(); com.ibm.jvm.Dump.HeapDump();

The code invokes the garbage collector before the heapdump because on older JVMs, an IBM heapdump also dumps objects that are ready to be garbage-collected. This pollutes the dump with useless objects. Starting with JVM 1.4.1, the garbage collection is automatically invoked right before the heapdump, which means that the heapdump never contains garbage-collectable objects.

This feature is available on the following JVM versions:

• Java 1.3: JDK 1.3.1 SR3 and above • Java 1.4: JDK 1.4.1 and above

Reading IBM heapdump output

Unlike an Hprof-type heapdump, an IBM heapdump file is simple and homogeneous: except for one

statement at the beginning of the file and one at the end, the entire content consists of one- or two-line entries, all in the same format.

The first line of the heapdump identifies the dumping JVM, as in this example:

// Version: J2RE 1.3.1 IBM Windows 32 build cn131-20021107

The last line displays totals of each category of allocation in the dump: classes, objects, arrays of primitives, and arrays of objects:

// EOF: Total: 212945 Classes: 5141 Objects: 142218 ObjectArrays: 16193 PrimitiveArrays: 49393 //

The body of the dump consists of entries in the following form:

hex-address1 [size] type object-name hex-address2 hex-address3 hex-address4 ... hex-addressN

where:

• hex-address1 is the location within the JVM of the allocated object. • size is the amount of memory it represents, in bytes. This is only the amount of memory allocated to

represent the object itself, and does not include memory used by other objects to which it refers. • type is the meta-class of the object: class, object, array of object, or array of primitives • object-name is the name of the class, or the name of the object, that was allocated • The next hex addresses (if any), indented on the next line, represent the locations of objects pointed to

by this object.

For example:

0x007ede18 [256] class java/lang/Package 0x00a24190 0x00a24200 0x00a242c8

means that the class java/lang/Package was loaded at location 0x007ede18, that it takes up 256 bytes, and that this class object holds references to three other objects. You can search the heapdump for details on these other objects. Searching on the address 0x00a24190, you'll find the following entry:

0x00a24190 [56] java/util/HashMap 0x00a24158

In other words, the class java/lang/Package holds a reference to an object of class java/util/HashMap, which takes up 56 bytes. This object in turn has a reference to another object at location 0x00a24158, which takes up an additional amount of memory.

Would removing the class java/lang/Package, through garbage collection, have removed this particular instance of java/util/HashMap from memory? Not necessarily. In order to determine this, you would have to search the dump for all objects that contained references to the HashMap's location of 0x00a24190.

Examples of the other kinds of objects represented in IBM heapdumps are arrays of primitives:

0x00a24570 [128] primitive array

and arrays of objects:

0x00a24158 [56] array of java/util/HashMap$Entry

Here are a few notes about the IBM heapdump files:

• There is no particular ordering in the entries, such as time allocated, size, or object relationship. • There is no combining of data by object type. Each allocation of a String object, for example, is

shown separately. • There is no indication of where, in the Java code, an allocation occurred. • IBM heapdumps do not contain GCRoot information. This means that:

• Objects ready for garbage collection will be included in the dump but not identified as such. • Objects referenced only by local methods and with JNI global references will show up as

independent trees (generally thousands more than really exist).

Summary: Unlike Sun heapdumps, IBM heapdumps do not map memory allocation to Java code, nor do they include tables that clearly show which threads are responsible for taking up the biggest chunk of memory. You'd have to be a cyborg, or seriously disturbed individual, to be able to read the raw file and understand what objects were related to a memory problem. However, on the plus side, IBM heapdumps represent a much lighter performance burden on the JVM and may therefore be more acceptable for diagnosing memory problems in a production environment.

But, you may ask, if you can't interpret them, what good are they? You need a tool to interpret them. That tool is HeapRoots, discussed in the following section.

Interpreting IBM heapdump output using the HeapRoots tool

The HeapRoots tool is a post-processor program that reads a dump produced with the IBM_HEAPDUMP option, collates information based on object size, occurrences, and links, and enables you to view the information in various ways. HeapRoots reads only heaps created using the IBM_HEAPDUMP option on the JVM; it does not process Hprof-type heapdumps.

To obtain the HeapRoots tool, go to http://www.alphaworks.ibm.com/tech/heaproots. Keep in mind that HeapRoots is provided strictly as-is.

Once you have the HeapRoots JAR file, you can invoked it at any command prompt as follows (a Java version 1.2 or later must be in the path):

java -jar HRnn.jar heapdumpfilename

where HRnn is the name of the HeapRoots jar file, which depends on the version. The output is large enough that you will want to redirect the output to a file or (on Unix) pipe it to a more or similar utility.

A fragment of HeapRoots output follows:

0x007ec118 [1,364,912/29,455] class java/util/jar/JarFile 2 children smaller than 1,048,576 total size/desc: 97,888/1,974 0x00951650 [1,266,856/27,480] java/util/Vector 0x02f066c0 [1,266,824/27,479] array of java/lang/Object 367 children smaller than 1,048,576 total size/desc: 1,265,288/27,112 0x0094fc60 [28,612,016/507,751] array of com/ibm/ws/classloader/ReloadableClassLoader$CacheEntry 6 children smaller than 1,048,576 total size/desc: 336/6 0x01fd6fe8 [28,611,968/507,750] com/ibm/ws/classloader/ReloadableClassLoader$CacheEntry 2 children smaller than 1,048,576 total size/desc: 184/1 0x034a9b10 [28,611,936/507,749] com/ibm/ws/classloader/JarClassLoader 10 children smaller than 1,048,576 total size/desc: 4,184/61 0x034a9a10 [28,607,696/507,679] com/ibm/ws/classloader/CompoundClassLoader 15 children smaller than 1,048,576 total size/desc: 263,632/3,087

This fragment shows two root objects, objects which have no parents, or objects which refer to them. These are the class jar file, and an array containing instances of the com/ibm/ws/classloader/ReloadableClassLoader$CacheEntry class. The JarFile class is a parent (including all of its descendants) of 29,455 objects, taking over 1 MB of memory.

The array of CacheEntry objects is responsible for 597,751 objects and more than 28 MB of memory. It points to seven objects directly, but of these, six take up less than the 1 MB threshold. In fact the collective size of these six objects, including their descendants, is only 336 bytes. The seventh object is an instance of com/ibm/ws/classloader/ReloadableClassLoader$CacheEntry, no doubt an element in the array. Obviously it is responsible for the bulk of memory held onto by its parent. It has three immediate children, of which one is, in turn, responsible for the rest of its parent's subtree. The numbers on the left help you keep track of how far into a root object's sub-tree you are.

By default, root objects are displayed in order by address rather than size.

If you were to browse the entire contents of the HeapRoots output, you would have to go quite far (at least in this case) to find what most of that 28 MB is being used for -- even deeper than the 64 levels to which HeapRoots by default limits its display! The HeapRoots command gives you options to refine what you want to see and how you want to see it. You can see these options by entering the command without a file name:

HeapRoots version 2.0.0 Usage: java -jar [opts] [opts] opts are: -e - file encoding (use ISO8859_1 for ASCII on OS/390) -t - set threshold of object size, default 1048576 -d - set max depth for output, root-depth=0, default 64 -a - only dump object at specified address -i - interactive use -v - verbose mode

You could iteratively display the output and re-enter the HeapRoots command with these options to get to the heart of a memory problem. However, a better, faster approach is to use the interactive mode of HeapRoots, described below, to refine the data.

Memory and performance problems with HeapRoots

HeapRoots itself uses a large amount of memory in order to construct its model of memory usage. The precise amount of memory it uses depends upon the data in the input heapdump. It makes sense to use HeapRoots on a machine with a large amount of main memory. If you still experience very slow performance, or HeapRoots exits with java.lang.OutOfMemoryError exceptions, try adding the -XmxN option to the invocation of HeapRoots, where N is the amount of memory, in megabytes, to give to the JVM running HeapRoots. A rule of thumb given by the HeapRoots author is to specify about 80% of the main memory amount of the machine on which you are running HeapRoots. For example, on a machine with 512 MB of main memory:

java -Xmx400m -jar Heaproot200.jar heapdump.txt

For more details on HeapRoots memory usage and troubleshooting in general, see the HeapRoots reference listed at the end of this topic.

Using HeapRoots in interactive mode

The interactive mode of HeapRoots enables the tool to build a tree of heap objects in memory, and enables you to query that tree quickly in various ways without reloading the raw heap file. It does not mean you can interactively explore a running JVM's heap; you need to specify an existing heapdump file you want to process.

To launch the HeapRoots command prompt and load a heapdump file, enter the HeapRoots command with the -i option:

java -jar heaproots.jar OutputFile.txt -i

where heaproot.jar is the name of the HeapRoots jar file on your system and OutputFile.txt is the file that will contain the results.

You will be rewarded with a set of statistics about your heap:

Comments : // Version: J2RE 1.3.1 IBM Windows 32 build cn131-20021107 // EOF: Total: 639922 Classes: 7908 Objects: 453837 ObjectArrays: 62321 PrimitiveArrays: 115856 // # Objects : 639,922 # Refs : 988,571 # Unresolved refs : 33 Heap usage : 36,790,984 Est. Heap size : 397,090,048 Extra stats : unavailable before processing Memory Usage : 23/30 mb

Est. Heap size is the total difference between the lowest and highest address in the dumped JVM's memory. Since an Application Server JVM typically uses multiple heaps in separate locations, this figure is liable to be several times the real amount of Application Server memory usage and should be ignored. Memory Usage denotes the memory used by the HeapRoots tool itself.

The first command you should execute after launching HeapRoots is the p command. This command has two effects:

• It causes HeapRoots to construct an object tree in memory for further analysis. • It creates a compressed state file containing the tree information, so that the next time you launch

HeapRoots against this dump, the tree will not have to be reconstructed. (HeapRoots will first look for the state file when it is launched).

Now you can filter your view of the dump in various ways. To see a basic picture of object trees for each root object, similar to what you would see when dumping the output non-interactively, execute the d command. To see a list of available commands, type help:

> help Help: oa/os/on/ot/od show info on objects by addr/size/name/total size/descendants ts/tc show info on types by total size/count gs/gc show gaps by size/count i show info on a single object p process d dump from roots/single object stats show stats save save state for quick reload clear clear processed data return repeat last command x exit

Enter: o[a,s,n,t,d], t[s,c], g[s,c], i, p, d, x or help for more info

Where do you start when trying to interpret a heapdump? A good first step is to view root objects, and their total tree sizes only. To do this, use the ot command.

> ot Enter name to filter on or '-' for no filterting [-] Enter combination of types to show, R for Roots, A for Artificial Roots, N for Non-Roots [R,N,A] RA Enter address range in format 'M','L-U','-U' or 'L-' [0x00000000-0xfffffff8] Enter range of lines to print in format 'M','L-U','-U' or 'L-' [1-25] Addr. Size Root-Owner Subtree Size Descend. Name --------------------------------------------------------------------------------- R 0x0094fc60 48 - 28,612,016 507,751 array of com/ibm/ws/classloader/ReloadableClassLoader$CacheEntry R 0x007ec118 256 - 1,364,912 29,455 class java/util/jar/JarFile R 0x007e5e18 256 - 806,928 15,020 class com/ibm/jvm/ExtendedSystem R 0x007e5d18 256 - 209,512 2,595 class java/lang/System R 0x13d00b18 256 - 105,824 120 class com/ibm/rmi/iiop/CDROutputStream R 0x01526308 32 - 66,128 478 java/util/HashMap$Entry R 0x05228760 32 - 65,744 466 java/util/HashMap$Entry R 0x02ee35b0 32 - 65,360 440 java/util/HashMap$Entry R 0x007e9018 256 - 53,504 1,558 class sun/io/CharacterEncoding A 0x04a47228 72 - 51,312 56 com/ibm/ws/management/connector/soap/SOAPConnection R 0x029d91b8 24 - 51,288 1,791 array of java/util/ResourceBundle R 0x0131d628 64 - 49,600 1,079 array of java/lang/Object R 0x114c2318 256 - 43,592 659 class java/util/TimeZoneData A 0x00a83a48 56 - 40,488 1,026 java/util/HashMap R 0x14353a18 256 - 37,368 851 class java/beans/Introspector R 0x02fd0070 32 - 30,376 2 java/io/StringWriter R 0x030ed600 32 - 28,560 1 java/lang/String R 0x038604c0 28,528 - 28,528 0 primitive array R 0x114c8f18 256 - 28,120 604 class java/net/URLConnection R 0x030f8730 40 - 23,312 5 org/apache/soap/transport/TransportMessage R 0x038c8020 22,960 - 22,960 0 primitive array R 0x03854648 22,960 - 22,960 0 primitive array A 0x009236c0 32 - 22,408 703 java/lang/ref/Finalizer R 0x02b50dd8 24 - 17,296 1 java/lang/StringBuffer R 0x02cfdff8 24 - 17,256 1 java/lang/StringBuffer (27051 matches but only displayed up to 25.) Matched objects : 27,051 / 639,922 Total Size : 1,882,320 / 36,790,984 Total Subtree Size : 36,790,984 Total Descendants : 612,871 Enter: o[a,s,n,t,d], t[s,c], g[s,c], i, p, d, x or help for more info >

In this example, you entered ot. You accepted the default values except for types to show, which you modified from R,N,A to R,A. This means that you want to see Root and Artificial Root objects only. (Note: An Artificial Root is not a root object in the strictest sense, since another object has a reference to it, but one which HeapRoots detects as being the holder of an object tree, nonetheless. It may be, for example, that it is

only referred to by a backward pointer from one of its own children.) This results in a list of the roots of the 25 largest object trees. In this example you can see that the array of com/ibm/ws/classloader/ReloadableClassLoader$CacheEntry that you saw in the initial dump is indeed far and away the root of the biggest object tree. To dig further into this object tree only, you can use the d command again, but this time, only show the descendants of a single root:

Enter: o[a,s,n,t,d], t[s,c], g[s,c], i, p, d, x or help for more info > d Enter threshold [1048576] Enter max depth or -ve for unlimited [64] Enter 0x to dump from one address or any value for all roots [0x0094fc60] 0x0094fc60

However, you may find that even dumping one object tree down to its leaves presents too much information to be useful. To see quickly which sub-trees are responsible for the greatest amount of memory limit the depth (as here, to 4):

Enter: o[a,s,n,t,d], t[s,c], g[s,c], i, p, d, x or help for more info > d Enter threshold [1048576] Enter max depth or -ve for unlimited [64] 4 Enter 0x to dump from one address or any value for all roots [0x0094fc60] 0x0094fc60 threshold is 1048576 bytes max depth is 4 levels Dumping object at 0x0094fc60 0x0094fc60 [28,612,016/507,751] array of com/ibm/ws/classloader/ReloadableClassLoader$CacheEntry 6 children smaller than 1,048,576 total size/desc: 336/6 0x01fd6fe8 [28,611,968/507,750] com/ibm/ws/classloader/ReloadableClassLoader$CacheEntry 2 children smaller than 1,048,576 total size/desc: 184/1 0x034a9b10 [28,611,936/507,749] com/ibm/ws/classloader/JarClassLoader 10 children smaller than 1,048,576 total size/desc: 4,184/61 0x034a9a10 [28,607,696/507,679] com/ibm/ws/classloader/CompoundClassLoader 15 children smaller than 1,048,576 total size/desc: 263,632/3,087 0x014bfdb0 [28,343,952/504,577] com/ibm/ws/classloader/ExtJarClassLoader 15 children smaller than 1,048,576 total size/desc: 17,280/90 0x014bfdb0 [28,343,952/504,577] com/ibm/ws/classloader/ExtJarClassLoader 0x014bfdb0 [28,343,952/504,577] com/ibm/ws/classloader/ExtJarClassLoader 0x014bfdb0 [28,343,952/504,577] com/ibm/ws/classloader/ExtJarClassLoader

From this dump, you can see that the majority of memory appears to belong to an instance of class com/ibm/ws/classloader/ExtJarClassLoader at address 0x14bfdb0. The fact that it appears multiple times and at different levels presumably means that there are multiple references to the same object. At this point you can further investigate the heap by running the d command again, this time starting with the ExtJarClassLoader object's address:

Enter: o[a,s,n,t,d], t[s,c], g[s,c], i, p, d, x or help for more info > d Enter threshold [1048576] Enter max depth or -ve for unlimited [4] Enter 0x to dump from one address or any value for all roots [0x0094fc60] 0x14bfdb0 threshold is 1048576 bytes max depth is 4 levels Dumping object at 0x014bfdb0

(Root is 0x0094fc60) 0x014bfdb0 com/ibm/ws/classloader/ExtJarClassLoader 15 children smaller than 1,048,576 total size/desc: 17,280/90 0x00911818 com/ibm/ws/classloader/ProtectionClassLoader 6 children smaller than 1,048,576 total size/desc: 568/6 0x00913fc0 com/ibm/ws/bootstrap/ExtClassLoader 14 children smaller than 1,048,576 total size/desc: 618,448/11,130 0x00ba0e18 java/util/Vector 0x03d82440 array of java/lang/Object 5948 children smaller than 1,048,576 total size/desc: 5,321,216/69,051 0x00913fc0 com/ibm/ws/bootstrap/ExtClassLoader 0x00911818 com/ibm/ws/classloader/ProtectionClassLoader

There is an array of Objects at 0x03d82440. It is holding on to 5948 small objects -- smaller than 1 MB -- that account for 5 MB of memory - almost one fifth of the heap.

What would it take for this object array to get garbage-collected? You can use the i command, or single object dump, to see a list of everyone holding references to this object, as well references this object has to others. Because of the way that HeapRoots displays output, listing the parent address after all the child addresses, you need to list all the references, or at least the last few, to make sure that you see the parents listed. A range of n- means "from line n forward." You knew frhom a previous command that this object had 5950 children and one parent, so here you are asking to see the last six children plus the parent:

Enter: o[a,s,n,t,d], t[s,c], g[s,c], i, p, d, x or help for more info > i Enter 0x for object to show info on [NONE] 0x03d82440 Enter range of lines to print in format 'M','L-U','-U' or 'L-' [5940-] 5945- (Displaying from match 5945.) REFERENCES FROM / CHILDREN of 0x03d82440 Addr. Size Name ------------------------------------------------------------------ 0x0090f510 256 class com/ibm/ws/exception/RuntimeWarning 0x0090f610 256 class com/ibm/ws/exception/ConfigurationWarning 0x0090f910 256 class com/ibm/ws/exception/ConfigurationError 0x0090f810 256 class com/ibm/ws/exception/WsException 0x0090f710 256 class com/ibm/ws/exception/WsNestedException 0x0090fa10 256 class com/ibm/ws/runtime/WsServer REFERENCES TO / PARENTS of 0x03d82440 Addr. Size Name ------------------------------------------------------------------ 0x00ba0e18 32 java/util/Vector Total refs : 5,951 Parents, Children : 1 , 5,950 Root Type : N Root-Owner : 0x0094fc60 Total size : 27,843,448 Descendants : 494,139 Size : 40,976 / 36,790,984

To see where the rest of the memory is going, you can dive further into the tree by running the d command against this address, repeating the process until you see another single object (or array) that accounts for a major chunk of memory--in other words, where the amount of the total size for an object drops dramatically beyond that object's own level.

You may not be able to discover a single object or array that accounts for a large amount of memory. For example, a memory leak may be caused by allocation of the same kind of object, over time, under different parents. This means that the leaking, non-garbage-collected objects are scattered throughout the JVM heap.

To look for this kind of leak, use the ts and tc commands to see the which kinds of objects, taken together, are accounting for the most amount of memory and the largest number of objects, respectively:

> ts Enter name to filter on or '-' for no filterting [-] Enter range of lines to print in format 'M','L-U','-U' or 'L-' [1-25] Approximate matches ... Count Size Name ------------------------------------------------------------------ 115,856 14,669,960 primitive array 115,806 3,705,792 java/lang/String 71,704 2,294,528 java/util/HashMap$Entry 19,608 1,739,088 array of java/util/HashMap$Entry 25,508 1,459,288 array of java/lang/Object 34,286 1,097,152 java/util/Hashtable$Entry 39,387 945,288 com/ibm/ejs/util/Bucket 15,835 886,760 java/util/HashMap 14,156 679,488 com/ibm/etools/emf/ref/impl/RefBaseObjectHelperImpl 5,694 545,312 array of java/util/Hashtable$Entry 5,506 220,240 com/ibm/etools/emf/ref/impl/FastOwnedListImpl 9,137 219,288 java/util/ArrayList 4,344 208,512 java/util/Hashtable 6,164 197,248 com/ibm/ejs/util/cache/Bucket 10,707 171,312 java/lang/Integer 9 157,728 array of com/ibm/ejs/util/Bucket 2,278 145,792 org/apache/struts/util/FastHashMap 2,180 139,520 com/ibm/websphere/pmi/stat/CountStatisticImpl 5,523 132,552 java/util/jar/Attributes$Name 33 132,424 array of com/ibm/disthub/impl/util/FastHashtableEntry 3,043 121,720 com/ibm/etools/emf/ecore/impl/ENamedElementImpl 1,233 118,368 com/ibm/websphere/pmi/stat/TimeStatisticImpl 7,232 115,712 java/lang/Object 852 102,240 com/ibm/etools/emf/ecore/impl/EAttributeImpl 2,438 84,216 array of java/lang/String (10864 matches but only displayed up to 25.) Matched types : 10,864 / 10,864 Usage count : 639,922 / 639,922 Total size : 36,790,984 / 36,790,984

If a memory leak is occurring, you will often see one object type that accounts for the vast majority of the number of objects, memory, or both. If such an object's type is a Java or WebSphere base class, you may need help from WebSphere technical support to understand the source of the problem. However, if the leaking object is an application's class, it is likely that the leak is caused by application code. In this case, after identifying the object's class, you can go back through the object tree produced by the d command to find object instances, then use the i command to find all of its parents, as discussed above. At that point, you will still need to contact your application's developers to have them determine and correct the source of the leak in their code, but this information will give them a big head start.

An example of using HeapRoots to diagnose a memory leak

Let's take a look at what HeapRoots tells us in the case of an actual memory leak happening within WebSphere. In this scenario, memory usage sharply and steadily rises when a specific JSP is called. Even after the JSP request returns or is stopped, the memory usage stays at a high level, and is not relieved by garbage collection -- a clear indication of a memory leak.

Trigger a heapdump while memory usage is high, and then use HeapRoots in the interactive mode to diagnose the problem. Let's show objects starting with the roots to an arbitrary depth of 5:

Enter: o[a,s,n,t,d], t[s,c], g[s,c], i, p, d, x or help for more info

> d Enter threshold [1048576] Enter max depth or -ve for unlimited [5] Enter 0x to dump from one address or any value for all roots [NONE] threshold is 1048576 bytes max depth is 5 levels Dumping roots 0x00565d18 [8,903,424/154,290] class java/lang/System 4 children smaller than 1,048,576 total size/desc: 67,560/293 0x007c6218 [8,873,032/154,268] com/ibm/ejs/security/SecurityManager 0x006941d8 [8,873,008/154,267] java/lang/Thread 3 children smaller than 1,048,576 total size/desc: 104/0 0x007c0a48 [8,872,888/154,265] java/util/HashMap 0x007c0a08 [8,872,832/154,264] array of java/util/HashMap$Entry 3 children smaller than 1,048,576 total size/desc: 1,256/21 0x010d7170 [8,871,512/154,239] java/util/HashMap$Entry 1 children smaller than 1,048,576 total size/desc: 40/1 0x00693fc0 [7,630,200/118,950] com/ibm/ws/bootstrap/ExtClassLoader 14 children smaller than 1,048,576 total size/desc: 1,526,848/30,723 0x009440b0 [6,112,512/88,215] java/util/Vector 0x0151f648 [6,112,480/88,214] array of java/lang/Object 3925 children smaller than 1,048,576 total size/desc: 3,679,208/39,002 0x0069f880 [2,024,640/34,702] java/lang/ThreadGroup 3 children smaller than 1,048,576 total size/desc: 289,088/5,091 0x00edc638 [1,735,496/29,607] array of java/lang/ThreadGroup 0x00edc658 [1,735,464/29,606] java/lang/ThreadGroup 2 children smaller than 1,048,576 total size/desc: 14,344/275 0x016283b0 [514,060,576/1,285] array of [LMemEater; 1285 children smaller than 1,048,576 total size/desc: 514,020,560/0

You can see that there are only two root objects, the System class and an array of an array of something called MemEater objects. (Don't expect class names to be this descriptive!) The total tree size of the MemEater array is many times larger than the System class and its descendants. Interestingly, the MemEater array object has no large (greater than 1 Mbytes) descendants; it appears that the sheer number of its children accounts for its size.

Let's look at memory another way: what are the largest objects? Let's use the os option to look at the ten largest objects, not counting their descendants:

Enter: o[a,s,n,t,d], t[s,c], g[s,c], i, p, d, x or help for more info > os Enter name to filter on or '-' for no filterting [-] Enter combination of types to show, R for Roots, A for Artificial Roots, N for Non-Roots [R,N,A] Enter address range in format 'M','L-U','-U' or 'L-' [0x00000000-0xfffffff8] Enter range of lines to print in format 'M','L-U','-U' or 'L-' [10-] -10 Addr. Size Root-Owner Subtree Size Descend. Name --------------------------------------------------------------------------------- N 0x0295d520 400,016 0x016283b0 400,016 0 array of MemEater N 0x204b1e00 400,016 0x016283b0 400,016 0 array of MemEater N 0x029befb0 400,016 0x016283b0 400,016 0 array of MemEater N 0x0e9a20a0 400,016 0x016283b0 400,016 0 array of MemEater N 0x0ea03b30 400,016 0x016283b0 400,016 0 array of MemEater N 0x0f9a7c40 400,016 0x016283b0 400,016 0 array of MemEater N 0x0fa096d0 400,016 0x016283b0 400,016 0 array of MemEater N 0x0fe9d590 400,016 0x016283b0 400,016 0 array of MemEater N 0x2038ce50 400,016 0x016283b0 400,016 0 array of MemEater N 0x0feff020 400,016 0x016283b0 400,016 0 array of MemEater (185166 matches but only displayed up to 10.)

Matched objects : 185,166 / 185,166 Total Size : 524,926,312 / 524,926,312 Total Subtree Size : 1,674,611,560 Total Descendants : 10,982,256

So the ten largest single objects are arrays of MemEater objects, and you can see from the address in the Root-Owner column that they do indeed belong to the monster array of arrays you saw earlier.

So now you have a strong hint that an array of arrays of MemEater objects is causing most of your memory to be consumed. Now you have to take the next step, which HeapRoots cannot do for us -- determine what Java code is causing this to happen. In this case, you know that the memory grows steeply when you access a certain JSP file. Looking at the jsp, you see the following code:

Very Simple JSP Bad JSP2

If you happen to have the source code for MemEater.java, you can look at the MemEater.eatMem() method:

public class MemEater { static String leakString= " .,,; \n" + " LjLfji \n" + " E;,itGDt \n" + " :;tttt... iLDDLjLGiiii \n" + " tGjGtifLfLLtit; iiDLjiijffGi \n" + " GjtjiGGDLjjjffGDi iGDfftijfDDii \n" + " iEEDKEi,ifffjfLLLGtt iiDLftiijLDDi \n" + " .; .,tffjtfLLDti iDDDt,,;jGDi \n" + " .;Dt ..:tGGjjjfLDEti DDi,;;i;Gi \n" + " :L;L;. iDGjjttjDEt KDj,;fj.iK \n" + " ,L Gt. tEGfttjfDEj KtLGjtjfLK \n" + " ,jff,. t;fftjj;t ELLfGGLftD \n" + " ... ttGjtjfGi EWKDEEEDDD \n" + " tDGftfDEWKi;;ijDi \n" + " ijtDKWGjtLL;,;,jEiK \n" + " KDt,,;;jGDj,;;,jEiK \n" + " tGfjjti;i#Kt,;;iLjKt \n" + " tELttjjDEDfjLLLGfGEt \n" + " :L: ttELjtfEfD#WjWt;;tLDt \n" + " .jGf: ttELftfKKiWt,;,::tGDt \n" + " ji if ttLLtfWf,;:,:,ifE \n" + " t;,ij. ttLGGGfjjfjjGt \n" + " .,t;. ttEGEEt \n"; static MemEater[][] ra; String member; public MemEater(String st) { this.member=new String(leakString); } public static void eatMem () {

ra = new MemEater[10000][100000]; for (int i=0; i < = 10000;i++) { for (int j=0; j < = 100000;j++) { ra[i][j]=new MemEater(leakString); } } } }

This is a much simplified version of the kind of code that can cause a memory leak. You can see the method called by the JSP, eatMem, will create a two-dimensional array containing 10000*100000 MemEater objects, each with its own copy of a long string. And the array is a class variable -- it will stay in memory until the MemEater class is garbage-collected, if ever!

Summary of HeapRoots

The HeapRoots tool constructs a snapshot of a JVM's memory space, builds a tree representing the relationship of objects in memory, and gives you various ways of viewing that information to aid you in determining which objects are not being freed. It works on dumps created by the IBM_HEAPDUMP environment variable, which is only recognized by IBM-brand JVMs. Unlike heapdumps created by the standard Hprof-type heapdump, IBM heapdumps store no information about where in Java code objects are being allocated. It is up to you and your application developer to make that connection. The advantage is that it takes very little overhead to enable IBM heapdumps, so it is practical for diagnosing problems in performance-sensitive production environments. A good approach in a production environment would be to try to diagnose a problem using the IBM heapdump approach, and then enable Hprof dumps if you are unable to determine the problem's cause.

Analyzing IBM dumps interactively with HeapWizard

The HeapWizard tool provides a convenient GUI for navigating and interpreting IBM heapdumps. Like Heaproots, HeapWizard is a post-processor application that works upon dumps created by the IBM_HEAPDUMP option of IBM JVMs. It does not work with Hprof heapdumps. HeapWizard is available at ftp://ftp.software.ibm.com/software/websphere/info/tools/heapwizard/HeapWizard.jar.

HeapWizard.jar is an executable jar file. To start HeapWizard, use the following command:

java -Xms128M -Xmx512M -jar HeapWizard.jar

Using HeapWizard

HeapWizard reads IBM heapdump files created using the IBM_HEAPDUMP environment variable. Once the HeapWizard application starts, select File => Open and browse to an IBM heapdump file. Once you have opened the file, you will see a window containing a log of summary information created by HeapWizard as it constructs an object tree (see Figure 2).

Figure 2: Heap Wizard analysis summary

Close this window and bring focus back to the main HeapWizard pane. Double-click the HeapDump icon to see the Classes by Size and Objects by Size views (see Figure 3).

Figure 3: The main panel, tree view

The Classes by Size tree lists the classes whose instances are responsible for occupying most of the memory. The classes are sorted in descending order by the total size, including descendants, of root objects of that class. In the following example, the class Java.lang.String comprises 44,352 total objects in memory, which along with their children occupy 5,441,832 bytes of memory. Of those 44,352 objects, the total cumulative amount of memory for those String objects that are roots is 2,158,000 bytes.

Some entries begin with class, having count 1. These entries represent memory held by actual class objects themselves, not their instances (see Figure 4).

Figure 4: The classes by size view

As you already observed, in the case of a memory leak, the leak will often be evidenced by a single class whose instances far outnumber the next smaller one.

The Objects by Size view lists individual objects in order by the amount of memory, including descendants, that they occupy. In the following example, a single instance of java.lang.ref.Finalizer$FinalizeThread occupies, with its children, 1,903,272 bytes. By itself the object uses a mere 72 bytes (see Figure. 5).

Figure 5: The object tree view

Double-clicking an object in the tree displays a sub-tree of its immediate children, again listed in order by total size. Double-clicking the "fattest" child and grandchild in this tree, for example, produces Figure 6:

Figure 6: Expanding an object in the heap tree

In this example, you can see that the the FinalizerThread object's largest child is an instance of com.ibm.CORBA.iiop.ClientDelegate. Almost all of the ClientDelegate object's memory is in turn held by an attribute of type com.ibm.CORBA.iiop.ORB. Each of these objects is by itself quite small. You could keep on expanding the biggest children until you see a single object that itself is very large, or that is an immediate parent of a large collection of objects.

HeapWizard also provides a command line interface. The readme file contained in the HeapWizard jar file lists the invocation syntax and provides some examples.

An example of heap analysis

So what should you expect to see in the case of an actual leak? Taking the same scenario as in our earlier discussion of the HeapRoots tool, generating an IBM heapdump, and opening it with HeapWizard, you see the Class tree shown in Figure 7.

Figure 7: Class tree

A single root object, the MemEater class, is the parent of the greatest amount of memory by far. This makes sense. Remember that its eatMem() method stores variables in a static attribute -- one that will not be removed unless and until the class itself is removed from memory.

If you look at the Objects tree, you again see that the MemEater class object is the biggest single holder of memory, although by itself it occupies only 256 bytes of memory. If you expand the MemEater class object, you only see two elements: a String object (not shown) occupying a total of 32 bytes, and an array of 1000-element arrays of MemEater instances -- the two-dimensional static array you saw earlier. This two dimensional array occupies about 52 megabytes of memory -- almost all of the memory in our bloated MemEater class! You can see that each of its elements occupies about 53 kilobytes of memory (see Figure 8).

Figure 8: Expanding the largest object in the object by size view

HeapWizard Summary

HeapWizard and HeapRoots are alternative tools for analyzing an IBM heapdump to determine the source of a memory leak. Some users may prefer the interactive GUI provided by HeapWizard. Both tools have the advantage of using as input the lightweight, non-performance-impacting IBM heapdump; both share the disadvantages of not working with standard "Hprof" dumps, and of not associating leaking objects to Java code, since that information is not stored in IBM heapdumps. As with HeapRoots, a reasonable approach would be to try diagnosing a memory leak first by generating an IBM heapdump, then analyze it using HeapWizard. If that is unsuccessful, a next step would be to enable and generate Hprof dumps, and them analyze them manually or with HAT, as described above.

Conclusion Memory leaks are among the most difficult Java application problems to resolve. This is because the symptoms are varied and difficult to reliably reproduce. However, we have outlined a step-by-step approach that will help you to discover memory leaks and pinpoint their sources. Our aim is to make fixing memory leaks more of a science and less of an art.

Part 1 of this paper has concentrated on techniques for investigating memory leaks in WebSphere Application Server for AIX and Windows. Part 2 will focus on the z/OS platform.

About the authors The authors of this paper are:

Steve Eaton (IBM Austin, TX) Steve Eaton has been part of the WebSphere Application Server technical support team for three years.

Frederic Mora (IBM Poughkeepsie, NY) Frederic has been involved in development for ten years and in WebSphere testing for three years. He is now providing support to WebSphere on zSeries customers.

Hany Salem (IBM Austin, TX) Hany Salem is the lead serviceability architect for WebSphere Application Server.

Acknowledgments

This paper benefited greatly from the help of several people. The authors would like to extend their thanks to:

Michel Betancourt (IBM Raleigh, NC) Michel graduated from Florida International University two years ago and has been supporting WebSphere Application Server ever since.

Jim Cunningham (IBM Poughkeepsie, NY) Jim is a performance analyst working on WebSphere for zOS. He has worked on WebSphere performance for the past five years.

Phillip Helm (IBM Raleigh, NC) Phil is the team lead for WebSphere Application Server for z/OS Level 2. He has been in support and service for IBM HTTP Server and WebSphere Application Server for z/OS for 4 years.

Keith Kopycinski (IBM Poughkeepsie, NY) Keith comes from WebSphere Development. He is a technical lead on the WebSphere Level 2 support team and is responsible for identifying and resolving serviceability issues for WebSphere on z/OS.

Arun Kumar (IBM Austin, TX) Arun has been with WebSphere Development and Service organization for a number of years. He works on enhancing Problem Determination and Serviceability characteristics of WebSphere.

David Screen (IBM Hursley, UK) Dave joined IBM in the Java Technology Centre Service team about 18 months ago. He currently works in the Process Automation (Build) team. He develops HeapRoots in his free time because getting to program some Java is fun.

Ron Verbruggen (IBM Raleigh, NC) Ron is a WebSphere Senior Software Engineer. He specializes in WebSphere Serviceability and Support.

Finding Java memory leaks in WebSphere Application Server...

Documents

Transcript of Finding Java memory leaks in WebSphere Application Server...