Department of Computer Sciences Cork: Dynamic Memory Leak Detection with Garbage Collection Maria...

42
Department of Computer Sciences Cork: Dynamic Memory Leak Detection with Garbage Collection Maria Jump Kathryn S. McKinley {mjump,mckinley}@cs.utexas.edu

Transcript of Department of Computer Sciences Cork: Dynamic Memory Leak Detection with Garbage Collection Maria...

Page 1: Department of Computer Sciences Cork: Dynamic Memory Leak Detection with Garbage Collection Maria Jump Kathryn S. McKinley {mjump,mckinley}@cs.utexas.edu.

Department of Computer Sciences

Cork: Dynamic Memory Leak Detection with Garbage Collection

Maria JumpKathryn S. McKinley

{mjump,mckinley}@cs.utexas.edu

Page 2: Department of Computer Sciences Cork: Dynamic Memory Leak Detection with Garbage Collection Maria Jump Kathryn S. McKinley {mjump,mckinley}@cs.utexas.edu.

11-Dec-2006 UMCP 2

Department of Computer Sciences

Best case : increases GC workload Worst case: systematic heap growth

causes crash after days of execution

A memory leak in a garbage-collected language occurs when a program inadvertently maintains references to objects that it no longer needs, preventing the collector from reclaiming space.

Cork accurately pinpoints systematicheap growth completely online

Page 3: Department of Computer Sciences Cork: Dynamic Memory Leak Detection with Garbage Collection Maria Jump Kathryn S. McKinley {mjump,mckinley}@cs.utexas.edu.

11-Dec-2006 UMCP 3

Department of Computer Sciences

Cork’s Solution

1. Summarize heap growth by calculating type points-from graph Piggybacks on full-heap object scan Summarizes the heap by type

2. Interpret the summarization using differencing

3. Generate debugging reports Candidate Report Slice Report Allocation Site Report

Page 4: Department of Computer Sciences Cork: Dynamic Memory Leak Detection with Garbage Collection Maria Jump Kathryn S. McKinley {mjump,mckinley}@cs.utexas.edu.

11-Dec-2006 UMCP 4

Department of Computer Sciences

3

Type Points-From Graph

Heap

3

4

2

2

1

4

11

1

TPFG

=instance =type

=HashTable =Queue =PQueue =Company =People

Page 5: Department of Computer Sciences Cork: Dynamic Memory Leak Detection with Garbage Collection Maria Jump Kathryn S. McKinley {mjump,mckinley}@cs.utexas.edu.

11-Dec-2006 UMCP 5

Department of Computer Sciences

12

Differencing TPFGs

1

2

1

2

2

3

2

1

TPFGi+1

TPFGi+2

1

4

1

1

2

1

2

33

4

TPFGi

1

1

1

1

1

1

1

1

1

Page 6: Department of Computer Sciences Cork: Dynamic Memory Leak Detection with Garbage Collection Maria Jump Kathryn S. McKinley {mjump,mckinley}@cs.utexas.edu.

11-Dec-2006 UMCP 6

Department of Computer Sciences

1

4

11

4

1

1

1

4 Rank growing nodes

Rank all growing nodes

Designate node as a candidate if

Finding Growth (SRT)

0it

s

S

sprr i

iii

tttt *

1

Rt

threstir

Page 7: Department of Computer Sciences Cork: Dynamic Memory Leak Detection with Garbage Collection Maria Jump Kathryn S. McKinley {mjump,mckinley}@cs.utexas.edu.

11-Dec-2006 UMCP 7

Department of Computer Sciences

0

1

2

3

4

com

pre

ss

jess

raytr

ace db

javac

mpegaudio

mtr

t

jack

pseudojb

b

SP

EC

jbb

antl

r

blo

at

fop

jyth

on

pm

d

ps

xala

n

Reported Candidates

# o

f C

an

did

ate

s

SRT

jess

SP

EC

jbb

fop

Page 8: Department of Computer Sciences Cork: Dynamic Memory Leak Detection with Garbage Collection Maria Jump Kathryn S. McKinley {mjump,mckinley}@cs.utexas.edu.

11-Dec-2006 UMCP 8

Department of Computer Sciences

1

4

11

4

1

1

1

4 Find nodes that are growing

Rank all growing nodes

Designate node as a candidate if

Finding Growth (RRT)

1*)1(

ii tt vfv

Rt

threstir

1

ii tt vv

1*1

Qprriii ttt

Page 9: Department of Computer Sciences Cork: Dynamic Memory Leak Detection with Garbage Collection Maria Jump Kathryn S. McKinley {mjump,mckinley}@cs.utexas.edu.

11-Dec-2006 UMCP 9

Department of Computer Sciences

0

1

2

3

4

5

com

pre

ss

jess

raytr

ace db

javac

mpegaudio

mtr

t

jack

pseudojb

b

SP

EC

jbb

antl

r

blo

at

fop

jyth

on

pm

d

ps

xala

n

Reported Candidates

# o

f C

an

did

ate

s

SRT

RRTje

ss

SP

EC

jbb

fop

Page 10: Department of Computer Sciences Cork: Dynamic Memory Leak Detection with Garbage Collection Maria Jump Kathryn S. McKinley {mjump,mckinley}@cs.utexas.edu.

11-Dec-2006 UMCP 10

Department of Computer Sciences

Finding Data Structure

Type is not enough Growing edges identify the data structure Rank edges

Calculate a slice from each candidate Set of all paths (n0…nn) such that “Sees” beyond non-candidate nodes

1

11

11

4 14

14

01

kk nnr

Page 11: Department of Computer Sciences Cork: Dynamic Memory Leak Detection with Garbage Collection Maria Jump Kathryn S. McKinley {mjump,mckinley}@cs.utexas.edu.

11-Dec-2006 UMCP 11

Department of Computer Sciences

Implementation and Methodology

Jikes RVM with MMTk Benchmarks:

SPECjvm98, DaCapo, SPECjbb2000 Eclipse 3.1.2

Garbage collector Generational with 4MB bounded nursery

For performance, report application only Replay compilation 2nd run methodology

Page 12: Department of Computer Sciences Cork: Dynamic Memory Leak Detection with Garbage Collection Maria Jump Kathryn S. McKinley {mjump,mckinley}@cs.utexas.edu.

11-Dec-2006 UMCP 12

Department of Computer Sciences

Efficiency and Scalability

Node/type data stored in type information block (TIB) adding 5 words 1 word for type volume and edge list pointer

for each of the previous 4 collections 1 word for # of phases (p)

Edge data stored in lists Prune parts of TPFG that are non-growing

Page 13: Department of Computer Sciences Cork: Dynamic Memory Leak Detection with Garbage Collection Maria Jump Kathryn S. McKinley {mjump,mckinley}@cs.utexas.edu.

11-Dec-2006 UMCP 13

Department of Computer Sciences

Space Overhead

jess Eclipse Geomean

# of types

bm+VM 1744 3365 1747

TPFG avg 318 667 334

TPFG max 319 775 346

# of edges

TPFG avg 844 4090 904

TPFG max 861 7585 1142

% pruned 66% 42% 60%

Increased Alloc % 0.094% 0.167% 0.233%

19%

2.7X

0.233%

Page 14: Department of Computer Sciences Cork: Dynamic Memory Leak Detection with Garbage Collection Maria Jump Kathryn S. McKinley {mjump,mckinley}@cs.utexas.edu.

11-Dec-2006 UMCP 14

Department of Computer Sciences

Time OverheadN

orm

aliz

ed T

ota

l T

ime

Heap Size Relative to Minimum

Page 15: Department of Computer Sciences Cork: Dynamic Memory Leak Detection with Garbage Collection Maria Jump Kathryn S. McKinley {mjump,mckinley}@cs.utexas.edu.

11-Dec-2006 UMCP 15

Department of Computer Sciences

Benchmarks on Cork Cork identified:

Systematic heap growth Growing types Growing data structure

Analysis: fop – application design jess – memory leak SPECjbb2000 – memory leak

0

1

2

3

4

5

com

pre

ss

jess

rayt

race d

b

java

c

mpegaudio

mtr

t

jack

pse

udojb

b

SP

EC

jbb

antl

r

blo

at

fop

jyth

on

pm

d ps

xala

n

SP

EC

jbb

fop

jess

Page 16: Department of Computer Sciences Cork: Dynamic Memory Leak Detection with Garbage Collection Maria Jump Kathryn S. McKinley {mjump,mckinley}@cs.utexas.edu.

11-Dec-2006 UMCP 16

Department of Computer Sciences

SPECjbb2000H

eap

Occ

up

ancy

(M

B)

Time (MB of allocation)

Page 17: Department of Computer Sciences Cork: Dynamic Memory Leak Detection with Garbage Collection Maria Jump Kathryn S. McKinley {mjump,mckinley}@cs.utexas.edu.

11-Dec-2006 UMCP 17

Department of Computer Sciences

Slice Diagram: SPECjbb2000

Order

Orderline

Date

NewOrder

Object[]

longBTreeNode

longBTreelongStaticBTree

Types: 1663 (71)Nodes: 318Edges: 904

Candidate

Non-candidate

Page 18: Department of Computer Sciences Cork: Dynamic Memory Leak Detection with Garbage Collection Maria Jump Kathryn S. McKinley {mjump,mckinley}@cs.utexas.edu.

11-Dec-2006 UMCP 18

Department of Computer Sciences

SPECjbb2000H

eap

Occ

up

ancy

(M

B)

Time (MB of allocation)

Page 19: Department of Computer Sciences Cork: Dynamic Memory Leak Detection with Garbage Collection Maria Jump Kathryn S. McKinley {mjump,mckinley}@cs.utexas.edu.

11-Dec-2006 UMCP 19

Department of Computer Sciences

Eclipse 3.1.2 on Cork

IDE Big, complex, and open-source Bug repository details known memory

leaks and how to reproduce them #115789: Memory Leak Comparing 2 source trees or jar files Manually repeat while running Cork

Page 20: Department of Computer Sciences Cork: Dynamic Memory Leak Detection with Garbage Collection Maria Jump Kathryn S. McKinley {mjump,mckinley}@cs.utexas.edu.

11-Dec-2006 UMCP 20

Department of Computer Sciences

Eclipse 115789H

eap

Occ

up

ancy

(M

B)

Time (MB of allocation)

Page 21: Department of Computer Sciences Cork: Dynamic Memory Leak Detection with Garbage Collection Maria Jump Kathryn S. McKinley {mjump,mckinley}@cs.utexas.edu.

11-Dec-2006 UMCP 21

Department of Computer Sciences

Slice Diagram: Eclipse 115789

Path

Folder File

ResourceCompareInput$FilteredBufferedResourceNode

ArrayList

Object[]

ListenerList

RuleBasedCollator

ResourceCompareInput$MyDiffNode

HashMap$HashEntry

HashMap$HashEntry[]

HashMap

HashMap$HashIterator

ResourceCompareInput

ElementTree$ChildIDsCache

ElementTree

Types: 3365 (1773)Nodes: 667Edges: 4090

Candidate

Non-candidate

Page 22: Department of Computer Sciences Cork: Dynamic Memory Leak Detection with Garbage Collection Maria Jump Kathryn S. McKinley {mjump,mckinley}@cs.utexas.edu.

11-Dec-2006 UMCP 22

Department of Computer Sciences

Eclipse 115789H

eap

Occ

up

ancy

(M

B)

Time (MB of allocation)

Page 23: Department of Computer Sciences Cork: Dynamic Memory Leak Detection with Garbage Collection Maria Jump Kathryn S. McKinley {mjump,mckinley}@cs.utexas.edu.

11-Dec-2006 UMCP 23

Department of Computer Sciences

Slice Diagram: Eclipse 115789

Path

Folder File

ResourceCompareInput$FilteredBufferedResourceNode

ArrayList

Object[]

ListenerList

RuleBasedCollator

ResourceCompareInput$MyDiffNode

HashMap$HashEntry

HashMap$HashEntry[]

HashMap

HashMap$HashIterator

ResourceCompareInput

ElementTree$ChildIDsCache

ElementTree

Types: 3365 (1773)Nodes: 667Edges: 4090

Candidate

Non-candidate

Page 24: Department of Computer Sciences Cork: Dynamic Memory Leak Detection with Garbage Collection Maria Jump Kathryn S. McKinley {mjump,mckinley}@cs.utexas.edu.

11-Dec-2006 UMCP 24

Department of Computer Sciences

Eclipse 115789H

eap

Occ

up

ancy

(M

B)

Time (MB of allocation)

Page 25: Department of Computer Sciences Cork: Dynamic Memory Leak Detection with Garbage Collection Maria Jump Kathryn S. McKinley {mjump,mckinley}@cs.utexas.edu.

11-Dec-2006 UMCP 25

Department of Computer Sciences

Cork’s Contributions Very low-overhead technique

<0.5% space overhead ~2% time overhead

Accurately identifies Systematic heap growth Data structure containing the growth

First mechanism for detecting memory leaks in production systems

Page 26: Department of Computer Sciences Cork: Dynamic Memory Leak Detection with Garbage Collection Maria Jump Kathryn S. McKinley {mjump,mckinley}@cs.utexas.edu.

11-Dec-2006 UMCP 26

Department of Computer Sciences

Thank You!

[email protected]://www.cs.utexas.edu/~mjump

Page 27: Department of Computer Sciences Cork: Dynamic Memory Leak Detection with Garbage Collection Maria Jump Kathryn S. McKinley {mjump,mckinley}@cs.utexas.edu.

11-Dec-2006 UMCP 27

Department of Computer Sciences

Second Run Methodology

Replay compilation Profiling runs chooses hot methods Deterministically applies optimizing compiler Mixture of optimized & unoptimized code

Measure 2nd run First run applies replay compilation Turn off compilation Flush compiler objects from heap Measure second run

Page 28: Department of Computer Sciences Cork: Dynamic Memory Leak Detection with Garbage Collection Maria Jump Kathryn S. McKinley {mjump,mckinley}@cs.utexas.edu.

11-Dec-2006 UMCP 28

Department of Computer Sciences

Gartner Report predicts that by 2010,80% of all new software will be in Java or C#

C++ Java

Execution efficiency Developer productivity

Trusts the programmer Protects the programmer

Arbitrary memoryaccess possible

Memory accessonly through objects

Can arbitrarily override types Type safety

Procedural or object-oriented Object-oriented

Operator overloading Meaning of operators immutable

Powerful capabilities of languageFeature-rich, easy-to-use

standard library

Explicit memory controlAutomatic memory

management

[Wikipedia: Comparison of Java and C++, Dec 2006]

Page 29: Department of Computer Sciences Cork: Dynamic Memory Leak Detection with Garbage Collection Maria Jump Kathryn S. McKinley {mjump,mckinley}@cs.utexas.edu.

11-Dec-2006 UMCP 29

Department of Computer Sciences

Panacea for Bugs?

PMD, FindBugs, JLint, … ESC/Java, Bandera, … HPROF, JProbe, HAT, Leakbot, …

Microsoft reports that, even in C#, 75% ofdevelopment time is spent in debugging

Provide a good start Programs still ship with memory

and semantic errors

Page 30: Department of Computer Sciences Cork: Dynamic Memory Leak Detection with Garbage Collection Maria Jump Kathryn S. McKinley {mjump,mckinley}@cs.utexas.edu.

11-Dec-2006 UMCP 30

Department of Computer Sciences

My Research Focus

PROBLEM:Dynamically detect statistical and anomalous per-object behavior5in production systems Low overhead and high accuracy

SOLUTION: Exploit GC and underlying runtime system Focus only on interesting objects Find ways to summarize object properties

Page 31: Department of Computer Sciences Cork: Dynamic Memory Leak Detection with Garbage Collection Maria Jump Kathryn S. McKinley {mjump,mckinley}@cs.utexas.edu.

11-Dec-2006 UMCP 31

Department of Computer Sciences

Outline

Motivation: Programs have bugs Cork: Dynamic Memory Leak Detection for

Garbage-Collected Languages Summarize using a type points-from graph Interpret the summarization Find memory leaks with Cork

How to focus only on interesting objects Heap summarization with focus Conclusions and future work

Page 32: Department of Computer Sciences Cork: Dynamic Memory Leak Detection with Garbage Collection Maria Jump Kathryn S. McKinley {mjump,mckinley}@cs.utexas.edu.

11-Dec-2006 UMCP 32

Department of Computer Sciences

Memory-Related Bugs with GC Lost Pointer : lose pointer to memory

before freeing

Dangling Pointer : de-referencing pointer to memory previously freed

Unnecessary Reference : keeping pointer to memory no longer needed

Objects are live, can not reclaim

Object is live

Reclaims automatically

Page 33: Department of Computer Sciences Cork: Dynamic Memory Leak Detection with Garbage Collection Maria Jump Kathryn S. McKinley {mjump,mckinley}@cs.utexas.edu.

11-Dec-2006 UMCP 33

Department of Computer Sciences

Heap Occupancy GraphH

eap

Occ

up

ancy

(M

B)

Time (MB of allocation)

Page 34: Department of Computer Sciences Cork: Dynamic Memory Leak Detection with Garbage Collection Maria Jump Kathryn S. McKinley {mjump,mckinley}@cs.utexas.edu.

11-Dec-2006 UMCP 34

Department of Computer Sciences

Related Work Offline Techniques:

Static analysis [Heine et al. 03]

Heap differencing [JProbe, DePauw et al. 98, 99, 00]

Allocation and/or usage tracking [OptimizeIt, Rationale, Purify, HAT, HPROF, Shaham et al. 00]

Online Techniques: Leakbot (partially online) [Mitchell et al. 03]

Adaptive usage tracking [Chilimbi et al. 04, Bond et al. 06]

Cork accurately pinpoints systematicheap growth completely online

Page 35: Department of Computer Sciences Cork: Dynamic Memory Leak Detection with Garbage Collection Maria Jump Kathryn S. McKinley {mjump,mckinley}@cs.utexas.edu.

11-Dec-2006 UMCP 35

Department of Computer Sciences

Outline

Motivation: Programs have bugs Cork: Dynamic Memory Leak Detection for

Garbage-Collected Languages Summarize using a type points-from graph Interpret the summarization Find memory leaks with Cork

How to focus only on interesting objects Heap summarization with focus Conclusions and future work

Page 36: Department of Computer Sciences Cork: Dynamic Memory Leak Detection with Garbage Collection Maria Jump Kathryn S. McKinley {mjump,mckinley}@cs.utexas.edu.

11-Dec-2006 UMCP 36

Department of Computer Sciences

What do we know?

Objects have special properties Lifetime, allocation site, last-use site, calling

context, thread usage, etc. Tracking individual object properties is

useful for debugging Can use dynamic object sampling to

gather fine-grained object statisticsat very low overhead [Jump et al. 04]

Page 37: Department of Computer Sciences Cork: Dynamic Memory Leak Detection with Garbage Collection Maria Jump Kathryn S. McKinley {mjump,mckinley}@cs.utexas.edu.

11-Dec-2006 UMCP 37

Department of Computer Sciences

Dynamic Object Sampling

Tag objects with special properties One bit in the header indicates a tag Sample tag encodes object properties

Examples: Allocation site Last-use site Lifetime Which data structure

Page 38: Department of Computer Sciences Cork: Dynamic Memory Leak Detection with Garbage Collection Maria Jump Kathryn S. McKinley {mjump,mckinley}@cs.utexas.edu.

11-Dec-2006 UMCP 38

Department of Computer Sciences

For example, modify a bump-pointer allocator

Dynamic Object Sampling

Sample Tag

Page 39: Department of Computer Sciences Cork: Dynamic Memory Leak Detection with Garbage Collection Maria Jump Kathryn S. McKinley {mjump,mckinley}@cs.utexas.edu.

11-Dec-2006 UMCP 39

Department of Computer Sciences

During Garbage CollectionGather object statistics

Piggyback on object scanning

SAMPLE TAG FOUND!1. Examine tag2. Collect statistics

survivors

Page 40: Department of Computer Sciences Cork: Dynamic Memory Leak Detection with Garbage Collection Maria Jump Kathryn S. McKinley {mjump,mckinley}@cs.utexas.edu.

11-Dec-2006 UMCP 40

Department of Computer Sciences

Focus DOS Overhead Sampling every object

12% space overhead 6-7% time overhead

What is interesting depends application Memory leak detection … candidate types Malformed data structures … nodes Dynamic pretenuring … random sampling

Focus only on 6% of objects 0.8% space overhead 2-3% time overhead

6%

Page 41: Department of Computer Sciences Cork: Dynamic Memory Leak Detection with Garbage Collection Maria Jump Kathryn S. McKinley {mjump,mckinley}@cs.utexas.edu.

11-Dec-2006 UMCP 41

Department of Computer Sciences

DOS in Cork Encode allocation site and lifetime for

candidates <1.3% space overhead, ~4% time overhead Find specific allocation sites causing growth

Future work Encode last-use site in sample tag Requires read/write barrier for candidates Will overhead still be low enough for use in

production systems?

Page 42: Department of Computer Sciences Cork: Dynamic Memory Leak Detection with Garbage Collection Maria Jump Kathryn S. McKinley {mjump,mckinley}@cs.utexas.edu.

11-Dec-2006 UMCP 42

Department of Computer Sciences

Conclusions

Developed synergistic two techniques Dynamic object sampling Points-from graphs

See detailed object characteristics in high-level summarizations

Unique ways to debug software in production systems