Tolerating Memory Leaks

75
Tolerating Memory Tolerating Memory Leaks Leaks Michael D. Bond Kathryn S. McKinley

description

Tolerating Memory Leaks. Michael D. Bond Kathryn S. McKinley. Bugs in Deployed Software. Deployed software fails Different environment and inputs  different behaviors Greater complexity & reliance. Bugs in Deployed Software. Deployed software fails - PowerPoint PPT Presentation

Transcript of Tolerating Memory Leaks

Page 1: Tolerating Memory Leaks

Tolerating Memory LeaksTolerating Memory Leaks

Michael D. Bond Kathryn S. McKinley

Page 2: Tolerating Memory Leaks

Bugs in Deployed Bugs in Deployed SoftwareSoftwareDeployed software fails

◦Different environment and inputs different behaviors

Greater complexity & reliance

Page 3: Tolerating Memory Leaks

Bugs in Deployed Bugs in Deployed SoftwareSoftwareDeployed software fails

◦Different environment and inputs different behaviors

Greater complexity & reliance

Memory leaks are a real problemFixing leaks is hard

Page 4: Tolerating Memory Leaks

Memory Leaks in Deployed Memory Leaks in Deployed SystemsSystemsMemory leaks are a real problem

◦Managed languages do not eliminate them

Page 5: Tolerating Memory Leaks

Memory Leaks in Deployed Memory Leaks in Deployed SystemsSystemsMemory leaks are a real problem

◦Managed languages do not eliminate them

ReachableDead

Page 6: Tolerating Memory Leaks

Memory Leaks in Deployed Memory Leaks in Deployed SystemsSystemsMemory leaks are a real problem

◦Managed languages do not eliminate them

ReachableDead

Page 7: Tolerating Memory Leaks

Memory Leaks in Deployed Memory Leaks in Deployed SystemsSystemsMemory leaks are a real problem

◦Managed languages do not eliminate them

Reachable

Dead

Page 8: Tolerating Memory Leaks

Memory Leaks in Deployed Memory Leaks in Deployed SystemsSystemsMemory leaks are a real problem

◦Managed languages do not eliminate them

Reach

able

Dead

Page 9: Tolerating Memory Leaks

Memory Leaks in Deployed Memory Leaks in Deployed SystemsSystemsMemory leaks are a real problem

◦Managed languages do not eliminate them

◦Slow & crash real programs

Dead

Page 10: Tolerating Memory Leaks

Memory Leaks in Deployed Memory Leaks in Deployed SystemsSystemsMemory leaks are a real problem

◦Managed languages do not eliminate them

◦Slow & crash real programs◦Unacceptable for some applications

Page 11: Tolerating Memory Leaks

Memory Leaks in Deployed Memory Leaks in Deployed SystemsSystemsMemory leaks are a real problem

◦Managed languages do not eliminate them

◦Slow & crash real programs◦Unacceptable for some applications

Fixing leaks is hard◦Leaks take time to materialize◦Failure far from cause

Page 12: Tolerating Memory Leaks

ExampleExample

http://www.codeproject.com/KB/showcase/IfOnlyWedUsedANTSProfiler.aspx

Driverless truck◦10,000 lines of C#

Leak: past obstacles remained reachable

No immediate symptoms“This problem was pernicious because it only showed up after 40 minutes to an hour of driving around and collecting obstacles.”

◦Quick “fix”: after 40 minutes, stop & rebootEnvironment sensitive

◦More obstacles in deployment: failed in 28 minutes

Page 13: Tolerating Memory Leaks

ExampleExample

http://www.codeproject.com/KB/showcase/IfOnlyWedUsedANTSProfiler.aspx

Driverless truck◦10,000 lines of C#

Leak: past obstacles remained reachable

No immediate symptoms“This problem was pernicious because it only showed up after 40 minutes to an hour of driving around and collecting obstacles.”

Quick “fix”: restart after 40 minutesEnvironment sensitive

◦More obstacles in deployment◦Failed in 28 minutes

Page 14: Tolerating Memory Leaks

ExampleExample

http://www.codeproject.com/KB/showcase/IfOnlyWedUsedANTSProfiler.aspx

Driverless truck◦10,000 lines of C#

Leak: past obstacles remained reachable

No immediate symptoms“This problem was pernicious because it only showed up after 40 minutes to an hour of driving around and collecting obstacles.”

Quick “fix”: restart after 40 minutesEnvironment sensitive

◦More obstacles in deployment◦Failed in 28 minutes

Page 15: Tolerating Memory Leaks

ExampleExampleDriverless truck

◦10,000 lines of C#Leak: past obstacles remained

reachableNo immediate symptoms

“This problem was pernicious because it only showed up after 40 minutes to an hour of driving around and collecting obstacles.”

Quick “fix”: restart after 40 minutesEnvironment sensitive

◦More obstacles in deployment◦Unresponsive after 28 minutes

http://www.codeproject.com/KB/showcase/IfOnlyWedUsedANTSProfiler.aspx

Page 16: Tolerating Memory Leaks

Uncertainty in Deployed Uncertainty in Deployed SoftwareSoftwareUnknown leaks; unexpected

failuresOnline leak diagnosis helps

◦Too late to help failing systems

Page 17: Tolerating Memory Leaks

Uncertainty in Deployed Uncertainty in Deployed SoftwareSoftwareUnknown leaks; unexpected

failuresOnline leak diagnosis helps

◦Too late to help failing systemsAlso tolerate leaks

Page 18: Tolerating Memory Leaks

Uncertainty in Deployed Uncertainty in Deployed SoftwareSoftwareUnknown leaks; unexpected

failuresOnline leak diagnosis helps

◦Too late to help failing systemsAlso tolerate leaks

Page 19: Tolerating Memory Leaks

Uncertainty in Deployed Uncertainty in Deployed SoftwareSoftwareUnknown leaks; unexpected

failuresOnline leak diagnosis helps

◦Too late to help failing systemsAlso tolerate leaks

Page 20: Tolerating Memory Leaks

Uncertainty in Deployed Uncertainty in Deployed SoftwareSoftwareUnknown leaks; unexpected

failuresOnline leak diagnosis helps

◦Too late to help failing systemsAlso tolerate leaks

Page 21: Tolerating Memory Leaks

Predicting the FuturePredicting the FutureDead objects not used againHighly stale objects likely

leaked

Reachable

Dead

Page 22: Tolerating Memory Leaks

Predicting the FuturePredicting the FutureDead objects not used againHighly stale objects likely leaked

[Chilimbi & Hauswirth ’04]

[Qin et al. ’05]

[Bond & McKinley ’06]

Reachable

Dead

Page 23: Tolerating Memory Leaks

Tolerating Leaks with Tolerating Leaks with MeltMeltMove highly stale objects to

disk◦Much larger than memory◦Time & space proportional to live

memory◦Preserve semantics

Stale objects

In-use objects

Stale objects

Page 24: Tolerating Memory Leaks

Sounds like Paging!Sounds like Paging!

Stale objects

In-use objects

Stale objects

Page 25: Tolerating Memory Leaks

Sounds like Paging!Sounds like Paging!Paging insufficient for managed

languages◦Need object granularity

◦GC’s working set is all reachable objects

Page 26: Tolerating Memory Leaks

Sounds like Paging!Sounds like Paging!Paging insufficient for managed

languages◦Need object granularity

◦GC’s working set is all reachable objects

Bookmarking collection [Hertz et al. ’05]

Page 27: Tolerating Memory Leaks

Challenge #1: How does Challenge #1: How does Melt identify stale objects?Melt identify stale objects?

roots

Page 28: Tolerating Memory Leaks

rootsGC:for all fields a.f a.f |= 0x1;

Challenge #1: How does Challenge #1: How does Melt identify stale objects?Melt identify stale objects?

Page 29: Tolerating Memory Leaks

rootsGC:for all fields a.f a.f |= 0x1;

Challenge #1: How does Challenge #1: How does Melt identify stale objects?Melt identify stale objects?

Page 30: Tolerating Memory Leaks

rootsGC:for all fields a.f a.f |= 0x1;

Application:b = a.f;if (b & 0x1) { b &= ~0x1; a.f = b; [atomic]}

Challenge #1: How does Challenge #1: How does Melt identify stale objects?Melt identify stale objects?

Page 31: Tolerating Memory Leaks

rootsGC:for all fields a.f a.f |= 0x1;

Application:b = a.f;if (b & 0x1) { b &= ~0x1; a.f = b; [atomic]}

Challenge #1: How does Challenge #1: How does Melt identify stale objects?Melt identify stale objects?

Page 32: Tolerating Memory Leaks

rootsGC:for all fields a.f a.f |= 0x1;

Application:b = a.f;if (b & 0x1) { b &= ~0x1; a.f = b; [atomic]}

Challenge #1: How does Challenge #1: How does Melt identify stale objects?Melt identify stale objects?

Page 33: Tolerating Memory Leaks

rootsGC:for all fields a.f a.f |= 0x1;

Application:b = a.f;if (b & 0x1) { b &= ~0x1; a.f = b; [atomic]}

Challenge #1: How does Challenge #1: How does Melt identify stale objects?Melt identify stale objects?

Page 34: Tolerating Memory Leaks

Stale SpaceStale Space

stale space

roots

in-use space

Page 35: Tolerating Memory Leaks

Stale SpaceStale Space

roots

in-use space stale space

Page 36: Tolerating Memory Leaks

Challenge #2Challenge #2

roots

in-use space stale space

How does Melt maintain pointers?

Page 37: Tolerating Memory Leaks

Stub-Scion PairsStub-Scion Pairs

roots

in-use space stale space

scionspace

Page 38: Tolerating Memory Leaks

Stub-Scion PairsStub-Scion Pairs

roots

in-use space stale space

scionspaceB BscionB Bscion

sciontable

Page 39: Tolerating Memory Leaks

Stub-Scion PairsStub-Scion Pairs

roots

in-use space stale space

scionspaceB BscionB Bscion

sciontable

?

Page 40: Tolerating Memory Leaks

Scion-Referenced Object Scion-Referenced Object Becomes StaleBecomes Stale

roots

in-use space stale space

scionspaceB BscionB Bscion

sciontable

Page 41: Tolerating Memory Leaks

Scion-Referenced Object Scion-Referenced Object Becomes StaleBecomes Stale

roots

in-use space stale space

scionspace

sciontable

Page 42: Tolerating Memory Leaks

roots

in-use space stale space

scionspace

sciontable

Challenge #3Challenge #3What if program accesses highly stale object?

Page 43: Tolerating Memory Leaks

Application Accesses Stale Application Accesses Stale ObjectObject

roots

in-use space stale space

scionspace

sciontable

Page 44: Tolerating Memory Leaks

Application Accesses Stale Application Accesses Stale ObjectObject

roots

in-use space stale space

scionspaceC CscionC Cscion

sciontable

Page 45: Tolerating Memory Leaks

Application Accesses Stale Application Accesses Stale ObjectObject

roots

in-use space stale space

scionspaceC CscionC Cscion

sciontable

Page 46: Tolerating Memory Leaks

Application Accesses Stale Application Accesses Stale ObjectObject

roots

in-use space stale space

scionspaceC CscionC Cscion

sciontable

Page 47: Tolerating Memory Leaks

ImplementationImplementationIntegrated into Jikes RVM 2.9.2

◦Works with any tracing collector◦Evaluation uses generational copying

collector

Page 48: Tolerating Memory Leaks

ImplementationImplementationIntegrated into Jikes RVM 2.9.2

◦Works with any tracing collector◦Evaluation uses generational copying

collector

64-bit

120 GB

32-bit

2 GB

Page 49: Tolerating Memory Leaks

ImplementationImplementationIntegrated into Jikes RVM 2.9.2

◦Works with any tracing collector◦Evaluation uses generational copying

collector

64-bit

120 GB

32-bit

2 GB

Page 50: Tolerating Memory Leaks

Performance EvaluationPerformance EvaluationMethodology

◦DaCapo, SPECjbb2000, SPECjvm98◦Dual-core Pentium 4◦Deterministic execution (replay)

Results◦6% overhead (read barriers)◦Stress test: still 6% overhead

Speedups in tight heaps (reduced GC workload)

Page 51: Tolerating Memory Leaks

Tolerating LeaksTolerating Leaks

Page 52: Tolerating Memory Leaks

Tolerating LeaksTolerating Leaks

Page 53: Tolerating Memory Leaks

Tolerating LeaksTolerating Leaks

Page 54: Tolerating Memory Leaks

Tolerating LeaksTolerating Leaks

Page 55: Tolerating Memory Leaks

Tolerating LeaksTolerating Leaks

Page 56: Tolerating Memory Leaks

Tolerating LeaksTolerating Leaks

Page 57: Tolerating Memory Leaks

Tolerating LeaksTolerating Leaks

Page 58: Tolerating Memory Leaks

Tolerating LeaksTolerating Leaks

Page 59: Tolerating Memory Leaks

Eclipse Diff: Reachable Eclipse Diff: Reachable MemoryMemory

Page 60: Tolerating Memory Leaks

Eclipse Diff: Reachable Eclipse Diff: Reachable MemoryMemory

Page 61: Tolerating Memory Leaks
Page 62: Tolerating Memory Leaks

Eclipse Diff: PerformanceEclipse Diff: Performance

Page 63: Tolerating Memory Leaks

Eclipse Diff: PerformanceEclipse Diff: Performance

Page 64: Tolerating Memory Leaks

Eclipse Diff: PerformanceEclipse Diff: Performance

Page 65: Tolerating Memory Leaks
Page 66: Tolerating Memory Leaks

Managed [LeakSurvivor, Tang et al. ’08] [Panacea, Goldstein et al. ’07, Breitgand et al. ’07]◦Don’t guarantee time & space

proportional to live memory

Native [Cyclic memory allocation, Nguyen & Rinard ’07]

[Plug, Novark et al. ’08]◦Different challenges & opportunities◦Less coverage or change semantics

Orthogonal persistence & distributed GC◦Barriers, swizzling, object faulting, stub-scion

pairs

Related WorkRelated Work

Page 67: Tolerating Memory Leaks

ConclusionConclusionFinding bugs before deployment is hard

Page 68: Tolerating Memory Leaks

ConclusionConclusionFinding bugs before deployment

is hard

Online diagnosis helps developers

Help users in meantime

Tolerate leaks with Melt: illusion of fix

Stale objects

Page 69: Tolerating Memory Leaks

ConclusionConclusionFinding bugs before deployment

is hard

Online diagnosis helps developers

Help users in meantime

Tolerate leaks with Melt: illusion of fix◦Time & space proportional to live

memory◦Preserve semantics

Page 70: Tolerating Memory Leaks

ConclusionConclusionFinding bugs before deployment is

hard

Online diagnosis helps developersHelp users in meantime

Tolerate leaks with Melt: illusion of fix◦Time & space proportional to live

memory◦Preserve semantics

Buys developers time to fix leaks

Page 71: Tolerating Memory Leaks

BackupBackup

Page 72: Tolerating Memory Leaks

Triggering MeltTriggering Melt

Heap not nearly full

Heap full ornearly full

Heap full ornearly full

Start

Expected heapfullness

Heap notnearly full

Aftermarking

Unexpectedheap fullness

Back

Page 73: Tolerating Memory Leaks

ConclusionConclusionFinding bugs before deployment

is hard

Online diagnosis helps developers

To help users in meantime, tolerate bugs

Tolerate leaks with Melt: illusion of fix

Stale objects

Page 74: Tolerating Memory Leaks

Related Work: Tolerating Related Work: Tolerating BugsBugsNondeterministic errors

[Atom-Aid] [DieHard] [Grace] [Rx]

◦Memory corruption: perturb layout◦Concurrency bugs: perturb

schedulingGeneral bugs

◦Ignore failing operations [FOC]

◦Need higher level, more proactive approaches

Page 75: Tolerating Memory Leaks

Melt’s GC OverheadMelt’s GC Overhead

Back