Unified Theory of Garbage Collection
description
Transcript of Unified Theory of Garbage Collection
A Unified Theory of Garbage Collection
David F. Bacon Perry Cheng V.T. Rajan
IBM Watson Research Center
Seminar Talk by Yoshimi Takano, ETH Zurich
A Unified Theory of Garbage Collection (Bacon et al.) Seminar Talk by Yoshimi Takano 1
“He who loves practice without theory is like the sailorwho boards ship without a rudder and compass andnever knows where he may cast.”
– Leonardo da Vinci
A Unified Theory of Garbage Collection (Bacon et al.) Seminar Talk by Yoshimi Takano 2
A Unified Theory of Garbage Collection
David F. Bacon Perry Cheng V.T. Rajan
IBM Watson Research Center
A Unified Theory of Garbage Collection (Bacon et al.) Seminar Talk by Yoshimi Takano 3
Summary
I Tracing and reference counting are duals
I All high-performance garbage collectors are hybrids oftracing and reference counting
I This taxonomy can be usedI To develop a uniform cost-modelI As an algorithm design frameworkI To generate collectors dynamically. . .
A Unified Theory of Garbage Collection (Bacon et al.) Seminar Talk by Yoshimi Takano 4
Outline
IntroductionGarbage CollectionMotivation
Duality of Tracing and Reference CountingQualitative ComparisonAbstract Garbage CollectionConvergence
Collection as Tracing and Reference CountingSingle HeapSplit Heap
Uniform Cost Model
A Unified Theory of Garbage Collection (Bacon et al.) Seminar Talk by Yoshimi Takano 5
Introduction Garbage Collection
Garbage Collection and Liveness (Recap)
I Automatic storage reclamation of unreachable objectsI Roots:
I GlobalsI Locals in stack frames
Roots
Live Dead
A Unified Theory of Garbage Collection (Bacon et al.) Seminar Talk by Yoshimi Takano 6
Introduction Motivation
Picking a Garbage Collector for your VM
I Lots and lots of garbage collector algorithms
State of the ArtI Implement n algorithmsI Measure and compare for m benchmarksI Use algorithm with best mean performance
ProblemsI Limited exploration of design space (“no compass”)I Static selection can sacrifice performance
A Unified Theory of Garbage Collection (Bacon et al.) Seminar Talk by Yoshimi Takano 7
[Slide from OOPSLA presentation]
Introduction Motivation
Picking a Garbage Collector for your VM
I Lots and lots of garbage collector algorithms
State of the ArtI Implement n algorithmsI Measure and compare for m benchmarksI Use algorithm with best mean performance
ProblemsI Limited exploration of design space (“no compass”)I Static selection can sacrifice performance
A Unified Theory of Garbage Collection (Bacon et al.) Seminar Talk by Yoshimi Takano 7
[Slide from OOPSLA presentation]
Duality of Tracing and Reference Counting Qualitative Comparison
Two Fundamental Garbage Collection Techniques
Tracing [McCarthy, 1960]
I Stop the worldI Trace forward from rootsI Everything touched is live, all else is garbage
Reference Counting [Collins, 1960]
I Each object has count of incoming pointersI Adjust count in case of mutations (write barrier)I When counter reaches zero, object is garbage and count
of all children is decremented
A Unified Theory of Garbage Collection (Bacon et al.) Seminar Talk by Yoshimi Takano 8
Duality of Tracing and Reference Counting Qualitative Comparison
Two Fundamental Garbage Collection Techniques
Tracing [McCarthy, 1960]
I Stop the worldI Trace forward from rootsI Everything touched is live, all else is garbage
Reference Counting [Collins, 1960]
I Each object has count of incoming pointersI Adjust count in case of mutations (write barrier)I When counter reaches zero, object is garbage and count
of all children is decremented
A Unified Theory of Garbage Collection (Bacon et al.) Seminar Talk by Yoshimi Takano 8
Duality of Tracing and Reference Counting Qualitative Comparison
Diametrical Opposites?
Tracing Reference CountingCollection Style Batch IncrementalPause Times Long ShortReal Time? No YesDelayed Reclamation? Yes NoCost per Mutation None HighCollects Cycles? Yes No
1 1
1
A Unified Theory of Garbage Collection (Bacon et al.) Seminar Talk by Yoshimi Takano 9
[Table from paper]
Duality of Tracing and Reference Counting Qualitative Comparison
How Different Really?
I Both types have been implemented by the authorsI Very different starting pointI But with optimizations, similarities increase:
I Both trace rootsI Both are semi-incrementalI Both have floating garbageI Both have write barriers
I Why?
A Unified Theory of Garbage Collection (Bacon et al.) Seminar Talk by Yoshimi Takano 10
Duality of Tracing and Reference Counting Abstract Garbage Collection
Abstract Garbage Collection
DefinitionAn object graph is a triple G = (V, E, R) with
V the set of vertices (objects)E the multiset of directed edges (pointers)R the multiset of roots
Multiset notation: [a, b] ] [b] = [a, b, b]
DefinitionA function ρ : V → N0 is a reference count function for an objectgraph G = (V, E, R) iff
∀ x ∈ V : ρ(x) = |[(u, x) ∈ E : ρ(u) > 0]|+ 1x∈R
I “# in-edges from vertices with a non-zero RC (+1)”
A Unified Theory of Garbage Collection (Bacon et al.) Seminar Talk by Yoshimi Takano 11
Roots R
︸ ︷︷ ︸V
Duality of Tracing and Reference Counting Abstract Garbage Collection
Abstract Garbage Collection
DefinitionAn object graph is a triple G = (V, E, R) with
V the set of vertices (objects)E the multiset of directed edges (pointers)R the multiset of roots
Multiset notation: [a, b] ] [b] = [a, b, b]
DefinitionA function ρ : V → N0 is a reference count function for an objectgraph G = (V, E, R) iff
∀ x ∈ V : ρ(x) = |[(u, x) ∈ E : ρ(u) > 0]|+ 1x∈R
I “# in-edges from vertices with a non-zero RC (+1)”
A Unified Theory of Garbage Collection (Bacon et al.) Seminar Talk by Yoshimi Takano 11
Roots R
︸ ︷︷ ︸V
Duality of Tracing and Reference Counting Abstract Garbage Collection
Abstract Garbage Collection, cont’d
DefinitionA garbage collection algorithm takes an object graph G as inputand computes a reference count function ρ for G.
Objects x with ρ(x) = 0 are then reclaimed.
I Common abstract model, where any algorithm computesreference counts ρ
I For a given object graph, there can be many suchfunctions ρ, as will be seen later
A Unified Theory of Garbage Collection (Bacon et al.) Seminar Talk by Yoshimi Takano 12
Duality of Tracing and Reference Counting Abstract Garbage Collection
Abstract Garbage Collection, cont’d
DefinitionA garbage collection algorithm takes an object graph G as inputand computes a reference count function ρ for G.
Objects x with ρ(x) = 0 are then reclaimed.
I Common abstract model, where any algorithm computesreference counts ρ
I For a given object graph, there can be many suchfunctions ρ, as will be seen later
A Unified Theory of Garbage Collection (Bacon et al.) Seminar Talk by Yoshimi Takano 12
Duality of Tracing and Reference Counting Convergence
Tracing Revisited
Let’s consider a version of tracing that computes referencecounts instead of simply setting mark bits:
initialize-for-tracing():W ← R
scan-by-tracing():while W 6= ∅
remove w from Wρ(w)← ρ(w) + 1if ρ(w) = 1
for each x ∈ children(w)W ← W ] [x] Roots
Live Dead
0
0
0
0
0
0
0
0
0
0
0
0
0
0
A Unified Theory of Garbage Collection (Bacon et al.) Seminar Talk by Yoshimi Takano 13
[Pseudo-code snippets from paper]
Duality of Tracing and Reference Counting Convergence
Tracing Revisited
Let’s consider a version of tracing that computes referencecounts instead of simply setting mark bits:
initialize-for-tracing():W ← R
scan-by-tracing():while W 6= ∅
remove w from Wρ(w)← ρ(w) + 1if ρ(w) = 1
for each x ∈ children(w)W ← W ] [x] Roots
Live Dead
0
0
0
0
0
0
0
0
0
0
0
0
0
0
A Unified Theory of Garbage Collection (Bacon et al.) Seminar Talk by Yoshimi Takano 13
[Pseudo-code snippets from paper]
Duality of Tracing and Reference Counting Convergence
Tracing Revisited
Let’s consider a version of tracing that computes referencecounts instead of simply setting mark bits:
initialize-for-tracing():W ← R
scan-by-tracing():while W 6= ∅
remove w from Wρ(w)← ρ(w) + 1if ρ(w) = 1
for each x ∈ children(w)W ← W ] [x] Roots
Live
Dead
2
2
1
1
4
1
0
0
0
0
0
0
0
0
A Unified Theory of Garbage Collection (Bacon et al.) Seminar Talk by Yoshimi Takano 13
[Pseudo-code snippets from paper]
Duality of Tracing and Reference Counting Convergence
Tracing Revisited
Let’s consider a version of tracing that computes referencecounts instead of simply setting mark bits:
initialize-for-tracing():W ← R
scan-by-tracing():while W 6= ∅
remove w from Wρ(w)← ρ(w) + 1if ρ(w) = 1
for each x ∈ children(w)W ← W ] [x] Roots
Live Dead
2
2
1
1
4
1
0
0
0
0
0
0
0
0
A Unified Theory of Garbage Collection (Bacon et al.) Seminar Talk by Yoshimi Takano 13
[Pseudo-code snippets from paper]
Duality of Tracing and Reference Counting Convergence
Reference Counting Revisited
Let’s consider a version of RC in which the decrementoperations are batched instead of performed immediately:
mutate(old, new):W ← W ] [old]ρ(new)← ρ(new) + 1
scan-by-counting():while W 6= ∅
remove w from Wρ(w)← ρ(w)− 1if ρ(w) = 0
for each x ∈ children(w)W ← W ] [x]
Anti-roots
Dead
Cyclic
1
2
1
3
4
2
2
1
1
1
2
1
1
1
A Unified Theory of Garbage Collection (Bacon et al.) Seminar Talk by Yoshimi Takano 14
Duality of Tracing and Reference Counting Convergence
Reference Counting Revisited
Let’s consider a version of RC in which the decrementoperations are batched instead of performed immediately:
mutate(old, new):W ← W ] [old]ρ(new)← ρ(new) + 1
scan-by-counting():while W 6= ∅
remove w from Wρ(w)← ρ(w)− 1if ρ(w) = 0
for each x ∈ children(w)W ← W ] [x]
Anti-roots
Dead
Cyclic
1
2
1
3
4
2
2
1
1
1
2
1
1
1
A Unified Theory of Garbage Collection (Bacon et al.) Seminar Talk by Yoshimi Takano 14
Duality of Tracing and Reference Counting Convergence
Reference Counting Revisited
Let’s consider a version of RC in which the decrementoperations are batched instead of performed immediately:
mutate(old, new):W ← W ] [old]ρ(new)← ρ(new) + 1
scan-by-counting():while W 6= ∅
remove w from Wρ(w)← ρ(w)− 1if ρ(w) = 0
for each x ∈ children(w)W ← W ] [x]
Anti-roots
Dead
Cyclic
2
2
1
3
4
2
2
1
1
1
2
1
2
1
A Unified Theory of Garbage Collection (Bacon et al.) Seminar Talk by Yoshimi Takano 14
Duality of Tracing and Reference Counting Convergence
Reference Counting Revisited
Let’s consider a version of RC in which the decrementoperations are batched instead of performed immediately:
mutate(old, new):W ← W ] [old]ρ(new)← ρ(new) + 1
scan-by-counting():while W 6= ∅
remove w from Wρ(w)← ρ(w)− 1if ρ(w) = 0
for each x ∈ children(w)W ← W ] [x]
Anti-roots
Dead
Cyclic
2
2
1
2
4
2
0
1
0
0
1
0
0
1
A Unified Theory of Garbage Collection (Bacon et al.) Seminar Talk by Yoshimi Takano 14
Duality of Tracing and Reference Counting Convergence
Reference Counting Revisited
Let’s consider a version of RC in which the decrementoperations are batched instead of performed immediately:
mutate(old, new):W ← W ] [old]ρ(new)← ρ(new) + 1
scan-by-counting():while W 6= ∅
remove w from Wρ(w)← ρ(w)− 1if ρ(w) = 0
for each x ∈ children(w)W ← W ] [x]
Anti-roots
Dead
Cyclic
2
2
1
2
4
2
0
1
0
0
1
0
0
1
A Unified Theory of Garbage Collection (Bacon et al.) Seminar Talk by Yoshimi Takano 14
Duality of Tracing and Reference Counting Convergence
Not So Different After All. . .
initialize-for-tracing():W ← R
scan-by-tracing():while W 6= ∅
remove w from Wρ(w)← ρ(w) + 1if ρ(w) = 1
for each x ∈ children(w)W ← W ] [x]
mutate(old, new):W ← W ] [old]ρ(new)← ρ(new) + 1
scan-by-counting():while W 6= ∅
remove w from Wρ(w)← ρ(w)− 1if ρ(w) = 0
for each x ∈ children(w)W ← W ] [x]
A Unified Theory of Garbage Collection (Bacon et al.) Seminar Talk by Yoshimi Takano 15
Duality of Tracing and Reference Counting Convergence
Duality
Tracing Reference CountingStarting Point Roots Anti-rootsGraph Traversal Fwd. from roots Fwd. from anti-rootsObjects Traversed Live DeadInitial RC Low (zero) HighRC Reconstruction Addition Subtraction
2
2
1
1
4
1
0
0
0
0
0
0
0
0
2
2
1
2
4
2
0
1
0
0
1
0
0
1
A Unified Theory of Garbage Collection (Bacon et al.) Seminar Talk by Yoshimi Takano 16
[Table from paper]
Collection as Tracing and Reference Counting
Tracing/Counting Hybrids
FundamentalsI Division of storage:
I Single heap (= 1)I Split heap (= 2)I Multi-heap (> 2)
I Assignment of either tracing or reference counting to thedifferent divisions
Trade-offsI Remaining choices are implementation details and
space-time trade-offs
A Unified Theory of Garbage Collection (Bacon et al.) Seminar Talk by Yoshimi Takano 17
Collection as Tracing and Reference Counting
Tracing/Counting Hybrids
FundamentalsI Division of storage:
I Single heap (= 1)I Split heap (= 2)I Multi-heap (> 2)
I Assignment of either tracing or reference counting to thedifferent divisions
Trade-offsI Remaining choices are implementation details and
space-time trade-offs
A Unified Theory of Garbage Collection (Bacon et al.) Seminar Talk by Yoshimi Takano 17
Collection as Tracing and Reference Counting Single Heap
Single Heap Algorithms
I Root references vs. intra-heap references
Roots Heap
A Unified Theory of Garbage Collection (Bacon et al.) Seminar Talk by Yoshimi Takano 18
Collection as Tracing and Reference Counting Single Heap
Single Heap Algorithms
I Root references vs. intra-heap references
Roots Heap
A Unified Theory of Garbage Collection (Bacon et al.) Seminar Talk by Yoshimi Takano 18
[Schematics from paper]
Collection as Tracing and Reference Counting Single Heap
Algorithm 1: Tracing
I Both root and intra-heap references are traced
Roots Heap
T T
A Unified Theory of Garbage Collection (Bacon et al.) Seminar Talk by Yoshimi Takano 19
Collection as Tracing and Reference Counting Single Heap
Algorithm 2: Reference Counting
I Both root and intra-heap references are counted
Roots Heap
C C
A Unified Theory of Garbage Collection (Bacon et al.) Seminar Talk by Yoshimi Takano 20
Collection as Tracing and Reference Counting Single Heap
Algo. 3: Deferred Reference Counting [Deutsch/Bobrow, ’76]
I To avoid high mutation overhead root references are notcounted (i.e. write barrier ignores root pointers)
I Objects with reference count 0 are maintained in a zerocount table (ZCT)
I Root references are traced at collection time
mutate(old, new):if ¬is-root-pointer
. . .
Roots Heap
ZCT
T C
A Unified Theory of Garbage Collection (Bacon et al.) Seminar Talk by Yoshimi Takano 21
Collection as Tracing and Reference Counting Single Heap
Algo. 3: Deferred Reference Counting [Deutsch/Bobrow, ’76]
I To avoid high mutation overhead root references are notcounted (i.e. write barrier ignores root pointers)
I Objects with reference count 0 are maintained in a zerocount table (ZCT)
I Root references are traced at collection time
mutate(old, new):if ¬is-root-pointer
. . .
Roots Heap
ZCT
T C
A Unified Theory of Garbage Collection (Bacon et al.) Seminar Talk by Yoshimi Takano 21
Collection as Tracing and Reference Counting Single Heap
Algo. 3: Deferred Reference Counting [Deutsch/Bobrow, ’76]
I To avoid high mutation overhead root references are notcounted (i.e. write barrier ignores root pointers)
I Objects with reference count 0 are maintained in a zerocount table (ZCT)
I Root references are traced at collection time
mutate(old, new):if ¬is-root-pointer
. . .
Roots Heap
ZCT
T C
A Unified Theory of Garbage Collection (Bacon et al.) Seminar Talk by Yoshimi Takano 21
Collection as Tracing and Reference Counting Single Heap
Single Heap Collector Family
T T
(Pure) Tracing[McCarthy, 1960]
C T
“Partial Tracing”
C C
(Pure) Reference Counting[Collins, 1960]
T C
Deferred Reference Counting[Deutsch/Bobrow, 1976]
A Unified Theory of Garbage Collection (Bacon et al.) Seminar Talk by Yoshimi Takano 22
Collection as Tracing and Reference Counting Split Heap
Generational Garbage Collection [Ungar, 1984]
I Heap is split up in 2 regions: a nursery and a mature spaceI Collect nursery independentlyI Nursery objects pointed to by mature references are
maintained in a remembered set (RS) by a write barrier,i.e. reference counted
Roots Nursery
Mature
RS
T
T
T
TC
A Unified Theory of Garbage Collection (Bacon et al.) Seminar Talk by Yoshimi Takano 23
Collection as Tracing and Reference Counting Split Heap
Generational Garbage Collection [Ungar, 1984]
I Heap is split up in 2 regions: a nursery and a mature spaceI Collect nursery independentlyI Nursery objects pointed to by mature references are
maintained in a remembered set (RS) by a write barrier,i.e. reference counted
Roots Nursery
Mature
RS
T
T
T
TC
A Unified Theory of Garbage Collection (Bacon et al.) Seminar Talk by Yoshimi Takano 23
Collection as Tracing and Reference Counting Split Heap
Generational Garbage Collection [Ungar, 1984]
I Heap is split up in 2 regions: a nursery and a mature spaceI Collect nursery independentlyI Nursery objects pointed to by mature references are
maintained in a remembered set (RS) by a write barrier,i.e. reference counted
Roots Nursery
Mature
RS
T
T
T
TC
A Unified Theory of Garbage Collection (Bacon et al.) Seminar Talk by Yoshimi Takano 23
Collection as Tracing and Reference Counting Split Heap
Generational Traced-Root Collector Family
T
T
T
TC
Generational [Ungar, 1984]
T
T
C
CC
“Redundant Reference Counting”
T
T
T
CC
Ulterior Reference Counting[Blackburn/McKinley, 2003]
T
T
C
TC
“Inferior Reference Counting”
A Unified Theory of Garbage Collection (Bacon et al.) Seminar Talk by Yoshimi Takano 24
Uniform Cost Model
Enabling Quantitative Comparison
I Characterize object graph and programI Number of objectsI Allocation rateI Mutation rateI etc.
I Develop space/time cost formulas for each collectorI Don’t “cheat” by ignoring collector metadataI Coefficients ci for each parameter are left unspecifiedI See paper for details
I Simple example:time-per-collectionTracing = c1 |R|+c2 |Vlive|+c3 |Elive|+c4 |V|
A Unified Theory of Garbage Collection (Bacon et al.) Seminar Talk by Yoshimi Takano 25
Uniform Cost Model
Enabling Quantitative Comparison
I Characterize object graph and programI Number of objectsI Allocation rateI Mutation rateI etc.
I Develop space/time cost formulas for each collectorI Don’t “cheat” by ignoring collector metadataI Coefficients ci for each parameter are left unspecifiedI See paper for details
I Simple example:time-per-collectionTracing = c1 |R|+c2 |Vlive|+c3 |Elive|+c4 |V|
A Unified Theory of Garbage Collection (Bacon et al.) Seminar Talk by Yoshimi Takano 25
Conclusion Benefits
Conclusion
BenefitsI Deeper theoretical insight into garbage collection
I Design of collectors can be made more methodical
I May help enable dynamic construction of collectors tunedto particular applications
A Unified Theory of Garbage Collection (Bacon et al.) Seminar Talk by Yoshimi Takano 26
Conclusion Future Work/Outlook
Conclusion, cont’d
Future Work/OutlookI Refine (unrealistic) assumptions:
I Fixed-size objects (no fragmentation)I No concurrent collectorsI Application in steady state
I Take allocation cost and locality issues into accountI Measure coefficients for cost parameters
I Theory looks promising, but practical relevance still needsto emerge
A Unified Theory of Garbage Collection (Bacon et al.) Seminar Talk by Yoshimi Takano 27
“Theory without practice cannot survive and dies asquickly as it lives.”
– Leonardo da Vinci
A Unified Theory of Garbage Collection (Bacon et al.) Seminar Talk by Yoshimi Takano 28
Sources
Sources
David F. Bacon, Perry Cheng, V.T. RajanA Unified Theory of Garbage Collection(Paper and Presentation at OOPSLA 2004, Vancouver)
Paul R. WilsonUniprocessor Garbage Collection Techniques
A Unified Theory of Garbage Collection (Bacon et al.) Seminar Talk by Yoshimi Takano 29
Additional Material Fix-point Formulation
Fix-point Formulation
DefinitionA function ρ : V → N0 is a reference count function for an objectgraph G = (V, E, R) iff
∀ x ∈ V : ρ(x) = |[(u, x) ∈ E : ρ(u) > 0]|+ 1x∈R
I “# in-edges from vertices with a non-zero RC + const.”I ρ = λ x. |[(u, x) ∈ E : ρ(u) > 0]|+ 1x∈R︸ ︷︷ ︸
=: F(ρ)
=⇒ ρ is a fix-point
A Unified Theory of Garbage Collection (Bacon et al.) Seminar Talk by Yoshimi Takano 30
Roots R
︸ ︷︷ ︸V
Additional Material Partial Tracing
Algorithm 4: Partial Tracing
I New (inefficient?) algorithmI Only root references are countedI Intra-heap references are traced, starting from the
dynamically maintained root set R
mutate(old, new):if is-root-pointer
R← R ] [new]R← R− [old]
Roots Heap
C T
A Unified Theory of Garbage Collection (Bacon et al.) Seminar Talk by Yoshimi Takano 31
Additional Material Partial Tracing
Algorithm 4: Partial Tracing
I New (inefficient?) algorithmI Only root references are countedI Intra-heap references are traced, starting from the
dynamically maintained root set R
mutate(old, new):if is-root-pointer
R← R ] [new]R← R− [old]
Roots Heap
C T
A Unified Theory of Garbage Collection (Bacon et al.) Seminar Talk by Yoshimi Takano 31
Additional Material Partial Tracing
Algorithm 4: Partial Tracing
I New (inefficient?) algorithmI Only root references are countedI Intra-heap references are traced, starting from the
dynamically maintained root set R
mutate(old, new):if is-root-pointer
R← R ] [new]R← R− [old]
Roots Heap
C T
A Unified Theory of Garbage Collection (Bacon et al.) Seminar Talk by Yoshimi Takano 31
Additional Material Train Algorithm
Multi-Heap Collectors: Train Algorithm [Hudson/Moss, 1992]
Roots Train 1
Train 2
T T
T T
T
C
C
C
A Unified Theory of Garbage Collection (Bacon et al.) Seminar Talk by Yoshimi Takano 32
Additional Material Trade-Offs
Trade-Offs
I Using semi-spaces with a copying collector (linearspace-time trade-off: half the heap space vs. sweep time)
I Traversal (recursive or with pointer reversals?)I Memory compactionI Implementation of remembered sets
A Unified Theory of Garbage Collection (Bacon et al.) Seminar Talk by Yoshimi Takano 33
Additional Material Cycle Collection
Cycle CollectionBackup Tracing
I Occasionally perform a tracing collection
Trial DeletionI Wanted: vertex set S having no live external in-edgesI Candidate vertex x, S := x∗
I Subtract internal references and remove vertices withexternal count > 0
2
2
1
2
4
2
0
1
0
0
1
0
0
1
A Unified Theory of Garbage Collection (Bacon et al.) Seminar Talk by Yoshimi Takano 34