project 3 final - CMU 15-721 :: Advanced Database...

Post on 03-May-2018

216 views 0 download

Transcript of project 3 final - CMU 15-721 :: Advanced Database...

GARBAGE COLLECTION

SAURABH KADEKODI RAJAT KATEJA TIANYUAN DING

GOALS

PROBLEM STATEMENT

▸ To implement efficient garbage collection in Peloton

GOALS (REVISED)

▸ 75% - implement basic tuple recycling using vacuum

▸ 100% - implement epoch based co-operative gc

▸ 112.5% - lock-free implementations, GetMemoryFootprint()

▸ 125% - DDL garbage collection (deferred after discussion)

METHODOLOGY

WITHOUT GC

Empty Tile Slot

Tile Group

TxnAdd new tuple

METHODOLOGY

WITHOUT GC

Empty Tile Slot

Tile Group

TxnAdd new tuple

METHODOLOGY

WITHOUT GC

Empty Tile Slot

Tile Group

TxnUpdate tuple 3

METHODOLOGY

WITHOUT GC

Empty Tile Slot

Tile Group

Empty Tile Slot

TxnUpdate tuple 3

METHODOLOGY

WITHOUT GC

Empty Tile Slot

Tile Group

Empty Tile Slot

Cannot mark this slot free yet (MVCC)Tuple Update

TxnUpdate tuple 3

METHODOLOGY

WITHOUT GC

Empty Tile Slot

Tile Group

Empty Tile Slot

Can use this slot…

TxnUpdate tuple 3

METHODOLOGY

WITHOUT GC

Empty Tile Slot

Tile Group

Empty Tile Slot

Can use this slot…

TxnUpdate tuple 1

METHODOLOGY

WITHOUT GC

Empty Tile Slot

Tile Group

Empty Tile SlotEmpty Tile Slot

Can use this slot…

TxnUpdate tuple 1

METHODOLOGY

WITHOUT GC

Empty Tile Slot

Tile Group

Empty Tile SlotEmpty Tile Slot

Can use this slot…

TxnUpdate tuple 1

METHODOLOGY

WITHOUT GC

Empty Tile Slot

Tile Group

Empty Tile SlotEmpty Tile Slot

Can use this slot…

Unused Slots

TxnUpdate tuple 1

METHODOLOGY

WITH GC

Empty Tile Slot

Tile Group

1

2

3

4

3

Actually Free

Possibly Free

1

METHODOLOGY

WITH GC

Empty Tile Slot

Tile Group

1

2

3

4

3

Actually Free

Possibly Free

1

METHODOLOGY

WITH GC

Empty Tile Slot

Tile Group

Empty Tile Slot

Add to possibly_free_list

1

2

3

4

5

3

Actually Free

Possibly Free

1

METHODOLOGY

WITH GC

Empty Tile Slot

Tile Group

Empty Tile Slot

Move to actually_free_list

1

2

3

4

5

3 Actually Free

Possibly Free

1

METHODOLOGY

WITH GC

Empty Tile Slot

Tile Group

Empty Tile SlotEmpty Tile Slot

Move to actually_free_list

1

2

3

4

5

6

3 Actually Free

Possibly Free

1

METHODOLOGY

WITH GC

Empty Tile Slot

Tile Group

Empty Tile SlotEmpty Tile Slot

1

2

3

4

5

6

Actually Free

Possibly Free

1

Use recycled tuple slot

METHODOLOGY

WITH GC

Empty Tile Slot

Tile Group

Empty Tile SlotEmpty Tile Slot

1

2

3

4

5

6

Actually Free

Possibly Free1

Use recycled tuple slot

METHODOLOGY

WITH GC

Empty Tile Slot

Tile Group

Empty Tile SlotEmpty Tile Slot

1

2

3

4

5

6

Actually Free

Possibly Free

1

Use recycled tuple slot

METHODOLOGY

MEMORY FOOTPRINT

Tile Group

Empty Tile SlotEmpty Tile Slot

1

2

3

4

5

6

3

Actually Free

Possibly Free

1

1

{-

CONTRIBUTIONS

GC MODES

▸ Off (default)

▸ Vacuum

▸ Naïve co-operative

▸ Epoch based co-operative

WORKLOAD

▸ YCSB - 80% updates, 20% reads, 10 terminals, 100000 tuples and 100 sec runtime

▸ YCSB - 50% updates, 50% inserts, 10 terminals, 100000 tuples and 100 sec runtime

CONTRIBUTIONS

VACUUM

▸ Separate thread started as Peloton bootstraps

▸ Periodically moves elements from possibly_free_list to actually_free_list

VACUUM TRADEOFFS

▸ Vacuum thread period vs GC efficiency

▸ # Recycled per vacuum invocation vs GC efficiency

CONTRIBUTIONS

VACUUM TRADEOFFS MEASURED

Mem

ory

Usag

e (M

B)

8

8.75

9.5

10.25

11

Thro

ughp

ut (r

ps)

0

3250

6500

9750

13000

Vacuum Configurations (V-<sleep>-<# recycled>)

V-1-1000 V-2-1000 V-5-1000 V-5-2000 V-5-3000 V-5-5000

Throughput (rps)Memory Usage (MB)

CONTRIBUTIONS

NAÏVE CO-OPERATIVE (COOPERATIVE)

CONTRIBUTIONS

NAÏVE CO-OPERATIVE (COOPERATIVE)

PERFORM GC

CONTRIBUTIONS

NAÏVE CO-OPERATIVE (COOPERATIVE)

PERFORM GCPER

CONTRIBUTIONS

COOPERATIVE TRADEOFFS MEASURED

Mem

ory

Usag

e (M

B)

0

2.25

4.5

6.75

9

Thro

ughp

ut (r

ps)

0

7500

15000

22500

30000

Naïve Configurations (N-<# recycled>)

N-1 N-10 N-25 N-50 N-100 N-1000 N-5000

Throughput (rps)Memory Usage (MB)

CONTRIBUTIONS

EPOCH CO-OPERATIVE (EPOCH)

ejoin

CONTRIBUTIONS

EPOCH CO-OPERATIVE (EPOCH)

ejoin

CONTRIBUTIONS

EPOCH CO-OPERATIVE (EPOCH)

ejoin

CONTRIBUTIONS

EPOCH CO-OPERATIVE (EPOCH)

e

join

CONTRIBUTIONS

EPOCH CO-OPERATIVE (EPOCH)

e

join

CONTRIBUTIONS

EPOCH CO-OPERATIVE (EPOCH)

e

join

ee

CONTRIBUTIONS

EPOCH CO-OPERATIVE (EPOCH)

e

join

leave

e e

CONTRIBUTIONS

EPOCH CO-OPERATIVE (EPOCH)

e

join

leave

e e

CONTRIBUTIONS

EPOCH CO-OPERATIVE (EPOCH)

PERFORM GCPER

e

FOR

join

leave

CONTRIBUTIONS

EPOCH TRADEOFFS MEASURED

Mem

ory

Usag

e (M

B)

0

3.5

7

10.5

14

Thro

ughp

ut (r

ps)

0

1750

3500

5250

7000

Epoch Configurations (E-<# epochs>)E-1 E-5 E-10 E-INF

Throughput (rps)Memory Usage (MB)

CONTRIBUTIONS

OVERALL GC COMPARISON

Mem

ory

Usag

e (M

B)

0

5

10

15

20

Thro

ughp

ut (r

ps)

0

7500

15000

22500

30000

Best ConfigurationsOFF VACUUM NAÏVE EPOCH

Throughput (rps)Memory Usage (MB)

CONTRIBUTIONS

CACHE MISSES MATTER AND THEY DON'T!Ca

che

Miss

es

0

4.5

9

13.5

18

Best ConfigurationsOFF VACUUM NAÏVE EPOCH

Cache Misses (%)

PROBLEMS

ENTIRE TABLE TRUNCATIONS

▸ Possibly_free_list may potentially end up with an absurdly large number of free slots

▸ Has to be handled as a special case

RECYCLING ACROSS TILE GROUPS

▸ Current recycling is at table granularity - i.e. across tile groups

▸ Different tile group schemas in the same table may become problematic

▸ Tradeoff: recycling granularity vs # tuples recycled

QUESTIONS?

Thank You…