1/20 SIGMOD 2011Turbocharging the DBMS Buffer Pool using an SSD Microsoft Jim Gray Systems Lab ...
-
Upload
muriel-lyons -
Category
Documents
-
view
214 -
download
0
description
Transcript of 1/20 SIGMOD 2011Turbocharging the DBMS Buffer Pool using an SSD Microsoft Jim Gray Systems Lab ...
1/20
SIGMOD 2011 Turbocharging the DBMS Buffer Pool using an SSD
Microsoft Jim Gray Systems Lab & University of Wisconsin, Madison
Turbocharging the DBMS Buffer Pool using an SSD
Jaeyoung Do, Donghui Zhang, Jignesh M. Patel,David J. DeWitt, Jeffrey F. Naughton, Alan Halverson
2/20
SIGMOD 2011 Turbocharging the DBMS Buffer Pool using an SSD
Microsoft Jim Gray Systems Lab & University of Wisconsin, Madison
Memory Hierarchy
DRAM
HDDDisk
For over three decades…
Now: a disruptive change…
SSD
??
SSD wisdom:- Store hot data.- Store data with
random-I/O access.
Fast random I/Os; but expensive.
Cache
3/20
SIGMOD 2011 Turbocharging the DBMS Buffer Pool using an SSD
Microsoft Jim Gray Systems Lab & University of Wisconsin, Madison
Take Home Message• Use an SSD to extend the Buffer Pool.• Implemented in Microsoft SQL Server
2008R2.• Evaluated with TPC-C, E, and H.• Up to 9X speedup.
4/20
SIGMOD 2011 Turbocharging the DBMS Buffer Pool using an SSD
Microsoft Jim Gray Systems Lab & University of Wisconsin, Madison
Prior Art• [Holloway09] A. L. Holloway. Chapter 4: Extending the Buffer
Pool with a Solid State Disk. In Adapting Database Storage for New Hardware, UW-Madison Ph.D. thesis, 2009.
• [KV09] Koltsidas and Viglas. The Case for Flash-Aware Multi-Level Caching. University of Edinburgh Technical Report, 2009.
• [KVSZ10] B. M. Khessib, K. Vaid, S. Sankar, and C. Zhang. Using Solid State Drives as a Mid-Tier Cache in Enterprise Database OLTP Applications. TPCTC’10.
• [CMB+10] M. Canim, G. A. Mihaila, B. Bhattacharjee, K. A. Ross, and C. A. Lang. SSD Bufferpool Extensions for Database Systems. In VLDB’10.
State-of-the-art:Temperature-Aware Caching (TAC)
5/20
SIGMOD 2011 Turbocharging the DBMS Buffer Pool using an SSD
Microsoft Jim Gray Systems Lab & University of Wisconsin, Madison
Research Issues• Page flow• SSD admission policy• SSD replacement policy• Implication on checkpoint
6/20
SIGMOD 2011 Turbocharging the DBMS Buffer Pool using an SSD
Microsoft Jim Gray Systems Lab & University of Wisconsin, Madison
Implemented Designs
• Temperature-Aware Caching (TAC)• Dual-Write (DW)• Lazy-Cleaning (LC)
7/20
SIGMOD 2011 Turbocharging the DBMS Buffer Pool using an SSD
Microsoft Jim Gray Systems Lab & University of Wisconsin, Madison
Page Flow
BP Operations:read evict read modify evict
TAC writes a clean page to the SSD right after reading from the disk.
C
Buffer pool
Disk SSD BP
C
C
C
Buffer pool
Disk SSD BP
C
C
Buffer pool
Disk SSD BP
C
TAC Dual-Write Lazy-Cleaning
8/20
SIGMOD 2011 Turbocharging the DBMS Buffer Pool using an SSD
Microsoft Jim Gray Systems Lab & University of Wisconsin, Madison
TAC Dual-Write Lazy-Cleaning
Page Flow
BP Operations:read evict read modify evict
C
Buffer pool
Disk SSD BP
C
C
C
Buffer pool
Disk SSD BP
C
C
Buffer pool
Disk SSD BP
C
DW/LC writes a clean page to the SSD upon eviction from BP.
C C
9/20
SIGMOD 2011 Turbocharging the DBMS Buffer Pool using an SSD
Microsoft Jim Gray Systems Lab & University of Wisconsin, Madison
TAC Dual-Write Lazy-Cleaning
Page Flow
BP Operations:read evict read modify evict
C
Buffer pool
Disk SSD BP
C
C
C
Buffer pool
Disk SSD BP
C
C
Buffer pool
Disk SSD BP
C
Read from the SSD: same for all.
C C
10/20
SIGMOD 2011 Turbocharging the DBMS Buffer Pool using an SSD
Microsoft Jim Gray Systems Lab & University of Wisconsin, Madison
CCC
TAC Dual-Write Lazy-Cleaning
Page Flow
BP Operations:read evict read modify evict
Buffer pool
Disk SSD BP
C
CBuffer pool
Disk SSD BP
CBuffer pool
Disk SSD BP
C
Upon dirtying a page, TAC does notreclaim the SSD frame.
C CI
D D D
I I
11/20
SIGMOD 2011 Turbocharging the DBMS Buffer Pool using an SSD
Microsoft Jim Gray Systems Lab & University of Wisconsin, Madison
TAC Dual-Write Lazy-Cleaning
D
D D D
I
Page Flow
BP Operations:read evict read modify evict
Buffer pool
Disk SSD BP
C
Buffer pool
Disk SSD BP
Buffer pool
Disk SSD BP
Upon evicting a dirty page:- TAC and DW are write through;- LC is write back.
CI
Lazy cleaning
C
12/20
SIGMOD 2011 Turbocharging the DBMS Buffer Pool using an SSD
Microsoft Jim Gray Systems Lab & University of Wisconsin, Madison
SSD Admission/Replacement Policies• TAC
– Admission: if warmer than the coldest SSD page.– Replacement: the coldest page.
• DW/LC– Admission: if loaded from disk using a random I/O.– Replacement: LRU2.
13/20
SIGMOD 2011 Turbocharging the DBMS Buffer Pool using an SSD
Microsoft Jim Gray Systems Lab & University of Wisconsin, Madison
Implication on Checkpoint• TAC/DW
– No change, because every page in the SSD is clean.• LC
– Needs change, to handle the dirty pages in the SSD.
14/20
SIGMOD 2011 Turbocharging the DBMS Buffer Pool using an SSD
Microsoft Jim Gray Systems Lab & University of Wisconsin, Madison
Experimental SetupConfiguration
Machine HP Proliant DL180 G6 Server
Processor Intel® Xeon® L5520 2.27GHz (dual quad core)
Memory 20GB
Disks 8X SATA 7200RPM 1TB
SSD 140GB Fusion ioDrive 160 SLC
OS Windows Server 2008 R2
DBMS SQL Server 2008 R2
15/20
SIGMOD 2011 Turbocharging the DBMS Buffer Pool using an SSD
Microsoft Jim Gray Systems Lab & University of Wisconsin, Madison
TPC-C
100G
B
200G
B
400G
B0
2
4
6
8
10
TAC DW LC
SpeedupRelative to
noSSD
Q: Why is LC so good?A: Because TPC-C is update
intensive. In LC, dirty pages in the SSD are frequently re-referenced.
83% of the SSD references are to dirty SSD pages.
LC is 9X better than noSSD, or 5X better than DW/TAC.
16/20
SIGMOD 2011 Turbocharging the DBMS Buffer Pool using an SSD
Microsoft Jim Gray Systems Lab & University of Wisconsin, Madison
TPC-E
100G
B
200G
B
400G
B0
2
4
6
8
10
TAC DW LC
SpeedupRelative to
noSSD
Q: Why do the three designs have similar speedups?
A: Because TPC-E is read intensive.
Q: Why does the highest speedup occur for 200GB database?
A: For 400GB, a smaller fraction of data is cached in the SSD;For 100GB, a larger fraction of data is cached in the memory BP.
17/20
SIGMOD 2011 Turbocharging the DBMS Buffer Pool using an SSD
Microsoft Jim Gray Systems Lab & University of Wisconsin, Madison
TPC-H
45GB 160GB0
2
4
6
8
10
TAC DW LC
SpeedupRelative to
noSSD
Q: Why are the speedups smaller than in C or E?
A: Because most I/Os are sequential.For random I/Os: Fusion is 10X faster;For sequential I/Os: 8x disks are 1.4X faster.
18/20
SIGMOD 2011 Turbocharging the DBMS Buffer Pool using an SSD
Microsoft Jim Gray Systems Lab & University of Wisconsin, Madison
Disks are the Bottleneck
As long as disks are the bottleneck…Using less expensive SSDs may be good enough.
8 Disks
0.00 1.30 2.60 3.90 5.20 6.50 7.80 9.100
10
20
30
40
50readwrite
Time (hours)
I/O B
andw
idth
(MB
/s)
0.001.302.603.905.206.507.809.100
10
20
30
40
50readwrite
Time (hours)
I/O B
andw
idth
(MB
/s)
SSD
capacityreached!
about half capacity
I/O traffic to the disks and SSD, for TPC-E 200GB.
19/20
SIGMOD 2011 Turbocharging the DBMS Buffer Pool using an SSD
Microsoft Jim Gray Systems Lab & University of Wisconsin, Madison
Long Ramp-up Time
If restarts are frequent…Restart from the SSD may reduce rampup time.
0.001.202.403.604.806.007.208.409.600
20
40
60
80
100
120 TAC DW LC
Time (#hours)
tpsE (#
trans/ sec)
TPC-E (200GB) Q: Why does rampup take 10 hours?
A: Because the SSD is being filled slowly, gated by the random read speed of the disks.
20/20
SIGMOD 2011 Turbocharging the DBMS Buffer Pool using an SSD
Microsoft Jim Gray Systems Lab & University of Wisconsin, Madison
Conclusions
• SSD buffer pool extension is a good idea.– We observed a 9X speedup (OLTP) and a 3X speedup (DSS).
• The choice of design depends on the update frequency.– For update-intensive (TPC-C) workloads: LC wins.– For read-intensive (TPC-E or H) workloads: DW/LC/TAC have similar
performance.• Mid-range SSDs may be good enough.
– With 8 disks, only half of FusionIO’s bandwidth is used.• Caution: rampup time may be long.
– If restarts are frequent, the DBMS should restart from the SSD.
21/20
SIGMOD 2011 Turbocharging the DBMS Buffer Pool using an SSD
Microsoft Jim Gray Systems Lab & University of Wisconsin, Madison
Backup Slides
22/20
SIGMOD 2011 Turbocharging the DBMS Buffer Pool using an SSD
Microsoft Jim Gray Systems Lab & University of Wisconsin, Madison
Architectural Change
Buffer Manager
I/O Manager
Disk
BPBuffer Manager
I/O Manager
Disk
SSD Manager
SSD BP
BP
23/20
SIGMOD 2011 Turbocharging the DBMS Buffer Pool using an SSD
Microsoft Jim Gray Systems Lab & University of Wisconsin, Madison
Data Structures
SSD Buffer PoolSSD
Memory
C D C C DSSD Buffer TableSSD Free List
SSD Hash Table
C C C D DSSD Heap Array
Clean Heap Dirty Heap
24/20
SIGMOD 2011 Turbocharging the DBMS Buffer Pool using an SSD
Microsoft Jim Gray Systems Lab & University of Wisconsin, Madison
Further Issues
• Aggressive filling• SSD throttle control• Multi-page I/O request• Asynchronous I/O handling• SSD partitioning• Gather write