Authors: Stavros Harizopoulos @ HP Daniel J. Abadi @ Yale Samuel Madden @ MIT Michael Stonebraker @...

53
Authors: Stavros Harizopoulos @ HP Daniel J. Abadi @ Yale Samuel Madden @ MIT Michael Stonebraker @ MIT Supervisor: Dr Benjamin Kao Presenter: For Sigmod 2008 paper OLTP through the looking glass, and what we found there

Transcript of Authors: Stavros Harizopoulos @ HP Daniel J. Abadi @ Yale Samuel Madden @ MIT Michael Stonebraker @...

Page 1: Authors: Stavros Harizopoulos @ HP Daniel J. Abadi @ Yale Samuel Madden @ MIT Michael Stonebraker @ MIT Supervisor: Dr Benjamin Kao Presenter: For Sigmod.

Authors:

Stavros Harizopoulos @ HP

Daniel J. Abadi @ Yale

Samuel Madden @ MIT

Michael Stonebraker @ MIT

Supervisor: Dr Benjamin Kao

Presenter: For

Sigmod 2008 paper

OLTP through the looking glass, and what we found there

Page 2: Authors: Stavros Harizopoulos @ HP Daniel J. Abadi @ Yale Samuel Madden @ MIT Michael Stonebraker @ MIT Supervisor: Dr Benjamin Kao Presenter: For Sigmod.

What is Online Transaction Processing(OLTP)?

• OLTP, refers to a class of systems that facilitate and manage transaction-oriented applications, typically for data entry and retrieval transaction processing (wiki)

• OLTP database was optimized for computer technology of late 1970s, 30 years ago

2B3L

Page 3: Authors: Stavros Harizopoulos @ HP Daniel J. Abadi @ Yale Samuel Madden @ MIT Michael Stonebraker @ MIT Supervisor: Dr Benjamin Kao Presenter: For Sigmod.

Main feature of OLTP

• Buffer Management to facilitate data transfer between memory and disk

• B-Tree for on-disk data storage• Logging for recovery• Locking to support concurrency • Latching for accessing shared data structure

2B3L

Page 4: Authors: Stavros Harizopoulos @ HP Daniel J. Abadi @ Yale Samuel Madden @ MIT Michael Stonebraker @ MIT Supervisor: Dr Benjamin Kao Presenter: For Sigmod.

Motivation of the studies• Is the OLTP database optimized nowadays,

given the hardware advancement?• Request from outside the DB community for

alternative DB architecture

Page 5: Authors: Stavros Harizopoulos @ HP Daniel J. Abadi @ Yale Samuel Madden @ MIT Michael Stonebraker @ MIT Supervisor: Dr Benjamin Kao Presenter: For Sigmod.

Motivation of studiesHardware advancement

30 Years ago Nowadays

DB cost In millions Few thousands

Storage size DB size >> Memory

Memory >DB size

Processing time for most of the transactions

\ In microseconds

Page 6: Authors: Stavros Harizopoulos @ HP Daniel J. Abadi @ Yale Samuel Madden @ MIT Michael Stonebraker @ MIT Supervisor: Dr Benjamin Kao Presenter: For Sigmod.

Motivation of studies Request from outside the DB community

• “database-like” storage system proposal from Operating System and networking conference – varying forms of

• concurrency, • consistency, • reliability, • replication, • queryability

Page 7: Authors: Stavros Harizopoulos @ HP Daniel J. Abadi @ Yale Samuel Madden @ MIT Michael Stonebraker @ MIT Supervisor: Dr Benjamin Kao Presenter: For Sigmod.

Trends in OLTP – (1/5) Cluster Computing

Page 8: Authors: Stavros Harizopoulos @ HP Daniel J. Abadi @ Yale Samuel Madden @ MIT Michael Stonebraker @ MIT Supervisor: Dr Benjamin Kao Presenter: For Sigmod.

Trend in OLTP-(2/5) Memory resident Databases

Buffer

Page 9: Authors: Stavros Harizopoulos @ HP Daniel J. Abadi @ Yale Samuel Madden @ MIT Michael Stonebraker @ MIT Supervisor: Dr Benjamin Kao Presenter: For Sigmod.

Trend in OLTP-(2/5) Memory resident Databases

• Data doesn’t grow as fast as memory size

Page 10: Authors: Stavros Harizopoulos @ HP Daniel J. Abadi @ Yale Samuel Madden @ MIT Michael Stonebraker @ MIT Supervisor: Dr Benjamin Kao Presenter: For Sigmod.

Trend in OLTP-(3/5)Single Threading in OLTP System

Page 11: Authors: Stavros Harizopoulos @ HP Daniel J. Abadi @ Yale Samuel Madden @ MIT Michael Stonebraker @ MIT Supervisor: Dr Benjamin Kao Presenter: For Sigmod.

• A step backward from multithread to single thread ?

• Why multithreading?– Prevent idle of CPU while waiting data from disk– Prevent long-running transactions from blocking

short transaction

• Not valid for memory resident DB– No disk wait– Long-running transactions run in ware house

Trend in OLTP-(3/5)Single Threading in OLTP System

Page 12: Authors: Stavros Harizopoulos @ HP Daniel J. Abadi @ Yale Samuel Madden @ MIT Michael Stonebraker @ MIT Supervisor: Dr Benjamin Kao Presenter: For Sigmod.

• What about multi processors ?– Achieve shared-nothing processor by virtual

machine

• What about network disk?– Feasible to partition transaction to run in “single-

site”

Trend in OLTP-(3/5)Single Threading in OLTP System

Page 13: Authors: Stavros Harizopoulos @ HP Daniel J. Abadi @ Yale Samuel Madden @ MIT Michael Stonebraker @ MIT Supervisor: Dr Benjamin Kao Presenter: For Sigmod.

Trend in OLTP-(4/5) High Availability vs. Logging

• 24x7 service achieved by using multiple set of hardware.

• Perform recovery by copying missing states from other database replicas.

• Log for recovery can be avoided

Page 14: Authors: Stavros Harizopoulos @ HP Daniel J. Abadi @ Yale Samuel Madden @ MIT Michael Stonebraker @ MIT Supervisor: Dr Benjamin Kao Presenter: For Sigmod.

Trend in OLTP-(4/5) High Availability vs. Logging

Production Server Standby Server

Recovery from24x7 service

Page 15: Authors: Stavros Harizopoulos @ HP Daniel J. Abadi @ Yale Samuel Madden @ MIT Michael Stonebraker @ MIT Supervisor: Dr Benjamin Kao Presenter: For Sigmod.

Trend in OLTP-(5/5)Transaction Variants

• Why transaction variants?– 2 phase commit protocol harm large scale

distributed DB system performance– 2 phase commit involves commit-request and

commit phase which involves all server to participate.

Page 16: Authors: Stavros Harizopoulos @ HP Daniel J. Abadi @ Yale Samuel Madden @ MIT Michael Stonebraker @ MIT Supervisor: Dr Benjamin Kao Presenter: For Sigmod.

Trend in OLTP-(5/5)Transaction Variants

• Trade consistency with performance• Eventual consistency, all writes propagate

among the database servers.

Page 17: Authors: Stavros Harizopoulos @ HP Daniel J. Abadi @ Yale Samuel Madden @ MIT Michael Stonebraker @ MIT Supervisor: Dr Benjamin Kao Presenter: For Sigmod.

Trend in OLTP-(5/5)Transaction Variants

• Eventual consistency example: AmazonSx, Sy, Sz are different servers

D1, D2 add item to cartD3 delete an item from cartD4 add another item to cart

Page 18: Authors: Stavros Harizopoulos @ HP Daniel J. Abadi @ Yale Samuel Madden @ MIT Michael Stonebraker @ MIT Supervisor: Dr Benjamin Kao Presenter: For Sigmod.

Research groups interested in new DB architecture

• Amazon, • HP, • NYU, • MIT

Page 19: Authors: Stavros Harizopoulos @ HP Daniel J. Abadi @ Yale Samuel Madden @ MIT Michael Stonebraker @ MIT Supervisor: Dr Benjamin Kao Presenter: For Sigmod.

Trend in OLTP - Summary

Eventual consistency

Page 20: Authors: Stavros Harizopoulos @ HP Daniel J. Abadi @ Yale Samuel Madden @ MIT Michael Stonebraker @ MIT Supervisor: Dr Benjamin Kao Presenter: For Sigmod.

DBMS modification by OLTP trends

• (1) memory resident DB can get rid of buffer mgt

Buffer

Page 21: Authors: Stavros Harizopoulos @ HP Daniel J. Abadi @ Yale Samuel Madden @ MIT Michael Stonebraker @ MIT Supervisor: Dr Benjamin Kao Presenter: For Sigmod.

DBMS modification by OLTP trends

• (2) single thread can avoid locking and latching

Page 22: Authors: Stavros Harizopoulos @ HP Daniel J. Abadi @ Yale Samuel Madden @ MIT Michael Stonebraker @ MIT Supervisor: Dr Benjamin Kao Presenter: For Sigmod.

DBMS modification by OLTP trends

• (3) Cluster computing avoid locking, instead of single processor multithread, each processor is responsible for each own thread

Page 23: Authors: Stavros Harizopoulos @ HP Daniel J. Abadi @ Yale Samuel Madden @ MIT Michael Stonebraker @ MIT Supervisor: Dr Benjamin Kao Presenter: For Sigmod.

DBMS modification by OLTP trends

• (4) high availability can avoid using log for recovery purpose

Production Server Standby Server

Recovery from

Page 24: Authors: Stavros Harizopoulos @ HP Daniel J. Abadi @ Yale Samuel Madden @ MIT Michael Stonebraker @ MIT Supervisor: Dr Benjamin Kao Presenter: For Sigmod.

DBMS modification by OLTP trends

• (5) transaction less avoid book keeping, i.e. logging

Page 25: Authors: Stavros Harizopoulos @ HP Daniel J. Abadi @ Yale Samuel Madden @ MIT Michael Stonebraker @ MIT Supervisor: Dr Benjamin Kao Presenter: For Sigmod.

Case Study of DBMS, Shore

• Shore was developed at the University of Wisconsin in the early 1990’s

• It was designed to be a typed, persistent object system borrowing from both file system and object-oriented database technologies

http://www.cs.wisc.edu/shore

Page 26: Authors: Stavros Harizopoulos @ HP Daniel J. Abadi @ Yale Samuel Madden @ MIT Michael Stonebraker @ MIT Supervisor: Dr Benjamin Kao Presenter: For Sigmod.

Basic components of Shore

2B3L

Page 27: Authors: Stavros Harizopoulos @ HP Daniel J. Abadi @ Yale Samuel Madden @ MIT Michael Stonebraker @ MIT Supervisor: Dr Benjamin Kao Presenter: For Sigmod.

Removing Shore’s components

Page 28: Authors: Stavros Harizopoulos @ HP Daniel J. Abadi @ Yale Samuel Madden @ MIT Michael Stonebraker @ MIT Supervisor: Dr Benjamin Kao Presenter: For Sigmod.

Remove Shore’s logging(1)

• To increase log buffer size so that it will not flush to disk

Page 29: Authors: Stavros Harizopoulos @ HP Daniel J. Abadi @ Yale Samuel Madden @ MIT Michael Stonebraker @ MIT Supervisor: Dr Benjamin Kao Presenter: For Sigmod.

Remove Shore’s locking(2)

• To configure Lock Manager always granting lock

• To remove codes that handle ungranted lock request

Page 30: Authors: Stavros Harizopoulos @ HP Daniel J. Abadi @ Yale Samuel Madden @ MIT Michael Stonebraker @ MIT Supervisor: Dr Benjamin Kao Presenter: For Sigmod.

Remove Shore’s latching(3)

• To add if-else statement to avoid request for latch

• To replace the original latching intensive B-tree with a latch free B-tree

Page 31: Authors: Stavros Harizopoulos @ HP Daniel J. Abadi @ Yale Samuel Madden @ MIT Michael Stonebraker @ MIT Supervisor: Dr Benjamin Kao Presenter: For Sigmod.

Remove Shore’s Buffer Mgt (4)

• To replace Buffer Mgt by directly invoking malloc for memory allocation

Page 32: Authors: Stavros Harizopoulos @ HP Daniel J. Abadi @ Yale Samuel Madden @ MIT Michael Stonebraker @ MIT Supervisor: Dr Benjamin Kao Presenter: For Sigmod.

Shore after components removal

Page 33: Authors: Stavros Harizopoulos @ HP Daniel J. Abadi @ Yale Samuel Madden @ MIT Michael Stonebraker @ MIT Supervisor: Dr Benjamin Kao Presenter: For Sigmod.

Bench mark for comparison (TPC-C)

• TPC-C is industry standard used to measure ecommerce performance

• TPC-C is designed to represent any industry that must manage, sell, or distribute a product or service

• Vendors includes Microsoft, Oracle, IBM, Sybase, Sun, HP, DELL etc.

http://www.tpc.org/tpcc/default.asp

Page 34: Authors: Stavros Harizopoulos @ HP Daniel J. Abadi @ Yale Samuel Madden @ MIT Michael Stonebraker @ MIT Supervisor: Dr Benjamin Kao Presenter: For Sigmod.

Bench mark for comparison

• 1 warehouse(~100M) serves 10 districts, and each district serve 3000 customers.

Page 35: Authors: Stavros Harizopoulos @ HP Daniel J. Abadi @ Yale Samuel Madden @ MIT Michael Stonebraker @ MIT Supervisor: Dr Benjamin Kao Presenter: For Sigmod.

Bench mark for comparison

• 5 concurrent transactions in TPC-C– New Order Transaction– Payment – Deliver Order– Check status of Order– Monitor Stock Level of warehouse

Page 36: Authors: Stavros Harizopoulos @ HP Daniel J. Abadi @ Yale Samuel Madden @ MIT Michael Stonebraker @ MIT Supervisor: Dr Benjamin Kao Presenter: For Sigmod.

Experiment setup and measurement

• Single-core Pentium 4, 3.2GHz, with 1MB L2 cache, hyper threading disabled, 1GB RAM, running Linux 2.6.

• 40,000 transactions of types New Order Transaction and Payment are run

• Results measured in – 1) Throughput (Time/ # Transaction completed) – 2) Instruction count

Page 37: Authors: Stavros Harizopoulos @ HP Daniel J. Abadi @ Yale Samuel Madden @ MIT Michael Stonebraker @ MIT Supervisor: Dr Benjamin Kao Presenter: For Sigmod.

Results after removing the components (in throughput)

• Memory resident Shore DB provided 640 transactions per second.

• Stripped-down Shore DB provided 12,700 transactions per second.

• The Stripped-down Shore DB gave a 20 times improvement in throughput

Page 38: Authors: Stavros Harizopoulos @ HP Daniel J. Abadi @ Yale Samuel Madden @ MIT Michael Stonebraker @ MIT Supervisor: Dr Benjamin Kao Presenter: For Sigmod.

Results after removing the components (in # instruction)

• Instruction of useful work is only <2% of a memory resident DB

Page 39: Authors: Stavros Harizopoulos @ HP Daniel J. Abadi @ Yale Samuel Madden @ MIT Michael Stonebraker @ MIT Supervisor: Dr Benjamin Kao Presenter: For Sigmod.

Effect of removing different components for payment (1/6)

Page 40: Authors: Stavros Harizopoulos @ HP Daniel J. Abadi @ Yale Samuel Madden @ MIT Michael Stonebraker @ MIT Supervisor: Dr Benjamin Kao Presenter: For Sigmod.

Effect of removing different components for payment (2/6)

Page 41: Authors: Stavros Harizopoulos @ HP Daniel J. Abadi @ Yale Samuel Madden @ MIT Michael Stonebraker @ MIT Supervisor: Dr Benjamin Kao Presenter: For Sigmod.

Effect of removing different components for payment (3/6)

Page 42: Authors: Stavros Harizopoulos @ HP Daniel J. Abadi @ Yale Samuel Madden @ MIT Michael Stonebraker @ MIT Supervisor: Dr Benjamin Kao Presenter: For Sigmod.

Effect of removing different components for payment (4/6)

Page 43: Authors: Stavros Harizopoulos @ HP Daniel J. Abadi @ Yale Samuel Madden @ MIT Michael Stonebraker @ MIT Supervisor: Dr Benjamin Kao Presenter: For Sigmod.

Effect of removing different components for payment (5/6)

Page 44: Authors: Stavros Harizopoulos @ HP Daniel J. Abadi @ Yale Samuel Madden @ MIT Michael Stonebraker @ MIT Supervisor: Dr Benjamin Kao Presenter: For Sigmod.

Effect of removing different components for payment (6/6)

Page 45: Authors: Stavros Harizopoulos @ HP Daniel J. Abadi @ Yale Samuel Madden @ MIT Michael Stonebraker @ MIT Supervisor: Dr Benjamin Kao Presenter: For Sigmod.

Effect of removing different components for both order

Page 46: Authors: Stavros Harizopoulos @ HP Daniel J. Abadi @ Yale Samuel Madden @ MIT Michael Stonebraker @ MIT Supervisor: Dr Benjamin Kao Presenter: For Sigmod.

Instruction and cycle comparsion

Page 47: Authors: Stavros Harizopoulos @ HP Daniel J. Abadi @ Yale Samuel Madden @ MIT Michael Stonebraker @ MIT Supervisor: Dr Benjamin Kao Presenter: For Sigmod.

Implication for future OLTP engineConcurrency Control

• Single threaded transaction allow concurrency control to be turned off

• But many DBMS applications are not sufficiently well behave for Single threaded transaction.

• Dynamic locking was experimentally the best concurrency control with disk.

• What concurrency control protocol is best?

Page 48: Authors: Stavros Harizopoulos @ HP Daniel J. Abadi @ Yale Samuel Madden @ MIT Michael Stonebraker @ MIT Supervisor: Dr Benjamin Kao Presenter: For Sigmod.

Implication for future OLTP engineMulti-core support

• Virtualization, each core is a single-threaded machine

• Intra-query parallelism, each processor running part of a single query

Page 49: Authors: Stavros Harizopoulos @ HP Daniel J. Abadi @ Yale Samuel Madden @ MIT Michael Stonebraker @ MIT Supervisor: Dr Benjamin Kao Presenter: For Sigmod.

Implication for future OLTP engineReplication Management

• Active-passive replication scheme with log• Replica may not be consistent with the primary

unless on two-phase commit protocol• Log is required

• Active-active replication scheme with transactions

• Two-phase commit introduce large latency for distributed replication

Page 50: Authors: Stavros Harizopoulos @ HP Daniel J. Abadi @ Yale Samuel Madden @ MIT Michael Stonebraker @ MIT Supervisor: Dr Benjamin Kao Presenter: For Sigmod.

Implication for future OLTP engineWeak Consistency

• Eventual Consistency - Data is not immediately propagated across all nodes.

• To study the effect of different degree of workload to the consistency level of data

Page 51: Authors: Stavros Harizopoulos @ HP Daniel J. Abadi @ Yale Samuel Madden @ MIT Michael Stonebraker @ MIT Supervisor: Dr Benjamin Kao Presenter: For Sigmod.

Implication for future OLTP engine Cache conscious B-Trees

• Cache misses in the B-tree code may well be the new bottleneck for the stripped down system.

Page 52: Authors: Stavros Harizopoulos @ HP Daniel J. Abadi @ Yale Samuel Madden @ MIT Michael Stonebraker @ MIT Supervisor: Dr Benjamin Kao Presenter: For Sigmod.

Conclusion

• Most significant overhead contributors are buffer management and locking operation, followed by logging and latching.

• A fully stripped down system’s performance is orders of magnitude better than an unmodified system.

Page 53: Authors: Stavros Harizopoulos @ HP Daniel J. Abadi @ Yale Samuel Madden @ MIT Michael Stonebraker @ MIT Supervisor: Dr Benjamin Kao Presenter: For Sigmod.

Thank you !