Advanced Topics in Databases Hadi Amiri Abolfazl AleAhmad Summer 1385.
-
Upload
michael-may -
Category
Documents
-
view
232 -
download
3
Transcript of Advanced Topics in Databases Hadi Amiri Abolfazl AleAhmad Summer 1385.
Advanced Topics in Advanced Topics in DatabasesDatabases
Hadi AmiriHadi Amiri
Abolfazl AleAhmadAbolfazl AleAhmad
Summer 1385Summer 1385
ContentContent
Why Main Memory Databases?Why Main Memory Databases? Performance Evaluation of MMDB:Performance Evaluation of MMDB:
PERST DBPERST DB TPC Performance BenchmarkTPC Performance Benchmark ImplementationImplementation
Experimental ResultsExperimental Results MARS MMDBMSMARS MMDBMS ConclusionConclusion ReferencesReferences
About Main Memory DatabasesAbout Main Memory Databases
Main Memory Databases (MMDB) =The idea of Main Memory Databases (MMDB) =The idea of having entire databases reside in main memory.having entire databases reside in main memory. In a Main Memory DBMS (MMDBMS) there is no central In a Main Memory DBMS (MMDBMS) there is no central
role for I/O management.role for I/O management. MMDB will use physical memory as primary storage and a MMDB will use physical memory as primary storage and a
disk subsystem for backupdisk subsystem for backup.. Motivations:Motivations:
Requirement of short Access/Response timeRequirement of short Access/Response time Telecommunication applicationsTelecommunication applications Applications handling high traffic of data, e.g. Router.Applications handling high traffic of data, e.g. Router. Real-Time applicationsReal-Time applications
About Main Memory Databases About Main Memory Databases (Cont.)(Cont.)
Traditional database systems : rely on the disk Traditional database systems : rely on the disk subsystem to retrieve and update data and use an subsystem to retrieve and update data and use an offline storage device such as magnetic tape for backupoffline storage device such as magnetic tape for backup
Advantages to the use of MMDBs: Advantages to the use of MMDBs: The memory resident databases can achieve significant The memory resident databases can achieve significant
performance improvements over conventional database performance improvements over conventional database systems systems
by eliminating the need for I/O to perform database by eliminating the need for I/O to perform database applications. applications.
processing time and throughput rates should improve due to processing time and throughput rates should improve due to the elimination of I/O overhead. the elimination of I/O overhead.
Our Project:Our Project: Performance Evaluation of MMDB: Performance Evaluation of MMDB: PERST DB:PERST DB:
a highly efficient main memory database system with a highly efficient main memory database system with realtime capabilities and convenient C# interface realtime capabilities and convenient C# interface
optimized for applications with dominated read access optimized for applications with dominated read access pattern pattern
supports transactions, online backup and automatic supports transactions, online backup and automatic recovery after system crash recovery after system crash
it is also possible to use it with databases, the size of it is also possible to use it with databases, the size of which exceeds the size of the physical memory in the which exceeds the size of the physical memory in the system system
Our Project:Our Project: Performance Evaluation of MMDB: Performance Evaluation of MMDB: PERST (Cont.): PERST (Cont.):
Database tables are constructed using information Database tables are constructed using information about application classes about application classes
A SQL-like query language is used to specify A SQL-like query language is used to specify queries queries
Table rows are considered as object instances and Table rows are considered as object instances and the table is the class of these objectsthe table is the class of these objects
Our Project:Our Project: Performance Evaluation of MMDB: Performance Evaluation of MMDB: TPC Performance Benchmark:TPC Performance Benchmark:
defined in 1989…defined in 1989… The benchmark simulates a typical bank application The benchmark simulates a typical bank application
by a single type of transaction that models cash by a single type of transaction that models cash withdrawal and deposit at a bank teller withdrawal and deposit at a bank teller
The transaction updates several relations such as the The transaction updates several relations such as the bank balance and the customer’s balancebank balance and the customer’s balance
The benchmark also incorporates communication The benchmark also incorporates communication with terminalswith terminals
to model the end to end performance of the system to model the end to end performance of the system realistically.realistically.
Our Project:Our Project: Performance Evaluation of MMDB: Performance Evaluation of MMDB: Implementation:Implementation:
We implemented another module in conjunction We implemented another module in conjunction with PERST modules, this module with PERST modules, this module
first creates the both data bases, first creates the both data bases, then allows user to fill up them, then allows user to fill up them, and then to run some “manipulations”, and then to run some “manipulations”,
Visual C# .NET interface in conjunction with Visual C# .NET interface in conjunction with PERST PERST
Evaluation queries including select, insert, update Evaluation queries including select, insert, update and delete for variable number of table records and and delete for variable number of table records and updates both for main memory and MS-SQLupdates both for main memory and MS-SQL
Our Project:Our Project: Performance Evaluation of MMDB: Performance Evaluation of MMDB: Implementation (Cont.) :Implementation (Cont.) :
recorded time is calculated from the beginning of an recorded time is calculated from the beginning of an “existing check” –which checks whether the “existing check” –which checks whether the produced random source/destination account/branch produced random source/destination account/branch exists or not -to the end of the query execution exists or not -to the end of the query execution
Selective query is designed so that it selects for half Selective query is designed so that it selects for half of the branches of the branches
Performed on Performed on a Pentium 2.8 GHz system a Pentium 2.8 GHz system with Windows XP as operating system with Windows XP as operating system and 512 MB of RAMand 512 MB of RAM
Experimental ResultsExperimental Results
we implemented to create the following result we implemented to create the following result diagrams :diagrams : Insert time measurementsInsert time measurements Update time measurementsUpdate time measurements Query time measurementQuery time measurement
Experimental Results :Experimental Results :Insert time measurementsInsert time measurements
PERST insertion time is about 10000 times less than PERST insertion time is about 10000 times less than MS-SQL’s MS-SQL’s
Insert Time
110
1001000
10000100000
1000000
0 20000 40000 60000 80000 100000 120000
# of records
mSec
MS-SQL
PERST
Experimental Results :Experimental Results :Update time measurementsUpdate time measurements
PERST has approximately a steady response timePERST has approximately a steady response time although MS-SQL reaches a final steady status too, but although MS-SQL reaches a final steady status too, but
its performance time is much longer than PERST’s its performance time is much longer than PERST’s Update Time
1
10
100
1000
10000
100000
0 20000 40000 60000 80000 100000 120000
# of records
mSec(log)
FastDB
MS-SQL
Experimental Results :Experimental Results :Query time measurementsQuery time measurements
PERST has a more steady state than MS-SQL, and PERST has a more steady state than MS-SQL, and the main important difference is the noticeable ratio the main important difference is the noticeable ratio of the performance times of the performance times
Query Time
1
10
100
1000
0 20000 40000 60000 80000 100000 120000
# of records
mSec(log)
PERST
MS-SQL
PerstDB PerformancePerstDB Performance
0
5000
10000
15000
20000
25000
30000
35000
40000
45000
50000
1 2 3 4 8 15 20 25 30 35 40
Scale
Tra
nsa
ctio
n p
er S
eco
nd
FastDB Performance FastDB Performance
0
1000
2000
3000
4000
5000
6000
7000
8000
1 2 3 4 8 15 20 25 30 35 40
Sacle
Tra
nsa
ctio
n p
er S
eco
nd
\Processor(_Total)\% Processor Time
0
20
40
60
80
100
120
23:2
7:40
23:2
7:51
23:2
8:02
23:2
8:13
23:2
8:24
23:2
8:35
23:2
8:46
23:2
8:57
23:2
9:08
23:2
9:19
23:2
9:30
23:2
9:41
23:2
9:52
23:3
0:03
23:3
0:14
23:3
0:25
23:3
0:36
23:3
0:47
23:3
0:58
23:3
1:12
23:3
1:23
23:3
1:34
23:3
1:45
23:3
1:56
23:3
2:07
23:3
2:18
23:3
2:29
23:3
2:40
23:3
2:51
23:3
3:02
23:3
3:13
23:3
3:24
23:3
3:35
23:3
3:46
\Processor(_Total)\% Processor Time
0
20
40
60
80
100
120
10:4
1:41
10:4
1:56
10:4
2:11
10:4
2:26
10:4
2:41
10:4
2:56
10:4
3:11
10:4
3:26
10:4
3:41
10:4
3:56
10:4
4:11
10:4
4:26
10:4
4:41
10:4
4:56
10:4
5:11
10:4
5:26
10:4
5:41
10:4
5:56
10:4
6:11
10:4
6:26
10:4
6:41
10:4
6:56
10:4
7:11
10:4
7:26
10:4
7:41
10:4
7:56
10:4
8:11
10:4
8:26
10:4
8:41
10:4
8:56
10:4
9:11
10:4
9:26
10:4
9:41
10:4
9:56
10:5
0:11
10:5
0:26
10:5
0:41
PERST Scale60
در پردازنده از استفاده میزان در مقایسه پردازنده از استفاده میزان مقایسهFastDBFastDBوو PERSTPERST
FastDB Scale30
ConclusionConclusion
Our results of the evaluation tests match Our results of the evaluation tests match completely with what we expected. completely with what we expected.
PERST is an object-oriented database, and is PERST is an object-oriented database, and is optimized for an application with dominated optimized for an application with dominated read access pattern.read access pattern.
Our performance results indicate that :Our performance results indicate that : elimination of the overhead time caused by elimination of the overhead time caused by
transferring database files to the transferring database files to the buffer poolbuffer pool and and vice versa, makes PERST work significantly faster vice versa, makes PERST work significantly faster than a traditional database, than a traditional database,
Conclusion (Cont.) Conclusion (Cont.)
MMDB fits more efficiently to the requirements MMDB fits more efficiently to the requirements of today’s real time applications of today’s real time applications
Main Memory has short response time, and its Main Memory has short response time, and its decreasing cost makes it affordable and decreasing cost makes it affordable and suitable for real-time applications.suitable for real-time applications.
Main Memory DBMS manages in-memory Main Memory DBMS manages in-memory data and ensures ACID properties on it.data and ensures ACID properties on it.
MARS MMDB SystemMARS MMDB System
Case Study
MARS MMDB SYSTEMMARS MMDB SYSTEM
MARS (MAin memory Recoverable database MARS (MAin memory Recoverable database with Stable log) with Stable log)
is the main memory database system that.is the main memory database system that. It assumes that the entire database resides in a It assumes that the entire database resides in a
volatile main memory (MM)volatile main memory (MM) its backup copy is kept in an archive memory its backup copy is kept in an archive memory
(AM) residing on secondary storage.(AM) residing on secondary storage.
MARS MMDB SYSTEMMARS MMDB SYSTEM
At least 20,000 transactions are executed At least 20,000 transactions are executed The size of a transaction is determined by The size of a transaction is determined by the numberthe numberof operations it executes, which is of operations it executes, which is distributed uniformly between 5 and 30distributed uniformly between 5 and 30concurrency locking: 2phase locking on concurrency locking: 2phase locking on the memory pagesthe memory pages
Immediate Immediate وو Deferred UpdateDeferred Update With immediate update, modified pages may be propagated to the primary With immediate update, modified pages may be propagated to the primary
database at any time. Hence, we need to make failure recovery possible.database at any time. Hence, we need to make failure recovery possible.
With deferred update, modified data is kept in the log until a successful completion With deferred update, modified data is kept in the log until a successful completion of the transaction performing the updates is assured. Since no dirty pages are of the transaction performing the updates is assured. Since no dirty pages are propagated to disks, only AFIMs need to be logged for REDO purposespropagated to disks, only AFIMs need to be logged for REDO purposes
System Resource System Resource DYNAMIC PARAMETERSDYNAMIC PARAMETERS
System ResourceSystem Resource STATIC PARAMETERSSTATIC PARAMETERS
AND etc ….
Before CrashBefore CrashTransaction arrival rate vs. transaction Transaction arrival rate vs. transaction
response timeresponse time
Recovery OverheadRecovery Overhead
Abort SituationAbort Situation
References (Cont.)References (Cont.)- Tobin J. Lehman, Michael J. Carey, “Tobin J. Lehman, Michael J. Carey, “
A Study of Index Structures for Main Memory Database Management SystemsA Study of Index Structures for Main Memory Database Management Systems”, Proceedings of the ”, Proceedings of the Twelfth International Conference on Very Large Data Bases, Kyoto, August, 1986Twelfth International Conference on Very Large Data Bases, Kyoto, August, 1986
- Rajeev Rastogi, S. Seshadri, Philip Bohannon, Dennis Leinbaugh, “Rajeev Rastogi, S. Seshadri, Philip Bohannon, Dennis Leinbaugh, “Logical and Physical Versioning in Main Memory DatabasesLogical and Physical Versioning in Main Memory Databases”, Proceedings of the 23rd VLDB Conference, ”, Proceedings of the 23rd VLDB Conference, Athens, Greece, 1997Athens, Greece, 1997
- Hector Garcia-Molina, Kenneth Salem, Hector Garcia-Molina, Kenneth Salem, ““Main Memory Database Systems: An overviewMain Memory Database Systems: An overview””, IEEE 1992, IEEE 1992
- Piyush Burte, Boanerges Aleman-Meza, D. Brent Weatherly, Rong Wu, Piyush Burte, Boanerges Aleman-Meza, D. Brent Weatherly, Rong Wu, ““Transaction Management for a Main-Memory DatabaseTransaction Management for a Main-Memory Database””
- Tobin J. Lehman, Michael J. Carey, “Tobin J. Lehman, Michael J. Carey, “Query Processing in Main Memory Database Management SystemsQuery Processing in Main Memory Database Management Systems”, ”, ACM, 1986ACM, 1986
- Margaret H. Eich, Margaret H. Eich, “Main Memory Database Recovery”“Main Memory Database Recovery”, IEEE 1986, IEEE 1986
- - J. Baulier, P. Bohannon, S. Gogate, C. Gupta, S. Haldar, S. Joshi, A. Khivesera, H. F. Korth, P. McIlroy, J. J. Baulier, P. Bohannon, S. Gogate, C. Gupta, S. Haldar, S. Joshi, A. Khivesera, H. F. Korth, P. McIlroy, J. Miller, P. P. S. Narayan, M. Nemeth, R. Rastogi, S. Seshadri, A. Silberschatz, S. Sudarshan, M. Wilder, Miller, P. P. S. Narayan, M. Nemeth, R. Rastogi, S. Seshadri, A. Silberschatz, S. Sudarshan, M. Wilder, and C. Wei. “and C. Wei. “DataBlitz Storage Manager: Main-Memory Database Performance for Critical DataBlitz Storage Manager: Main-Memory Database Performance for Critical applicationsapplications”, In Proceedings of theACM SIGMOD International Conference on Management of Data ”, In Proceedings of theACM SIGMOD International Conference on Management of Data (SIGMOD), pages 519–520, Philadephia, PA, USA, June 1999(SIGMOD), pages 519–520, Philadephia, PA, USA, June 1999 ..
ReferencesReferences- PERST: a main-memory database object-orientd database system, http://www.mcobject.com/ , august PERST: a main-memory database object-orientd database system, http://www.mcobject.com/ , august
2006. 2006.
- Inseon Lee, Heon Y.Yeon, Taesoon Park, “A New Approach for Distributed Main Memory Database Inseon Lee, Heon Y.Yeon, Taesoon Park, “A New Approach for Distributed Main Memory Database Systems: A Causal Commit Protocol”, IEICE TRANS. INF. & SYST., VOL.ES7, NO.1 JANUARY 2004Systems: A Causal Commit Protocol”, IEICE TRANS. INF. & SYST., VOL.ES7, NO.1 JANUARY 2004
- Nicholas Carriero, Michael V. Osier, Kei-Hoi Cheung, Peter Masiar, Perry L. Miller, Kevin White, Martin Nicholas Carriero, Michael V. Osier, Kei-Hoi Cheung, Peter Masiar, Perry L. Miller, Kevin White, Martin Schultz, “Schultz, “ Exploring the Use of Main Memory Database (MMDB) Technology for the Analysis of Gene Exploring the Use of Main Memory Database (MMDB) Technology for the Analysis of Gene Expression Microarray Data”, Expression Microarray Data”, Technical report, April 2004Technical report, April 2004
- Stefan Manegold, Stefan Manegold, “Understanding, Modeling, and Improving Main-Memory Database Performance”“Understanding, Modeling, and Improving Main-Memory Database Performance” , , November 2002November 2002
- Philip Bohannon, Peter McIlroy, Rajeev Rastogi,”Main Memory Index Structures with FixedSize Partial Philip Bohannon, Peter McIlroy, Rajeev Rastogi,”Main Memory Index Structures with FixedSize Partial Keys”, Keys”, ACM SIGMOD 2001 ACM SIGMOD 2001 May 2124,Santa Barbara, California, USAMay 2124,Santa Barbara, California, USA
- S. Manegold, P. A. Boncz, and M. L. Kersten. S. Manegold, P. A. Boncz, and M. L. Kersten. “Optimizing Main-Memory Join on Modern Hardware”“Optimizing Main-Memory Join on Modern Hardware” , , IEEE Transactions on Knowledge and Data Engineering (TKDE), 14(4):709–730, July 2002.IEEE Transactions on Knowledge and Data Engineering (TKDE), 14(4):709–730, July 2002.
- - J. Rao and K. A. Ross. “Making B+-Trees Cache Conscious in Main Memory”, In Proceedings of the J. Rao and K. A. Ross. “Making B+-Trees Cache Conscious in Main Memory”, In Proceedings of the ACM SIGMOD International Conference on Management of Data (SIGMOD), pages 475–486, Dallas, ACM SIGMOD International Conference on Management of Data (SIGMOD), pages 475–486, Dallas, TX, USA, May 2000.TX, USA, May 2000.