Recovery in Main Memory Databases -Le Gruenwald, Jing Huang, Margaret H. Dunham el al - Engineering...
-
Upload
harry-blankenship -
Category
Documents
-
view
217 -
download
2
Transcript of Recovery in Main Memory Databases -Le Gruenwald, Jing Huang, Margaret H. Dunham el al - Engineering...
Recovery in Main Memory Databases
-Le Gruenwald, Jing Huang, Margaret H. Dunham el al -
Engineering Intelligent Systems, Vol.4, No. 3, September 1996
이 인선 97/08/21
Introduction
General MMDB Architecture– Main Memory (MM) in RAM memory– Stable Memory(SM)
optional nonvolatile memory used to hold log buffers(log tail) avoid I/O actions when transaction are committed essential to performance
– Archive Memory(AM) holds a backup of the entire database
focus on logging, checkpointing, reloading
MMDB Logging(1)
– physical logging the state of the database modified by an operation are logged it is recommended for MMDB systems
– logical logging contains descriptions of higher level operations and records the
state transition of the database the idempotent property does not hold
MMDB Logging(2) Logging rules
– Write Ahead Rule undo-log data must be written to a nonvolatile memory prior to
the updating in the database
– Commit rule if a DBMS allows a transaction to commit, the redo-log data of
it should be ensured in nonvolatile storage
– Logging After Writing the after image of an updated item should be written to the log
after its corresponding update is propagated to the database simplifies the log processing with a fuzzy checkpointing MMD
B
MMDB Logging(3)
MMDB logging differs from DRDB logging in three ways– a nonvolatile log buffer should be used to satisfy
WAL without requiring I/O prior to transaction commit
– physical logging is recommended as it is easier to use with fuzzy checkpointing
– to reduce the amount of the log needed to redo transactions after a system failure, the LAW policy should be followed
Checkpointing DRDB Commit consistent checkpointing
– periodically stop processing transactions– flush all dirty cache slots and mark the log
cache consistent checkpointing fuzzy checkpointing
– only flushes those dirty slots that have not been flushed since before the previous checkpoint
– normal replacement activity will flush most cache slots that were dirty since before the previous checkpoint
– checkpoint won’t have much flushing to do and won’t delay active transaction for very long.
Checkpointing MMDBs(1) Focuses on low-interference with normal tran
sactions and supporting efficient recovery Fuzzy checkpointing
– Hagmann first suggested using fuzzy checkpointing for MMDBs “a crash recovery scheme for a memory-resident databa
se system” IEEE transactions on computers. Vol. C-35, No. 9, septe
mber 1986 the checkpointer does not need to obtain the locks on th
e data items to be checkpointed the database is dumped in sections after dumping a section, the checkpointer writes a log rec
ord to the log a section must not overwrite its previous image (sliding
monoplexed backups)
LAW with fuzzy checkpointing
Checkpointing MMDBs(2)
– Salem and Garcia-Molina “checkpointing memory-resident databases”(‘89) compared the fuzzy checkpointing scheme with two-non-
fuzzy checkpointing schemes fuzzy checkpointing is the most efficient one ping-pong scheme
– each dirty page is flushed twice
– Lin and Dunham “segmented fuzzy checkpointing for main memory datab
ases”(‘94) checkpoints one segment at a time in a round-robin fashi
on automatically changes the segment boundaries based on
the distribution of update operations
Checkpointing MMDBs(3)
Redo log size in the Segmented fuzzy checkpointing
– Li et al “checkpointing and recovery in partitioned main memory datab
ases(‘95) the database is divided into partitions, each of which has its o
wn log disks the time to recover from a system failure is reduced
B C1
a1 b1 c1B C2
a2 b2 c2
1 2 3 4
Checkpointing MMDBs(4)
Non-Fuzzy Checkpointing– overhead comes from locking the checkpointed obje
cts to ensure transaction-consistency or action-consistency
– Lehman and Carey “a recovery algorithm for a high-performance memory-resid
ent database system”(‘87) transaction-consistent(at relation level)scheme no need to maintain undo-log-records in nonvolatile storage checkpointing increases the data contention with normal tra
nsaction
Checkpointing MMDBs(5)– Salem and Garcia-Molina
“checkpointing memory-resient databases” (‘89) discuss two non-fuzzy checkpointing approaches
– the first(black and white) one aborts some update transactions
– the second(Copy-On-Update) one requires some update transactions storing the original values of data items to be updated
– both have severe impact on the system performance
– Jagadish et al “recovering from main-memory lapses” (‘93) propose an action-consistent checkpointing scheme the undo-logs of active transactions are first written to th
e log, and then dirty pages are flushed to disk during normal processing, the redo-logs of the committed
transactions are written to the log ping-pong update this approach was originally used in Dali
Checkpointing MMDBs(6)
Log-driven checkpointing– applies the log to a previous dump to generate a n
ew dump– originally used to generate remote backup of the d
atabase– is adopted to “incremental recovery in main memo
ry database systems” (‘92)– with high transaction processing rate in MMDBs, t
he size of the log can increase rapidly– it is quite inefficient compared to fuzzy checkpointi
ng
MMDB Reloading(1) Issues
– occurrence frequency of the reload process on average, a system failure occurs once every few wee
ks media failure, MM page faults
– when the system should resume its execution after a failure
28.43 minutes are needed to recover 1Giga DB [?] if the system is not available at all during recovery, many
transactions will be backlogged
– reload prioritization reload priority can be determined based on access frequ
ency, transaction deadline(“MMDB reload algorithms”) or temporal data interval from real-time applications[?]
MMDB Reloading(2) Existing reload schemes
– simple reloading the system can not be brought online until the entire data
base is memory-resident
– concurrent reloading Grenwald
– “mmdb reload algorithms” (‘91)
– two processors(RP & DP), nonvolatile shadow memory(SM) and dual address translation mechanism in the MARS system
– ordered reload with prioritization/ smart reload/ frequency reload
– the differences lie in the structure of AM, utilization of data access frequency, reload prioritization, and reload granularity
– the frequency reload yields the best transaction response time and system throughput
MMDB Reloading(3)
Lehman– “a recovery algorithm for a high-preformance”
– after the system catalogs and their indices are reloaded then regular transaction processing is allowed to resume
Levy and Silberschatz– “incremental recovery in main memory database systems”,
(‘92)
– resume transaction processing immediately after a system failure and recovers pages individually according to the demand of post-crash transaction.
– Stale/fresh marking technique
– in order to implement a page-based recovery, log records must be grouped together on a page basis during normal operation
Recovery with Existing MMDB Systems(1)
Dali from AT&T– the original recovery manager was implemented a
ccording to “recovering from main-memory lapses” (‘93)
logging only redo records during normal execution segment-level action-consistent checkpoints checkpointer write to the disk relevant parts of the undo l
og recovery has only a single pass over the log require no special h/w to preserve the data
– test led to a restructuring of its recovery manager “multi-level recovery in the Dali storage manager” (‘95) multi-level logging, post-commit actions, dirty page detec
tion, and fuzzy checkpoints
Recovery with Existing MMDB Systems(2)
Fast Path– supports the memory-resident data and disk-
resident data– performs updates to memory resident data at
commit time– no undo operations are required when a failure
occurs– a group commit is adopted– transaction-consistent backup copy of the
database is refreshed during system shutdown or infrequently checkpoints.
– Two backup database with ping-pong backups
Recovery with Existing MMDB Systems(3)
two real-time system examples NEC Real-Time DBMS Stone RTDB
– NEC RTDBMS has several features to ensure high throughput and accurate predictability
no page fault in-memory log buffer is nonvolatile physical logging using deferred update fuzzy checkpointing no real-time characteristics such as transaction deadline
and criticalness are utilized in the recovery components
Summary and Conclusion– Discussed 3 logging rules
nonvolatile log buffer should be used to satisfy WAL without requiring I/O prior to transaction commit
LAW should be followed to reduce the amount of log needed to redo transactions after a system failure
– described three groups of checkpointing– identified 3 issues about reloading
data should be prioritized for reload purposes
– future research investigate how real-time requirements such as transacti
on deadline and temporal data intervals can be incorporated into MMDB recovery
a crash recovery scheme for a memory-resident database
system
Robert B. Hagmann
IEEE transactions on computers. Vol. C-35, No. 9, september 1986
overview
Presents a method of doing recovery that uses the existing techniques of fuzzy dumps and log compression
design requirement– small system example
2 pages/transaction *100 transactions/s * 3600s /h * 8h = 5,760,000 pages written to the log
– transaction size must be short– checkpointed periodically every five minutes
Overview(2)
– The principal requirement of the system is “fast” recovery from a system crash
critical factor : transfer rate of the disk can be improved by using several parallel processors
design overview– fuzzy dump
simply a copy of the database taken without any synchronization
– If a DBMS uses a nonvolatile storage, some log compression can occur
– else precommitting and group commits can be used to increase performance
overview
Design details