Introduction to Database Systems CSE 444 … · Pl tt Unit of read or write: disk block Platters...
Transcript of Introduction to Database Systems CSE 444 … · Pl tt Unit of read or write: disk block Platters...
![Page 1: Introduction to Database Systems CSE 444 … · Pl tt Unit of read or write: disk block Platters Arm movement disk block Once in memory: page Typically: 4k or 8k or 16k Arm assembly](https://reader030.fdocuments.net/reader030/viewer/2022040415/5f32f3543833a442b976adc3/html5/thumbnails/1.jpg)
Introduction to Database SystemsCSE 444
Lectures 9-10 Transactions: recovery
CSE 444 - Summer 2010 1
![Page 2: Introduction to Database Systems CSE 444 … · Pl tt Unit of read or write: disk block Platters Arm movement disk block Once in memory: page Typically: 4k or 8k or 16k Arm assembly](https://reader030.fdocuments.net/reader030/viewer/2022040415/5f32f3543833a442b976adc3/html5/thumbnails/2.jpg)
Outline
• We are starting to look at DBMS internals
• Next pair of lectures: transactions & recovery– Disks 13.2– Undo logging 17.2 – Redo logging 17.3– Redo/undo 17.4
CSE 444 - Summer 2010 2
![Page 3: Introduction to Database Systems CSE 444 … · Pl tt Unit of read or write: disk block Platters Arm movement disk block Once in memory: page Typically: 4k or 8k or 16k Arm assembly](https://reader030.fdocuments.net/reader030/viewer/2022040415/5f32f3543833a442b976adc3/html5/thumbnails/3.jpg)
The Mechanics of DiskM h i l h t i ti
CylinderMechanical characteristics:• Rotation speed (5400 RPM)• Number of platters (1 30)
SpindleDisk head Tracks
• Number of platters (1-30)• Number of tracks (<=10000)• Number of bytes/track(105)
Sector
Number of bytes/track(10 )
Pl tt
Unit of read or write:disk block Platters
Arm movementdisk block
Once in memory:page
Typically: 4k or 8k or 16kArm assembly
Typically: 4k or 8k or 16k3
![Page 4: Introduction to Database Systems CSE 444 … · Pl tt Unit of read or write: disk block Platters Arm movement disk block Once in memory: page Typically: 4k or 8k or 16k Arm assembly](https://reader030.fdocuments.net/reader030/viewer/2022040415/5f32f3543833a442b976adc3/html5/thumbnails/4.jpg)
RAID
Several disks that work in parallel• Redundancy: use parity to recover from disk failure
Speed: read from several disks at once• Speed: read from several disks at once
Various configurations (called levels):• RAID 1 = mirror• RAID 4 = n disks + 1 parity disk• RAID 5 = n+1 disks assign parity blocks round robin• RAID 5 = n+1 disks, assign parity blocks round robin• RAID 6 = “Hamming codes”
CSE 444 - Summer 2010 4
![Page 5: Introduction to Database Systems CSE 444 … · Pl tt Unit of read or write: disk block Platters Arm movement disk block Once in memory: page Typically: 4k or 8k or 16k Arm assembly](https://reader030.fdocuments.net/reader030/viewer/2022040415/5f32f3543833a442b976adc3/html5/thumbnails/5.jpg)
Disk Access Characteristics
• Disk latency = time between when command is issued and when data is in memory
• Disk latency = seek time + rotational latency– Seek time = time for the head to reach cylinder
10 40• 10ms – 40ms– Rotational latency = time for the sector to rotate
• Rotation time = 10msAverage latency 10ms/2• Average latency = 10ms/2
• Transfer time = typically 40MB/s• Disks read/write one block at a time
CSE 444 - Summer 2010 5
![Page 6: Introduction to Database Systems CSE 444 … · Pl tt Unit of read or write: disk block Platters Arm movement disk block Once in memory: page Typically: 4k or 8k or 16k Arm assembly](https://reader030.fdocuments.net/reader030/viewer/2022040415/5f32f3543833a442b976adc3/html5/thumbnails/6.jpg)
Storage Latency: How Far Away is the Data?How Far Away is the Data?
Andromeda
Tape /Optical Robot
109 2,000 Years
Disk10 6 2 YearsPluto
Memory100 Olympia 1.5 hr
On Chip CacheOn Board Cache
Memory
210
100
This BuildingThis Room
10 min
7/8/2010 © 2007 Gribble, Lazowska, Levy, Zahorjan
62
Registers1 My Head 1 min
© 2004 Jim Gray, Microsoft Corporation
![Page 7: Introduction to Database Systems CSE 444 … · Pl tt Unit of read or write: disk block Platters Arm movement disk block Once in memory: page Typically: 4k or 8k or 16k Arm assembly](https://reader030.fdocuments.net/reader030/viewer/2022040415/5f32f3543833a442b976adc3/html5/thumbnails/7.jpg)
Buffer Management in a DBMSPage Requests from Higher Levels
BUFFER POOLREADWRITE
disk page
BUFFER POOL WRITE
MAIN MEMORY
free frame INPUTOUTUPT
DB
MAIN MEMORY
DISK choice of frame dictatedby replacement policy
OUTUPT
• Data must be in RAM for DBMS to operate on it!Table of <frame# pageid> pairs is maintained
y p p y
7
• Table of <frame#, pageid> pairs is maintainedCSE 444 Summer 2010
![Page 8: Introduction to Database Systems CSE 444 … · Pl tt Unit of read or write: disk block Platters Arm movement disk block Once in memory: page Typically: 4k or 8k or 16k Arm assembly](https://reader030.fdocuments.net/reader030/viewer/2022040415/5f32f3543833a442b976adc3/html5/thumbnails/8.jpg)
Buffer Manager
• Enables higher layers of the DBMS to assume that needed data is in main memory
• Needs to decide on page replacement policy– LRU, clock algorithm, or other
• Both work well in OS, but not always in DB
CSE 444 - Summer 2010 8
![Page 9: Introduction to Database Systems CSE 444 … · Pl tt Unit of read or write: disk block Platters Arm movement disk block Once in memory: page Typically: 4k or 8k or 16k Arm assembly](https://reader030.fdocuments.net/reader030/viewer/2022040415/5f32f3543833a442b976adc3/html5/thumbnails/9.jpg)
Least Recently Used (LRU)
• Order pages by the time of last accessed• Always replace the least recently accessedy p y
P5, P2, P8, P4, P1, P9, P6, P3, P7, , , , , , , ,
Access P6Access P6
P6, P5, P2, P8, P4, P1, P9, P3, P7
LRU is expensive (why ?); the clock algorithm is good approx
![Page 10: Introduction to Database Systems CSE 444 … · Pl tt Unit of read or write: disk block Platters Arm movement disk block Once in memory: page Typically: 4k or 8k or 16k Arm assembly](https://reader030.fdocuments.net/reader030/viewer/2022040415/5f32f3543833a442b976adc3/html5/thumbnails/10.jpg)
Buffer Manager
• Why not use the OS for the task??• Reason 1: Correctness
– DBMS needs fine grained control for transactions– Needs to force pages to disk for recovery purposes
• Reason 2: Performance– DBMS may be able to anticipate access patterns– Hence, may also be able to perform prefetching– May select better page replacement policy
M t t i i th b ff– May want to pin pages in the bufferCSE 444 - Summer 2010 10
![Page 11: Introduction to Database Systems CSE 444 … · Pl tt Unit of read or write: disk block Platters Arm movement disk block Once in memory: page Typically: 4k or 8k or 16k Arm assembly](https://reader030.fdocuments.net/reader030/viewer/2022040415/5f32f3543833a442b976adc3/html5/thumbnails/11.jpg)
Transaction Management andTransaction Management and the Buffer Manager
Transaction manager operates on buffer pool• Recovery: ‘log-file write-ahead’, then careful y g
policy about which pages to force to disk• Concurrency control: locks at the page
level, multiversion concurrency control
CSE 444 - Summer 2010 11
![Page 12: Introduction to Database Systems CSE 444 … · Pl tt Unit of read or write: disk block Platters Arm movement disk block Once in memory: page Typically: 4k or 8k or 16k Arm assembly](https://reader030.fdocuments.net/reader030/viewer/2022040415/5f32f3543833a442b976adc3/html5/thumbnails/12.jpg)
Transaction Management
Two parts:
• Recovery from crashes: ACID• Concurrency control: ACIDy
Both operate on the buffer poolBoth operate on the buffer pool
CSE 444 - Summer 2010 12
![Page 13: Introduction to Database Systems CSE 444 … · Pl tt Unit of read or write: disk block Platters Arm movement disk block Once in memory: page Typically: 4k or 8k or 16k Arm assembly](https://reader030.fdocuments.net/reader030/viewer/2022040415/5f32f3543833a442b976adc3/html5/thumbnails/13.jpg)
Problem IllustrationClient 1:Client 1:
START TRANSACTIONINSERT INTO SmallProduct(name, price)
SELECT pname, priceFROM ProductWHERE price <= 0.99WHERE price 0.99
DELETE ProductWHERE price <=0 99
Crash !
WHERE price <=0.99COMMIT
What do we do now?CSE 444 - Summer 2010
What do we do now?13
![Page 14: Introduction to Database Systems CSE 444 … · Pl tt Unit of read or write: disk block Platters Arm movement disk block Once in memory: page Typically: 4k or 8k or 16k Arm assembly](https://reader030.fdocuments.net/reader030/viewer/2022040415/5f32f3543833a442b976adc3/html5/thumbnails/14.jpg)
Recovery
Type of Crash Prevention
Constraints andWrong data entry Constraints andData cleaning
Redundancy:Disk crashes Redundancy: e.g. RAID, archive
Fire, theft, Buy insurance,Fire, theft, bankruptcy…
Buy insurance, Change jobs…
System failures DATABASESystem failures RECOVERY14
![Page 15: Introduction to Database Systems CSE 444 … · Pl tt Unit of read or write: disk block Platters Arm movement disk block Once in memory: page Typically: 4k or 8k or 16k Arm assembly](https://reader030.fdocuments.net/reader030/viewer/2022040415/5f32f3543833a442b976adc3/html5/thumbnails/15.jpg)
Main Idea for Recovery
• Each transaction has internal state• When system crashes, internal state is lost
– Don’t know which parts executed and which didn’t– Need ability to undo and redo
• Remedy: use a log– File that records every single action of all running
transactionstransactions– After a crash, transaction manager reads the log to find
out exactly what each transaction did or did not doy
CSE 444 - Summer 2010 15
![Page 16: Introduction to Database Systems CSE 444 … · Pl tt Unit of read or write: disk block Platters Arm movement disk block Once in memory: page Typically: 4k or 8k or 16k Arm assembly](https://reader030.fdocuments.net/reader030/viewer/2022040415/5f32f3543833a442b976adc3/html5/thumbnails/16.jpg)
Transactions
• Assumption: db composed of elements– Usually 1 element = 1 block
C b ll ( 1 d) l ( 1 l ti )– Can be smaller (=1 record) or larger (=1 relation)
• Assumption: each transaction reads/writes• Assumption: each transaction reads/writes some elements
CSE 444 - Summer 2010 16
![Page 17: Introduction to Database Systems CSE 444 … · Pl tt Unit of read or write: disk block Platters Arm movement disk block Once in memory: page Typically: 4k or 8k or 16k Arm assembly](https://reader030.fdocuments.net/reader030/viewer/2022040415/5f32f3543833a442b976adc3/html5/thumbnails/17.jpg)
Primitive Operations ofPrimitive Operations of Transactions
• READ(X,t)– copy element X to transaction local variable t
• WRITE(X,t)– copy transaction local variable t to element X
• INPUT(X)– read element X to memory buffery
• OUTPUT(X)– write element X to disk
CSE 444 - Summer 2010 17
![Page 18: Introduction to Database Systems CSE 444 … · Pl tt Unit of read or write: disk block Platters Arm movement disk block Once in memory: page Typically: 4k or 8k or 16k Arm assembly](https://reader030.fdocuments.net/reader030/viewer/2022040415/5f32f3543833a442b976adc3/html5/thumbnails/18.jpg)
ExampleSTART TRANSACTIONSTART TRANSACTIONREAD(A,t); t t*2t := t*2;WRITE(A,t);
Atomicity:BOTH A and B
READ(B,t); t := t*2;
are multiplied by 2
WRITE(B,t);COMMIT;
CSE 444 - Summer 2010 18
![Page 19: Introduction to Database Systems CSE 444 … · Pl tt Unit of read or write: disk block Platters Arm movement disk block Once in memory: page Typically: 4k or 8k or 16k Arm assembly](https://reader030.fdocuments.net/reader030/viewer/2022040415/5f32f3543833a442b976adc3/html5/thumbnails/19.jpg)
READ(A,t); t := t*2; WRITE(A,t); READ(B,t); t := t*2; WRITE(B,t);
Action t Mem A Mem B Disk A Disk B
Buffer pool DiskTransaction
Action t Mem A Mem B Disk A Disk B
INPUT(A) 8 8
READ(A,t)READ(A,t)
t:=t*2
WRITE(A,t)
INPUT(B)
READ(B,t)
t:=t*2
WRITE(B,t)
OUTPUT(A)
OUTPUT(B)
![Page 20: Introduction to Database Systems CSE 444 … · Pl tt Unit of read or write: disk block Platters Arm movement disk block Once in memory: page Typically: 4k or 8k or 16k Arm assembly](https://reader030.fdocuments.net/reader030/viewer/2022040415/5f32f3543833a442b976adc3/html5/thumbnails/20.jpg)
READ(A,t); t := t*2; WRITE(A,t); READ(B,t); t := t*2; WRITE(B,t);
Action t Mem A Mem B Disk A Disk B
Buffer pool DiskTransaction
Action t Mem A Mem B Disk A Disk B
INPUT(A) 8 8 8
READ(A,t)READ(A,t)
t:=t*2
WRITE(A,t)
INPUT(B)
READ(B,t)
t:=t*2
WRITE(B,t)
OUTPUT(A)
OUTPUT(B)
![Page 21: Introduction to Database Systems CSE 444 … · Pl tt Unit of read or write: disk block Platters Arm movement disk block Once in memory: page Typically: 4k or 8k or 16k Arm assembly](https://reader030.fdocuments.net/reader030/viewer/2022040415/5f32f3543833a442b976adc3/html5/thumbnails/21.jpg)
READ(A,t); t := t*2; WRITE(A,t); READ(B,t); t := t*2; WRITE(B,t);
Action t Mem A Mem B Disk A Disk B
Buffer pool DiskTransaction
Action t Mem A Mem B Disk A Disk B
INPUT(A) 8 8 8
READ(A,t) 8 8 8 8READ(A,t) 8 8 8 8
t:=t*2 16 8 8 8
WRITE(A,t)
INPUT(B)
READ(B,t)
t:=t*2
WRITE(B,t)
OUTPUT(A)
OUTPUT(B)
![Page 22: Introduction to Database Systems CSE 444 … · Pl tt Unit of read or write: disk block Platters Arm movement disk block Once in memory: page Typically: 4k or 8k or 16k Arm assembly](https://reader030.fdocuments.net/reader030/viewer/2022040415/5f32f3543833a442b976adc3/html5/thumbnails/22.jpg)
READ(A,t); t := t*2; WRITE(A,t); READ(B,t); t := t*2; WRITE(B,t);
Action t Mem A Mem B Disk A Disk B
Buffer pool DiskTransaction
Action t Mem A Mem B Disk A Disk B
INPUT(A) 8 8 8
READ(A,t) 8 8 8 8READ(A,t) 8 8 8 8
t:=t*2 16 8 8 8
WRITE(A,t) 16 16 8 8
INPUT(B)
READ(B,t)
t:=t*2
WRITE(B,t)
OUTPUT(A)
OUTPUT(B)
![Page 23: Introduction to Database Systems CSE 444 … · Pl tt Unit of read or write: disk block Platters Arm movement disk block Once in memory: page Typically: 4k or 8k or 16k Arm assembly](https://reader030.fdocuments.net/reader030/viewer/2022040415/5f32f3543833a442b976adc3/html5/thumbnails/23.jpg)
READ(A,t); t := t*2; WRITE(A,t); READ(B,t); t := t*2; WRITE(B,t);
Action t Mem A Mem B Disk A Disk B
Buffer pool DiskTransaction
Action t Mem A Mem B Disk A Disk B
INPUT(A) 8 8 8
READ(A,t) 8 8 8 8READ(A,t) 8 8 8 8
t:=t*2 16 8 8 8
WRITE(A,t) 16 16 8 8
INPUT(B) 16 16 8 8 8
READ(B,t)
t:=t*2
WRITE(B,t)
OUTPUT(A)
OUTPUT(B)
![Page 24: Introduction to Database Systems CSE 444 … · Pl tt Unit of read or write: disk block Platters Arm movement disk block Once in memory: page Typically: 4k or 8k or 16k Arm assembly](https://reader030.fdocuments.net/reader030/viewer/2022040415/5f32f3543833a442b976adc3/html5/thumbnails/24.jpg)
READ(A,t); t := t*2; WRITE(A,t); READ(B,t); t := t*2; WRITE(B,t);
Action t Mem A Mem B Disk A Disk B
Buffer pool DiskTransaction
Action t Mem A Mem B Disk A Disk B
INPUT(A) 8 8 8
READ(A,t) 8 8 8 8READ(A,t) 8 8 8 8
t:=t*2 16 8 8 8
WRITE(A,t) 16 16 8 8
INPUT(B) 16 16 8 8 8
READ(B,t) 8 16 8 8 8
t:=t*2 16 16 8 8 8
WRITE(B,t)
OUTPUT(A)
OUTPUT(B)
![Page 25: Introduction to Database Systems CSE 444 … · Pl tt Unit of read or write: disk block Platters Arm movement disk block Once in memory: page Typically: 4k or 8k or 16k Arm assembly](https://reader030.fdocuments.net/reader030/viewer/2022040415/5f32f3543833a442b976adc3/html5/thumbnails/25.jpg)
READ(A,t); t := t*2; WRITE(A,t); READ(B,t); t := t*2; WRITE(B,t);
Action t Mem A Mem B Disk A Disk B
Buffer pool DiskTransaction
Action t Mem A Mem B Disk A Disk B
INPUT(A) 8 8 8
READ(A,t) 8 8 8 8READ(A,t) 8 8 8 8
t:=t*2 16 8 8 8
WRITE(A,t) 16 16 8 8
INPUT(B) 16 16 8 8 8
READ(B,t) 8 16 8 8 8
t:=t*2 16 16 8 8 8
WRITE(B,t) 16 16 16 8 8
OUTPUT(A)
OUTPUT(B)
![Page 26: Introduction to Database Systems CSE 444 … · Pl tt Unit of read or write: disk block Platters Arm movement disk block Once in memory: page Typically: 4k or 8k or 16k Arm assembly](https://reader030.fdocuments.net/reader030/viewer/2022040415/5f32f3543833a442b976adc3/html5/thumbnails/26.jpg)
READ(A,t); t := t*2; WRITE(A,t); READ(B,t); t := t*2; WRITE(B,t);
Action t Mem A Mem B Disk A Disk B
Buffer pool DiskTransaction
Action t Mem A Mem B Disk A Disk B
INPUT(A) 8 8 8
READ(A,t) 8 8 8 8READ(A,t) 8 8 8 8
t:=t*2 16 8 8 8
WRITE(A,t) 16 16 8 8
INPUT(B) 16 16 8 8 8
READ(B,t) 8 16 8 8 8
t:=t*2 16 16 8 8 8
WRITE(B,t) 16 16 16 8 8
OUTPUT(A) 16 16 16 16 8
OUTPUT(B)
![Page 27: Introduction to Database Systems CSE 444 … · Pl tt Unit of read or write: disk block Platters Arm movement disk block Once in memory: page Typically: 4k or 8k or 16k Arm assembly](https://reader030.fdocuments.net/reader030/viewer/2022040415/5f32f3543833a442b976adc3/html5/thumbnails/27.jpg)
READ(A,t); t := t*2; WRITE(A,t); READ(B,t); t := t*2; WRITE(B,t);
Action t Mem A Mem B Disk A Disk B
Buffer pool DiskTransaction
Action t Mem A Mem B Disk A Disk B
INPUT(A) 8 8 8
READ(A,t) 8 8 8 8READ(A,t) 8 8 8 8
t:=t*2 16 8 8 8
WRITE(A,t) 16 16 8 8
INPUT(B) 16 16 8 8 8
READ(B,t) 8 16 8 8 8
t:=t*2 16 16 8 8 8
WRITE(B,t) 16 16 16 8 8
OUTPUT(A) 16 16 16 16 8
OUTPUT(B) 16 16 16 16 16
![Page 28: Introduction to Database Systems CSE 444 … · Pl tt Unit of read or write: disk block Platters Arm movement disk block Once in memory: page Typically: 4k or 8k or 16k Arm assembly](https://reader030.fdocuments.net/reader030/viewer/2022040415/5f32f3543833a442b976adc3/html5/thumbnails/28.jpg)
Action t Mem A Mem B Disk A Disk BAction t Mem A Mem B Disk A Disk B
INPUT(A) 8 8 8
READ(A,t) 8 8 8 8( )
t:=t*2 16 8 8 8
WRITE(A,t) 16 16 8 8
INPUT(B) 16 16 8 8 8
READ(B,t) 8 16 8 8 8
t:=t*2 16 16 8 8 8
WRITE(B,t) 16 16 16 8 8
OUTPUT(A) 16 16 16 16 8OUTPUT(A) 16 16 16 16 8
OUTPUT(B) 16 16 16 16 16Crash !
Crash occurs after OUTPUT(A), before OUTPUT(B)We lose atomicity 28
![Page 29: Introduction to Database Systems CSE 444 … · Pl tt Unit of read or write: disk block Platters Arm movement disk block Once in memory: page Typically: 4k or 8k or 16k Arm assembly](https://reader030.fdocuments.net/reader030/viewer/2022040415/5f32f3543833a442b976adc3/html5/thumbnails/29.jpg)
Buffer Manager Policies
• STEAL or NO-STEAL– Can an update made by an uncommitted transaction overwrite
the most recent committed value of a data item on disk?the most recent committed value of a data item on disk?
• FORCE or NO-FORCE– Should all updates of a transaction be forced to disk before the
transaction commits?
• Easiest for recovery: NO-STEAL/FORCE• Highest performance: STEAL/NO-FORCE
CSE 444 - Summer 2010
g p
29
![Page 30: Introduction to Database Systems CSE 444 … · Pl tt Unit of read or write: disk block Platters Arm movement disk block Once in memory: page Typically: 4k or 8k or 16k Arm assembly](https://reader030.fdocuments.net/reader030/viewer/2022040415/5f32f3543833a442b976adc3/html5/thumbnails/30.jpg)
The Log
• Log = append-only file containing log records• Multiple transactions run concurrently, log p y g
records are interleaved• After a system crash, use log to:
– Redo some transactions that did commit– Undo other transactions that did not commit
• Three kinds of logs: undo, redo, undo/redo
CSE 444 - Summer 2010 30
![Page 31: Introduction to Database Systems CSE 444 … · Pl tt Unit of read or write: disk block Platters Arm movement disk block Once in memory: page Typically: 4k or 8k or 16k Arm assembly](https://reader030.fdocuments.net/reader030/viewer/2022040415/5f32f3543833a442b976adc3/html5/thumbnails/31.jpg)
Undo LoggingL dLog records• <START T>
Transaction T has begun– Transaction T has begun• <COMMIT T>
– T has committedT has committed• <ABORT T>
– T has aborted• <T,X,v> -- Update record
– T has updated element X, and its old value was v
CSE 444 - Summer 2010 31
![Page 32: Introduction to Database Systems CSE 444 … · Pl tt Unit of read or write: disk block Platters Arm movement disk block Once in memory: page Typically: 4k or 8k or 16k Arm assembly](https://reader030.fdocuments.net/reader030/viewer/2022040415/5f32f3543833a442b976adc3/html5/thumbnails/32.jpg)
Action T Mem A Mem B Disk A Disk B Log
<START T>
INPUT(A) 8 8 8
READ(A ) 8 8 8 8READ(A,t) 8 8 8 8
t:=t*2 16 8 8 8
WRITE(A t) 16 16 8 8 <T A 8>WRITE(A,t) 16 16 8 8 <T,A,8>
INPUT(B) 16 16 8 8 8
READ(B,t) 8 16 8 8 8READ(B,t) 8 16 8 8 8
t:=t*2 16 16 8 8 8
WRITE(B,t) 16 16 16 8 8 <T,B,8>
OUTPUT(A) 16 16 16 16 8
OUTPUT(B) 16 16 16 16 16
COMMIT <COMMIT T>
![Page 33: Introduction to Database Systems CSE 444 … · Pl tt Unit of read or write: disk block Platters Arm movement disk block Once in memory: page Typically: 4k or 8k or 16k Arm assembly](https://reader030.fdocuments.net/reader030/viewer/2022040415/5f32f3543833a442b976adc3/html5/thumbnails/33.jpg)
Action T Mem A Mem B Disk A Disk B Log
<START T>
INPUT(A) 8 8 8
READ(A ) 8 8 8 8READ(A,t) 8 8 8 8
t:=t*2 16 8 8 8
WRITE(A t) 16 16 8 8 <T A 8>WRITE(A,t) 16 16 8 8 <T,A,8>
INPUT(B) 16 16 8 8 8
READ(B,t) 8 16 8 8 8READ(B,t) 8 16 8 8 8
t:=t*2 16 16 8 8 8
WRITE(B,t) 16 16 16 8 8 <T,B,8>
OUTPUT(A) 16 16 16 16 8
OUTPUT(B) 16 16 16 16 16 Crash !
COMMIT <COMMIT T>
WHAT DO WE DO ?
![Page 34: Introduction to Database Systems CSE 444 … · Pl tt Unit of read or write: disk block Platters Arm movement disk block Once in memory: page Typically: 4k or 8k or 16k Arm assembly](https://reader030.fdocuments.net/reader030/viewer/2022040415/5f32f3543833a442b976adc3/html5/thumbnails/34.jpg)
Action T Mem A Mem B Disk A Disk B Log
<START T>START T
INPUT(A) 8 8 8
READ(A,t) 8 8 8 8
t:=t*2 16 8 8 8
WRITE(A,t) 16 16 8 8 <T,A,8>
INPUT(B) 16 16 8 8 8
READ(B,t) 8 16 8 8 8
t:=t*2 16 16 8 8 8
WRITE(B,t) 16 16 16 8 8 <T,B,8>
OUTPUT(A) 16 16 16 16 8OUTPUT(A) 16 16 16 16 8
OUTPUT(B) 16 16 16 16 16
COMMIT <COMMIT T>COMMIT <COMMIT T>
Crash !WHAT DO WE DO ?
![Page 35: Introduction to Database Systems CSE 444 … · Pl tt Unit of read or write: disk block Platters Arm movement disk block Once in memory: page Typically: 4k or 8k or 16k Arm assembly](https://reader030.fdocuments.net/reader030/viewer/2022040415/5f32f3543833a442b976adc3/html5/thumbnails/35.jpg)
After Crash
• In the first example:– We UNDO both changes: A=8, B=8
The transaction is atomic since none of its actions has been– The transaction is atomic, since none of its actions has been executed
• In the second example– We don’t undo anything– The transaction is atomic, since both it’s actions have beenThe transaction is atomic, since both it s actions have been
executed
CSE 444 - Summer 2010 35
![Page 36: Introduction to Database Systems CSE 444 … · Pl tt Unit of read or write: disk block Platters Arm movement disk block Once in memory: page Typically: 4k or 8k or 16k Arm assembly](https://reader030.fdocuments.net/reader030/viewer/2022040415/5f32f3543833a442b976adc3/html5/thumbnails/36.jpg)
Undo-Logging Rules
U1: If T modifies X, then <T,X,v> must be written to disk before OUTPUT(X)
U2: If T commits, then OUTPUT(X) must be written to disk before <COMMIT T>written to disk before <COMMIT T>
• Hence: OUTPUTs are done early before theHence: OUTPUTs are done early, before the transaction commits
CSE 444 - Summer 2010 36
![Page 37: Introduction to Database Systems CSE 444 … · Pl tt Unit of read or write: disk block Platters Arm movement disk block Once in memory: page Typically: 4k or 8k or 16k Arm assembly](https://reader030.fdocuments.net/reader030/viewer/2022040415/5f32f3543833a442b976adc3/html5/thumbnails/37.jpg)
Action T Mem A Mem B Disk A Disk B Log
<START T>
INPUT(A) 8 8 8
READ(A ) 8 8 8 8READ(A,t) 8 8 8 8
t:=t*2 16 8 8 8
WRITE(A t) 16 16 8 8 <T A 8>WRITE(A,t) 16 16 8 8 <T,A,8>
INPUT(B) 16 16 8 8 8
READ(B,t) 8 16 8 8 8READ(B,t) 8 16 8 8 8
t:=t*2 16 16 8 8 8
WRITE(B,t) 16 16 16 8 8 <T,B,8>
OUTPUT(A) 16 16 16 16 8
OUTPUT(B) 16 16 16 16 16
COMMIT <COMMIT T>
![Page 38: Introduction to Database Systems CSE 444 … · Pl tt Unit of read or write: disk block Platters Arm movement disk block Once in memory: page Typically: 4k or 8k or 16k Arm assembly](https://reader030.fdocuments.net/reader030/viewer/2022040415/5f32f3543833a442b976adc3/html5/thumbnails/38.jpg)
Recovery with Undo Log
After system’s crash, run recovery manager
• Idea 1. Decide for each transaction T whether it is completed or not
<START T> <COMMIT T> = yes– <START T>….<COMMIT T>…. = yes– <START T>….<ABORT T>……. = yes– <START T>……………………… = no
• Idea 2. Undo all modifications by incomplete transactions
CSE 444 - Summer 2010 38
![Page 39: Introduction to Database Systems CSE 444 … · Pl tt Unit of read or write: disk block Platters Arm movement disk block Once in memory: page Typically: 4k or 8k or 16k Arm assembly](https://reader030.fdocuments.net/reader030/viewer/2022040415/5f32f3543833a442b976adc3/html5/thumbnails/39.jpg)
Recovery with Undo Log
Recovery manager:• Read log from the end; cases:g
<COMMIT T>: mark T as completed<ABORT T>: mark T as completed<T,X,v>: if T is not completed
then write X=v to diskelse ignoreelse ignore
<START T>: ignore
CSE 444 - Summer 2010 39
![Page 40: Introduction to Database Systems CSE 444 … · Pl tt Unit of read or write: disk block Platters Arm movement disk block Once in memory: page Typically: 4k or 8k or 16k Arm assembly](https://reader030.fdocuments.net/reader030/viewer/2022040415/5f32f3543833a442b976adc3/html5/thumbnails/40.jpg)
Recovery with Undo Log……<T6,X6,v6>
Question 1: Which updates are undone?
……<START T5>
S
Question 2: How far back do we need to read in the log?<START T4>
<T1,X1,v1><T5,X5,v5>
log?
Question 3: What happens if there is a second crash , ,
<T4,X4,v4><COMMIT T5><T3 X3 v3>
during recovery?
<T3,X3,v3><T2,X2,v2>crash
40
![Page 41: Introduction to Database Systems CSE 444 … · Pl tt Unit of read or write: disk block Platters Arm movement disk block Once in memory: page Typically: 4k or 8k or 16k Arm assembly](https://reader030.fdocuments.net/reader030/viewer/2022040415/5f32f3543833a442b976adc3/html5/thumbnails/41.jpg)
Recovery with Undo Log
• Note: all undo commands are idempotent– If we perform them a second time, no harm done– E.g. if there is a system crash during recovery,
simply restart recovery from scratch
CSE 444 - Summer 2010 41
![Page 42: Introduction to Database Systems CSE 444 … · Pl tt Unit of read or write: disk block Platters Arm movement disk block Once in memory: page Typically: 4k or 8k or 16k Arm assembly](https://reader030.fdocuments.net/reader030/viewer/2022040415/5f32f3543833a442b976adc3/html5/thumbnails/42.jpg)
Recovery with Undo Log
When do we stop reading the log ?• We cannot stop until we reach the beginning p g g
of the log file• This is impractical
Instead: use checkpointingp g
CSE 444 - Summer 2010 42
![Page 43: Introduction to Database Systems CSE 444 … · Pl tt Unit of read or write: disk block Platters Arm movement disk block Once in memory: page Typically: 4k or 8k or 16k Arm assembly](https://reader030.fdocuments.net/reader030/viewer/2022040415/5f32f3543833a442b976adc3/html5/thumbnails/43.jpg)
Checkpointing
Checkpoint the database periodically• Stop accepting new transactionsp p g• Wait until all current transactions complete• Flush log to diskg• Write a <CKPT> log record, flush• Resume transactionsResume transactions
CSE 444 - Summer 2010 43
![Page 44: Introduction to Database Systems CSE 444 … · Pl tt Unit of read or write: disk block Platters Arm movement disk block Once in memory: page Typically: 4k or 8k or 16k Arm assembly](https://reader030.fdocuments.net/reader030/viewer/2022040415/5f32f3543833a442b976adc3/html5/thumbnails/44.jpg)
Undo Recovery withUndo Recovery with Checkpointing
……<T9,X9,v9>
During recovery other transactions……(all completed)<CKPT>
During recovery,Can stop at first<CKPT>
<START T2><START T3<START T5><START T4>START T4<T1,X1,v1><T5,X5,v5><T4,X4,v4>
COMMIT T5
transactions T2,T3,T4,T5
<COMMIT T5><T3,X3,v3><T2,X2,v2> 44
![Page 45: Introduction to Database Systems CSE 444 … · Pl tt Unit of read or write: disk block Platters Arm movement disk block Once in memory: page Typically: 4k or 8k or 16k Arm assembly](https://reader030.fdocuments.net/reader030/viewer/2022040415/5f32f3543833a442b976adc3/html5/thumbnails/45.jpg)
Nonquiescent Checkpointing
• Problem with checkpointing: database freezes during checkpoint
• Would like to checkpoint while database is operational
• Idea: nonquiescent checkpointing
Quiescent = being quiet, still, or at rest; inactiveNon-quiescent = allowing transactions to be activeNon-quiescent = allowing transactions to be active
CSE 444 - Summer 2010 45
![Page 46: Introduction to Database Systems CSE 444 … · Pl tt Unit of read or write: disk block Platters Arm movement disk block Once in memory: page Typically: 4k or 8k or 16k Arm assembly](https://reader030.fdocuments.net/reader030/viewer/2022040415/5f32f3543833a442b976adc3/html5/thumbnails/46.jpg)
Nonquiescent Checkpointing
• Write a <START CKPT(T1,…,Tk)>where T1,…,Tk are all active transactions.
• Continue normal operation
• When all of T1,…,Tk have completed, write , , p ,<END CKPT>.
CSE 444 - Summer 2010 46
![Page 47: Introduction to Database Systems CSE 444 … · Pl tt Unit of read or write: disk block Platters Arm movement disk block Once in memory: page Typically: 4k or 8k or 16k Arm assembly](https://reader030.fdocuments.net/reader030/viewer/2022040415/5f32f3543833a442b976adc3/html5/thumbnails/47.jpg)
Undo Recovery withUndo Recovery with Nonquiescent Checkpointing
…………During recovery
earlier transactions plusT4 T5 T6…
…<START CKPT T4, T5, T6>…
During recovery,Can stop at first<CKPT>
T4, T5, T6
………<END CKPT>
T4, T5, T6, pluslater transactions
………
later transactionslater transactionsQ: Do we need <END CKPT> ? 47
![Page 48: Introduction to Database Systems CSE 444 … · Pl tt Unit of read or write: disk block Platters Arm movement disk block Once in memory: page Typically: 4k or 8k or 16k Arm assembly](https://reader030.fdocuments.net/reader030/viewer/2022040415/5f32f3543833a442b976adc3/html5/thumbnails/48.jpg)
Implementing ROLLBACK
• Recall: a transaction can end in COMMIT or ROLLBACK
• Idea: use the undo-log to implement ROLLBCACK
• How ?– LSN = Log Sequence Number– Log entries for the same transaction are linked,
using the LSN’s– Read log in reverse using LSN pointers– Read log in reverse, using LSN pointers
CSE 444 - Summer 2010 48
![Page 49: Introduction to Database Systems CSE 444 … · Pl tt Unit of read or write: disk block Platters Arm movement disk block Once in memory: page Typically: 4k or 8k or 16k Arm assembly](https://reader030.fdocuments.net/reader030/viewer/2022040415/5f32f3543833a442b976adc3/html5/thumbnails/49.jpg)
Undo Logging Critique
• Works!• But….
– Requires physical OUTPUT before transaction can commit
C I/O if d t ill b• Can cause unnecessary I/O ops if more updates will be done on the same buffer page soon
• What if two transactions share the same buffer page and l i d t it? (thi i btlonly one is ready to commit? (this one is subtle – more
later…)
CSE 444 - Summer 2010 49
![Page 50: Introduction to Database Systems CSE 444 … · Pl tt Unit of read or write: disk block Platters Arm movement disk block Once in memory: page Typically: 4k or 8k or 16k Arm assembly](https://reader030.fdocuments.net/reader030/viewer/2022040415/5f32f3543833a442b976adc3/html5/thumbnails/50.jpg)
Redo Logging
Log records• <START T> = transaction T has begung• <COMMIT T> = T has committed• <ABORT T>= T has aborted• <T,X,v>= T has updated element X, and its
new value is v
CSE 444 - Summer 2010 50
![Page 51: Introduction to Database Systems CSE 444 … · Pl tt Unit of read or write: disk block Platters Arm movement disk block Once in memory: page Typically: 4k or 8k or 16k Arm assembly](https://reader030.fdocuments.net/reader030/viewer/2022040415/5f32f3543833a442b976adc3/html5/thumbnails/51.jpg)
Action T Mem A Mem B Disk A Disk B Log
<START T>
READ(A,t) 8 8 8 8
t:=t*2 16 8 8 8
WRITE(A t) 16 16 8 8 <T A 16>WRITE(A,t) 16 16 8 8 <T,A,16>
READ(B,t) 8 16 8 8 8
t:=t*2 16 16 8 8 8t: t 2 16 16 8 8 8
WRITE(B,t) 16 16 16 8 8 <T,B,16>
<COMMIT T>
OUTPUT(A) 16 16 16 16 8
OUTPUT(B) 16 16 16 16 16
CSE 444 - Summer 2010 51
![Page 52: Introduction to Database Systems CSE 444 … · Pl tt Unit of read or write: disk block Platters Arm movement disk block Once in memory: page Typically: 4k or 8k or 16k Arm assembly](https://reader030.fdocuments.net/reader030/viewer/2022040415/5f32f3543833a442b976adc3/html5/thumbnails/52.jpg)
Redo-Logging Rules
R1: If T modifies X, then both <T,X,v> and <COMMIT T> must be written to disk before OUTPUT(X)
• Hence: OUTPUTs are done late
CSE 444 - Summer 2010 52
![Page 53: Introduction to Database Systems CSE 444 … · Pl tt Unit of read or write: disk block Platters Arm movement disk block Once in memory: page Typically: 4k or 8k or 16k Arm assembly](https://reader030.fdocuments.net/reader030/viewer/2022040415/5f32f3543833a442b976adc3/html5/thumbnails/53.jpg)
Action T Mem A Mem B Disk A Disk B Log
<START T>
READ(A,t) 8 8 8 8
t:=t*2 16 8 8 8
WRITE(A t) 16 16 8 8 <T A 16>WRITE(A,t) 16 16 8 8 <T,A,16>
READ(B,t) 8 16 8 8 8
t:=t*2 16 16 8 8 8t: t 2 16 16 8 8 8
WRITE(B,t) 16 16 16 8 8 <T,B,16>
<COMMIT T>
OUTPUT(A) 16 16 16 16 8
OUTPUT(B) 16 16 16 16 16
CSE 444 - Summer 2010 53
![Page 54: Introduction to Database Systems CSE 444 … · Pl tt Unit of read or write: disk block Platters Arm movement disk block Once in memory: page Typically: 4k or 8k or 16k Arm assembly](https://reader030.fdocuments.net/reader030/viewer/2022040415/5f32f3543833a442b976adc3/html5/thumbnails/54.jpg)
Recovery with Redo Log
After system’s crash, run recovery manager • Step 1. Decide for each transaction T whether
it i l t d tit is completed or not– <START T>….<COMMIT T>…. = yes– <START T> <ABORT T> = yes– <START T>….<ABORT T>……. = yes– <START T>……………………… = no
• Step 2. Read log from the beginning, redo all p g g g,updates of committed transactions
CSE 444 - Summer 2010 54
![Page 55: Introduction to Database Systems CSE 444 … · Pl tt Unit of read or write: disk block Platters Arm movement disk block Once in memory: page Typically: 4k or 8k or 16k Arm assembly](https://reader030.fdocuments.net/reader030/viewer/2022040415/5f32f3543833a442b976adc3/html5/thumbnails/55.jpg)
Recovery with Redo Log
<START T1><T1,X1,v1><START T2><START T2><T2, X2, v2><START T3><T1 X3 3><T1,X3,v3><COMMIT T2><T3,X4,v4><T1,X5,v5>………
CSE 444 - Summer 2010 55
![Page 56: Introduction to Database Systems CSE 444 … · Pl tt Unit of read or write: disk block Platters Arm movement disk block Once in memory: page Typically: 4k or 8k or 16k Arm assembly](https://reader030.fdocuments.net/reader030/viewer/2022040415/5f32f3543833a442b976adc3/html5/thumbnails/56.jpg)
Nonquiescent Checkpointing
• Write a <START CKPT(T1,…,Tk)>where T1,…,Tk are all active transactions
• Flush to disk all blocks of committed transactions (dirty blocks), while continuing
l tinormal operation• When all blocks have been written, write
<END CKPT><END CKPT>
CSE 444 - Summer 2010 56
![Page 57: Introduction to Database Systems CSE 444 … · Pl tt Unit of read or write: disk block Platters Arm movement disk block Once in memory: page Typically: 4k or 8k or 16k Arm assembly](https://reader030.fdocuments.net/reader030/viewer/2022040415/5f32f3543833a442b976adc3/html5/thumbnails/57.jpg)
Redo Recovery withRedo Recovery with Nonquiescent Checkpointing
…<START T1>…<COMMIT T1>Step 1: look for Step 2: redo…a<START T4>…<START CKPT T4, T5, T6>
Step 1: look forThe last<END CKPT>
pfrom theearlieststart of…
………
start ofT4, T5, T6ignoringt ti
All OUTPUTsof T1 are
<END CKPT>………
transactionscommittedearlier
known to be on disk
C t <START CKPT T9, T10>…
Cannotuse 57
![Page 58: Introduction to Database Systems CSE 444 … · Pl tt Unit of read or write: disk block Platters Arm movement disk block Once in memory: page Typically: 4k or 8k or 16k Arm assembly](https://reader030.fdocuments.net/reader030/viewer/2022040415/5f32f3543833a442b976adc3/html5/thumbnails/58.jpg)
Comparison Undo/Redo
• Undo logging:– OUTPUT must be done early– If <COMMIT T> is seen T definitely has written all its data to
Steal/ForceIf <COMMIT T> is seen, T definitely has written all its data to disk (hence, don’t need to redo) – inefficient
• Redo logging– OUTPUT must be done late No-Steal/No-Force– OUTPUT must be done late– If <COMMIT T> is not seen, T definitely has not written any
of its data to disk (hence there is not dirty data on disk, no need to undo) – inflexible
No Steal/No Force
)• Would like more flexibility on when to OUTPUT:
undo/redo logging (next) Steal/No-Force
CSE 444 - Summer 2010 58
![Page 59: Introduction to Database Systems CSE 444 … · Pl tt Unit of read or write: disk block Platters Arm movement disk block Once in memory: page Typically: 4k or 8k or 16k Arm assembly](https://reader030.fdocuments.net/reader030/viewer/2022040415/5f32f3543833a442b976adc3/html5/thumbnails/59.jpg)
Undo/Redo Logging
Log records, only one change• <T,X,u,v>= T has updated element X, its oldp
value was u, and its new value is v
CSE 444 - Summer 2010 59
![Page 60: Introduction to Database Systems CSE 444 … · Pl tt Unit of read or write: disk block Platters Arm movement disk block Once in memory: page Typically: 4k or 8k or 16k Arm assembly](https://reader030.fdocuments.net/reader030/viewer/2022040415/5f32f3543833a442b976adc3/html5/thumbnails/60.jpg)
Undo/Redo-Logging Rule
UR1: If T modifies X, then <T,X,u,v> must be written to disk before OUTPUT(X)
Note: we are free to OUTPUT early or late relative to <COMMIT T>
CSE 444 - Summer 2010 60
![Page 61: Introduction to Database Systems CSE 444 … · Pl tt Unit of read or write: disk block Platters Arm movement disk block Once in memory: page Typically: 4k or 8k or 16k Arm assembly](https://reader030.fdocuments.net/reader030/viewer/2022040415/5f32f3543833a442b976adc3/html5/thumbnails/61.jpg)
Action T Mem A Mem B Disk A Disk B Log
<START T>
REAT(A,t) 8 8 8 8
t:=t*2 16 8 8 8
WRITE(A t) 16 16 8 8 <T A 8 16>WRITE(A,t) 16 16 8 8 <T,A,8,16>
READ(B,t) 8 16 8 8 8
t:=t*2 16 16 8 8 8t: t 2 16 16 8 8 8
WRITE(B,t) 16 16 16 8 8 <T,B,8,16>
OUTPUT(A) 16 16 16 16 8
<COMMIT T>
OUTPUT(B) 16 16 16 16 16
Can OUTPUT whenever we want: before/after COMMIT
![Page 62: Introduction to Database Systems CSE 444 … · Pl tt Unit of read or write: disk block Platters Arm movement disk block Once in memory: page Typically: 4k or 8k or 16k Arm assembly](https://reader030.fdocuments.net/reader030/viewer/2022040415/5f32f3543833a442b976adc3/html5/thumbnails/62.jpg)
Recovery with Undo/Redo Log
After system’s crash, run recovery manager • Redo all committed transaction, top-downp• Undo all uncommitted transactions, bottom-up
CSE 444 - Summer 2010 62
![Page 63: Introduction to Database Systems CSE 444 … · Pl tt Unit of read or write: disk block Platters Arm movement disk block Once in memory: page Typically: 4k or 8k or 16k Arm assembly](https://reader030.fdocuments.net/reader030/viewer/2022040415/5f32f3543833a442b976adc3/html5/thumbnails/63.jpg)
Recovery with Undo/Redo Log<START T1><T1,X1,v1><START T2><T2 X2 v2><T2, X2, v2><START T3><T1,X3,v3><COMMIT T2><T3,X4,v4><T1,X5,v5>………
CSE 444 - Summer 2010 63
![Page 64: Introduction to Database Systems CSE 444 … · Pl tt Unit of read or write: disk block Platters Arm movement disk block Once in memory: page Typically: 4k or 8k or 16k Arm assembly](https://reader030.fdocuments.net/reader030/viewer/2022040415/5f32f3543833a442b976adc3/html5/thumbnails/64.jpg)
Granularity of the Log
• Physical logging: element = physical page• Logical logging: element = data recordg gg g
• What are the pros and cons ?p
CSE 444 - Summer 2010 64
![Page 65: Introduction to Database Systems CSE 444 … · Pl tt Unit of read or write: disk block Platters Arm movement disk block Once in memory: page Typically: 4k or 8k or 16k Arm assembly](https://reader030.fdocuments.net/reader030/viewer/2022040415/5f32f3543833a442b976adc3/html5/thumbnails/65.jpg)
Granularity of the Log
• Modern DBMS:
• Physical logging for the REDO part– Efficiency
• Logical logging for the UNDO part– For ROLLBACKs
CSE 444 - Summer 2010 65