Chapter 11 - db.inf.uni-tuebingen.de · Chapter 11 Transaction Management Concurrent and Consistent...
Transcript of Chapter 11 - db.inf.uni-tuebingen.de · Chapter 11 Transaction Management Concurrent and Consistent...
TransactionManagement
Torsten Grust
ACID Properties
Anomalies
The Scheduler
Serializability
Query Scheduling
Locking
Two-Phase Locking
OptimisticConcurrency Protocol
Multi-VersionConcurrency Control
1
Chapter 11Transaction ManagementConcurrent and Consistent Data Access
Architecture and Implementation of Database SystemsSummer 2014
Torsten GrustWilhelm-Schickard-Institut für Informatik
Universität Tübingen
TransactionManagement
Torsten Grust
ACID Properties
Anomalies
The Scheduler
Serializability
Query Scheduling
Locking
Two-Phase Locking
OptimisticConcurrency Protocol
Multi-VersionConcurrency Control
2
The “Hello World” of Transaction Management
• My bank issued me a debit card to access my account.
• Every once in a while, I’d use it at an ATM to draw somemoney from my account, causing the ATM to perform atransaction in the bank’s database.
Example (ATM transaction)
1 bal ← read_bal (acct_no) ;2 bal ← bal − 100¤ ;3 write_bal (acct_no, bal) ;
• My account is properly updated to reflect the new balance.
TransactionManagement
Torsten Grust
ACID Properties
Anomalies
The Scheduler
Serializability
Query Scheduling
Locking
Two-Phase Locking
OptimisticConcurrency Protocol
Multi-VersionConcurrency Control
3
Concurrent Access
The problem is: My wife has a card for the very same account, too.
⇒ We might end up using our cards at different ATMs at thesame time, i.e., concurrently.
Example (Concurrent ATM transactions)
me
my wife DB state
bal ← read (acct) ;
1200bal ← read (acct) ; 1200
bal ← bal − 100 ;
1200bal ← bal − 200 ; 1200
write (acct, bal) ;
1100write (acct, bal) ; 1000
• The first update was lost during this execution. Lucky me!
TransactionManagement
Torsten Grust
ACID Properties
Anomalies
The Scheduler
Serializability
Query Scheduling
Locking
Two-Phase Locking
OptimisticConcurrency Protocol
Multi-VersionConcurrency Control
3
Concurrent Access
The problem is: My wife has a card for the very same account, too.
⇒ We might end up using our cards at different ATMs at thesame time, i.e., concurrently.
Example (Concurrent ATM transactions)
me my wife
DB state
bal ← read (acct) ;
1200
bal ← read (acct) ;
1200
bal ← bal − 100 ;
1200
bal ← bal − 200 ;
1200
write (acct, bal) ;
1100
write (acct, bal) ;
1000
• The first update was lost during this execution. Lucky me!
TransactionManagement
Torsten Grust
ACID Properties
Anomalies
The Scheduler
Serializability
Query Scheduling
Locking
Two-Phase Locking
OptimisticConcurrency Protocol
Multi-VersionConcurrency Control
3
Concurrent Access
The problem is: My wife has a card for the very same account, too.
⇒ We might end up using our cards at different ATMs at thesame time, i.e., concurrently.
Example (Concurrent ATM transactions)
me my wife DB state
bal ← read (acct) ; 1200bal ← read (acct) ; 1200
bal ← bal − 100 ; 1200bal ← bal − 200 ; 1200
write (acct, bal) ; 1100write (acct, bal) ; 1000
• The first update was lost during this execution. Lucky me!
TransactionManagement
Torsten Grust
ACID Properties
Anomalies
The Scheduler
Serializability
Query Scheduling
Locking
Two-Phase Locking
OptimisticConcurrency Protocol
Multi-VersionConcurrency Control
4
If the Plug is Pulled . . .
• This time, I want to transfer money over to another account.
Example (Money transfer transaction)
// Subtract money from source (checking) account
1 chk_bal ← read_bal (chk_acct_no) ;2 chk_bal ← chk_bal − 500¤ ;3 write_bal (chk_acct_no, chk_bal) ;
// Credit money to the target (savings) account
4 sav_bal ← read_bal (sav_acct_no) ;5 sav_bal ← sav_bal + 500¤ ;6 7 write_bal (sav_acct_no, sav_bal) ;
• Before the transaction gets to step 7, its execution isinterrupted/cancelled (power outage, disk failure, softwarebug, . . . ). My money is lost /.
TransactionManagement
Torsten Grust
ACID Properties
Anomalies
The Scheduler
Serializability
Query Scheduling
Locking
Two-Phase Locking
OptimisticConcurrency Protocol
Multi-VersionConcurrency Control
5
ACID Properties
To prevent these (and many other) effects from happening, aDBMS guarantees the following transaction properties:
AtomicityA Either all or none of the updates in a databasetransaction are applied.
ConsistencyC Every transaction brings the database from oneconsistent state to another. (While the transactionexecutes, the database state may be temporarilyinconsistent.)
IsolationI A transaction must not see any effect from othertransactions that run in parallel.
DurabilityD The effects of a successful transaction remainpersistent and may not be undone for systemreasons.
TransactionManagement
Torsten Grust
ACID Properties
Anomalies
The Scheduler
Serializability
Query Scheduling
Locking
Two-Phase Locking
OptimisticConcurrency Protocol
Multi-VersionConcurrency Control
6
Concurrency Control
data files, indices, . . .
Disk Space Manager
Buffer Manager
Files and Access Methods
Operator Evaluator Optimizer
Executor Parser
RecoveryManager
Lock Manager
TransactionManager
DBMS
Database
SQL Commands
Web Forms Applications SQL Interface
TransactionManagement
Torsten Grust
ACID Properties
Anomalies
The Scheduler
Serializability
Query Scheduling
Locking
Two-Phase Locking
OptimisticConcurrency Protocol
Multi-VersionConcurrency Control
7
Anomalies: Lost Update
• We already saw an example of the lost update anomaly onslide 3:
The effects of one transaction are lost due to anuncontrolled overwrite performed by the secondtransaction.
TransactionManagement
Torsten Grust
ACID Properties
Anomalies
The Scheduler
Serializability
Query Scheduling
Locking
Two-Phase Locking
OptimisticConcurrency Protocol
Multi-VersionConcurrency Control
8
Anomalies: Inconsistent Read
Reconsider the money transfer example (slide 4), expressed inSQL syntax:
Example
Transaction 1 Transaction 21 UPDATE Accounts2 SET balance = balance - 5003 WHERE customer = 19044 AND account_type = ’C’;
1 SELECT SUM(balance)2 FROM Accounts3 WHERE customer = 1904;
5 UPDATE Accounts6 SET balance = balance + 5007 WHERE customer = 19048 AND account_type = ’S’;
• Transaction 2 sees a temporary, inconsistent database state.
TransactionManagement
Torsten Grust
ACID Properties
Anomalies
The Scheduler
Serializability
Query Scheduling
Locking
Two-Phase Locking
OptimisticConcurrency Protocol
Multi-VersionConcurrency Control
9
Anomalies: Dirty Read
At a different day, my wife and me again end up in front of anATM at roughly the same time. This time, my transaction iscancelled (aborted):
Example
me my wife DB state
bal ← read (acct) ; 1200bal ← bal − 100 ; 1200write (acct, bal) ; 1100
bal ← read (acct) ; 1100bal ← bal − 200 ; 1100
abort ; 1200write (acct, bal) ; 900
• My wife’s transaction has already read the modified accountbalance before my transaction was rolled back (i.e., itseffects are undone).
TransactionManagement
Torsten Grust
ACID Properties
Anomalies
The Scheduler
Serializability
Query Scheduling
Locking
Two-Phase Locking
OptimisticConcurrency Protocol
Multi-VersionConcurrency Control
9
Anomalies: Dirty Read
At a different day, my wife and me again end up in front of anATM at roughly the same time. This time, my transaction iscancelled (aborted):
Example
me my wife DB state
bal ← read (acct) ; 1200bal ← bal − 100 ; 1200write (acct, bal) ; 1100
bal ← read (acct) ; 1100bal ← bal − 200 ; 1100
abort ; 1200
write (acct, bal) ; 900
• My wife’s transaction has already read the modified accountbalance before my transaction was rolled back (i.e., itseffects are undone).
TransactionManagement
Torsten Grust
ACID Properties
Anomalies
The Scheduler
Serializability
Query Scheduling
Locking
Two-Phase Locking
OptimisticConcurrency Protocol
Multi-VersionConcurrency Control
9
Anomalies: Dirty Read
At a different day, my wife and me again end up in front of anATM at roughly the same time. This time, my transaction iscancelled (aborted):
Example
me my wife DB state
bal ← read (acct) ; 1200bal ← bal − 100 ; 1200write (acct, bal) ; 1100
bal ← read (acct) ; 1100bal ← bal − 200 ; 1100
abort ; 1200write (acct, bal) ; 900
• My wife’s transaction has already read the modified accountbalance before my transaction was rolled back (i.e., itseffects are undone).
TransactionManagement
Torsten Grust
ACID Properties
Anomalies
The Scheduler
Serializability
Query Scheduling
Locking
Two-Phase Locking
OptimisticConcurrency Protocol
Multi-VersionConcurrency Control
10
Concurrent Execution
• The scheduler decides the execution order of concurrentdatabase accesses.
The transaction scheduler
Client 1 Client 2 Client 3
Scheduler
Access and Storage Layer
32
1
2
1
32
1
1
2
1
1
TransactionManagement
Torsten Grust
ACID Properties
Anomalies
The Scheduler
Serializability
Query Scheduling
Locking
Two-Phase Locking
OptimisticConcurrency Protocol
Multi-VersionConcurrency Control
11
Database Objects and Accesses
• We now assume a slightly simplified model of databaseaccess:
1 A database consists of a number of named objects. In agiven database state, each object has a value.
2 Transactions access an object o using the twooperations read o and write o.
• In a relational DBMS we have that
object ≡ attribute .
This defines the granularity of our discussion. Otherpossible granularities:
object ≡ row, object ≡ table .
TransactionManagement
Torsten Grust
ACID Properties
Anomalies
The Scheduler
Serializability
Query Scheduling
Locking
Two-Phase Locking
OptimisticConcurrency Protocol
Multi-VersionConcurrency Control
12
Transactions
Database transaction
A database transaction T is a (strictly ordered) sequence ofsteps. Each step is a pair of an access operation applied to anobject.
• Transaction T = 〈s1, . . . , sn〉• Step si = (ai, ei)
• Access operation ai ∈ {r(ead), w(rite)}The length of a transaction T is its number of steps |T| = n.
We could write the money transfer transaction as
T = 〈 (read, Checking), (write, Checking),(read, Saving), (write, Saving) 〉
32
1
or, more concisely,
T = 〈r(C),w(C), r(S),w(S)〉 .
TransactionManagement
Torsten Grust
ACID Properties
Anomalies
The Scheduler
Serializability
Query Scheduling
Locking
Two-Phase Locking
OptimisticConcurrency Protocol
Multi-VersionConcurrency Control
13
Schedules
Schedule
A schedule S for a given set of transactions T = {T1, . . . , Tn} is anarbitrary sequence of execution steps
S(k) = (Tj, ai, ei) k = 1 . . .m ,2
1
1
such that
1 S contains all steps of all transactions and nothing else and
2 the order among steps in each transaction Tj is preserved:
(ap, ep) < (aq, eq) in Tj ⇒ (Tj, ap, ep) < (Tj, aq, eq) in S
(read “<” as: occurs before).
We sometimes writeS = 〈r1(B), r2(B),w1(B),w2(B)〉
to abbreviateS(1) = (T1, read, B) S(3) = (T1, write, B)S(2) = (T2, read, B) S(4) = (T2, write, B)
TransactionManagement
Torsten Grust
ACID Properties
Anomalies
The Scheduler
Serializability
Query Scheduling
Locking
Two-Phase Locking
OptimisticConcurrency Protocol
Multi-VersionConcurrency Control
14
Serial Execution
Serial execution
One particular schedule is serial execution.
• A schedule S is serial iff, for each contained transaction Tj, allits steps are adjacent (no interleaving of transactions andthus no concurrency).
Briefly:
S = Tπ1, Tπ2, . . . , Tπn (for some permutation π of 1, . . . , n)
Consider again the ATM example from slide 3.
• S = 〈r1(B), r2(B),w1(B),w2(B)〉• This is a schedule, but it is not serial.
2
2
1
1
If my wife had gone to the bank one hour later (initiatingtransaction T2), the schedule probably would have been serial.
• S = 〈r1(B),w1(B), r2(B),w2(B)〉2
1
2
1
TransactionManagement
Torsten Grust
ACID Properties
Anomalies
The Scheduler
Serializability
Query Scheduling
Locking
Two-Phase Locking
OptimisticConcurrency Protocol
Multi-VersionConcurrency Control
15
Correctness of Serial Execution
• Anomalies such as the “lost update” problem on slide 3 canonly occur in multi-user mode.
• If all transactions were fully executed one after another (noconcurrency), no anomalies would occur.
⇒ Any serial execution is correct.
• Disallowing concurrent access, however, is not practical.
Correctness criterion
Allow concurrent executions if their overall effect is equivalentto an (arbitrary) serial execution.
TransactionManagement
Torsten Grust
ACID Properties
Anomalies
The Scheduler
Serializability
Query Scheduling
Locking
Two-Phase Locking
OptimisticConcurrency Protocol
Multi-VersionConcurrency Control
16
Conflicts
What does it mean for a schedule S to be equivalent to anotherschedule S′?
• Sometimes, we may be able to reorder steps in a schedule.• We must not change the order among steps of any
transaction Tj (↗ slide 13).• Rearranging operations must not lead to a different
result.
• Two operations (Ti, a, e) and (Tj, a′, e′) are said to be inconflict (Ti, a, e) = (Tj, a′, e′) if their order of executionmatters.
• When reordering a schedule, we must not change therelative order of such operations.
• Any schedule S′ that can be obtained this way from S is saidto be conflict equivalent to S.
TransactionManagement
Torsten Grust
ACID Properties
Anomalies
The Scheduler
Serializability
Query Scheduling
Locking
Two-Phase Locking
OptimisticConcurrency Protocol
Multi-VersionConcurrency Control
17
Conflicts
Based on our read/write model, we can come up with a moremachine-friendly definition of a conflict.
Conflicting operations
Two operations (Ti, a, e) and (Tj, a′, e′) are in conflict (=) in S if
1 they belong to two different transactions (Ti 6= Tj), and
2 they access the same database object, i.e., e = e′, and
3 at least one of them is a write operation.
• This inspires the following conflict matrix:read write
read ×write × ×
• Conflict relation ≺S:
(Ti, a, e) ≺S (Tj, a′, e′):=
(Ti, a, e) = (Tj, a′, e′) ∧ (Ti, a, e) occurs before (Tj, a′, e′) in S
TransactionManagement
Torsten Grust
ACID Properties
Anomalies
The Scheduler
Serializability
Query Scheduling
Locking
Two-Phase Locking
OptimisticConcurrency Protocol
Multi-VersionConcurrency Control
18
Conflict Serializability
Conflict serializability
A schedule S is conflict serializable iff it is conflict equivalent tosome serial schedule S′.
• The execution of a conflict-serializable S schedule iscorrect.
• Note: S does not have to be a serial schedule.�
TransactionManagement
Torsten Grust
ACID Properties
Anomalies
The Scheduler
Serializability
Query Scheduling
Locking
Two-Phase Locking
OptimisticConcurrency Protocol
Multi-VersionConcurrency Control
19
Serializability: Example
Example (Three schedules Si for two transactions T1,2, with S2 serial)
Schedule S1
T1 T2
read Awrite A
read Awrite A
read Bwrite B
read Bwrite B
Schedule S2
T1 T2
read Awrite Aread Bwrite B
read Awrite Aread Bwrite B
Schedule S3
T1 T2
read Awrite A
read Awrite Aread Bwrite B
read Bwrite B
• Conflict relations:
(T1, r, A) ≺S1 (T2, w, A), (T1, r, B) ≺S1 (T2, w, B),(T1, w, A) ≺S1 (T2, r, A), (T1, w, B) ≺S1 (T2, r, B),(T1, w, A) ≺S1 (T2, w, A), (T1, w, B) ≺S1 (T2, w, B)
(Note: ≺S2 = ≺S1 )
(T1, r, A) ≺S3 (T2, w, A), (T2, r, B) ≺S3 (T1, w, B),(T1, w, A) ≺S3 (T2, r, A), (T2, w, B) ≺S3 (T1, r, B),(T1, w, A) ≺S3 (T2, w, A), (T2, w, B) ≺S3 (T1, w, B)
⇒ S1serializable
TransactionManagement
Torsten Grust
ACID Properties
Anomalies
The Scheduler
Serializability
Query Scheduling
Locking
Two-Phase Locking
OptimisticConcurrency Protocol
Multi-VersionConcurrency Control
20
The Conflict Graph
• The serializability idea comes with an effective test for thecorrectness of a schedule S based on its conflict graph G(S)(also: serialization graph):
• The nodes of G(S) are all transactions Ti in S.• There is an edge Ti → Tj iff S contains operations(Ti, a, e) and (Tj, a′, e′) such that (Ti, a, e) ≺S (Tj, a′, e′)
(read: in a conflict equivalent serial schedule, Ti mustoccur before Tj).
• S is conflict serializable iff G(S) is acyclic.
An equivalent serial schedule for S may be immediatelyobtained by sorting G(S) topologically.
TransactionManagement
Torsten Grust
ACID Properties
Anomalies
The Scheduler
Serializability
Query Scheduling
Locking
Two-Phase Locking
OptimisticConcurrency Protocol
Multi-VersionConcurrency Control
21
Serialization Graph
Example (ATM transactions (↗ slide 3))
• S = 〈r1(A), r2(A),w1(A),w2(A)〉• Conflict relation:(T1, r,A) ≺S (T2, w,A)(T2, r,A) ≺S (T1, w,A)(T1, w,A) ≺S (T2, w,A)
T1
T2
⇒ not serializable
Example (Two money transfers (↗ slide 4))
• S = 〈r1(C),w1(C), r2(C),w2(C), r1(S),w1(S), r2(S),w2(S)〉• Conflict relation:(T1, r, C) ≺S (T2, w, C)(T1, w, C) ≺S (T2, r, C)(T1, w, C) ≺S (T2, w, C)
...
T1
T2
⇒ serializable
TransactionManagement
Torsten Grust
ACID Properties
Anomalies
The Scheduler
Serializability
Query Scheduling
Locking
Two-Phase Locking
OptimisticConcurrency Protocol
Multi-VersionConcurrency Control
21
Serialization Graph
Example (ATM transactions (↗ slide 3))
• S = 〈r1(A), r2(A),w1(A),w2(A)〉• Conflict relation:(T1, r,A) ≺S (T2, w,A)(T2, r,A) ≺S (T1, w,A)(T1, w,A) ≺S (T2, w,A)
T1
T2
⇒ not serializable
Example (Two money transfers (↗ slide 4))
• S = 〈r1(C),w1(C), r2(C),w2(C), r1(S),w1(S), r2(S),w2(S)〉• Conflict relation:(T1, r, C) ≺S (T2, w, C)(T1, w, C) ≺S (T2, r, C)(T1, w, C) ≺S (T2, w, C)
...
T1
T2
⇒ serializable
TransactionManagement
Torsten Grust
ACID Properties
Anomalies
The Scheduler
Serializability
Query Scheduling
Locking
Two-Phase Locking
OptimisticConcurrency Protocol
Multi-VersionConcurrency Control
21
Serialization Graph
Example (ATM transactions (↗ slide 3))
• S = 〈r1(A), r2(A),w1(A),w2(A)〉• Conflict relation:(T1, r,A) ≺S (T2, w,A)(T2, r,A) ≺S (T1, w,A)(T1, w,A) ≺S (T2, w,A)
T1
T2
⇒ not serializable
Example (Two money transfers (↗ slide 4))
• S = 〈r1(C),w1(C), r2(C),w2(C), r1(S),w1(S), r2(S),w2(S)〉• Conflict relation:(T1, r, C) ≺S (T2, w, C)(T1, w, C) ≺S (T2, r, C)(T1, w, C) ≺S (T2, w, C)
...
T1
T2
⇒ serializable
TransactionManagement
Torsten Grust
ACID Properties
Anomalies
The Scheduler
Serializability
Query Scheduling
Locking
Two-Phase Locking
OptimisticConcurrency Protocol
Multi-VersionConcurrency Control
22
Query Scheduling
Can we build a scheduler that always emits a serializableschedule?
Idea:
• Require each transaction toobtain a lock before itaccesses a data object o:
Locking and unlocking of o
1 lock o ;2 . . . access o . . . ;3 unlock o ;
• This prevents concurrentaccess to o.
Client 1 Client 2 Client 3
Scheduler
Access and Storage Layer
32
1
2
1
32
1
2
1
1
TransactionManagement
Torsten Grust
ACID Properties
Anomalies
The Scheduler
Serializability
Query Scheduling
Locking
Two-Phase Locking
OptimisticConcurrency Protocol
Multi-VersionConcurrency Control
23
Locking
• If a lock cannot be granted (e.g., because another transactionT ′ already holds a conflicting lock) the requestingtransaction T gets blocked.
• The scheduler suspends execution of the blockedtransaction T .
• Once T ′ releases its lock, it may be granted to T , whoseexecution is then resumed.
⇒ Since other transactions can continue execution while T isblocked, locks can be used to control the relative order ofoperations.
TransactionManagement
Torsten Grust
ACID Properties
Anomalies
The Scheduler
Serializability
Query Scheduling
Locking
Two-Phase Locking
OptimisticConcurrency Protocol
Multi-VersionConcurrency Control
24
Locking and Scheduling
Example (Locking and scheduling)
• Consider twotransactions T1,2:
T1
lock Awrite Alock Bunlock Awrite Bunlock B
T2
lock Awrite Alock Bwrite Bwrite Aunlock Awrite Bunlock B
• Two valid schedules (respecting lock andunlock calls) are:
Schedule S1
T1 T2
lock Awrite Alock Bunlock A
lock Awrite A
write Bunlock B
lock Bwrite Bwrite Aunlock Awrite Bunlock B
Schedule S2
T1 T2
lock Awrite Alock Bwrite Bwrite Aunlock A
lock Awrite A
write Bunlock B
lock Bwrite Bunlock Bunlock A
• Note: Both schedules S1,2 are serializable. Are we done yet?
TransactionManagement
Torsten Grust
ACID Properties
Anomalies
The Scheduler
Serializability
Query Scheduling
Locking
Two-Phase Locking
OptimisticConcurrency Protocol
Multi-VersionConcurrency Control
25
Locking and Serializability
Example (Proper locking does not guarantee serializability yet)
Even if we adhere to a properly nested lock/unlock discipline,the scheduler might still yield non-serializiable schedules:
Schedule S1T1 T2lock Alock Cwrite Awrite Cunlock A
lock Awrite Alock Bunlock Awrite Bunlock B
unlock Clock Bwrite Bunlock B
lock Cwrite Cunlock C
� What is the conflict graph of thisschedule?
TransactionManagement
Torsten Grust
ACID Properties
Anomalies
The Scheduler
Serializability
Query Scheduling
Locking
Two-Phase Locking
OptimisticConcurrency Protocol
Multi-VersionConcurrency Control
26
ATM Transaction with Locking
Example (Two concurrent ATM transactions with locking)
Transaction 1 Transaction 2 DB state
lock (acct) ;
1200read (acct) ;
unlock (acct) ;lock (acct) ;
read (acct) ;
unlock (acct) ;lock (acct) ;
write (acct) ; 1100
unlock (acct) ;lock (acct) ;
write (acct) ; 1000
unlock (acct) ;
⇒ Again: on its own, proper locking does not guaranteeserializability yet.
TransactionManagement
Torsten Grust
ACID Properties
Anomalies
The Scheduler
Serializability
Query Scheduling
Locking
Two-Phase Locking
OptimisticConcurrency Protocol
Multi-VersionConcurrency Control
26
ATM Transaction with Locking
Example (Two concurrent ATM transactions with locking)
Transaction 1 Transaction 2 DB state
lock (acct) ; 1200read (acct) ;unlock (acct) ;
lock (acct) ;read (acct) ;unlock (acct) ;
lock (acct) ;write (acct) ; 1100unlock (acct) ;
lock (acct) ;write (acct) ; 1000unlock (acct) ;
⇒ Again: on its own, proper locking does not guaranteeserializability yet.
TransactionManagement
Torsten Grust
ACID Properties
Anomalies
The Scheduler
Serializability
Query Scheduling
Locking
Two-Phase Locking
OptimisticConcurrency Protocol
Multi-VersionConcurrency Control
27
Two-Phase Locking (2PL)
The two-phase locking protocol poses an additional restrictionon how transactions have to be written:
Definition (Two-Phase Locking)
• Once a transaction has released any lock (i.e., performed thefirst unlock), it must not acquire any new lock:
lock phase release phase
lock point
# of locks held
time
• Two-phase locking is the concurrency control protocol usedin database systems today.
TransactionManagement
Torsten Grust
ACID Properties
Anomalies
The Scheduler
Serializability
Query Scheduling
Locking
Two-Phase Locking
OptimisticConcurrency Protocol
Multi-VersionConcurrency Control
28
Again: ATM Transaction
Example (Two concurrent ATM transactions with locking, ¬ 2PL)
Transaction 1 Transaction 2 DB state
lock (acct) ; 1200read (acct) ;unlock (acct) ;
lock (acct) ;read (acct) ;unlock (acct) ;
lock (acct) ;
write (acct) ; 1100unlock (acct) ;
lock (acct) ;
write (acct) ; 1000unlock (acct) ;
These locks violate the 2PL principle.
TransactionManagement
Torsten Grust
ACID Properties
Anomalies
The Scheduler
Serializability
Query Scheduling
Locking
Two-Phase Locking
OptimisticConcurrency Protocol
Multi-VersionConcurrency Control
28
Again: ATM Transaction
Example (Two concurrent ATM transactions with locking, ¬ 2PL)
Transaction 1 Transaction 2 DB state
lock (acct) ; 1200read (acct) ;unlock (acct) ;
lock (acct) ;read (acct) ;unlock (acct) ;
lock (acct) ; write (acct) ; 1100unlock (acct) ;
lock (acct) ; write (acct) ; 1000unlock (acct) ;
These locks violate the 2PL principle.
TransactionManagement
Torsten Grust
ACID Properties
Anomalies
The Scheduler
Serializability
Query Scheduling
Locking
Two-Phase Locking
OptimisticConcurrency Protocol
Multi-VersionConcurrency Control
29
A 2PL-Compliant ATM Transaction
• To comply with the two-phase locking protocol, the ATMtransaction must not acquire any new locks after a first lockhas been released:
A 2PL-compliant ATM withdrawal transaction
1 lock (acct) ;2 bal ← read_bal (acct) ;3 bal ← bal − 100¤ ;4 write_bal (acct, bal) ;5 unlock (acct) ;
lock phase
unlock phase
TransactionManagement
Torsten Grust
ACID Properties
Anomalies
The Scheduler
Serializability
Query Scheduling
Locking
Two-Phase Locking
OptimisticConcurrency Protocol
Multi-VersionConcurrency Control
30
Resulting Schedule
Example
Transaction 1 Transaction 2 DB state
lock (acct) ; 1200
read (acct) ;
lock (acct) ;
write (acct) ; 1100
unlock (acct) ;
lock (acct) ;
read (acct) ;
write (acct) ; 900
unlock (acct) ;
Transactionblocked
• Theorem: The use of 2PL-compliant locking always leads toa correct and serializable schedule or to a deadlock.
TransactionManagement
Torsten Grust
ACID Properties
Anomalies
The Scheduler
Serializability
Query Scheduling
Locking
Two-Phase Locking
OptimisticConcurrency Protocol
Multi-VersionConcurrency Control
30
Resulting Schedule
Example
Transaction 1 Transaction 2 DB state
lock (acct) ; 1200
read (acct) ;
lock (acct) ;
write (acct) ; 1100
unlock (acct) ;
lock (acct) ;
read (acct) ;
write (acct) ; 900
unlock (acct) ;
Transactionblocked
• Theorem: The use of 2PL-compliant locking always leads toa correct and serializable schedule or to a deadlock.
TransactionManagement
Torsten Grust
ACID Properties
Anomalies
The Scheduler
Serializability
Query Scheduling
Locking
Two-Phase Locking
OptimisticConcurrency Protocol
Multi-VersionConcurrency Control
31
Lock Modes
• We saw earlier that two read operations do not conflict witheach other.
• Systems typically use different types of locks (lock modes) toallow read operations to run concurrently.
• read locks or shared locks: mode S• write locks or exclusive locks: mode X
• Locks are only in conflict if at least one of them is an X lock:
Shared vs. exclusive lock compatibility
shared (S) exclusive (X)shared (S) ×
exclusive (X) × ×
• It is a safe operation in two-phase locking to (try to) converta shared lock into an exclusive lock during the lockphase (lock upgrade)⇒ improved concurrency.
TransactionManagement
Torsten Grust
ACID Properties
Anomalies
The Scheduler
Serializability
Query Scheduling
Locking
Two-Phase Locking
OptimisticConcurrency Protocol
Multi-VersionConcurrency Control
32
Deadlocks
• Like many lock-based protocols, two-phase locking has therisk of deadlock situations:
Example (Proper schedule with locking)
Transaction 1 Transaction 2
lock (A) ;... lock (B) ;
do something...
... do something
lock (B) ;...
[ wait for T2 to release lock ] lock (A) ;[ wait for T1 to release lock ]
• Both transactions would wait for each other indefinitely.
TransactionManagement
Torsten Grust
ACID Properties
Anomalies
The Scheduler
Serializability
Query Scheduling
Locking
Two-Phase Locking
OptimisticConcurrency Protocol
Multi-VersionConcurrency Control
33
Deadlock Handling
• Deadlock detection:
1 The system maintains a waits-for graph, where an edgeT1 → T2 indicates that T1 is blocked by a lock held by T2.
2 Periodically, the system tests for cycles in the graph.3 If a cycle is detected, the deadlock is resolved by
aborting one or more transactions.4 Selecting the victim is a challenge:
• Aborting young transactions may lead tostarvation: the same transaction may be cancelledagain and again.
• Aborting an old transaction may cause a lot ofcomputational investment to be thrown away (butthe undo costs may be high).
• Deadlock prevention:Define an ordering� on all database objects. If A� B,then order the lock operations in all transactions in thesame way (lock(A) before lock(B)).
TransactionManagement
Torsten Grust
ACID Properties
Anomalies
The Scheduler
Serializability
Query Scheduling
Locking
Two-Phase Locking
OptimisticConcurrency Protocol
Multi-VersionConcurrency Control
33
Deadlock Handling
• Deadlock detection:
1 The system maintains a waits-for graph, where an edgeT1 → T2 indicates that T1 is blocked by a lock held by T2.
2 Periodically, the system tests for cycles in the graph.3 If a cycle is detected, the deadlock is resolved by
aborting one or more transactions.4 Selecting the victim is a challenge:
• Aborting young transactions may lead tostarvation: the same transaction may be cancelledagain and again.
• Aborting an old transaction may cause a lot ofcomputational investment to be thrown away (butthe undo costs may be high).
• Deadlock prevention:Define an ordering� on all database objects. If A� B,then order the lock operations in all transactions in thesame way (lock(A) before lock(B)).
TransactionManagement
Torsten Grust
ACID Properties
Anomalies
The Scheduler
Serializability
Query Scheduling
Locking
Two-Phase Locking
OptimisticConcurrency Protocol
Multi-VersionConcurrency Control
34
Deadlock Handling
Other common technique:
• Deadlock detection via timeout:Let a transaction T block on a lock request only until atimeout occurs (counter expires). On expiration, assume thata deadlock has occurred and abort T .
Timeout-based deadlock detection
db2 => GET DATABASE CONFIGURATION;
.
.
.Interval for checking deadlock (ms) (DLCHKTIME) = 10000Lock timeout (sec) (LOCKTIMEOUT) = 30
.
.
.
• Also: lock-less optimistic concurrency control (↗ slide 42).
TransactionManagement
Torsten Grust
ACID Properties
Anomalies
The Scheduler
Serializability
Query Scheduling
Locking
Two-Phase Locking
OptimisticConcurrency Protocol
Multi-VersionConcurrency Control
35
Variants of Two-Phase Locking
• The two-phase locking discipline does not prescribe exactlywhen locks have to be acquired and released.
• Two possible variants:
Preclaiming and strict 2PL
“lock phase” release phase
locksheld
time
Preclaiming 2PL
lock phase “release phase”
locksheld
time
Strict 2PL
� What could motivate either variant?
1 Preclaiming 2PL:
2 Strict 2PL:
TransactionManagement
Torsten Grust
ACID Properties
Anomalies
The Scheduler
Serializability
Query Scheduling
Locking
Two-Phase Locking
OptimisticConcurrency Protocol
Multi-VersionConcurrency Control
36
Cascading Rollbacks
Consider three transactions:
Transations T1,2,3, T1 fails later on
abort ;w(x)
r(x)
r(x)
T1
T2
T3
timet2t1
• When transaction T1 aborts, transactions T2 and T3 havealready read data written by T1 (↗ dirty read, slide 9)
• T2 and T3 need to be rolled back, too (cascading roll back).
• T2 and T3 cannot commit until the fate of T1 is known.
⇒ Strict 2PL can avoid cascading roll backs altogether. (How?)
TransactionManagement
Torsten Grust
ACID Properties
Anomalies
The Scheduler
Serializability
Query Scheduling
Locking
Two-Phase Locking
OptimisticConcurrency Protocol
Multi-VersionConcurrency Control
37
Granularity of Locking
The granularity of locking is a trade-off:
database level
tablespace level1
table level
page level
row-level high concurrency
low concurrency
high overhead
low overhead
⇒ Idea: multi-granularity locking.
1An DB2 tablespace represents a collection of tables that share a physicalstorage location.
TransactionManagement
Torsten Grust
ACID Properties
Anomalies
The Scheduler
Serializability
Query Scheduling
Locking
Two-Phase Locking
OptimisticConcurrency Protocol
Multi-VersionConcurrency Control
38
Multi-Granularity Locking
• Decide the granularity of locks held for each transaction(depending on the characteristics of the transaction):
• For example, aquire a row lock for
Row-selecting query (C_CUSTKEY is key)
1 SELECT * Q12 FROM CUSTOMERS3 WHERE C_CUSTKEY = 42
and a table lock for
Table scan query
1 SELECT * Q22 FROM CUSTOMERS
• How do such transactions know about each others’ locks?
• Note that locking is performance-critical. Q2 does notwant to do an extensive search for row-level conflicts.
TransactionManagement
Torsten Grust
ACID Properties
Anomalies
The Scheduler
Serializability
Query Scheduling
Locking
Two-Phase Locking
OptimisticConcurrency Protocol
Multi-VersionConcurrency Control
39
Intention Locks
Databases use an additional type of locks: intention locks.
• Lock mode intention share: IS
• Lock mode intention exclusive: IX
Extended conflict matrix
S X IS IX
S × ×X × × × ×IS ×IX × ×
• A lock I� on a coarser level of granularity means that thereis some � lock on a lower level.
TransactionManagement
Torsten Grust
ACID Properties
Anomalies
The Scheduler
Serializability
Query Scheduling
Locking
Two-Phase Locking
OptimisticConcurrency Protocol
Multi-VersionConcurrency Control
40
Intention Locks
Multi-granularity locking protocol
1 Before a granule g can be locked in � ∈ {S, X} mode, thetransaction has to obtain an I� lock on all coarsergranularities that contain g.
2 If all intention locks could be granted, the transaction canlock granule g in the announced � mode.
Example (Multi-granularity locking)
Query Q1 (↗ slide 38) would, e.g.,
• obtain an IS lock on table CUSTOMERS
(also on the containing tablespace and database) and
• obtain an S lock on the row(s) with C_CUSTKEY = 42.
Query Q2 would place an
• S lock on table CUSTOMERS
(and an IS lock on tablespace and database).
TransactionManagement
Torsten Grust
ACID Properties
Anomalies
The Scheduler
Serializability
Query Scheduling
Locking
Two-Phase Locking
OptimisticConcurrency Protocol
Multi-VersionConcurrency Control
41
Detecting Conflicts
• Now suppose an updating query comes in:
Update request
1 UPDATE CUSTOMERS Q32 SET NAME = ’Seven Teen’3 WHERE C_CUSTKEY = 17
• Q3 will want to place
• an IX lock on table CUSTOMER (and all coarser granules)and
• an X lock on the row holding customer 17.
As such it is• compatible with Q1
(there is no conflict between IX and IS on the tablelevel),
• but incompatible with Q2
(the table-level S lock held by Q2 is in conflict with Q3’sIX lock request).
TransactionManagement
Torsten Grust
ACID Properties
Anomalies
The Scheduler
Serializability
Query Scheduling
Locking
Two-Phase Locking
OptimisticConcurrency Protocol
Multi-VersionConcurrency Control
42
Optimistic Concurrency Control
• Up to here, the approach to concurrency control has beenpessimistic:
• Assume that transactions will conflict and thus protectdatabase objects by locks and lock protocols.
• The converse is a optimistic concurrency control approach:• Hope for the best and let transactions freely proceed
with their read/write operations.• Only just before updates are to be committed to the
database, perform a check to see whether conflictsindeed did not happen.
• Rationale: Non-serializable conflicts are not that frequent.Save the locking overhead in the majority of cases andonly invest effort if really required.
TransactionManagement
Torsten Grust
ACID Properties
Anomalies
The Scheduler
Serializability
Query Scheduling
Locking
Two-Phase Locking
OptimisticConcurrency Protocol
Multi-VersionConcurrency Control
43
Optimistic Concurrency Control
Under optimistic concurrency control, transactions proceed inthree phases:
Optimistic concurrency control
1 Read Phase. Execute transaction, but do not write data backto disk immediately. Instead, collect updates in thetransaction’s private workspace.
2 Validation Phase. When the transaction wants to commit,test whether its execution was correct (only acceptableconflicts happened). If it is not, abort the transaction.
3 Write Phase. Transfer data from private workspace intodatabase.
• Note: Phases 2 and 3 need to be performed in anon-interruptible critical section (thus also called theval-write phase).
TransactionManagement
Torsten Grust
ACID Properties
Anomalies
The Scheduler
Serializability
Query Scheduling
Locking
Two-Phase Locking
OptimisticConcurrency Protocol
Multi-VersionConcurrency Control
44
Validating Transactions
Validation is typically implemented by looking at transaction Tj’s
• Read Set RS(Tj) (attributes read by Tj) and
• Write Set WS(Tj) (attributes written by Tj).
Optimistic Concurrency Control
Compare Tj against all committed transactions Ti.Check succeeds if
Ti committed before Tj startedor
(RS(Tj) ∪WS(Tj)) ∩WS(Ti) = ∅ .
TransactionManagement
Torsten Grust
ACID Properties
Anomalies
The Scheduler
Serializability
Query Scheduling
Locking
Two-Phase Locking
OptimisticConcurrency Protocol
Multi-VersionConcurrency Control
45
Optimistic Concurrency Control in IBM DB2
• DB2 V9.5 provides SQL-level constructs that enable databaseapplications to implement optimistic concurrency control:
• RID(r): return row identifier for row r,• ROW CHANGE TOKEN FOR r: unique number reflecting
the time row r has last been updated.
Optimistic concurrency control
1 db2 => SELECT * FROM EMPLOYEES2
3 ID NAME DEPT SALARY4 ----------- ---- ---- -----------5 1 Alex DE 3006 2 Bert DE 1007 3 Cora DE 2008 4 Drew US 2009 5 Erik US 400
10
11 db2 => SELECT E.NAME, E.SALARY,12 RID(E) AS RID, ROW CHANGE TOKEN FOR E AS TOKEN13 FROM EMPLOYEES E14 WHERE E.NAME = ’Erik’15
16 NAME SALARY RID TOKEN17 ---- ----------- -------------------- --------------------------18 Erik 400 16777224 74904229642240
TransactionManagement
Torsten Grust
ACID Properties
Anomalies
The Scheduler
Serializability
Query Scheduling
Locking
Two-Phase Locking
OptimisticConcurrency Protocol
Multi-VersionConcurrency Control
46
Optimistic Concurrency Control in IBM DB2
Optimistic concurrency control
• The ’Erik’ row belongs to our read set. Save its RID and TOKEN
values to perform validation later.
• (. . . Time passes . . . ) Now try to save our changes to the row.Perform BOCC-style validation:
1 db2 => UPDATE EMPLOYEES E2 SET E.SALARY = 4503 WHERE RID(E) = 16777224 -- identify row4 AND ROW CHANGE TOKEN FOR E = 74904229642240 -- row changed?5
6 SQL0100W No row was found for FETCH, UPDATE or DELETE; or the7 result of a query is an empty table. SQLSTATE=020008
9 db2 => SELECT E.ID, E.NAME10 RID(E) AS RID, ROW CHANGE TOKEN FOR E AS TOKEN11 FROM EMPLOYEES E12
13 ID NAME RID TOKEN14 ----------- ---- -------------------- --------------------15 1 Alex 16777220 7490422964224016 2 Bert 16777221 7490422964224017 3 Cora 16777222 7490422964224018 4 Drew 16777223 7490422964224019 5 Erik 16777224 141378732653941710
TransactionManagement
Torsten Grust
ACID Properties
Anomalies
The Scheduler
Serializability
Query Scheduling
Locking
Two-Phase Locking
OptimisticConcurrency Protocol
Multi-VersionConcurrency Control
47
Multi-Version Concurrency Control
Looking back at the concurrency control strategies discussed upto this point, we have seen
1 Wait Mechanisms, i.e., locks and the associated two-phaselocking protocol,
2 Rollback Mechanisms, i.e., a conditional write phase thatmakes it trivial to take back any changes made by atransaction.
We now add
3 Timestamp Mechanisms that use a DBMS-wide clock toorder transactions and decide visibility of rows.
The resulting Multi-Version Concurrency Control (MVCC)protocol is lock-less but comes with a space overhead (thatrequires garbage collection of rows).
TransactionManagement
Torsten Grust
ACID Properties
Anomalies
The Scheduler
Serializability
Query Scheduling
Locking
Two-Phase Locking
OptimisticConcurrency Protocol
Multi-VersionConcurrency Control
47
Multi-Version Concurrency Control
Looking back at the concurrency control strategies discussed upto this point, we have seen
1 Wait Mechanisms, i.e., locks and the associated two-phaselocking protocol,
2 Rollback Mechanisms, i.e., a conditional write phase thatmakes it trivial to take back any changes made by atransaction.
We now add
3 Timestamp Mechanisms that use a DBMS-wide clock toorder transactions and decide visibility of rows.
The resulting Multi-Version Concurrency Control (MVCC)protocol is lock-less but comes with a space overhead (thatrequires garbage collection of rows).
TransactionManagement
Torsten Grust
ACID Properties
Anomalies
The Scheduler
Serializability
Query Scheduling
Locking
Two-Phase Locking
OptimisticConcurrency Protocol
Multi-VersionConcurrency Control
48
MVCC: Timestamps
• In MVCC, each transaction Ti is assigned a timestamp ti thatrepresents the point in time when Ti started.
• Can implement timestamps based on• actual system clock (resolution, uniqueness, portability
across OSs?),or
• a DBMS-internal counter used to assign transaction IDs(xid).
• Timestamp requirements:1 unique: ti 6= tj if i 6= j,2 ordered: ti < tj if Ti has started before Tj.
MVCC: Semantics
Under MVCC, a transaction Ti operates on the consistent state ofthe database that was current at time ti.
TransactionManagement
Torsten Grust
ACID Properties
Anomalies
The Scheduler
Serializability
Query Scheduling
Locking
Two-Phase Locking
OptimisticConcurrency Protocol
Multi-VersionConcurrency Control
48
MVCC: Timestamps
• In MVCC, each transaction Ti is assigned a timestamp ti thatrepresents the point in time when Ti started.
• Can implement timestamps based on• actual system clock (resolution, uniqueness, portability
across OSs?),or
• a DBMS-internal counter used to assign transaction IDs(xid).
• Timestamp requirements:1 unique: ti 6= tj if i 6= j,2 ordered: ti < tj if Ti has started before Tj.
MVCC: Semantics
Under MVCC, a transaction Ti operates on the consistent state ofthe database that was current at time ti.
TransactionManagement
Torsten Grust
ACID Properties
Anomalies
The Scheduler
Serializability
Query Scheduling
Locking
Two-Phase Locking
OptimisticConcurrency Protocol
Multi-VersionConcurrency Control
49
MVCC: Versions and Snapshots
In a concurrent DBMS, operations can conflict if they write thesame database object (recall relation =). Thus:
MVCC: Versions
Under MVCC, multiple versions of the same database objectmay exist at one time. Different transactions may read/writedifferent (not necessarily the most recent) object versions.
MVCC uses snapshots to identify exactly which version of eachdatabase object are visible to a transaction:
MVCC: Snapshot
To take a snapshot, gather the following information:
• the highest xid of all committed transactions,
• a list of xids of all transactions currently executing.
Typically, a snapshot is taken at the time the transaction starts.
TransactionManagement
Torsten Grust
ACID Properties
Anomalies
The Scheduler
Serializability
Query Scheduling
Locking
Two-Phase Locking
OptimisticConcurrency Protocol
Multi-VersionConcurrency Control
49
MVCC: Versions and Snapshots
In a concurrent DBMS, operations can conflict if they write thesame database object (recall relation =). Thus:
MVCC: Versions
Under MVCC, multiple versions of the same database objectmay exist at one time. Different transactions may read/writedifferent (not necessarily the most recent) object versions.
MVCC uses snapshots to identify exactly which version of eachdatabase object are visible to a transaction:
MVCC: Snapshot
To take a snapshot, gather the following information:
• the highest xid of all committed transactions,
• a list of xids of all transactions currently executing.
Typically, a snapshot is taken at the time the transaction starts.
TransactionManagement
Torsten Grust
ACID Properties
Anomalies
The Scheduler
Serializability
Query Scheduling
Locking
Two-Phase Locking
OptimisticConcurrency Protocol
Multi-VersionConcurrency Control
50
MVCC: Row Timestamps
Database Object ≡ Row
PostgreSQL implements MVCC at a granularity of rows: multipleversions of the same row may exist. Adopt this model in whatfollows.
• To help decide whether a particular row version is includedin (or excluded from) a snapshot, attach to each row versiontwo virtual/hidden attributes:
1 xmin: the xid of the transaction that created this row,2 xmax: the xid of the transaction that deleted this row.
• Row updates are modellled as the two-step operation rowdeletion, then creation.
MVCC: No Physical Deletion!
Note: 2 implies that rows are not actually physically deleted.Instead, their xmax attribute is modified to record thedeleting/updating transaction (⇒ eventual row garbage).
�
TransactionManagement
Torsten Grust
ACID Properties
Anomalies
The Scheduler
Serializability
Query Scheduling
Locking
Two-Phase Locking
OptimisticConcurrency Protocol
Multi-VersionConcurrency Control
50
MVCC: Row Timestamps
Database Object ≡ Row
PostgreSQL implements MVCC at a granularity of rows: multipleversions of the same row may exist. Adopt this model in whatfollows.
• To help decide whether a particular row version is includedin (or excluded from) a snapshot, attach to each row versiontwo virtual/hidden attributes:
1 xmin: the xid of the transaction that created this row,2 xmax: the xid of the transaction that deleted this row.
• Row updates are modellled as the two-step operation rowdeletion, then creation.
MVCC: No Physical Deletion!
Note: 2 implies that rows are not actually physically deleted.Instead, their xmax attribute is modified to record thedeleting/updating transaction (⇒ eventual row garbage).
�
TransactionManagement
Torsten Grust
ACID Properties
Anomalies
The Scheduler
Serializability
Query Scheduling
Locking
Two-Phase Locking
OptimisticConcurrency Protocol
Multi-VersionConcurrency Control
51
MVCC: Row Visibility (1)
Example (Decide Row Visibility)
• Current snapshot:2 〈 100︸ ︷︷ ︸highest committed xid
, [25, 50, 75]︸ ︷︷ ︸currently active xids
〉
• Row visible in snapshot?
1xmin xmax · · · data · · ·
30 — · · ·
X
2xmin xmax · · · data · · ·
50 — · · ·
3xmin xmax · · · data · · ·110 — · · ·
2For simplicity: assume that all other xids have committed (and not rolledback their work).
TransactionManagement
Torsten Grust
ACID Properties
Anomalies
The Scheduler
Serializability
Query Scheduling
Locking
Two-Phase Locking
OptimisticConcurrency Protocol
Multi-VersionConcurrency Control
51
MVCC: Row Visibility (1)
Example (Decide Row Visibility)
• Current snapshot:2 〈 100︸ ︷︷ ︸highest committed xid
, [25, 50, 75]︸ ︷︷ ︸currently active xids
〉
• Row visible in snapshot?
1xmin xmax · · · data · · ·
30 — · · ·
X
2xmin xmax · · · data · · ·
50 — · · ·
3xmin xmax · · · data · · ·110 — · · ·
2For simplicity: assume that all other xids have committed (and not rolledback their work).
TransactionManagement
Torsten Grust
ACID Properties
Anomalies
The Scheduler
Serializability
Query Scheduling
Locking
Two-Phase Locking
OptimisticConcurrency Protocol
Multi-VersionConcurrency Control
51
MVCC: Row Visibility (1)
Example (Decide Row Visibility)
• Current snapshot:2 〈 100︸ ︷︷ ︸highest committed xid
, [25, 50, 75]︸ ︷︷ ︸currently active xids
〉
• Row visible in snapshot?
1xmin xmax · · · data · · ·
30 — · · · X
2xmin xmax · · · data · · ·
50 — · · ·
3xmin xmax · · · data · · ·110 — · · ·
2For simplicity: assume that all other xids have committed (and not rolledback their work).
TransactionManagement
Torsten Grust
ACID Properties
Anomalies
The Scheduler
Serializability
Query Scheduling
Locking
Two-Phase Locking
OptimisticConcurrency Protocol
Multi-VersionConcurrency Control
51
MVCC: Row Visibility (1)
Example (Decide Row Visibility)
• Current snapshot:2 〈 100︸ ︷︷ ︸highest committed xid
, [25, 50, 75]︸ ︷︷ ︸currently active xids
〉
• Row visible in snapshot?
1xmin xmax · · · data · · ·
30 — · · · X
2xmin xmax · · · data · · ·
50 — · · ·
3xmin xmax · · · data · · ·110 — · · ·
2For simplicity: assume that all other xids have committed (and not rolledback their work).
TransactionManagement
Torsten Grust
ACID Properties
Anomalies
The Scheduler
Serializability
Query Scheduling
Locking
Two-Phase Locking
OptimisticConcurrency Protocol
Multi-VersionConcurrency Control
51
MVCC: Row Visibility (1)
Example (Decide Row Visibility)
• Current snapshot:2 〈 100︸ ︷︷ ︸highest committed xid
, [25, 50, 75]︸ ︷︷ ︸currently active xids
〉
• Row visible in snapshot?
1xmin xmax · · · data · · ·
30 — · · · X
2xmin xmax · · · data · · ·
50 — · · ·
3xmin xmax · · · data · · ·110 — · · ·
2For simplicity: assume that all other xids have committed (and not rolledback their work).
TransactionManagement
Torsten Grust
ACID Properties
Anomalies
The Scheduler
Serializability
Query Scheduling
Locking
Two-Phase Locking
OptimisticConcurrency Protocol
Multi-VersionConcurrency Control
51
MVCC: Row Visibility (1)
Example (Decide Row Visibility)
• Current snapshot:2 〈 100︸ ︷︷ ︸highest committed xid
, [25, 50, 75]︸ ︷︷ ︸currently active xids
〉
• Row visible in snapshot?
1xmin xmax · · · data · · ·
30 — · · · X
2xmin xmax · · · data · · ·
50 — · · ·
3xmin xmax · · · data · · ·110 — · · ·
2For simplicity: assume that all other xids have committed (and not rolledback their work).
TransactionManagement
Torsten Grust
ACID Properties
Anomalies
The Scheduler
Serializability
Query Scheduling
Locking
Two-Phase Locking
OptimisticConcurrency Protocol
Multi-VersionConcurrency Control
51
MVCC: Row Visibility (1)
Example (Decide Row Visibility)
• Current snapshot:2 〈 100︸ ︷︷ ︸highest committed xid
, [25, 50, 75]︸ ︷︷ ︸currently active xids
〉
• Row visible in snapshot?
1xmin xmax · · · data · · ·
30 — · · · X
2xmin xmax · · · data · · ·
50 — · · ·
3xmin xmax · · · data · · ·110 — · · ·
2For simplicity: assume that all other xids have committed (and not rolledback their work).
TransactionManagement
Torsten Grust
ACID Properties
Anomalies
The Scheduler
Serializability
Query Scheduling
Locking
Two-Phase Locking
OptimisticConcurrency Protocol
Multi-VersionConcurrency Control
52
MVCC: Row Visibility (2)
Example (Decide Row Visibility)
• Current snapshot: 〈 100︸ ︷︷ ︸highest committed xid
, [25, 50, 75]︸ ︷︷ ︸currently active xids
〉
• Row visible in snapshot?
1xmin xmax · · · data · · ·
30 80 · · ·
2xmin xmax · · · data · · ·
30 75 · · ·
X
3xmin xmax · · · data · · ·
30 110 · · ·
X
• Given the current state of the system, may row 1 beconsidered garbage that can be collected?
TransactionManagement
Torsten Grust
ACID Properties
Anomalies
The Scheduler
Serializability
Query Scheduling
Locking
Two-Phase Locking
OptimisticConcurrency Protocol
Multi-VersionConcurrency Control
52
MVCC: Row Visibility (2)
Example (Decide Row Visibility)
• Current snapshot: 〈 100︸ ︷︷ ︸highest committed xid
, [25, 50, 75]︸ ︷︷ ︸currently active xids
〉
• Row visible in snapshot?
1xmin xmax · · · data · · ·
30 80 · · ·
2xmin xmax · · · data · · ·
30 75 · · ·
X
3xmin xmax · · · data · · ·
30 110 · · ·
X
• Given the current state of the system, may row 1 beconsidered garbage that can be collected?
TransactionManagement
Torsten Grust
ACID Properties
Anomalies
The Scheduler
Serializability
Query Scheduling
Locking
Two-Phase Locking
OptimisticConcurrency Protocol
Multi-VersionConcurrency Control
52
MVCC: Row Visibility (2)
Example (Decide Row Visibility)
• Current snapshot: 〈 100︸ ︷︷ ︸highest committed xid
, [25, 50, 75]︸ ︷︷ ︸currently active xids
〉
• Row visible in snapshot?
1xmin xmax · · · data · · ·
30 80 · · ·
2xmin xmax · · · data · · ·
30 75 · · ·
X
3xmin xmax · · · data · · ·
30 110 · · ·
X
• Given the current state of the system, may row 1 beconsidered garbage that can be collected?
TransactionManagement
Torsten Grust
ACID Properties
Anomalies
The Scheduler
Serializability
Query Scheduling
Locking
Two-Phase Locking
OptimisticConcurrency Protocol
Multi-VersionConcurrency Control
52
MVCC: Row Visibility (2)
Example (Decide Row Visibility)
• Current snapshot: 〈 100︸ ︷︷ ︸highest committed xid
, [25, 50, 75]︸ ︷︷ ︸currently active xids
〉
• Row visible in snapshot?
1xmin xmax · · · data · · ·
30 80 · · ·
2xmin xmax · · · data · · ·
30 75 · · · X
3xmin xmax · · · data · · ·
30 110 · · ·
X
• Given the current state of the system, may row 1 beconsidered garbage that can be collected?
TransactionManagement
Torsten Grust
ACID Properties
Anomalies
The Scheduler
Serializability
Query Scheduling
Locking
Two-Phase Locking
OptimisticConcurrency Protocol
Multi-VersionConcurrency Control
52
MVCC: Row Visibility (2)
Example (Decide Row Visibility)
• Current snapshot: 〈 100︸ ︷︷ ︸highest committed xid
, [25, 50, 75]︸ ︷︷ ︸currently active xids
〉
• Row visible in snapshot?
1xmin xmax · · · data · · ·
30 80 · · ·
2xmin xmax · · · data · · ·
30 75 · · · X
3xmin xmax · · · data · · ·
30 110 · · ·
X
• Given the current state of the system, may row 1 beconsidered garbage that can be collected?
TransactionManagement
Torsten Grust
ACID Properties
Anomalies
The Scheduler
Serializability
Query Scheduling
Locking
Two-Phase Locking
OptimisticConcurrency Protocol
Multi-VersionConcurrency Control
52
MVCC: Row Visibility (2)
Example (Decide Row Visibility)
• Current snapshot: 〈 100︸ ︷︷ ︸highest committed xid
, [25, 50, 75]︸ ︷︷ ︸currently active xids
〉
• Row visible in snapshot?
1xmin xmax · · · data · · ·
30 80 · · ·
2xmin xmax · · · data · · ·
30 75 · · · X
3xmin xmax · · · data · · ·
30 110 · · · X
• Given the current state of the system, may row 1 beconsidered garbage that can be collected?
TransactionManagement
Torsten Grust
ACID Properties
Anomalies
The Scheduler
Serializability
Query Scheduling
Locking
Two-Phase Locking
OptimisticConcurrency Protocol
Multi-VersionConcurrency Control
52
MVCC: Row Visibility (2)
Example (Decide Row Visibility)
• Current snapshot: 〈 100︸ ︷︷ ︸highest committed xid
, [25, 50, 75]︸ ︷︷ ︸currently active xids
〉
• Row visible in snapshot?
1xmin xmax · · · data · · ·
30 80 · · ·
2xmin xmax · · · data · · ·
30 75 · · · X
3xmin xmax · · · data · · ·
30 110 · · · X
• Given the current state of the system, may row 1 beconsidered garbage that can be collected?
TransactionManagement
Torsten Grust
ACID Properties
Anomalies
The Scheduler
Serializability
Query Scheduling
Locking
Two-Phase Locking
OptimisticConcurrency Protocol
Multi-VersionConcurrency Control
53
MVCC: Garbage Collection
• The creation of new row versions during UPDATE (rather thanreplacing the existing row) requires the reclamation ofstorage space used by old row versions.
• Delay such row garbage collection until the old versions areguaranteed to be invisible to all current and futuretransactions.
Delayed Row Garbage Collection
Exactly when is it safe to declare a row as garbage and mark it forcollection?
Row Cleanup
• On-demand cleanup of a single page when page isaccessed during SELECT, UPDATE, DELETE.
• Bulk cleanup by scheduled auto-vacuum process or via anexplicit VACUUM command.