Distributed Transactions
description
Transcript of Distributed Transactions
Distributed Transactions
What is a transaction?(A sequence of server operations that must be carried out atomically)
ACID properties - what are these
(Atomicity, Consistency, Isolation, Durability)
What is a distributed transaction?-Involves objects managed by multiple servers communicating with one
another.
Transactions
Permanent Record
Server operation
Server operation
Server operation
Server operation
Sharedvariables
Commit / Abort
Concurrency control
The goal of concurrency control is to guarantee that
when multiple transactions are concurrently
executed, the net effect should be equivalent to
executing them in some serial order. This is the
essence of the serializability property.
Example 1
T1 starts (20) W(x:=1) [OK] R(x) [OK] T1 commits
T2 starts(30) W(x:=2) [OK] T2 commits
T3 starts (40) W(x:=3) [OK] R(x) T3 commits
This is serializable. Think of other examples too.
Example 2
T1 starts (20) W(x:=1) [OK] R(x) [NO] T1 aborts
T2 starts(30) W(x:=2) [OK] R(x) T2 commits
T3 starts (40) W(x:=3) [OK] T3 commits
This is not serializable.
Question
Transaction 1 Raise the Q score of all GRE candidates from Iowa City by 10 points
Transaction 2Raise the Q score of all students whose id ends with 035 by 5 points
Can we run these concurrently? Explain.
Pitfalls in concurrency control
Dirty read Lost update Premature write
Lost update
Amy’s transaction Bob’s transaction
1 Load B into local 4 Load B into local 2 Add $250 to local 5 Add $250 to local3 Store local to B 6 Store local to B
What if the interleaving is 1 4 2 5 3 6 ? The final value of B is $1250, although it should have been $1500
Initially, B= $1000
Dirty read
Initially B= $1000
Amy’s transaction Bob’s transaction1 Load B into local 4 Load B into local 2 Add $250 to local 5 Add $250 to local3 Store local to B 6 Store local to B
ABORT COMMIT
Execute the actions in the sequence 1 2 3 4 5 6. The final result is still $1500, although it should have been $1250
Premature write
{Initially B = 0}
Amy’s transaction Bob’s transaction1 B:= $500 2 B := $1000
3 COMMIT
4 ABORT
B changes to 0. This could have been avoided if the second transaction postponed its commit UNTIL the first transaction commits or aborts.
Locks
Locks are commonly used to implement serrializability of concurrent transactions. Operations on shared objects are in conflict when one of them is a write operation. Each transaction must acquire the corresponding exclusive lock before executing an action.
Locks can be fine grained. Note that there is no conflict between two reads.
Serializability
The serialization graph is a directed graph (V, E) where V is the set of transactions, and E is the set of directed edges between transactions - a directed edge from a transaction Tj
to a transaction Tk implies that Tk applied a lock only after Tj
released the corresponding lock.
Tj Tk
Serializability theorem
For a set of concurrent transaction, the serializability property holds if and only if the corresponding serialization graph is acyclic
Two-phase locking (2PL)
Phase 1. Acquire all locks needed to execute the transaction. The locks will be acquired one after another, and this phase is called the growing phase or acquisition phase
Phase 2. Release all locks acquired so far. This is called the shrinking phase or the release phase.
Two-phase locking (2PL)
Growing phase Shrinking phase
acquire release
2PL
Theorem. 2PL guarantees serializability.
Proof. Suppose that the theorem is not correct. Then the serialization graph must contain a cycle …Tj Tk … Tm Tj
…This implies that Tj must have released a lock (that was later
acquired by Tk) and then acquired a lock (that was released by Tm).
However this violates the condition of two-phase locking that rules out any locking once a lock has been released.
Atomic Commit Protocols
Network of servers
The initiator of a transaction is called the coordinator, and the remianing servers are participants
S1
S3S2
Servers may crash
Requirements of Atomic Commit Protocols Network of servers
Termination. All non-faulty servers must eventually reach an irrevocable decision.
Agreement. If any server decides to commit, then every server must have voted to commit.
Validity. If all servers vote commit and there is no failure, then all servers must commit.
S1
S3S2
Servers may crash
One-phase Commit
server
coordinator
client
server
server
server
participant
participant
participant
Commit
If a participant deadlocks or faces a local problem then thecoordinator may never be able to find it. Too simplistic.