15-440/15-640 Distributed Systems Concurrency Control

68
Concurrency Control 15-440/15-640 Distributed Systems Lecture #10, Thursday September 30 th

Transcript of 15-440/15-640 Distributed Systems Concurrency Control

Page 1: 15-440/15-640 Distributed Systems Concurrency Control

Concurrency Control15-440/15-640 Distributed Systems

Lecture #10, Thursday September 30th

Page 2: 15-440/15-640 Distributed Systems Concurrency Control

P1 – Checkpoint done, Part A (due 10/8), Part B (10/15)

Updated Piazza policy to allow private posts• General idea still: please use public posts • Only for H/W questions that may reveal answer, we can make public • Not for detailed project debugging questions => please go to OH

HW2 – released Oct 3rd – Due in a week Oct 10th. Get started early! • Short timeline, need to release solutions for Midterm

Midterm -1 – In class, during class time, on Tuesday Oct 12th• Please arrive early (10:00 or 10:05) so that we can start on time!

AnnouncementsO U T L I N E

Page 3: 15-440/15-640 Distributed Systems Concurrency Control

Consistency for multiple-objects, multiple-servers

Part I: Single Server Case• (not covered well in book)• Two Phase Locking

Part II: Distributed Transactions• Two Phase Commit (Tanenbaum 8.5)

Today's LectureO U T L I N E

Page 4: 15-440/15-640 Distributed Systems Concurrency Control

Consistency for multiple-objects, multiple-servers

Part I: Single Server Case• (not covered well in book)• Two Phase Locking

Part II: Distributed Transactions• Two Phase Commit (Tanenbaum 8.5)

Today's LectureO U T L I N E

Page 5: 15-440/15-640 Distributed Systems Concurrency Control

a) Ignore failures in first half• Concurrency is our main concern

b) To deal with failures• Assume a form of logging, where every machine writes information

down before operating on it, to recover from simple failures. Recover after failure.

• Next lecture: Logging and Crash Recovery

Assumptions for Today

Page 6: 15-440/15-640 Distributed Systems Concurrency Control

In the context of concurrency, what have we been focused on avoiding?

S O F A R I N T H I S C O U R S E ,

Page 7: 15-440/15-640 Distributed Systems Concurrency Control

In the context of concurrency, what have we been focused on avoiding?

S O F A R I N T H I S C O U R S E ,

Generally so far, we have been focused

on various ways to ensure safe

concurrent access to a single resource.Unsafe concurrent access to shared resources.e.g., race conditions.

Page 8: 15-440/15-640 Distributed Systems Concurrency Control

Unsafe concurrent access to shared resources.e.g., race conditions.

In the context of concurrency, what have we been focused on avoiding?

S O F A R I N T H I S C O U R S E ,

Generally so far, we have been focused

on various ways to ensure safe

concurrent access to a single resource.

Going forward, we want to handle composite operations– we want to build concurrent systems which act on multiple distinct objects!

Page 9: 15-440/15-640 Distributed Systems Concurrency Control

Imagine a banking application with 3 functions that can access account information:

If each of these 3 functions were implemented in a thread-safe way, we should be OK, right?

Nope.

Composite OperationsC O N T E X T

Let’s start with an example of composite operations.

getAccount(i) deposit(j, v) withdraw(i, v)

Page 10: 15-440/15-640 Distributed Systems Concurrency Control

Imagine a banking application with 3 functions that can access account information:

If each of these 3 functions were implemented in a thread-safe way, we should be OK, right?

Nope.

Composite OperationsC O N T E X T

Let’s start with an example of composite operations.

getAccount(i) deposit(j, v) withdraw(i, v)

// thread 1: transfer 100// from savings->currentwithdraw(savings, 100)deposit(current, 100)

// thread 2: check balances = getBalance(savings)c = getBalance(current)tot = s + c

Assuming these 2 threads are concurrently executing… What can happen?

Page 11: 15-440/15-640 Distributed Systems Concurrency Control

Two issues…

Problems with Composite OperationsC O M P O S I T E O P E R AT I O N S

Insufficient Isolation• Individual operations each being atomic is not enough• Instead, we want the deposit & withdraw making up the transfer to

happen as one atomic operation…

1

2 Fault Tolerance• In the real-word, programs (or systems) can fail• Need to make sure we can recover safely if failure happens

(particularly– no corrupted data!)

Page 12: 15-440/15-640 Distributed Systems Concurrency Control

Transactions!E N T E R :

Yep! The database kind.

Page 13: 15-440/15-640 Distributed Systems Concurrency Control

Goal:Want programmer to be able to specify that a set of operations should happen atomically.

TransactionsB A C K G R O U N D

For example:

// transfer amt from A -> Btransaction {if (getBalance(A) > amt) {withdraw(A, amt)deposit(B, amt)return true

} else return false}

There are exactly two possibilities for execution:

A transaction either1. executes correctly (it commits),

or2. has no effect at all (it aborts)

Importantly: this always happens regardless of other transactions, or system crashes!

Page 14: 15-440/15-640 Distributed Systems Concurrency Control

ACID PropertiesT R A N S A C T I O N S

Q: Anybody familiar with the ACID acronym?

Transactions are defined as having the 4 ACID properties.

Page 15: 15-440/15-640 Distributed Systems Concurrency Control

Atomicity: Each transaction completes in its entirely, or is aborted. If aborted, should not have effect on the shared global state. Example: Update account balance on multiple servers

Consistency: Each transaction preserves a set of invariants about global state. (Nature of invariants is system dependent). Example: in a bank system, law of conservation of $$

Isolation: Also means serializability. Each transaction executes as if it were the only one with the ability to read/write shared global state.

Durability: Once a transaction has been completed, or “committed” there is no going back. In other words there is no “undo”.

ACID PropertiesT R A N S A C T I O N S

Page 16: 15-440/15-640 Distributed Systems Concurrency Control

Atomicity: Each transaction completes in its entirely, or is aborted. If aborted, should not have effect on the shared global state.Example: Update account balance on multiple servers

Consistency: Each transaction preserves a set of invariants about global state. (Nature of invariants is system dependent). Example: in a bank system, law of conservation of $$

Isolation: Also means serializability. Each transaction executes as if it were the only one with the ability to read/write shared global state.

Durability: Once a transaction has been completed, or “committed” there is no going back. In other words there is no “undo”.

ACID PropertiesT R A N S A C T I O N S

Page 17: 15-440/15-640 Distributed Systems Concurrency Control

Atomicity: Each transaction completes in its entirely, or is aborted. If aborted, should not have effect on the shared global state.Example: Update account balance on multiple servers

Consistency: Each transaction preserves a set of invariants about global state. (Nature of invariants is system dependent). Example: in a bank system, law of conservation of $$

Isolation: Also means serializability. Each transaction executes as if it were the only one with the ability to read/write shared global state.

Durability: Once a transaction has been completed, or “committed” there is no going back. In other words there is no “undo”.

ACID PropertiesT R A N S A C T I O N S

Page 18: 15-440/15-640 Distributed Systems Concurrency Control

Atomicity: Each transaction completes in its entirely, or is aborted. If aborted, should not have effect on the shared global state.Example: Update account balance on multiple servers

Consistency: Each transaction preserves a set of invariants about global state. (Nature of invariants is system dependent). Example: in a bank system, law of conservation of $$

Isolation: Also means serializability. Each transaction executes as if it were the only one with the ability to read/write shared global state.

Durability: Once a transaction has been completed, or “committed” there is no going back. In other words there is no “undo”.

ACID PropertiesT R A N S A C T I O N S

Page 19: 15-440/15-640 Distributed Systems Concurrency Control

Atomicity: Each transaction completes in its entirely, or is aborted. If aborted, should not have effect on the shared global state.Example: Update account balance on multiple servers

Consistency: Each transaction preserves a set of invariants about global state. (Nature of invariants is system dependent). Example: in a bank system, law of conservation of $$

Isolation: Also means serializability. Each transaction executes as if it were the only one with the ability to read/write shared global state.

Durability: Once a transaction has been completed, or “committed” there is no going back. In other words there is no “undo”.

ACID PropertiesT R A N S A C T I O N S

Page 20: 15-440/15-640 Distributed Systems Concurrency Control

ACID PropertiesT R A N S A C T I O N S

Q: Anybody familiar with the ACID acronym?

Also noteworthy:Transactions can be nested.

And, terminologically:“Atomic Operations” ⇒ Atomicity + Isolation

Transactions are defined as having the 4 ACID properties.

Page 21: 15-440/15-640 Distributed Systems Concurrency Control

Consistency for multiple-objects, multiple-servers

Part I: Single Server Case• (not covered well in book)• Two Phase Locking

Part II: Distributed Transactions• Two Phase Commit (Tanenbaum 8.5)

Today's LectureO U T L I N E

Page 22: 15-440/15-640 Distributed Systems Concurrency Control

Array Bal[i] stores balance of Account “i”

A Transaction Example: Bank E X A M P L E

xfer(i, j, v): if withdraw(i, v):

deposit(j, v) else

abort

withdraw(i, v): b = Bal[i] // Readif b >= v // Test

Bal[i] = b-v // Write return true

else return false

deposit(j, v): Bal[j] += v

Goal:Implement: xfer, withdraw, deposit

Page 23: 15-440/15-640 Distributed Systems Concurrency Control

A Transaction Example: Bank E X A M P L E

xfer(i, j, v): if withdraw(i, v):

deposit(j, v) else

abort

withdraw(i, v): b = Bal[i] // Readif b >= v // Test

Bal[i] = b-v // Write return true

else return false

deposit(j, v): Bal[j] += v

Imagine the following scenario:Bal[x] = 100, Bal[y] = Bal[z] = 02 transactions:- T1: xfer(x,y,60)- T2: xfer(x,z,70)

Q: What will happen if these 2 transactions follow ACID properties and happen in some serial order?

Page 24: 15-440/15-640 Distributed Systems Concurrency Control

A Transaction Example: Bank E X A M P L E

xfer(i, j, v): if withdraw(i, v):

deposit(j, v) else

abort

withdraw(i, v): b = Bal[i] // Readif b >= v // Test

Bal[i] = b-v // Write return true

else return false

deposit(j, v): Bal[j] += v

Imagine the following scenario:Bal[x] = 100, Bal[y] = Bal[z] = 02 transactions:- T1: xfer(x,y,60)- T2: xfer(x,z,70)

T1; T2:- T1 Succeeds; T2 Fails. Bal[x]=40, Bal[y]=60

T2; T1:- T2 Succeeds. T1 Fails. Bal[x]=30, Bal[z]=70

Page 25: 15-440/15-640 Distributed Systems Concurrency Control

A Transaction Example: Bank E X A M P L E

xfer(i, j, v): if withdraw(i, v):

deposit(j, v) else

abort

withdraw(i, v): b = Bal[i] // Readif b >= v // Test

Bal[i] = b-v // Write return true

else return false

deposit(j, v): Bal[j] += v

Imagine the following scenario:Bal[x] = 100, Bal[y] = Bal[z] = 02 transactions:- T1: xfer(x,y,60)- T2: xfer(x,z,70)

Q: What if we weren’t careful about

ACID properties? Is there a race

condition?

Page 26: 15-440/15-640 Distributed Systems Concurrency Control

A Transaction Example: Bank E X A M P L E

xfer(i, j, v): if withdraw(i, v):

deposit(j, v) else

abort

withdraw(i, v): b = Bal[i] // Readif b >= v // Test

Bal[i] = b-v // Write return true

else return false

deposit(j, v): Bal[j] += v

Imagine the following scenario:Bal[x] = 100, Bal[y] = Bal[z] = 02 transactions:- T1: xfer(x,y,60)- T2: xfer(x,z,70)

Q: What if we weren’t careful about

ACID properties? Is there a race

condition?Updating Bal[x] with Read/Write

interleaving of T1,T2

ACID VIOLATION: NOT ISOLATED, NOT DURABLE

Bal[x] = 30 or 40; Bal[y] = 60; Bal [z] = 70

Page 27: 15-440/15-640 Distributed Systems Concurrency Control

A Transaction Example: Bank E X A M P L E

xfer(i, j, v): if withdraw(i, v):

deposit(j, v) else

abort

withdraw(i, v): b = Bal[i] // Readif b >= v // Test

Bal[i] = b-v // Write return true

else return false

deposit(j, v): Bal[j] += v

Imagine the following scenario:Bal[x] = 100, Bal[y] = Bal[z] = 02 transactions:- T1: xfer(x,y,60)- T2: xfer(x,z,70)

Updating Bal[x] with Read/Write interleaving of T1,T2

Bal[x] = 30 or 40; Bal[y] = 60; Bal [z] = 70

ACID VIOLATION: NOT ISOLATED, NOT DURABLE

Page 28: 15-440/15-640 Distributed Systems Concurrency Control

A Transaction Example: Bank E X A M P L E

xfer(i, j, v): if withdraw(i, v):

deposit(j, v) else

abort

withdraw(i, v): b = Bal[i] // Readif b >= v // Test

Bal[i] = b-v // Write return true

else return false

deposit(j, v): Bal[j] += v

Imagine the following scenario:Bal[x] = 100, Bal[y] = Bal[z] = 02 transactions:- T1: xfer(x,y,60)- T2: xfer(x,z,70)

Bal[x] = 30 or 40; Bal[y] = 60; Bal [z] = 70

ACID VIOLATION: NOT ISOLATED, NOT DURABLE

sumbalance(i, j, k): return Bal[i]+Bal[j]+ Bal[k]

For Consistency, we can implement sumbalance()

Updating Bal[x] with Read/Write interleaving of T1,T2

Page 29: 15-440/15-640 Distributed Systems Concurrency Control

A Transaction Example: Bank E X A M P L E

xfer(i, j, v): if withdraw(i, v):

deposit(j, v) else

abort

withdraw(i, v): b = Bal[i] // Readif b >= v // Test

Bal[i] = b-v // Write return true

else return false

deposit(j, v): Bal[j] += v

Imagine the following scenario:Bal[x] = 100, Bal[y] = Bal[z] = 02 transactions:- T1: xfer(x,y,60)- T2: xfer(x,z,70)

sumbalance(i, j, k): return Bal[i]+Bal[j]+ Bal[k]

For Consistency, we can implement sumbalance()

Q: Do you see anything wrong with this?

Page 30: 15-440/15-640 Distributed Systems Concurrency Control

A Transaction Example: Bank E X A M P L E

xfer(i, j, v): if withdraw(i, v):

deposit(j, v) else

abort

withdraw(i, v): b = Bal[i] // Readif b >= v // Test

Bal[i] = b-v // Write return true

else return false

deposit(j, v): Bal[j] += v

Imagine the following scenario:Bal[x] = 100, Bal[y] = Bal[z] = 02 transactions:- T1: xfer(x,y,60)- T2: xfer(x,z,70)

sumbalance(i, j, k): return Bal[i]+Bal[j]+ Bal[k]

For Consistency, we can implement sumbalance()

Q: Do you see anything wrong with this?State invariant sumbalance=100 violated!

We created $$

Page 31: 15-440/15-640 Distributed Systems Concurrency Control

Implementing Transaction with LocksE X A M P L E

xfer(i, j, v): lock()if withdraw(i, v):

deposit(j, v) else

abortunlock()

We can use locks to wrap xfer.

xfer(i, j, v):lock(i)if withdraw(i, v):

unlock(i)lock(j)deposit(j, v)unlock(j)

else unlock(i)abort

Q: However, is this a good approach? Sequential bottleneck due to global lock!

Q: Is there some other solution?

Q: Is this fixed now?

Nope. Consistency violation.Releasing lock i early before another transaction can cause consistency violation.

Page 32: 15-440/15-640 Distributed Systems Concurrency Control

Implementing Transaction with LocksE X A M P L E

xfer(i, j, v):lock(i)if withdraw(i, v):

lock(j)deposit(j, v)unlock(i)unlock(j)

else unlock(i)abort

Potential fix:Release locks when updateof all state variables complete.

Q: Does this work?

Nope. Deadlock!Bal[x]=Bal[y]=100 xfer(x,y,40) and xfer(y,x,30)

Page 33: 15-440/15-640 Distributed Systems Concurrency Control

Implementing Transaction with LocksE X A M P L E

xfer(i, j, v):lock(min(i,j); lock(max (i,j)) if withdraw(i, v):

deposit(j, v)unlock(i); unlock(j)

else unlock(i); unlock(j)abort

General rule: Always acquire locks according to some consistent global order

This works

This is also the motivation for 2-phase locking!

Why does this version work?We can visualize what’s happening by representing the state of the locks in a directed acyclic graph called a “waits-for” graph.

Page 34: 15-440/15-640 Distributed Systems Concurrency Control

Acquiring Locks in a Unique OrderW E K N O W : O R D E R O F L O C K A C Q U I S I T I O N I S I M P O R TA N T

Pi Pj

Pk

Consider “wait-for” graph for state of locks.

• Vertices represent transactions

• Edge from vertex Pi to vertex Pjif transaction Pi is waiting for lock held by transaction Pj.

What does a cycle mean?

Page 35: 15-440/15-640 Distributed Systems Concurrency Control

Acquiring Locks in a Unique OrderW E K N O W : O R D E R O F L O C K A C Q U I S I T I O N I S I M P O R TA N T

Pi Pj

Pk

Consider “wait-for” graph for state of locks.

• Vertices represent transactions

• Edge from vertex Pi to vertex Pjif transaction Pi is waiting for lock held by transaction Pj.

Can a cycle occur if we acquire locks in a unique order?

Page 36: 15-440/15-640 Distributed Systems Concurrency Control

Acquiring Locks in a Unique OrderW E K N O W : O R D E R O F L O C K A C Q U I S I T I O N I S I M P O R TA N T

Pi Pj

Pky

x

z

Consider “wait-for” graph for state of locks.

• Vertices represent transactions

• Edge from vertex Pi to vertex Pjif transaction Pi is waiting for lock held by transaction Pj.

Can a cycle occur if we acquire locks in a unique order?

No. Label edges with a lock ID.

– For any cycle, there must be some pair of edges (i, j), (j, k) labeled with values x & y. As k holds y, but waits for x: y < x

– Transaction j is holding lock x and it wants lock y, so y > x

– Implies that j is not acquiring its lock in proper order.

Page 37: 15-440/15-640 Distributed Systems Concurrency Control

Acquiring Locks in a Unique OrderW E K N O W : O R D E R O F L O C K A C Q U I S I T I O N I S I M P O R TA N T

Pi Pj

Pky

x

z

Consider “wait-for” graph for state of locks.

• Vertices represent transactions

• Edge from vertex Pi to vertex Pjif transaction Pi is waiting for lock held by transaction Pj.

Can a cycle occur if we acquire locks in a unique order?

No. Label edges with a lock ID.

– For any cycle, there must be some pair of edges (i, j), (j, k) labeled with values x & y. As k holds y, but waits for x: y < x

– Transaction j is holding lock x and it wants lock y, so y > x

– Implies that j is not acquiring its lock in proper order.

This general scheme is known as

two-phase locking.(Specifically, strong strict two-phase locking.)

Page 38: 15-440/15-640 Distributed Systems Concurrency Control

Two-Phase Locking (2PL)E N T E R :

Page 39: 15-440/15-640 Distributed Systems Concurrency Control

General 2-phase locking • Phase 1: Acquire or Escalate Locks (e.g. read => write)• Phase 2: Release or de-escalate lock

Strict 2-phase locking • Phase 1: (same as before.) Acquire or Escalate Locks (e.g. read => write) • Phase 2: Release WRITE lock at end of transaction only

Strong Strict 2-phase locking • Phase 1: (same as before) Acquire or Escalate Locks (e.g. read => write) • Phase 2: Release ALL locks at end of transaction only. • Most common version, required for ACID properties!

2-Phase Locking Variants TA X O N O M Y

There are several kinds of 2PL.

Page 40: 15-440/15-640 Distributed Systems Concurrency Control

Why not always use strong-strict 2-phase locking?• A transaction may not know the locks it needs in advance

2-Phase Locking S H O U L D W E A L W AY S U S E I T ?

if Bal(heather) < 100: x = find_richest_prof() transfer_from(x, heather)

•Ways to handle deadlocks• Lock manager builds a “waits-for” graph. On finding a cycle, choose offending transaction and force abort

• Use timeouts: Transactions should be short. If hit time limit, find transaction waiting for a lock and force abort.

Page 41: 15-440/15-640 Distributed Systems Concurrency Control

For reliability, transactions are split into two phases:

Phase 1: Preparation: • Determine what has to be done, how it will change state, without

actually altering it.• Generate Lock set “L” • Generate List of Updates “U”

Phase 2: Commit or Abort:• Everything OK, then update global state • Transaction cannot be completed, leave global state as is• In either case, RELEASE ALL LOCKS

Transactions: Split Into 2 PhasesR E T U R N I N G T O T R A N S A C T I O N S

Returning our attention to transactions…

Page 42: 15-440/15-640 Distributed Systems Concurrency Control

2PL Applied to Banking ExampleE X A M P L E

xfer(i, j, v):L={i,j} //Locks U=[] //List of Updatesbegin(L) //Begin transaction, Acquire locksbi = Bal[i]bj = Bal[j]if bi >= v:

Append(U,Bal[i] <- bi – v)Append(U, Bal[j] <- bj + v) commit(U,L)

else abort(L)

commit(U,L):Perform all updates in U Release all locks in L

Q: So, what would “commit” and ”abort” look like?

abort(L): Release all locks in L

Page 43: 15-440/15-640 Distributed Systems Concurrency Control

Consistency for multiple-objects, multiple-servers

Part I: Single Server Case• (not covered well in book)• Two Phase Locking

Part II: Distributed Transactions• Two Phase Commit (Tanenbaum 8.5)

Today's LectureO U T L I N E

Page 44: 15-440/15-640 Distributed Systems Concurrency Control

Partition databases across multiple machines for scalability(E.g., machine 1 responsible for account m, machine 2 responsible for account n)

Transactions often touch more than one data partition.

How do we guarantee that all of the partitions commit the transactions or none commit the transactions?• Transferring money from m to n.• Requirement: both machines do it, or neither.

Distributed Transactions?B R I N G I N G T R A N S A C T I O N S T O D I S T R I B U T E D S Y S T E M S

21

account m account n

Page 45: 15-440/15-640 Distributed Systems Concurrency Control

Multiple servers (S1, S2, S3, …), each holding some objects which can be read and written within client transactionsMultiple concurrent clients (C1, C2, …) who perform transactions that interact with one or more servers• E.g. T1 reads x, z from S1, writes a on S2, reads + writes j on S3• E.g. T2 reads i, j from S3, then writes z on S1A successful commit implies agreement at all servers

Modeling Distributed TransactionsE X A M P L E O F W H AT W E ’ R E S T R I V I N G T O W A R D S

i = 2j = 4

a = 7b = 8c = 1

x = 5y = 0z = 3

S1

S2

S3

C1 C2

T1 transaction {if (x<2) aborta:= zj:= j + 1

}

T2 transaction {z:= (i+j)

}

Page 46: 15-440/15-640 Distributed Systems Concurrency Control

• Similar idea as before, but: • State spread across servers (maybe even WAN) • Single transaction should be able to read/modify global state while

maintaining ACID properties.• Failures!

• Overall Idea:• Client initiates transaction. • Makes use of “coordinator”• All other relevant servers operate as “participants” • Coordinator assigns unique transaction ID (TxID)

• Distributed Commit• 2-phase commit protocol (2PC)

Enabling Distributed Transactions B R I N G I N G T R A N S A C T I O N S T O D I S T R I B U T E D S Y S T E M S

Page 47: 15-440/15-640 Distributed Systems Concurrency Control

A simple distributed commit protocol:• Coordinator tells other processes that are

involved (called participants) whether or not to locally perform the operation in question.

• This is called the one-phase commit protocol.

Even without failures, a lot can go wrong, for example:• Account m on Server 2 has only $90• Account m doesn’t exist!

Distributed CommitA S I M P L E S C H E M E

Coordinator

Server 1 Server 2

Client0: start

1: m, +$100 1: j, -$100

2: done

Q: What part of ACID does this violate?

Page 48: 15-440/15-640 Distributed Systems Concurrency Control

Phase 1: Prepare & Vote• Participants figure out all state changes • Each determines if it can complete the transaction• Communicate with coordinator

Phase 2: Commit• Coordinator broadcasts to participants: COMMIT / ABORT• If COMMIT, participants make respective state changes

2-Phase CommitO V E R V I E W

Page 49: 15-440/15-640 Distributed Systems Concurrency Control

Implementing 2-Phase Commit2 P C

Coordinator

Server 1 Server 2

Client0: start

Implemented as a set of messages.

Page 50: 15-440/15-640 Distributed Systems Concurrency Control

Implementing 2-Phase Commit2 P C

Coordinator

Server 1 Server 2

Client

1A: CanCommit? 1A: CanCommit?

Implemented as a set of messages.

Messages in first phase • 1A: Coordinator sends CanCommit? to participants

Page 51: 15-440/15-640 Distributed Systems Concurrency Control

Implementing 2-Phase Commit2 P C

Coordinator

Server 1 Server 2

Client

1B: VoteCommit 1B: VoteCommit

Implemented as a set of messages.

Messages in first phase • 1A: Coordinator sends CanCommit? to participants• 1B Participants respond: VoteCommit or VoteAbort

Page 52: 15-440/15-640 Distributed Systems Concurrency Control

Implementing 2-Phase Commit2 P C

Coordinator

Server 1 Server 2

Client

2A: DoCommit 2A: DoCommit

Implemented as a set of messages.

Messages in first phase • 1A: Coordinator sends CanCommit? to participants• 1B Participants respond: VoteCommit or VoteAbort

Messages in the second phase• 2A: All VotedCommit: , Coord sends DoCommit• If any VoteAbort: abort transaction. Coordinator sends DoAbort to everyone => release locks

Page 53: 15-440/15-640 Distributed Systems Concurrency Control

Implemented as a set of messages.

Messages in first phase • 1A: Coordinator sends CanCommit? to participants• 1B Participants respond: VoteCommit or VoteAbort

Messages in the second phase• 2A: All VotedCommit: , Coord sends DoCommit• If any VoteAbort: abort transaction. Coordinator sends DoAbort to everyone => release locks

Implementing 2-Phase Commit2 P C

Coordinator

Server 1 Server 2

Client3: done

Page 54: 15-440/15-640 Distributed Systems Concurrency Control

Implementing 2-Phase CommitA N O T H E R V I E W

time

S1

S2

S3

yes

C

yes

yes

ack

ack

ack

doCommit(TxID)canCommit(TxID)?

ack = haveCommitted(TxID,P)yes = voteCommit(TxID,P)

Page 55: 15-440/15-640 Distributed Systems Concurrency Control

Bank Account m at Server 1, n at Server 2.

BankingE X A M P L E

L={m} Begin(L) // Acq. Locks U=[] //List of Updates b=Bal[m]if b >= v:

Append(U, Bal[m] <- b – v)vote commit

elsevote abort

L={n} Begin(L) // Acq. LocksU=[] //List of Updatesb=Bal[n]Append(U, Bal[n] <- b + v)vote commit

Server 1 implements transactionServer 2 implements transaction

Server 2 can assume that the account of m has enough money, otherwise whole transaction will abort.

Q: What about locking?

21

account m account n

Page 56: 15-440/15-640 Distributed Systems Concurrency Control

Correctness• Neither can commit unless both agreed to commit

Performance• 3N messages per transaction

How to handle failure?• Timeouts à performance bad in case of failure!

Properties of 2-Phase Commit2 P C

Page 57: 15-440/15-640 Distributed Systems Concurrency Control

Distributed deadlock • Cyclic dependency of locks by transactions across servers

• In 2PC this can happen if participants unable to respond to voting request e.g. still waiting on a lock on its local resource.

• Handled with a timeout. Participants times out, then votes to abort. Retry transaction again. • Addresses the deadlock concern • However, danger of LIVELOCK – keep trying!

Deadlocks and Livelocks 2 P C

Q: How can this happen?

Q: How can we fix this?

Page 58: 15-440/15-640 Distributed Systems Concurrency Control

Coordinator times out after CanCommit?• Hasn’t sent any commit messages, safely abort• Conservative. Why?• Preserve correctness, sacrifice performance

Participant times out after VoteAbort• Can safely abort unilaterally.• Why?

Timeout and Failure Cases (1)2 P C

Page 59: 15-440/15-640 Distributed Systems Concurrency Control

Participant times out after VoteCommit•Are unilateral decisions possible? Commit, Abort?•Participant could wait forever

Solution: ask another participant (gossip protocol)• Learn coordinator’s decision: do the same• Assumption: non-Byzantine failure model

• Other participant hasn’t voted: abort is safe. Why?• Coordinator has not made decision

• No reply or other participant also VoteCommit: wait• 2PC is “blocking protocol” à 3PC in book.

Timeout and Failure Cases (2)2 P C

Page 60: 15-440/15-640 Distributed Systems Concurrency Control

2PC widely used in practice

In transactional APIs for databases and cloud platforms

Logging and Crash Recovery• Crucial to handle crashes / reboots

(next lecture)• Very powerful and resilient when paired with

RAID (3 lectures from now)

2 Phase Commit in Practice

NDB Cluster

Page 61: 15-440/15-640 Distributed Systems Concurrency Control

• Distributed consistency management• ACID Properties desirable • Single Server case: use locks + 2-phase locking (strict 2PL, strong strict

2PL), transactional support for locks• Multiple server distributed case: use 2-phase commit for distributed

transactions. Need a coordinator to manage messages from participants• 2PC can become a performance bottleneck

Summary

Page 62: 15-440/15-640 Distributed Systems Concurrency Control

Overview:•2PC notation from the Book•Terminology used by messages different, but essentially the protocol is the same•Pointers to 3PC (fully described in the book)

Additional Material

Page 63: 15-440/15-640 Distributed Systems Concurrency Control

Two-Phase Commit (1)C o o r d i n a t o r / P a r t i c i p a n t c a n b e b l o c k e d i n 3 s t a t e s :

Participant: Waiting in INIT state for VOTE_REQUEST

Coordinator: Blocked in WAIT state, listening for votes

Participant: blocked in READY state, waiting for global vote

(a) The finite state machine for the coordinator in 2PC. (b) The finite state machine for a participant.

Page 64: 15-440/15-640 Distributed Systems Concurrency Control

•What if a “READY” participant does not receive the global commit? Can’t just abort => figure out what message a co-ordinator may have sent.•Approach: ask other partcipants• Take actions on response on any of the participants• E.g. P is in READY state, asks other “Q” participants

Two-Phase Commit (2)

What happens if everyone is in ”READY” state?

Page 65: 15-440/15-640 Distributed Systems Concurrency Control

• For recovery, must save state to persistent storage (e.g. log), to restart/recover after failure. • Participant (INIT): Safe to local abort, inform Coordinator• Participant (READY): Contact others• Coordinator (WAIT): Retransmit VOTE_REQ• Coordinator (WAIT/Decision): Retransmit VOTE_COMMIT

Two-Phase Commit (3)

Page 66: 15-440/15-640 Distributed Systems Concurrency Control

2PC: Actions by Coordinator

Why do we have the ”write to LOG” statements?

Page 67: 15-440/15-640 Distributed Systems Concurrency Control

2PC: Actions by Participant

Wait for REQUESTS

If Decision to COMMITLOG=>Send=> Wait

No response? Ask others

If global decision,COMMIT OR ABORT

Else, Local DecisionLOG => send

Page 68: 15-440/15-640 Distributed Systems Concurrency Control

2PC: Handling Decision Request

Note, participant can only help others if it has reached a global decision and committed it to its log.

What if everyone has received VOTE_REQ, and Co-ordinator crashes?