EE324 INTRO. TO DISTRIBUTED SYSTEMS LECTURE 13 TRANSACTIONS.

50
EE324 INTRO. TO DISTRIBUTED SYSTEMS LECTURE 13 TRANSACTIONS

Transcript of EE324 INTRO. TO DISTRIBUTED SYSTEMS LECTURE 13 TRANSACTIONS.

Page 1: EE324 INTRO. TO DISTRIBUTED SYSTEMS LECTURE 13 TRANSACTIONS.

EE324 INTRO. TO DISTRIBUTED SYSTEMSLECTURE 13 TRANSACTIONS

Page 2: EE324 INTRO. TO DISTRIBUTED SYSTEMS LECTURE 13 TRANSACTIONS.

Midterm

Midterm grading will take about a week and a half. Assignment 3 will be out. Thursday there will be a in-class session to prepare you

for the assignment.

Page 3: EE324 INTRO. TO DISTRIBUTED SYSTEMS LECTURE 13 TRANSACTIONS.

Last lecture

Distributed mutex

Page 4: EE324 INTRO. TO DISTRIBUTED SYSTEMS LECTURE 13 TRANSACTIONS.

Lamport’s Shared Priority Queue

Each process i locally maintains Qi (its own ver-sion of the priority Q)

To execute critical section, you must have replies from all other processes AND your request must be at the front of Qi

When you have all replies: All other processes are aware of your request (because the

request happens before response) You are aware of any earlier requests (assume messages

from the same process are not reordered)

Page 5: EE324 INTRO. TO DISTRIBUTED SYSTEMS LECTURE 13 TRANSACTIONS.

Lamport’s Shared Priority Queue

To enter critical section at process i :Stamp your request with the cur-rent time T Add request to Qi Broadcast REQUEST(T) to all processes Wait for all replies and for T to reach front of Qi

To leave Pop head of Qi, Broadcast RELEASE to all processes

On receipt of REQUEST(T’) from process j: Add T’ to Qi If waiting for REPLY from j for an earlier request T, wait until j replies to you Otherwise REPLY

• On receipt of RELEASEPop head of Qi

Page 6: EE324 INTRO. TO DISTRIBUTED SYSTEMS LECTURE 13 TRANSACTIONS.

Shared priority queue

Node1: time Action

40 (start)

41 Recv <15,3>

42 Reply to <15,3>

Node2: time Action

11 (start)

12 Recv <15,3>

13 Reply to <15,3>

Node3: time Action

14 (start)

15 Request <15,3>

Q: <15,3>

Q: <15,3>Q: <15,3>

Page 7: EE324 INTRO. TO DISTRIBUTED SYSTEMS LECTURE 13 TRANSACTIONS.

Shared priority queue

Node1: time Action

40 (start)

41 Recv <15,3>

42 Reply to <15,3>

Node2: time Action

11 (start)

12 Recv <15,3>

13 Reply to <15,3>

Node3: time Action

14 (start)

15 Request <15,3>

43 Recv reply 1

44 Recv reply 2

45 Run critical sec-tion

Q: <15,3>

Q: <15,3>

Q: <15,3>

Page 8: EE324 INTRO. TO DISTRIBUTED SYSTEMS LECTURE 13 TRANSACTIONS.

Shared priority queueNode1: time Action

40 (start)

41 Recv <15,3>

42 Reply to <15,3>

43 Requet <43,1>

Node2: time Action

11 (start)

16 Recv <15,3>

17 Reply to <15,3>

18 Request <18,2>

Node3: time Action

14 (start)

15 Request <15,3>

43 Recv reply 1

44 Recv reply 2

45 Run critical sec-tion

46 Recv <43,1>

Reply

48 Recv <18,2>

Q: <15,3>, <43,1>

Q: <15,3>, <18,2>, <45,1>

Q: <15,3>, <18,2>

Page 9: EE324 INTRO. TO DISTRIBUTED SYSTEMS LECTURE 13 TRANSACTIONS.

Shared priority queueNode1: time Action

40 (start)

41 Recv <15,3>

42 Reply to <15,3>

43 Request <43,1>Node2: time Action

11 (start)

16 Recv <15,3>

17 Reply to <15,3>

18 Request <18,2>

50 Recv reply from 1

51 Recv <43,1>

Delay reply be-cause <18,2> is my earlier re-quest which 1 didn’t reply

Node3: time Action

14 (start)

15 Request <15,3>

43 Recv reply 1

44 Recv reply 2

45 Run critical sec-tion

46 Recv <43,1>

47 Reply to 1

48 Recv <18,2>

49 Reply to 2

Q: <15,3>, <43,1>

Q: <15,3>, <18,2>, <45,1>

Q: <15,3>, <18,2>, <43,1>

Page 10: EE324 INTRO. TO DISTRIBUTED SYSTEMS LECTURE 13 TRANSACTIONS.

Shared priority queueNode1: time Action

40 (start)

41 Recv <15,3>

42 Reply to <15,3>

43 Request <43,1>

Recv <18,2>

Reply to 1 <18,2>

Node2: time Action

11 (start)

16 Recv <15,3>

17 Reply to <15,3>

18 Request <18,2>

50 Recv reply from 3

51 Recv <43,1>

Recv reply from 1 <18,2>

Node3: time Action

14 (start)

15 Request <15,3>

43 Recv reply 1

44 Recv reply 2

45 Run critical sec-tion

46 Recv <43,1>

47 Reply to 1

48 Recv <18,2>

49 Reply to 2

Q: <15,3>, <18,2> <43,1>

Q: <15,3>, <18,2>, <43,1>

Q: <15,3>, <18,2>, <43,1>

Page 11: EE324 INTRO. TO DISTRIBUTED SYSTEMS LECTURE 13 TRANSACTIONS.

Shared Queue approach

Everyone eventually sees the same ordering Ordered by Lamport’s clock.

Disadvantages: Very unreliable Any process failure halts progress 3(N-1) messages per entry/exit

Advantages: Fair, Short synchronization delay

Page 12: EE324 INTRO. TO DISTRIBUTED SYSTEMS LECTURE 13 TRANSACTIONS.

Lamport’s Shared Priority Queue

Advantages: Fair Short synchronization delay

Disadvantages: Very unreliable (Any process failure halts

progress) 3(N-1) messages per entry/exit

Page 13: EE324 INTRO. TO DISTRIBUTED SYSTEMS LECTURE 13 TRANSACTIONS.

Today

We want to look at distributed transactions, but first we need to understand transactions in a single machine.

Page 14: EE324 INTRO. TO DISTRIBUTED SYSTEMS LECTURE 13 TRANSACTIONS.

14

Today's Lecture

Reading CDK5 16.2~.4 Transaction basics Locking and deadlock in transactions

Page 15: EE324 INTRO. TO DISTRIBUTED SYSTEMS LECTURE 13 TRANSACTIONS.

Transactions

A group of operations often represent a unit of “work”. Fundamental abstraction to group operations into a

single unit of work begin: begins the transaction commit: attempts to complete the transaction rollback / abort: aborts the transaction

Page 16: EE324 INTRO. TO DISTRIBUTED SYSTEMS LECTURE 13 TRANSACTIONS.

16

Transactions

A transaction is a sequence of server operations that is guaranteed by the server to be atomic in the pres-ence of multiple clients and server crashes. Free from interference by operations being performed on

behalf of other concurrent clients Either all of the operations must be completed successfully

or they must have no effect at all in the presence of server crashes

Page 17: EE324 INTRO. TO DISTRIBUTED SYSTEMS LECTURE 13 TRANSACTIONS.

17

Transactions – The ACID Properties

The four desirable properties for reliable handling of concurrent transactions. (The alternative defini-tion of transactions.)

Atomicity: “All or Nothing” Consistency: Each transaction, if executed by itself, main-

tains the correctness of the database. Isolation (Serializability): each transaction runs as if alone Durability: once a transaction is done, it stays done. Can-

not be undone.

Page 18: EE324 INTRO. TO DISTRIBUTED SYSTEMS LECTURE 13 TRANSACTIONS.

18

Bank Operations

deposit(amount)deposit amount in the account

withdraw(amount)withdraw amount from the account

getBalance() -> amountreturn the balance of the account

setBalance(amount)set the balance of the account to amount

Operations of the Account interface bool xfer(Account src, Account dest, long x) { Transaction t = begin(); if (src.getBalance() >= x) { src.setBalance(src.getBalance() – x); dest.setBalance(dest.getBalance() + x); return t.commit(); } t.abort(); return FALSE;}

A client’s banking transaction

Page 19: EE324 INTRO. TO DISTRIBUTED SYSTEMS LECTURE 13 TRANSACTIONS.

19

The transactional model

Applications are coded in a stylized way: begin transaction Perform a series of read, update operations Terminate by commit or abort.

Terminology The application is the transaction manager The data manager is presented with operations from

concurrently active transactions It schedules them in an interleaved but serializable or-

der

Page 20: EE324 INTRO. TO DISTRIBUTED SYSTEMS LECTURE 13 TRANSACTIONS.

20

Transaction and Data Managers

Transactions

readupdate

read

update

transactions are stateful: transaction “knows” about database contents and updates

Data (and Lock) Managers

Page 21: EE324 INTRO. TO DISTRIBUTED SYSTEMS LECTURE 13 TRANSACTIONS.

21

Transaction life histories

openTransaction() trans; starts a new transaction and delivers a unique TID trans. This identifier will be used in the other operations

in the transaction. closeTransaction(trans) (commit, abort);

ends a transaction: a commit return value indicates that the transaction has committed; an abort return value indicates that it has aborted.

abortTransaction(trans); aborts the transaction.

Successful Aborted by client Aborted by server

openTransaction openTransaction openTransactionoperation operation operation operation operation operation

server abortstransaction

operation operation operation ERRORreported to client

closeTransaction abortTransaction

Page 22: EE324 INTRO. TO DISTRIBUTED SYSTEMS LECTURE 13 TRANSACTIONS.

22

Transactional Execution Log

As the transaction runs, it creates a history of its actions. Suppose we were to write down the sequence of opera-tions it performs.

Data manager does this, one by one This yields a “schedule”

Operations and order they executed Can infer order in which transactions ran

Scheduling is called “concurrency control”

Page 23: EE324 INTRO. TO DISTRIBUTED SYSTEMS LECTURE 13 TRANSACTIONS.

Instructor’s Guide for Coulouris, Dollimore, Kindberg and Blair, Distributed Systems: Concepts and Design Edn. 5 © Pearson Education 2012

Figure 16.5The lost update problem

Transaction T :

balance = b.getBalance();b.setBalance(balance*1.1);a.withdraw(balance/10)

Transaction U:

balance = b.getBalance();b.setBalance(balance*1.1);c.withdraw(balance/10)

balance = b.getBalance(); $200

balance = b.getBalance(); $200

b.setBalance(balance*1.1); $220

b.setBalance(balance*1.1); $220

a.withdraw(balance/10) $80

c.withdraw(balance/10) $280

Page 24: EE324 INTRO. TO DISTRIBUTED SYSTEMS LECTURE 13 TRANSACTIONS.

Instructor’s Guide for Coulouris, Dollimore, Kindberg and Blair, Distributed Systems: Concepts and Design Edn. 5 © Pearson Education 2012

Figure 16.6The inconsistent retrievals problem

Transaction V: a.withdraw(100)b.deposit(100)

Transaction W:

aBranch.branchTotal()

a.withdraw(100); $100

total = a.getBalance() $100

total = total+b.getBalance() $300

total = total+c.getBalance()

b.deposit(100) $300

Page 25: EE324 INTRO. TO DISTRIBUTED SYSTEMS LECTURE 13 TRANSACTIONS.

25

Concurrency control

Motivation: without concurrency control, we have lost up-dates, inconsistent retrievals, etc.

Concurrency control schemes are designed to allow two or more transactions to be executed correctly while main-taining serial equivalence Serial Equivalence is correctness criterion

Schedule produced by concurrency control scheme should be equivalent to a serial schedule in which transactions are exe-cuted one after the other

Schemes: locking, optimistic concurrency control, time-stamp based concurrency control

Page 26: EE324 INTRO. TO DISTRIBUTED SYSTEMS LECTURE 13 TRANSACTIONS.

26

Serially Equivalent Interleaving

Means that effect of the interleaved execution is indistin-guishable from some possible serial execution of the committed transactions

For example: T1 and T2 are interleaved but it “looks like” T2 ran before T1

Idea is that transactions can be coded to be correct if run in isolation, and yet will run correctly when executed con-currently (and hence gain a speedup)

Page 27: EE324 INTRO. TO DISTRIBUTED SYSTEMS LECTURE 13 TRANSACTIONS.

27

Need for serially equivalent interleaving

Data manager interleaves operations to improve concurrency

DB: R1(X) R2(X) W2(X) R1(Y) W1(X) W2(Y) commit1 commit2

T1: R1(X) R1(Y) W1(X) commit1

T2: R2(X) W2(X) W2(Y) commit2

Page 28: EE324 INTRO. TO DISTRIBUTED SYSTEMS LECTURE 13 TRANSACTIONS.

28

Need for serially equivalent interleaving

Problem: transactions may “interfere”. Here, T2 changes x, hence T1 should have either run first (read and write) or after (reading the changed value).

Unsafe! Not serially equivalent

DB: R1(X) R2(X) W2(X) R1(Y) W1(X) W2(Y) commit2 commit1

T1: R1(X) R1(Y) W1(X) commit1

T2: R2(X) W2(X) W2(Y) commit2

Page 29: EE324 INTRO. TO DISTRIBUTED SYSTEMS LECTURE 13 TRANSACTIONS.

29

Serially equivalent interleaving

Data manager interleaves operations to improve concurrency but schedules them so that it looks as if one transaction ran at a time. This schedule “looks” like T2 ran first.

DB: R2(X) W2(X) R1(X) W1(X) W2(Y) R1(Y) commit2 commit1

T1: R1(X) R1(Y) W1(X) commit1

T2: R2(X) W2(X) W2(Y) commit2

Page 30: EE324 INTRO. TO DISTRIBUTED SYSTEMS LECTURE 13 TRANSACTIONS.

Conflicting operations

A pair of operations conflicts when their combined ef-fect depends on the ordering.

Read and write operation conflict rules

Operations of differenttransactions

Conflict Reason

read read No Because the effect of a pair of read operations

does not depend on the order in which they are

executed

read write Yes Because the effect of a read and a write operation

depends on the order of their execution

write write Yes Because the effect of a pair of write operations

depends on the order of their execution

Page 31: EE324 INTRO. TO DISTRIBUTED SYSTEMS LECTURE 13 TRANSACTIONS.

Serial equivalence property

For two transactions to be serially equivalent, it is nec-essary and sufficient that all pairs of conflicting opera-tions of the two transactions be executed in the same order at all of the objects they both access.

Page 32: EE324 INTRO. TO DISTRIBUTED SYSTEMS LECTURE 13 TRANSACTIONS.

Recovery from abort

Servers must record all the effects of committed trans-actions and non of the effects of aborted transactions.

Aborted transactions can cause “dirty reads” and “premature writes”.

Page 33: EE324 INTRO. TO DISTRIBUTED SYSTEMS LECTURE 13 TRANSACTIONS.

33

A dirty read when transaction T aborts

Transaction T:

a.getBalance()a.setBalance(balance + 10)

Transaction U:

a.getBalance()a.setBalance(balance + 20)

balance = a.getBalance() $100

a.setBalance(balance + 10) $110

balance = a.getBalance() $110

a.setBalance(balance + 20) $130

commit transaction

abort transaction

uses result of uncommitted transaction!

Page 34: EE324 INTRO. TO DISTRIBUTED SYSTEMS LECTURE 13 TRANSACTIONS.

34

Today's Lecture

Transaction basics

Locking and deadlock

Page 35: EE324 INTRO. TO DISTRIBUTED SYSTEMS LECTURE 13 TRANSACTIONS.

35

Schemes for Concurrency control

Locking Server attempts to gain an exclusive ‘lock’ that is about to

be used by one of its operations in a transaction. Can use different lock types (read/write for example) Two-phase locking

Optimistic concurrency control Time-stamp based concurrency control

Page 36: EE324 INTRO. TO DISTRIBUTED SYSTEMS LECTURE 13 TRANSACTIONS.

36

What about the locks?

Unlike other kinds of distributed systems, trans-actional systems typically lock the data they ac-cess

They obtain these locks as they run: Before accessing “x” get a lock on “x” Usually we assume that the application knows enough to get

the right kind of lock. It is not good to get a read lock if you’ll later need to update the object

In clever applications, one lock will often cover many objects

Page 37: EE324 INTRO. TO DISTRIBUTED SYSTEMS LECTURE 13 TRANSACTIONS.

37

Locking rule

Suppose that transaction T will access object x. We need to know that first, T gets a lock that “covers” x

What does coverage entail? We need to know that if any other transaction T’ tries to

access x it will attempt to get the same lock

Page 38: EE324 INTRO. TO DISTRIBUTED SYSTEMS LECTURE 13 TRANSACTIONS.

38

Examples of lock coverage

We could have one lock per object … or one lock for the whole database (a global lock) … or one lock for a category of objects

In a tree, we could have one lock for the whole tree associated with the root

In a table we could have one lock for row, or one for each col-umn, or one for the whole table

All transactions must use the same rules! And if you will update the object, the lock must be a

“write” lock, not a “read” lock

Page 39: EE324 INTRO. TO DISTRIBUTED SYSTEMS LECTURE 13 TRANSACTIONS.

Global lock?

Only let one transaction run at a time Poor solution Performance issues.

bool xfer(Account src, Account dest, long x) { lock(); if (src.getBalance() >= x) {

src.setBalance(src.getBalance() – x);dest.setBalance(dest.getBalance() + x);unlock();return TRUE;

} unlock(); return FALSE;}

Page 40: EE324 INTRO. TO DISTRIBUTED SYSTEMS LECTURE 13 TRANSACTIONS.

Per-Object Locking

Other transactions can execute concurrently, as long as they don’t read or write the src or dest accounts

bool xfer(Account src, Account dest, long x) { lock(src); if (src.getBalance() >= x) {

src.setBalance(src.getBalance() – x);unlock(src);lock(dest);dest.setBalance(dest.getBalance() + x);unlock(dest);return TRUE;

} unlock(src); return FALSE;}

See any problem?

Page 41: EE324 INTRO. TO DISTRIBUTED SYSTEMS LECTURE 13 TRANSACTIONS.

Read/Write locks

We can use different type of locks to increase concur-rency.

Read/write locks. Need to respect the conflict rule.

Page 42: EE324 INTRO. TO DISTRIBUTED SYSTEMS LECTURE 13 TRANSACTIONS.

42

Read/Write locks: Lock compatibility

For one object Lock requested read write

Lock already set none OK OK

read OK wait

write wait wait

Operation Conflict rules:1. If a transaction T has already performed a read operation on a

particular object, then a concurrent transaction U must not writethat object until T commits or aborts

2. If a transaction T has already performed a read operation on a particular object, then a concurrent transaction U must not reador write that object until T commits or aborts

Page 43: EE324 INTRO. TO DISTRIBUTED SYSTEMS LECTURE 13 TRANSACTIONS.

43

Strict Two-Phase Locking

Strict two-phase locking.

Automatically release all locks upon commit or abort.

Page 44: EE324 INTRO. TO DISTRIBUTED SYSTEMS LECTURE 13 TRANSACTIONS.

44

Why does strict 2PL imply serializability?

Suppose that T’ will perform an operation that conflicts with an operation that T has done: T’ will update data item X that T read or updated T updated item Y and T’ will read or update it

T must have had a lock on X/Y that conflicts with the lock that T’ wants

T won’t release it until it commits or aborts So T’ will wait until T commits or aborts

Page 45: EE324 INTRO. TO DISTRIBUTED SYSTEMS LECTURE 13 TRANSACTIONS.

45

Use of locks in strict two-phase locking

1. When an operation accesses an object within a transaction:(a) If the object is not already locked, it is locked and the operation proceeds.(b) If the object has a conflicting lock set by another transaction, the transaction

must wait until it is unlocked.(c) If the object has a non-conflicting lock set by another transaction, the lock is

shared and the operation proceeds.(d) If the object has already been locked in the same transaction, the lock will be

promoted if necessary and the operation proceeds. (Where promotion is pre-vented by a conflicting lock, rule (b) is used.)

Lock promotion: getting a more exclusive lock (e.g., read write lock)2. When a transaction is committed or aborted, the server unlocks all objects it locked

for the transaction.

Page 46: EE324 INTRO. TO DISTRIBUTED SYSTEMS LECTURE 13 TRANSACTIONS.

46

Deadlock with write locks

Transaction T Transaction U

Operations Locks Operations Locks

a.deposit(100); write lock A

b.deposit(200) write lock B

b.withdraw(100)waits for U’s a.withdraw(200); waits for T’s

lock on B lock on A

Page 47: EE324 INTRO. TO DISTRIBUTED SYSTEMS LECTURE 13 TRANSACTIONS.

47

Dealing with Deadlock in two-phase locking

Deadlock prevention Acquire all needed locks in a single atomic operation Acquire locks in a particular order Often impractical in practice: transactions may not

know which lock they may need in the future

Page 48: EE324 INTRO. TO DISTRIBUTED SYSTEMS LECTURE 13 TRANSACTIONS.

48

Dealing with Deadlock in two-phase locking

Deadlock detection Keep graph of locks held. Check for cycles periodically

or each time an edge is added Cycles can be eliminated by aborting transactions

Timeouts (“ignoring”) Aborting transactions when time expires Most transactions are short.

Long-lived ones are probably deadlocked, so abort and retry.

Page 49: EE324 INTRO. TO DISTRIBUTED SYSTEMS LECTURE 13 TRANSACTIONS.

49

Deadlock detection: The wait-for graph

B

A

Waits for

Held by

Held by

T UU T

Waits for

Page 50: EE324 INTRO. TO DISTRIBUTED SYSTEMS LECTURE 13 TRANSACTIONS.

50

Timeouts

Transaction T Transaction U

Operations Locks Operations Locks

a.deposit(100); write lock A

b.deposit(200) write lock B

b.withdraw(100)

waits for U’s a.withdraw(200); waits for T’s

lock on B lock on A

(timeout elapses)

T’s lock on A becomes vulnerable,

unlock A, abort T

a.withdraw(200); write locks A

unlock A, B