CSC 536 Lecture 3
description
Transcript of CSC 536 Lecture 3
![Page 1: CSC 536 Lecture 3](https://reader035.fdocuments.net/reader035/viewer/2022081802/568164e0550346895dd74277/html5/thumbnails/1.jpg)
CSC 536 Lecture 3
![Page 2: CSC 536 Lecture 3](https://reader035.fdocuments.net/reader035/viewer/2022081802/568164e0550346895dd74277/html5/thumbnails/2.jpg)
Outline
Akka example: mapreduceDistributed transactions
![Page 3: CSC 536 Lecture 3](https://reader035.fdocuments.net/reader035/viewer/2022081802/568164e0550346895dd74277/html5/thumbnails/3.jpg)
MapReduce Framework: Motivation
Want to process lots of data ( > 1 TB)
Want to parallelize the job across hundreds/thousands of commodity CPUs connected by a commodity networks
Want to make this easy, re-usable
![Page 4: CSC 536 Lecture 3](https://reader035.fdocuments.net/reader035/viewer/2022081802/568164e0550346895dd74277/html5/thumbnails/4.jpg)
Example Uses at Google
Pagerank wordcount distributed grep distributed sort web link-graph reversal term-vector per host web access log stats inverted index construction document clustering machine learning statistical machine translation …
![Page 5: CSC 536 Lecture 3](https://reader035.fdocuments.net/reader035/viewer/2022081802/568164e0550346895dd74277/html5/thumbnails/5.jpg)
Programming Model
Users implement interface of two functions:
mapper (in_key, in_value) ->
list((out_key, intermediate_value))
reducer (out_key, intermediate_values list) -> (out_key,
out_value)
![Page 6: CSC 536 Lecture 3](https://reader035.fdocuments.net/reader035/viewer/2022081802/568164e0550346895dd74277/html5/thumbnails/6.jpg)
Map phase
Records from the data source are fed into the mapper function as (key, value) pairs
(filename, content) (goal: wordcount) (web page URL, web page content) (goal: web link-
graph reversal)
mapper produces one or more intermediate (output key, intermediate value) pairs from the input
(word, 1) (link URL, web page URL)
![Page 7: CSC 536 Lecture 3](https://reader035.fdocuments.net/reader035/viewer/2022081802/568164e0550346895dd74277/html5/thumbnails/7.jpg)
Reduce phase
After the Map phase is over, all the intermediate values for a given output key are combined together into a list
(“hello”, 1), (“hello”, 1), (“hello”, 1) -> (“hello”, [1,1,1]) Done by intermediate aggregator step of MapReduce
reducer function combines those intermediate values into one or more final values for that same output key
(“hello”, [1,1,1]) -> (“hello”, 3)
![Page 8: CSC 536 Lecture 3](https://reader035.fdocuments.net/reader035/viewer/2022081802/568164e0550346895dd74277/html5/thumbnails/8.jpg)
Data store 1 Data store nmap
(key 1, values...)
(key 2, values...)
(key 3, values...)
map
(key 1, values...)
(key 2, values...)
(key 3, values...)
Input key*value pairs
Input key*value pairs
== Barrier == : Aggregates intermediate values by output key
reduce reduce reduce
key 1, intermediate
values
key 2, intermediate
values
key 3, intermediate
values
final key 1 values
final key 2 values
final key 3 values
...
![Page 9: CSC 536 Lecture 3](https://reader035.fdocuments.net/reader035/viewer/2022081802/568164e0550346895dd74277/html5/thumbnails/9.jpg)
Parallelism
mapper functions run in parallel, creating different intermediate values from different input data sets
reducer functions also run in parallel, each working on a different output key
All values are processed independently
![Page 10: CSC 536 Lecture 3](https://reader035.fdocuments.net/reader035/viewer/2022081802/568164e0550346895dd74277/html5/thumbnails/10.jpg)
MapReduce example: wordcount
Problem: Count the number of occurrences of words in a set of files
Input to any MapReduce job: A set of (input_key, input_value) pairs
In wordcount: (input_key, input_value) = (filename, content)
filenames = ["a.txt", "b.txt", "c.txt"]content = {}for filename in filenames: f = open(filename) content[filename] = f.read() f.close()
![Page 11: CSC 536 Lecture 3](https://reader035.fdocuments.net/reader035/viewer/2022081802/568164e0550346895dd74277/html5/thumbnails/11.jpg)
MapReduce example: wordcount
The content of the input files
a.txt:
The quick brown fox jumped over the lazy grey dogs.
b.txt:
That's one small step for a man, one giant leap for mankind.
c.txt:
Mary had a little lamb,Its fleece was white as snow;And everywhere that Mary went,The lamb was sure to go.
![Page 12: CSC 536 Lecture 3](https://reader035.fdocuments.net/reader035/viewer/2022081802/568164e0550346895dd74277/html5/thumbnails/12.jpg)
MapReduce example: wordcount
Map phase: Function mapper is applied to every (filename, content) pair
mapper moves through the words in the file for each word it encounters, it returns the intermediate key and value
(word, 1)
A call to mapper("a.txt", content["a.txt"]) returns:
[('the', 1), ('quick', 1), ('brown', 1), ('fox', 1), ('jumped', 1), ('over', 1), ('the', 1), ('lazy', 1), ('grey', 1), ('dogs', 1)]
The output of the Map phase is the concatenation of the lists for map("a.txt", content["a.txt"]), map("b.txt", content[“b.txt"]), and map("c.txt", content[“c.txt"])
![Page 13: CSC 536 Lecture 3](https://reader035.fdocuments.net/reader035/viewer/2022081802/568164e0550346895dd74277/html5/thumbnails/13.jpg)
MapReduce example: wordcount
The output of the Map phase
[('the', 1), ('quick', 1), ('brown', 1), ('fox', 1), ('jumped', 1), ('over', 1), ('the', 1), ('lazy', 1), ('grey', 1), ('dogs', 1), ('mary', 1), ('had', 1), ('a', 1), ('little', 1), ('lamb', 1), ('its', 1), ('fleece', 1), ('was', 1), ('white', 1), ('as', 1), ('snow', 1), ('and', 1), ('everywhere', 1), ('that', 1), ('mary', 1), ('went', 1), ('the', 1), ('lamb', 1), ('was', 1), ('sure', 1), ('to', 1), ('go', 1),('thats', 1), ('one', 1), ('small', 1), ('step', 1),('for', 1), ('a', 1), ('man', 1), ('one', 1),('giant', 1), ('leap', 1), ('for', 1), ('mankind', 1)]
![Page 14: CSC 536 Lecture 3](https://reader035.fdocuments.net/reader035/viewer/2022081802/568164e0550346895dd74277/html5/thumbnails/14.jpg)
MapReduce example: wordcount
The Map phase of MapReduce is logically trivial But when the input dictionary has, say, 10 billion keys, and those keys point
to files held on thousands of different machines, implementing the map phase is actually quite non-trivial.
The MapReduce library should handle: knowing which files are stored on what machines, making sure that machine failures don’t affect the computation, making efficient use of the network, and storing the output in a useable form.
The programmer only writes the mapper function The MapReduce framework takes care of everything else
![Page 15: CSC 536 Lecture 3](https://reader035.fdocuments.net/reader035/viewer/2022081802/568164e0550346895dd74277/html5/thumbnails/15.jpg)
MapReduce example: wordcount
In preparation for the Reduce phase, the MapReduce library groups together all the intermediate values which have the same key to obtain this intermediate dictionary:
{'and': [1], 'fox': [1], 'over': [1], 'one': [1, 1], 'as': [1], 'go': [1], 'its': [1], 'lamb': [1, 1], 'giant': [1], 'for': [1, 1], 'jumped': [1], 'had': [1], 'snow': [1], 'to': [1], 'leap': [1], 'white': [1], 'was': [1, 1], 'mary': [1, 1], 'brown': [1], 'lazy': [1], 'sure': [1], 'that': [1], 'little': [1], 'small': [1], 'step': [1], 'everywhere': [1], 'mankind': [1], 'went': [1], 'man': [1], 'a': [1, 1], 'fleece': [1], 'grey': [1], 'dogs': [1], 'quick': [1], 'the': [1, 1, 1], 'thats': [1]}
![Page 16: CSC 536 Lecture 3](https://reader035.fdocuments.net/reader035/viewer/2022081802/568164e0550346895dd74277/html5/thumbnails/16.jpg)
MapReduce example: wordcount
In the Reduce phase, a programmer-defined function
reducer(out_key, intermediate_value_list)
is applied to each entry in the intermediate dictionary.
For wordcount, reducer sums up the list of intermediate values, and returns both out_key and the sum as the output.
def reduce(out_key, intermediate_value_list): return (out_key, sum(intermediate_value_list))
![Page 17: CSC 536 Lecture 3](https://reader035.fdocuments.net/reader035/viewer/2022081802/568164e0550346895dd74277/html5/thumbnails/17.jpg)
MapReduce example: wordcount
The output from the Reduce phase, and from the complete MapReduce computation, is:
[('and', 1), ('fox', 1), ('over', 1), ('one', 2), ('as', 1), ('go', 1), ('its', 1), ('lamb', 2), ('giant', 1), ('for', 2), ('jumped', 1), ('had', 1), ('snow', 1), ('to', 1), ('leap', 1), ('white', 1), ('was', 2), ('mary', 2), ('brown', 1), ('lazy', 1), ('sure', 1), ('that', 1), ('little', 1), ('small', 1), ('step', 1), ('everywhere', 1), ('mankind', 1), ('went', 1), ('man', 1), ('a', 2), ('fleece', 1), ('grey', 1), ('dogs', 1), ('quick', 1), ('the', 3), ('thats', 1)]
![Page 18: CSC 536 Lecture 3](https://reader035.fdocuments.net/reader035/viewer/2022081802/568164e0550346895dd74277/html5/thumbnails/18.jpg)
MapReduce example: wordcountMap and Reduce can be done in parallel... but how is the grouping step that takes place between the Map phase and the Reduce phase done? For the reducer functions to work in parallel, we need to ensure that all the
intermediate values corresponding to the same key get sent to the same machine
![Page 19: CSC 536 Lecture 3](https://reader035.fdocuments.net/reader035/viewer/2022081802/568164e0550346895dd74277/html5/thumbnails/19.jpg)
MapReduce example: wordcountMap and Reduce can be done in parallel... but how is the grouping step that takes place between the Map phase and the Reduce phase done? For the reducer functions to work in parallel, we need to ensure that all the
intermediate values corresponding to the same key get sent to the same machine
The general idea: Imagine you’ve got 1000 machines that you’re going to use to run reduce on. As the mapper functions compute the output keys and intermediate value lists, they
compute hash(out_key) mod 1000 for some hash function. This number is used to identify the machine in the cluster that the corresponding
reducer will be run on, and the resulting output key and value list is then sent to that machine.
Because every machine running mapper uses the same hash function, this ensures that value lists corresponding to the same output key all end up at the same machine.
Furthermore, by using a hash we ensure that the output keys end up pretty evenly spread over machines in the cluster
![Page 20: CSC 536 Lecture 3](https://reader035.fdocuments.net/reader035/viewer/2022081802/568164e0550346895dd74277/html5/thumbnails/20.jpg)
mapreduce example
project mapreduce in lecture 3 code
![Page 21: CSC 536 Lecture 3](https://reader035.fdocuments.net/reader035/viewer/2022081802/568164e0550346895dd74277/html5/thumbnails/21.jpg)
MapReduce optimizations
Locality Fault Tolerance Time optimization Bandwidth optimization
![Page 22: CSC 536 Lecture 3](https://reader035.fdocuments.net/reader035/viewer/2022081802/568164e0550346895dd74277/html5/thumbnails/22.jpg)
Locality
Master program divvies up tasks based on location of data tries to have mapper tasks on same machine as physical file
data, or at least same rack
mapper task inputs are divided into 64 MB blocks same size as Google File System chunks
![Page 23: CSC 536 Lecture 3](https://reader035.fdocuments.net/reader035/viewer/2022081802/568164e0550346895dd74277/html5/thumbnails/23.jpg)
Redundancy for Fault Tolerance
Master detects worker failures via periodic heartbeats Re-executes completed & in-progress mapper tasks Re-executes in-progress reducer tasks
![Page 24: CSC 536 Lecture 3](https://reader035.fdocuments.net/reader035/viewer/2022081802/568164e0550346895dd74277/html5/thumbnails/24.jpg)
Redundancy for time optimization
Reduce phase can’t start until Map phase is complete
Slow workers significantly lengthen completion time A single slow disk controller can rate-limit the whole process Other jobs consuming resources on machine Bad disks with soft errors transfer data very slowly Weird things: processor caches disabled
Solution: Near end of phase, spawn backup copies of tasks Whichever one finishes first "wins” Effect: Dramatically shortens job completion time
![Page 25: CSC 536 Lecture 3](https://reader035.fdocuments.net/reader035/viewer/2022081802/568164e0550346895dd74277/html5/thumbnails/25.jpg)
Bandwidth Optimizations
“Aggregator” function can run on same machine as a mapper function
Causes a mini-reduce phase to occur before the real Reduce phase, to save bandwidth
![Page 26: CSC 536 Lecture 3](https://reader035.fdocuments.net/reader035/viewer/2022081802/568164e0550346895dd74277/html5/thumbnails/26.jpg)
Distributed Transactions
![Page 27: CSC 536 Lecture 3](https://reader035.fdocuments.net/reader035/viewer/2022081802/568164e0550346895dd74277/html5/thumbnails/27.jpg)
Distributed transactions
Transactions, like mutual exclusion, protect shared data against simultaneous access by several concurrent processes.
Transactions allow a process to access and modify multiple data items as a single atomic transaction.
If the process backs out halfway during the transaction, everything is restored to the point just before the transaction started.
![Page 28: CSC 536 Lecture 3](https://reader035.fdocuments.net/reader035/viewer/2022081802/568164e0550346895dd74277/html5/thumbnails/28.jpg)
Distributed transactions: example 1
A customer dials into her bank web account and does the following:
Withdraws amount x from account 1.Deposits amount x to account 2.
If telephone connection is broken after the first step but before the second, what happens?
Either both or neither should be completed.Requires special primitives provided by the DS.
![Page 29: CSC 536 Lecture 3](https://reader035.fdocuments.net/reader035/viewer/2022081802/568164e0550346895dd74277/html5/thumbnails/29.jpg)
The Transaction Model
Examples of primitives for transactions
Write data to a file, a table, or otherwiseWRITE
Read data from a file, a table, or otherwiseREAD
Kill the transaction and restore the old valuesABORT_TRANSACTION
Terminate the transaction and try to commitEND_TRANSACTION
Make the start of a transactionBEGIN_TRANSACTION
DescriptionPrimitive
![Page 30: CSC 536 Lecture 3](https://reader035.fdocuments.net/reader035/viewer/2022081802/568164e0550346895dd74277/html5/thumbnails/30.jpg)
Distributed transactions: example 2
a) Transaction to reserve three flights commitsb) Transaction aborts when third flight is unavailable
BEGIN_TRANSACTION reserve WP -> JFK; reserve JFK -> Nairobi; reserve Nairobi -> Malindi full =>ABORT_TRANSACTION (b)
BEGIN_TRANSACTION reserve WP -> JFK; reserve JFK -> Nairobi; reserve Nairobi -> Malindi;END_TRANSACTION (a)
![Page 31: CSC 536 Lecture 3](https://reader035.fdocuments.net/reader035/viewer/2022081802/568164e0550346895dd74277/html5/thumbnails/31.jpg)
ACID
Transactions areAtomic: to the outside world, the transaction happens indivisibly.
Consistent: the transaction does not violate system invariants.
Isolated (or serializable): concurrent transactions do not interfere with each other.
Durable: once a transaction commits, the changes are permanent.
![Page 32: CSC 536 Lecture 3](https://reader035.fdocuments.net/reader035/viewer/2022081802/568164e0550346895dd74277/html5/thumbnails/32.jpg)
Flat, nested and distributed transactions
a) A nested transactionb) A distributed transaction
![Page 33: CSC 536 Lecture 3](https://reader035.fdocuments.net/reader035/viewer/2022081802/568164e0550346895dd74277/html5/thumbnails/33.jpg)
Implementation of distributed transactions
For simplicity, we consider transactions on a file system.
Note that if each process executing a transaction just updates the file in place, transactions will not be atomic, and changes will not vanish if the transaction aborts.
Other methods required.
![Page 34: CSC 536 Lecture 3](https://reader035.fdocuments.net/reader035/viewer/2022081802/568164e0550346895dd74277/html5/thumbnails/34.jpg)
Atomicity
If each process executing a transaction just updates the file in place, transactions will not be atomic, and changes will vanish if the transaction aborts.
![Page 35: CSC 536 Lecture 3](https://reader035.fdocuments.net/reader035/viewer/2022081802/568164e0550346895dd74277/html5/thumbnails/35.jpg)
Solution 1: Private Workspace
a) The file index and disk blocks for a three-block fileb) The situation after a transaction has modified block 0 and appended block 3c) After committing
![Page 36: CSC 536 Lecture 3](https://reader035.fdocuments.net/reader035/viewer/2022081802/568164e0550346895dd74277/html5/thumbnails/36.jpg)
Solution 2: Writeahead Log
(a) A transaction(b) – (d) The log before each statement is executed
Log
[x = 0 / 1][y = 0/2][x = 0/4]
(d)
Log
[x = 0 / 1][y = 0/2]
(c)
Log
[x = 0 / 1]
(b)
x = 0;y = 0;BEGIN_TRANSACTION; x = x + 1; y = y + 2 x = y * y;END_TRANSACTION; (a)
![Page 37: CSC 536 Lecture 3](https://reader035.fdocuments.net/reader035/viewer/2022081802/568164e0550346895dd74277/html5/thumbnails/37.jpg)
Concurrency control (1)
We just learned how to achieve atomicity; we will learn about durability when discussing fault tolerance
Need to handle consistency and isolation
Concurrency control allows several transactions to be executed simultaneously, while making sure that the data is left in a consistent state
This is done by scheduling operations on data in an order whereby the final result is the same as if all transactions had run sequentially
![Page 38: CSC 536 Lecture 3](https://reader035.fdocuments.net/reader035/viewer/2022081802/568164e0550346895dd74277/html5/thumbnails/38.jpg)
Concurrency control (2)
General organization of managers for handling transactions
![Page 39: CSC 536 Lecture 3](https://reader035.fdocuments.net/reader035/viewer/2022081802/568164e0550346895dd74277/html5/thumbnails/39.jpg)
Concurrency control (3)General organization of managers for handling distributed transactions.
![Page 40: CSC 536 Lecture 3](https://reader035.fdocuments.net/reader035/viewer/2022081802/568164e0550346895dd74277/html5/thumbnails/40.jpg)
Serializability
The main issue in concurrency control is the scheduling of conflicting operations (operating on same data item and one of which is a write operation)
Read/Write operations can be synchronized using:Mutual exclusion mechanisms, orScheduling using timestamps
Pessimistic/optimistic concurrency control
![Page 41: CSC 536 Lecture 3](https://reader035.fdocuments.net/reader035/viewer/2022081802/568164e0550346895dd74277/html5/thumbnails/41.jpg)
The lost update problem
Transaction T : balance = b.getBalance();b.setBalance(balance*1.1);a.withdraw(balance/10)
Transaction U:
balance = b.getBalance();b.setBalance(balance*1.1);c.withdraw(balance/10)
balance = b.getBalance(); $200
balance = b.getBalance(); $200
b.setBalance(balance*1.1); $220
b.setBalance(balance*1.1); $220
a.withdraw(balance/10) $80
c.withdraw(balance/10) $280
Accounts a, b, and c start with $100, $200, and $300, respectively
![Page 42: CSC 536 Lecture 3](https://reader035.fdocuments.net/reader035/viewer/2022081802/568164e0550346895dd74277/html5/thumbnails/42.jpg)
The inconsistent retrievals problem
Transaction V: a.withdraw(100)b.deposit(100)
Transaction W:
aBranch.branchTotal()
a.withdraw(100); $100total = a.getBalance() $100
total = total+b.getBalance() $300
total = total+c.getBalance()
b.deposit(100) $300
Accounts a and b start with $200 each.
![Page 43: CSC 536 Lecture 3](https://reader035.fdocuments.net/reader035/viewer/2022081802/568164e0550346895dd74277/html5/thumbnails/43.jpg)
A serialized interleaving of T and U
Transaction T: balance = b.getBalance()b.setBalance(balance*1.1)a.withdraw(balance/10)
Transaction U: balance = b.getBalance()b.setBalance(balance*1.1)c.withdraw(balance/10)
balance = b.getBalance() $200
b.setBalance(balance*1.1) $220balance = b.getBalance() $220
b.setBalance(balance*1.1) $242a.withdraw(balance/10) $80
c.withdraw(balance/10) $278
![Page 44: CSC 536 Lecture 3](https://reader035.fdocuments.net/reader035/viewer/2022081802/568164e0550346895dd74277/html5/thumbnails/44.jpg)
A serialized interleaving of V and W
Transaction V: a.withdraw(100);b.deposit(100)
Transaction W:
aBranch.branchTotal()
a.withdraw(100); $100
b.deposit(100) $300
total = a.getBalance() $100
total = total+b.getBalance() $400
total = total+c.getBalance()...
![Page 45: CSC 536 Lecture 3](https://reader035.fdocuments.net/reader035/viewer/2022081802/568164e0550346895dd74277/html5/thumbnails/45.jpg)
Read and write operation conflict rules
Operations of differenttransactions
Conflict Reason
read read No Because the effect of a pair of read operationsdoes not depend on the order in which they areexecuted
read write Yes Because the effect of a read and a write operationdepends on the order of their execution
write write Yes Because the effect of a pair of write operationsdepends on the order of their execution
![Page 46: CSC 536 Lecture 3](https://reader035.fdocuments.net/reader035/viewer/2022081802/568164e0550346895dd74277/html5/thumbnails/46.jpg)
Serializability
Two transactions are serialized
if and only if
All pairs of conflicting operations of the two transactions are executed in the same order at all
objects they both access.
![Page 47: CSC 536 Lecture 3](https://reader035.fdocuments.net/reader035/viewer/2022081802/568164e0550346895dd74277/html5/thumbnails/47.jpg)
A non-serialized interleaving of operations of transactions T and U
Transaction T: Transaction U:
x = read(i)write(i, 10)
y = read(j)write(j, 30)
write(j, 20)z = read (i)
![Page 48: CSC 536 Lecture 3](https://reader035.fdocuments.net/reader035/viewer/2022081802/568164e0550346895dd74277/html5/thumbnails/48.jpg)
Recoverability of aborts
Aborted transactions must be prevented from affecting other concurrent transactions
Dirty readsCascading aborts
![Page 49: CSC 536 Lecture 3](https://reader035.fdocuments.net/reader035/viewer/2022081802/568164e0550346895dd74277/html5/thumbnails/49.jpg)
A dirty read when transaction T aborts
Transaction T: a.getBalance()a.setBalance(balance + 10)
Transaction U:a.getBalance()a.setBalance(balance + 20)
balance = a.getBalance() $100
a.setBalance(balance + 10) $110balance = a.getBalance() $110
a.setBalance(balance + 20) $130
commit transaction
abort transaction
![Page 50: CSC 536 Lecture 3](https://reader035.fdocuments.net/reader035/viewer/2022081802/568164e0550346895dd74277/html5/thumbnails/50.jpg)
Cascading aborts
Suppose:Transaction U has seen the effects of transaction TTransaction V has seen the effects of transaction UT decides to abort
![Page 51: CSC 536 Lecture 3](https://reader035.fdocuments.net/reader035/viewer/2022081802/568164e0550346895dd74277/html5/thumbnails/51.jpg)
Cascading aborts
Suppose:Transaction U has seen the effects of transaction TTransaction V has seen the effects of transaction UT decides to abort
V and U must abort
![Page 52: CSC 536 Lecture 3](https://reader035.fdocuments.net/reader035/viewer/2022081802/568164e0550346895dd74277/html5/thumbnails/52.jpg)
Transactions T and U with locksTransaction T: balance = b.getBalance()b.setBalance(bal*1.1)a.withdraw(bal/10)
Transaction U: balance = b.getBalance()b.setBalance(bal*1.1)c.withdraw(bal/10)
Operations Locks Operations Locks
openTransactionbal = b.getBalance() lock B
b.setBalance(bal*1.1) openTransaction
a.withdraw(bal/10) lock A bal = b.getBalance() waits for T’slock on B
closeTransaction unlock A, B lock B
b.setBalance(bal*1.1) c.withdraw(bal/10) lock C
closeTransaction unlock B, C
![Page 53: CSC 536 Lecture 3](https://reader035.fdocuments.net/reader035/viewer/2022081802/568164e0550346895dd74277/html5/thumbnails/53.jpg)
Two-phase locking (2)
Idea: the scheduler grants locks in a way that creates only serializable schedules.
In 2-phase-locking, the transaction acquires all the locks it needs in the first phase, and then releases them in the second. This will insure a serializable schedule.
Dirty reads and cascading aborts are still possible
![Page 54: CSC 536 Lecture 3](https://reader035.fdocuments.net/reader035/viewer/2022081802/568164e0550346895dd74277/html5/thumbnails/54.jpg)
Two-phase locking (2)
Idea: the scheduler grants locks in a way that creates only serializable schedules.
In 2-phase-locking, the transaction acquires all the locks it needs in the first phase, and then releases them in the second. This will insure a serializable schedule.
Dirty reads and cascading aborts are still possible
Under strict 2-phase locking, a transaction that needs to read or write an object must be delayed until other transactions that wrote the same object have committed or aborted
Locks are held until transaction commits or aborts
Example: CORBA Concurrency Control Service
![Page 55: CSC 536 Lecture 3](https://reader035.fdocuments.net/reader035/viewer/2022081802/568164e0550346895dd74277/html5/thumbnails/55.jpg)
Two-phase locking in a distributed system
The data is assumed to be distributed across multiple machines
Centralized 2PL: central scheduler grants locks
Primary 2PL: local scheduler is coordinator for local data
Distributed 2PL: (data may be replicated)the local schedulers use a distributed mutual exclusion algorithm to obtain a lockThe local scheduler forwards Read/Write operations to data managers holding the replicas
![Page 56: CSC 536 Lecture 3](https://reader035.fdocuments.net/reader035/viewer/2022081802/568164e0550346895dd74277/html5/thumbnails/56.jpg)
Two-phase locking issues
Exclusive locks reduce concurrency more than necessary. It is sometimes preferable to allow concurrent transactions to read an object; two types of locks may be needed (read locks and write locks)
Deadlocks are possible.Solution 1: acquire all locks in the same order.Solution 2: use a graph to detect potential deadlocks.
![Page 57: CSC 536 Lecture 3](https://reader035.fdocuments.net/reader035/viewer/2022081802/568164e0550346895dd74277/html5/thumbnails/57.jpg)
Deadlock with write locks
Transaction T Transaction U
Operations Locks Operations Locks
a.deposit(100); write lock A
b.deposit(200) write lock B
b.withdraw(100)waits for U’s a.withdraw(200); waits for T’s
lock on B lock on A
![Page 58: CSC 536 Lecture 3](https://reader035.fdocuments.net/reader035/viewer/2022081802/568164e0550346895dd74277/html5/thumbnails/58.jpg)
The wait-for graph
B
A
Waits for
Held by
Held by
T UU T
Waits for
![Page 59: CSC 536 Lecture 3](https://reader035.fdocuments.net/reader035/viewer/2022081802/568164e0550346895dd74277/html5/thumbnails/59.jpg)
A cycle in a wait-for graph
U
V
T
![Page 60: CSC 536 Lecture 3](https://reader035.fdocuments.net/reader035/viewer/2022081802/568164e0550346895dd74277/html5/thumbnails/60.jpg)
Deadlock prevention with timeouts
Transaction T Transaction U Operations Locks Operations Locks
a.deposit(100); write lock A
b.deposit(200) write lock B
b.withdraw(100)waits for U’s a.withdraw(200); waits for T’slock on B lock on A
(timeout elapses) T’s lock on A becomes vulnerable, unlock A, abort T
a.withdraw(200); write locks Aunlock A, B
![Page 61: CSC 536 Lecture 3](https://reader035.fdocuments.net/reader035/viewer/2022081802/568164e0550346895dd74277/html5/thumbnails/61.jpg)
Disadvantages of locking
High overhead
Deadlocks
Locks cannot be released until the end of the transaction, which reduces concurrency
In most applications, the likelihood of two clients accessing the same object is low
![Page 62: CSC 536 Lecture 3](https://reader035.fdocuments.net/reader035/viewer/2022081802/568164e0550346895dd74277/html5/thumbnails/62.jpg)
Pessimistic timestamp concurrency control
A transaction’s request to write an object is valid only if that object was last read and written by an earlier transaction
A transaction’s request to read an object is valid only if that object was last written by an earlier transaction
Advantage: Non-blocking and deadlock-free
Disadvantage: Transactions may need to abort and restart
![Page 63: CSC 536 Lecture 3](https://reader035.fdocuments.net/reader035/viewer/2022081802/568164e0550346895dd74277/html5/thumbnails/63.jpg)
Operation conflicts for timestamp ordering
Rule Tc Ti
1. write read Tc must not write an object that has been read by any Ti where this requires that Tc≥ the maximum read timestamp of the object.
2. write write Tc must not write an object that has been written by any Ti where
Ti >Tc
this requires that Tc> write timestamp of the committedobject.
3. read write Tc must not read an object that has been written by any Ti where this requires that Tc > write timestamp of the committed object.
Ti >Tc
Ti >Tc
![Page 64: CSC 536 Lecture 3](https://reader035.fdocuments.net/reader035/viewer/2022081802/568164e0550346895dd74277/html5/thumbnails/64.jpg)
Pessimistic Timestamp Ordering
Concurrency control using timestamps.
![Page 65: CSC 536 Lecture 3](https://reader035.fdocuments.net/reader035/viewer/2022081802/568164e0550346895dd74277/html5/thumbnails/65.jpg)
Optimistic timestamp ordering
Idea: just go ahead and do the operations without paying attention to what concurrent transactions are doing:
Keep track of when each data item has been read and written.Before committing, check whether any item has been changed since the transaction started. If so, abort. If not, commit.
Advantage: deadlock free and fast.Disadvatange: it can fail and transactions must be run again.Example: Scala Software Transactional Memory (next week)