1 Formal Models for Distributed Negotiations Commit Protocols Roberto Bruni Dipartimento di...
-
date post
22-Dec-2015 -
Category
Documents
-
view
212 -
download
0
Transcript of 1 Formal Models for Distributed Negotiations Commit Protocols Roberto Bruni Dipartimento di...
1
Formal Models forDistributed NegotiationsCommit Protocols
Roberto BruniDipartimento di Informatica Università di Pisa
XVII Escuela de Ciencias Informaticas (ECI 2003), Buenos Aires, July 21-26 2003
Formal Models for Distributed
Negotiations 2
Distributed DataBases Data can be inherently distributed
e.g. customers accounts in different branches of the same bank
Data are distributed to achieve failure independence e.g. replicated file systems
Partial failures can lead to inconsistent results Commits have to be coordinated among
participants to preserve data consistency
Formal Models for Distributed
Negotiations 3
Distributed DataBases
DBuser
user
useruser
user
Centralized Distributed
Formal Models for Distributed
Negotiations 4
Atomic Commitment Problem
Reach a globally consistent state despite failures Each participant has two possible decision values
commit All participants will make the transaction’s updates permanent
abort All will roll-back
Individual decisions are irreversible A commit decision requires unanimity of YES votes
Formal Models for Distributed
Negotiations 5
Atomic Commitment Properties
Consensus All participants that decide reach the same
decision If any participant decides commit, then
all participants must have voted YES If all participants have voted YES and no
failures occur, then commit is decided Irreversibility
Each participant decides at most once
Formal Models for Distributed
Negotiations 6
Commitment Protocols Atomic commitment protocol
satisfies all atomic commitment properties ensures that transactions terminate consistently
at all participating sites of a distributed database, even in presence of failures
Non-blocking if it permits transaction termination to proceed at
correct participants despite failures of others is the activity of ensuring that Sw and Hw failures do
not corrupt persistent data can limit time intervals of resource locking
Formal Models for Distributed
Negotiations 7
Some Assumptions One of the participants acts as unique coordinator
(centralized version) At most one (if no failures, then there is one
coordinator) A participant assumes the role of coordinator within a
fixed time interval from the beginning of the transaction
The transaction begins at a single participant called the invoker sends start messages to other participants
Only undeliverable messages are dropped All participants can communicate (useful later)
Formal Models for Distributed
Negotiations 8
Generic ACP: Coordinator send VOTE-REQ[Tid] to all participants set-timeout wait-for vote[Tid] from all participants
if (all votes are YES) then broadcast(commit[Tid], participants)
else // at least one vote is NO broadcast(abort[Tid], participants)
on-timeout: // escape blocking wait-for broadcast(abort[Tid], participants)
Phase 1
Phase 2
Formal Models for Distributed
Negotiations 9
Generic ACP: Participants set-timeout wait-for VOTE-REQ[Tid] from coordinator // 1
send vote[Tid] to coordinator if (vote==NO) then // unilateral abort
decide abort else
set-timeout wait-for decision from coordinator // 2
if (decision==abort) then decide abort else decide commit
on-timeout: termination-protocol // escape 2 on-timeout: decide abort //escape 1
Formal Models for Distributed
Negotiations 10
Simple Broadcast broadcast(m,S) // Broadcaster
send m to all processes in S deliver m
// other processes in S upon-receipt m // non-blocking
deliver m
This corresponds to the 2PC Protocol
Formal Models for Distributed
Negotiations 11
Timeout Actions Participants must wait
VOTE_REQ from coordinator If this takes too long can just decide abort
Coordinator collects votes No global decision is yet made Coordinator can decide abort
commit / abort from coordinator The participants already took a decision (YES) It is now uncertain It must consult other participants according to the
termination protocol
Formal Models for Distributed
Negotiations 12
Termination Protocol (TP) What if a participant that voted YES times out
waiting for the response from coordinator? It invokes a termination protocol to contact:
the coordinator other participants (cooperative TP)
can have already voted or not yet voted There are failure scenarios for which no termination
protocol can lead to a decision Blocking scenario: correct participants cannot decide e.g. coordinator crashes during broadcast
all faulty participants deliver and crash all correct participants do not deliver the decision if faulty participants do not recover any decision could
contradict the decision of a participant that crashed
Formal Models for Distributed
Negotiations 13
Non-Blocking ACP I set-timeout wait-for VOTE-REQ[Tid] from coordinator // 1
send vote[Tid] to coordinator if (vote==NO) then // unilateral abort
decide abort else
set-timeout wait-for decision from coordinator // 2
if (decision==abort) then decide abort else decide commit
on-timeout: decide abort // escape 2 on-timeout: decide abort //escape 1
Formal Models for Distributed
Negotiations 14
Non-Blocking ACP II broadcast(m,S) // Broadcaster as before // other processes in S
upon-first-receipt m send m to all processes in S // S can be sent along
VOTE_REQ deliver m
any process receiving m relays m to all others (if any correct process receives m, all correct process receive m, even if broadcaster crashes)
m is delivered only after relaying
Formal Models for Distributed
Negotiations 15
Recovery Participant p is recovering from a failure
Must reach a consistent decision Suppose p remembers its state at the time it
failed Before voting
it can unilaterally abort After deciding abort
it can unilaterally abort After receiving commit / abort from coordinator
it had already decided and must behave accordingly During the uncertainty period (voted YES)
Independent recovery is not possible! Termination protocol is needed
Formal Models for Distributed
Negotiations 16
Distributed Transaction Log
DTL is kept in stable storage at each site Its content must survive failures
Coordinators and participants at that site can record information about transactions Before/after sending VOTE_REQ, the coordinator C writes
start2PC(S,Tid) Before voting YES, a participant writes yes(C,S,Tid) Before/after voting NO, a participant writes abort(Tid) Before C sends commit, it writes commit(Tid) Before/after C sends abort, it writes abort(Tid) After receiving the decision, participant writes
commit/abort
Formal Models for Distributed
Negotiations 17
Recovery From DTL If DTL contains start2PC (the site hosted the
coordinator) If it also contains commit/abort
The coordinator decided before failure Otherwise
The coordinator can decide abort (and record it in DTL) Otherwise
It contains commit/abort The participant has reached decision before the failure
Does not contain yes Either failed before voting or voted no The participant can unilaterally abort
Otherwise (it contains yes but not commit/abort) The participant failed in its uncertainty period Must use the termination protocol
Formal Models for Distributed
Negotiations 18
Cooperative TP: Initiator send DECISION_REQ[Tid] to all processes in S wait-for decision[Tid] from any process
if (decision==commit) then write commit in DTL
else // decision==abort write abort in DTL
Formal Models for Distributed
Negotiations 19
Cooperative TP: Responder
wait-for decision[Tid] from any process p if (abort(Tid) in DTL) then
send abort to p else if (commit(Tid) in DTL) then
send commit to p
Formal Models for Distributed
Negotiations 20
Evaluation of 2PC Criteria: Reliability vs Efficiency
Resiliency What failures can be tolerated?
Blocking Can processes be blocked? Under which conditions?
Time Complexity How long does it take to reach a decision?
Message Complexity How many messages are exchanged to reach a
decision? What are their dimensions?
Formal Models for Distributed
Negotiations 21
Balancing Reliability and Efficiency are conflicting goals
each can be achieved at the expenses of the other The choice of protocol depends on which goal
is more important for a specific application Whatever protocol is chosen, we should
optimize for the case with no failures Hopefully the normal operating state of the system
Formal Models for Distributed
Negotiations 22
Measuring Time Complexity
A round is the max time for a message to reach its destination
Timeouts are based on the assumption that such a delay is known Note that many messages can be sent in a single round
Two messages must belong to different rounds iff one cannot be sent before the other is received
Rounds are taken as time units We count the number of rounds needed for unblocked sites to reach
a decision, in the worst case This neglects the time needed to process messages
Reasonable: messages delays usually exceed processing delays Other two factors can be relevant:
DTL management (on stable storage) Broadcasting preparation (to a large number of processes)
Formal Models for Distributed
Negotiations 23
Measuring Message Complexity
Number of messages sent during the whole protocol Reasonable measure if individual
messages are not very large Otherwise we should measure the length
of messages, not merely their number Here messages are short, so we abstract
away from their lengths
Formal Models for Distributed
Negotiations 24
Reliability of 2PC Resiliency
2PC is resilient to site failures communication failures
In fact, the cause of timeouts is not important
Blocking 2PC is subject to blocking
Probabilistic analysis can be performed depending on the probabilistic distribution of failures
Formal Models for Distributed
Negotiations 25
Time Complexity of 2PC In absence of failure, 2PC requires 3 rounds
Broadcast VOTE-REQ Collect votes Broadcast global decision
If failures happen, The TP may need 2 additional rounds Broadcast DECISION_REQ Reply from a process outside its uncertainty period
Note that several TPs can be initiated separately in the same round
Up to 5 rounds, independently from the number of failures! But processes may remain blocked for an unbounded period of time
Formal Models for Distributed
Negotiations 26
Message Complexity of 2PC
Let N+1 be the number of participants, including the coordinator
In each round of 2PC, there are N messages sent Hence, in absence of failures 2PC uses 3N messages Cooperative TP is invoked by all participants that
voted YES but did not receive commit / abort Let there be M such participants M initiators, each sending N DECISION_REQ (MN messages) At most N-M+1 processes will respond to the first request
In the worst case only one process abandons its uncertainty and will respond to another initiator: (N-M+1)+(N-M+2)+…+N
Formal Models for Distributed
Negotiations 27
Calculating the Message Complexity of 2PC
In the worst case the total number of TP messages will be:
NM + i=1 (N-M+i) = NM + NM – M2 + M(M+1)/2 = 2NM – M2/2 + M/2 messages This quantity is maximum when M=N
N(3N+1)/2 messages The 2PC together with worst-case TP amount to
3N + N(3N+1)/2 = N(3N+7)/2 messages
M
Formal Models for Distributed
Negotiations 28
Communication Topology The communication topology of a protocol
is the specification of who sends messages to whom e.g. in 2PC without TP, the coordinator sends
messages to participants and vice versa Participants do not send messages directly to each
other The topology is described as a tree of height 1
Coordinator
Participant Participant Participant Participant…
Formal Models for Distributed
Negotiations 29
Alternative 2PCs To reduce time and message complexity of
centralized 2PC, two variations have been proposed, based on different communication topologies Decentralized 2PC
Communication topology is a complete graph Improve time complexity
Linear 2PC (aka Nested 2PC) Linearly ordered processes Reduce the number of messages
Formal Models for Distributed
Negotiations 30
Decentralized 2PC Depending on its own vote, the coordinator sends YES
or NO to all participants Informs that it is time to vote Tells the coordinator’s vote
If the message is NO Each participant decides abort and stops
Otherwise, each participant sends back its vote to ALL OTHER PARTICIPANTS
After receiving all votes each process can decide autonomously
If all are YES and its own vote is YES, decide commit Otherwise it decides abort
Timeouts can be employed as in the centralized 2PC
Formal Models for Distributed
Negotiations 31
Evaluation of Decentralized 2PC
In the absence of failures, only 2 rounds are necessary Coordinator voting YES / NO Each participant voting YES / NO
More messages are needed: N2+N messages N messages in the first round N2 messages in the second round (and this is just in absence of failures)
Formal Models for Distributed
Negotiations 32
Linear 2PC Each participant can communicate only with its left
/ right neighbors The coordinator is the leftmost process
It sends its vote YES / NO to its right neighbor This message has a dual meaning as in decentralized 2PC
Each participant p waits for the vote from its left neighbor If it is YES, and p votes YES, then p tells YES to its right
neighbor Otherwise, p tells NO to its right neighbor
When the rightmost participant receives the vote, it makes the final decision commit / abort
The decision is propagated from right to left When the coordinator receives it, the protocol ends
Timeout periods are influenced by positions
Formal Models for Distributed
Negotiations 33
Evaluation of Linear 2PC
Only 2N messages needed N votes from left to right N decisions from right to left (and this is just in absence of failures)
Unfortunately the same amount of rounds is needed: 2N rounds No two messages are sent concurrently
Formal Models for Distributed
Negotiations 34
Comparison of 2PC Variants
Rounds
Messages
Centralized 2PC 3 3N
Decentralized 2PC
2 N2+N
Linear 2PC 2N 2N Hybrid communication topologies are also possible e.g. Linear for voting, complete for conveying decision
2N messages, N+1 rounds
The choice of the protocol might be influenced by the available communication topology
Formal Models for Distributed
Negotiations 35
From 2PC to 3PC In 2PC, if all operational participants are
uncertain, they are blocked They cannot decide abort even if aware that
processes they cannot communicate with have failed, because some of them could have decided commit before failure
The 3CP is an ACP designed to rule out this situation It guarantees that if any operational process is uncertain, then
no (operational / failed) process can have decided commit Thus, if p realizes that any operational site is uncertain, then p
can decide abort Why does 2PC violate this property?
A participant p can receive commit while q is still uncertain
Formal Models for Distributed
Negotiations 36
Sketch of 3PC: The Idea After the coordinator has found that all votes were
YES, it sends pre-commit messages to all participants When a participant p receives pre-commit, it knows
that all participants voted YES p is no longer uncertain, but does not decide commit yet p knows that it will decide commit unless it fails p acknowledges the receipt of pre-commit
When the coordinator collects all acks it knows that no participant is uncertain
The coordinator sends commit to all participants When a participant receives commit, it decides commit If a participant voted NO, then 3PC behaves as 2PC
Formal Models for Distributed
Negotiations 37
Sketch of 3PC: Some Notes
In absence of failures, 3PC involves 5 rounds and up to 5N messages
Participants have four possible states Aborted, Uncertain, Committable, Committed For p and q any two participants, only certain combinations of
their states are possible Timeouts can occur in five situations
3 are trivially handled 2 require a complex termination protocol
Election protocol (for a new coordinator) based on a linear ordering of participants
The new coordinator checks the states of all operational participants
Timeouts are again necessary
Formal Models for Distributed
Negotiations 38
Recap We have seen
Atomic Commitment Problem Several ACP protocols
Generic ACP Centralized 2PC (Good middle ground) Non-Blocking ACP Decentralized 2PC (OK if end-to-end delays must be
minimized) Linear 2PC (OK if messages are expensive) 3PC (sketched)
Learned some criteria to evaluate and compare protocols
Usually also dependent on the communication topology