Commitment & Consensus Kai Engelhardtcs3151/17s2/lec/PDF/lecture12b.pdfDistributed Programs...
Transcript of Commitment & Consensus Kai Engelhardtcs3151/17s2/lec/PDF/lecture12b.pdfDistributed Programs...
Distributed Programs Commitment Consensus
COMP3151/9151 Foundations ofConcurrency Lecture 12a’
Commitment & Consensus
Kai Engelhardt
CSE, UNSW (and data61)
Revision: 1.2 of Date: 2017/10/11 02:18:25 UTC
1
Distributed Programs Commitment Consensus
Quote of the day
A distributed system is one in which the failure of a computer youdidn’t even know existed can render your own computerunusable. L. Lamport, 1987
2
Distributed Programs Commitment Consensus
Sending and Receiving Messages
node 5
integer k ← 20send(request, 3, k, 30)
node 3
integer m, nreceive(request, m, n)
-
Senders are anonymous be default. Messages can be chosen based on tags(eg request) just as in MPI.
3
Distributed Programs Commitment Consensus
Sending a Message and Expecting a Reply
node 5
send(request, 3, myID)
node 3
integer sourcereceive(request, source)
-
Senders can reveal their identity.
4
Distributed Programs Commitment Consensus
Assumptions
Atomicity of local code (if we have more than one process at a node)
Network topology = full graph
Ben-Ari’s basic model is Rel: reliable asynchronous message passing withpossible reordering of messages.
5
Distributed Programs Commitment Consensus
Architecture for a Fault-Tolerant System
Pressure
Temperature
������������������������
CPU
CPU
CPU
�
����3
-
JJJJJJJ
QQQQs
-HHH
Hj
����*
-�� ��Comparator -
Throttle
����
6
Distributed Programs Commitment Consensus
Failure Models
In the literature there are many more, but we focus on two extremes:
Crash failures failed nodes stop sending messages
Byzantine failures failed nodes can send arbitrary messages, evenaccording to a malevolent plan
7
Distributed Programs Commitment Consensus
Commitment Problem
Suppose n agents collaborate on a database transaction.
After each of the agents has done its share of the transaction, they wantto come to an agreement on whether to commit the transaction for lateruse of the result by other transactions.
Each agent can be assumed to have formed an initial vote but not yetmade a final decision.
All that remains to be done is to ensure that no two agents make differentdecisions.
8
Distributed Programs Commitment Consensus
Commitment Spec
1 All agents that reach a decision reach the same one.
2 If an agent decides to commit, then all agents voted to commit.
3 If there are no failures and all agents voted to commit, then thedecision reached is to commit.
Failure model: Only agents can fail, and if they do then they crash.
9
Distributed Programs Commitment Consensus
2-Phase Commit Algorithm
One distinguished agent, say, agent 1, collects the other agents’ votes. Ifall votes (including agent 1’s) are “commit” then agent 1 tells all otheragents to commit, otherwise, to abort. Note that “otherwise” could alsomean that some agent crashed before being able to communicate its voteto agent 1.
Theorem
2-Phase Commit solves the commitment problem but may fail to terminateif processes fail.
For stronger solutions, e.g. 3-Phase Commit, see [Lyn96].
10
Distributed Programs Commitment Consensus
1st Consensus Problem
A group of Byzantine armies is surrounding an enemy city. Theballance of force is such that if they all attack together, they cancapture the city; otherwise they must retreat in order to avoiddefeat.
So far this resembles Exercise 9.2 scaled to n generals instead of 2.
The generals of the armies have reliable messengers whosuccessfully deliver any mesage sent from one general to another.However, some of the generals may be traitors endeavouring tobring about defeat of the Byzantine armies.
11
Distributed Programs Commitment Consensus
Bad NewsQuoting and adapting from the PODC webpage for the influential paperaward 2001 (AKA Dijkstra Prize from 2003 onward):
Theorem ([FLP85])
It is impossible for a set of processors in an asynchronous distributedsystem to agree on a binary value, even if only a single processor is subjectto an unannounced crash.
Proof sketch
Initially, either decision, 0 or 1, is possible. Assume a correct algorithm. Atsome point in time the system as a whole must commit to one value or theother. That commitment must result from some action of a single process.Suppose that process fails. Then there is no way for the other processes toknow the commitment value; hence, they will sometimes make the wrongdecision. Contradiction!
In order to get anywhere near consensus in the presence of faults, we needa bit of synchrony. With some synchrony, we can do much better and eventolerate some Byzantine faults.
12
Distributed Programs Commitment Consensus
Consider
crash failures (assumed detectable eg by time-out) and
Byzantine failures (detectable at most indirectly).
Task
Devise an algorithm so that the loyal generals come to a consensus on aplan. The final decision should be almost the same as a majority vote oftheir initial choices; if the vote is tied, the final decision is to retreat.
13
Distributed Programs Commitment Consensus
planType = {A,R} for attack and retreat.
Algorithm 10.1: Consensus—one-round algorithmplanType finalPlanplanType array[generals] plan
p1: plan[myID] ← chooseAttackOrRetreatp2: for all other generals Gp3: send(G, myID, plan[myID])p4: for all other generals Gp5: receive(G, plan[G])p6: finalPlan ← majority(plan)
14
Distributed Programs Commitment Consensus
Messages Sent in a One-Round Algorithm
Suppose Basil crashed after sending to Leo but before sending to Zoe.
Zoe A Leo R
Basil A
-�
����� @
@@@R @@
@@@I
R
A
A A R
��
���
–
15
Distributed Programs Commitment Consensus
Messages Sent in a One-Round Algorithm
Suppose Basil crashed after sending to Leo but before sending to Zoe.
Zoe A Leo R
Basil A
-�
����� @
@@@R @@
@@@I
R
A
A A R
��
���
–
16
Distributed Programs Commitment Consensus
Data Structures in a One-Round Algorithm
Leogeneral planBasil ALeo RZoe Amajority
Zoegeneral plansBasil –Leo RZoe Amajority
Thus, the One-Round Algorithm cannot even tolerate a single crash failureamong 3 generals.
Idea
Relay received messages in a further round.
17
Distributed Programs Commitment Consensus
Data Structures in a One-Round Algorithm
Leogeneral planBasil ALeo RZoe Amajority A
Zoegeneral plansBasil –Leo RZoe Amajority R
Thus, the One-Round Algorithm cannot even tolerate a single crash failureamong 3 generals.
Idea
Relay received messages in a further round.
18
Distributed Programs Commitment Consensus
Algorithm 10.2: Consensus—Byzantine Generals algorithmplanType finalPlanplanType array[generals] plan, majorityPlanplanType array[generals, generals] reportedPlan
p1: plan[myID] ← chooseAttackOrRetreatp2: for all other generals G // First roundp3: send(G, myID, plan[myID])p4: for all other generals Gp5: receive(G, plan[G])p6: for all other generals G // Second roundp7: for all other generals G’ except Gp8: send(G’, myID, G, plan[G])p9: for all other generals Gp10: for all other generals G’ except Gp11: receive(G, G’, reportedPlan[G, G’])p12: for all other generals G // First votep13: majorityPlan[G] ← majority(plan[G] ∪ reportedPlan[*, G])p14: majorityPlan[myID] ← plan[myID] // Second votep15: finalPlan ← majority(majorityPlan)
19
Distributed Programs Commitment Consensus
Data Structure for Crash Failure - First Scenario(Leo)
Leogeneral plan reported by majority
Basil ZoeBasil A – ALeo R RZoe A – Amajority A
20
Distributed Programs Commitment Consensus
Data Structure for Crash Failure - First Scenario(Zoe)
Zoegeneral plan reported by majority
Basil LeoBasil – A ALeo R – RZoe A Amajority A
21
Distributed Programs Commitment Consensus
Data Structure for Crash Failure - Second Scenario(Leo)
Leogeneral plan reported by majority
Basil ZoeBasil A A ALeo R RZoe A A Amajority A
22
Distributed Programs Commitment Consensus
Data Structure for Crash Failure - Second Scenario(Zoe)
Zoegeneral plan reported by majority
Basil LeoBasil A A ALeo R – RZoe A Amajority A
23
Distributed Programs Commitment Consensus
Knowledge Tree about Basil for Crash Failure - FirstScenario
Suppose Basil has chosen A and sends all messages.
Leo A Zoe A
Zoe A Leo A
Basil A
����
HHHH
A concise representation of what is known about the root node of the tree;in traditional epistemic logic with a knowledge modality Ki and attackpredicate Ai for each general i we’d write:
ABasil ∧ KZoeABasil ∧ KLeoABasil ∧ KLeoKZoeABasil ∧ KZoeKLeoABasil
24
Distributed Programs Commitment Consensus
Knowledge Tree about Basil for Crash Failure -Second Scenario
Suppose Basil has chosen X ∈ {A,R} and crashes right before sending the1st round message to Zoe but after sending it to Leo.
Zoe X
Leo X
Basil X
HHHH
XBasil ∧ KLeoXBasil ∧ KZoeKLeoXBasil
25
Distributed Programs Commitment Consensus
Knowledge Tree about Leo for Crash Failure
Suppose Leo has chosen X and Basil crashes anytime before sending the2nd round message to Zoe.
Basil X
Zoe X Basil X
Leo X
����
HHHH
XLeo ∧ KZoeXLeo ∧ KBasilXLeo ∧ KBasilKZoeXLeo
26
Distributed Programs Commitment Consensus
Messages Sent for Byzantine Failure with ThreeGenerals
Zoe A Leo R
Basil
-������
��
���
@@@@R @@
@@@I
R
A
R A A R
27
Distributed Programs Commitment Consensus
Data Stuctures for Leo and Zoe After First Round
Leogeneral plansBasil ALeo RZoe Amajority A
Zoegeneral plansBasil RLeo RZoe Amajority R
28
Distributed Programs Commitment Consensus
Data Stuctures for Leo After Second Round
Leogeneral plans reported by majority
Basil ZoeBasil A A ALeo R RZoe A R Rmajority R
29
Distributed Programs Commitment Consensus
Data Stuctures for Zoe After Second Round
Zoegeneral plans reported by majority
Basil LeoBasil A A ALeo R R RZoe A Amajority A
Thus the Byzantine Generals Algorithm is incorrect for 3 generals of which1 is a traitor.
30
Distributed Programs Commitment Consensus
Knowledge Tree About Zoe
Basil A
Basil A
Zoe A
����
HHHH
Leo R
Leo A
31
Distributed Programs Commitment Consensus
Four Generals: Data Structure of Basil (1)
Basilgeneral plan reported by majority
John Leo ZoeBasil A AJohn A A ? ALeo R R ? RZoe ? ? ? ?majority ?
32
Distributed Programs Commitment Consensus
Four Generals: Data Structure of Basil (2)
Basilgeneral plans reported by majority
John Leo ZoeBasil A AJohn A A ? ALeo R R ? RZoe R A R R
R
33
Distributed Programs Commitment Consensus
Knowledge Tree About Loyal General Leo
Leo X
Basil X John X Zoe X
John X Zoe X Basil X Zoe X Basil Y John Z
��
��
��
AA
AA
AA
�����
���
HHHHH
HHH
Note that, with Byzantine failures, trees may represent possibly untrueknowledge (AKA belief in the literature).
34
Distributed Programs Commitment Consensus
Knowledge Tree About Traitor Zoe
Zoe
Basil X John Y Leo Z
John X Leo X Basil Y Leo Y Basil Z John Z
��
��
��
AA
AA
AA
������
��
HHHHHHHH
35
Distributed Programs Commitment Consensus
Complexity of the Byzantine Generals Algorithm
Generalising to any number of generals works as long a round of messagesis added for every extra traitor and as long as there are at least n = 3t + 1generals, where t is the maximum number of treacherous generals.
The resulting numbers of messages are O(n4).
traitors generals messages1 4 362 7 3923 10 17904 13 5408
36
Distributed Programs Commitment Consensus
Algorithm 10.3: Consensus - flooding algorithmplanType finalPlanset of planType plan ← { chooseAttackOrRetreat }set of planType receivedPlan
p1: do t + 1 timesp2: for all other generals Gp3: send(G, plan)p4: for all other generals Gp5: receive(G, receivedPlan)p6: plan ← plan ∪ receivedPlanp7: finalPlan ← majority(plan)
37
Distributed Programs Commitment Consensus
Flooding Algorithm with No Crash:What Zoe Knows about Leo
Leo X
? ?
Zoe X
Zoe X
Zoe X �
�
�
John X
Basil X
Basil X
John X
Zoe X Zoe X
Zoe X
���
@@@R
??
Zoe X Zoe X
Zoe X
���
@@@R
??
38
Distributed Programs Commitment Consensus
Flooding Algorithm with Crash: Knowledge TreeAbout Leo (1)
Leo X
?
John X
Basil X
Zoe X Zoe X
Zoe X
���
@@@R
??
39
Distributed Programs Commitment Consensus
Flooding Algorithm with Crash: Knowledge TreeAbout Leo (2)
Leo X
?
John X
Basil X
Zoe X
@@@R
?
40
Distributed Programs Commitment Consensus
Algorithm 10.4: Consensus - King algorithmplanType finalPlan, myMajority, kingPlanplanType array[generals] planinteger votesMajority
p1: plan[myID] ← chooseAttackOrRetreat
p2: do two timesp3: for all other generals G // First and third roundsp4: send(G, myID, plan[myID])p5: for all other generals Gp6: receive(G, plan[G])p7: myMajority ← majority(plan)p8: votesMajority ← number of votes for myMajority
41
Distributed Programs Commitment Consensus
Algorithm 10.4: Consensus - King algorithm (continued)
p9: if my turn to be king // Second and fourth roundsp10: for all other generals Gp11: send(G, myID, myMajority)p12: plan[myID] ← myMajority
elsep13: receive(kingID, kingPlan)p14: if votesMajority > 3p15: plan[myID] ← myMajority
elsep16: plan[myID] ← kingPlan
p17: finalPlan ← plan[myID] // Final decision
42
Distributed Programs Commitment Consensus
Scenario for King Algorithm - First King LoyalGeneral Zoe (1)
BasilBasil John Leo Mike Zoe myMajority votesMajority kingPlan
A A R R R R 3
JohnBasil John Leo Mike Zoe myMajority votesMajority kingPlan
A A R A R A 3
LeoBasil John Leo Mike Zoe myMajority votesMajority kingPlan
A A R A R A 3
ZoeBasil John Leo Mike Zoe myMajority votesMajority kingPlan
A A R R R R 3
43
Distributed Programs Commitment Consensus
Scenario for King Algorithm - First King LoyalGeneral Zoe (2)
BasilBasil John Leo Mike Zoe myMajority votesMajority kingPlan
R R
JohnBasil John Leo Mike Zoe myMajority votesMajority kingPlan
R R
LeoBasil John Leo Mike Zoe myMajority votesMajority kingPlan
R R
ZoeBasil John Leo Mike Zoe myMajority votesMajority kingPlan
R
44
Distributed Programs Commitment Consensus
Scenario for King Algorithm - First King LoyalGeneral Zoe (3)
BasilBasil John Leo Mike Zoe myMajority votesMajority kingPlan
R R R ? R R 4–5
JohnBasil John Leo Mike Zoe myMajority votesMajority kingPlan
R R R ? R R 4–5
LeoBasil John Leo Mike Zoe myMajority votesMajority kingPlan
R R R ? R R 4–5
ZoeBasil John Leo Mike Zoe myMajority votesMajority kingPlan
R R R ? R R 4–5
45
Distributed Programs Commitment Consensus
Scenario for King Algorithm - First King TraitorMike (1)
BasilBasil John Leo Mike Zoe myMajority votesMajority kingPlan
R R
JohnBasil John Leo Mike Zoe myMajority votesMajority kingPlan
A A
LeoBasil John Leo Mike Zoe myMajority votesMajority kingPlan
A A
ZoeBasil John Leo Mike Zoe myMajority votesMajority kingPlan
R R
46
Distributed Programs Commitment Consensus
Scenario for King Algorithm - First King TraitorMike (2)
BasilBasil John Leo Mike Zoe myMajority votesMajority kingPlan
R A A ? R ? 3
JohnBasil John Leo Mike Zoe myMajority votesMajority kingPlan
R A A ? R ? 3
LeoBasil John Leo Mike Zoe myMajority votesMajority kingPlan
R A A ? R ? 3
ZoeBasil John Leo Mike Zoe myMajority votesMajority kingPlan
R A A ? R ? 3
47
Distributed Programs Commitment Consensus
Scenario for King Algorithm - First King TraitorMike (3)
BasilBasil John Leo Mike Zoe myMajority votesMajority kingPlan
A A
JohnBasil John Leo Mike Zoe myMajority votesMajority kingPlan
A A
LeoBasil John Leo Mike Zoe myMajority votesMajority kingPlan
A A
ZoeBasil John Leo Mike Zoe myMajority votesMajority kingPlan
A
48
Distributed Programs Commitment Consensus
Complexity of Byzantine Generals and KingAlgorithms
traitors BG generals BG messages King generals King messages1 4 36 5 482 7 392 9 2403 10 1790 13 6724 13 5408 17 1440
49
Distributed Programs Commitment Consensus
Impossibility Results
The bounds known for (simultaneous) Byzantine agreement are tight.
Theorem
Simultaneous Byzantine agreement for n generals of which up to t aretraitors requires n > 3t and at least t + 1 rounds of communication.
50
Distributed Programs Commitment Consensus
Impossibility with Three Generals (1)
We check the condition n > 3t in the special case of t = 1. John is thetraitor. Zoe has plan X and Leo has plan Y. Their knowledge trees are:
Zoe X
JohnLeox1, . . . , xn
Leou1, . . . , um
���
@@@
Leo Y
JohnZoey1, . . . , yn
Zoev1, . . . , vm
���
@@@
where the ui are the messages John sent to Leo about his own and Zoe’splans, etc.
51
Distributed Programs Commitment Consensus
Impossibility with Three Generals (2)
Since the loyal generals are required to correctly infer the plans of otherloyal generals (by applying some function f to their state), we have thatLeo reasons about Zoe’s plan
f (x1, . . . , xn, u1, . . . , um) = X (1)
and that Zoe reasons about Leo’s plan
f (y1, . . . , yn, v1, . . . , vn) = Y (2)
If X 6= Y , then Zoe and Leo must also come to a consensus on John’splan.
52
Distributed Programs Commitment Consensus
Impossibility with Three Generals (3)
The traitor John can use yi for ui and xi for vi , which leads to the loyalgenerals thinking
John
Zoex1, . . . , xn
Zoey1, . . . , yn
Leox1, . . . , xn
Leoy1, . . . , yn
���
@@@
about John’s plan. Using those messages in (1) and (2) gives X = Y ,contradiction!
53
Distributed Programs Commitment Consensus
One round is not enough
Theorem
Let n ≥ 3. Then there is no n-process simultaneous crash-failureagreement algorithm that tolerates one fault in which all loyal generalsalways decide by the end of round 1.
Corollary
Let n ≥ 3. Then there is no n-process simultaneous Byzantine agreementalgorithm that tolerates one traitor in which all loyal generals alwaysdecide by the end of round 1.
54
Distributed Programs Commitment Consensus
References I
Michael J. Fischer, Nancy A. Lynch, and Michael S. Paterson.Impossibility of distributed consensus with one faulty process.Journal of the ACM, 32(2):374–382, 1985.
Nancy A. Lynch.Distributed Algorithms.Morgan Kaufmann, 1996.
55