S-Paxos: Eliminating the Leader Bottleneck Martin Biely, Zarko Milosevic, Nuno Santos, André...
-
Upload
laureen-lloyd -
Category
Documents
-
view
235 -
download
9
Transcript of S-Paxos: Eliminating the Leader Bottleneck Martin Biely, Zarko Milosevic, Nuno Santos, André...
![Page 1: S-Paxos: Eliminating the Leader Bottleneck Martin Biely, Zarko Milosevic, Nuno Santos, André Schiper Ecole Polytechnique Fédérale de Lausanne (EPFL) Switzerland.](https://reader036.fdocuments.net/reader036/viewer/2022062314/56649f295503460f94c42141/html5/thumbnails/1.jpg)
S-Paxos: Eliminating the Leader Bottleneck
Martin Biely, Zarko Milosevic, Nuno Santos, André Schiper
Ecole Polytechnique Fédérale de Lausanne (EPFL)
Switzerland
October 9, 2012
![Page 2: S-Paxos: Eliminating the Leader Bottleneck Martin Biely, Zarko Milosevic, Nuno Santos, André Schiper Ecole Polytechnique Fédérale de Lausanne (EPFL) Switzerland.](https://reader036.fdocuments.net/reader036/viewer/2022062314/56649f295503460f94c42141/html5/thumbnails/2.jpg)
Context: State Machine Replication
Nuno Santos Context and Motivation 2
Replicated Service
Service Service Service
Ordering protocol (Paxos)
Clients
![Page 3: S-Paxos: Eliminating the Leader Bottleneck Martin Biely, Zarko Milosevic, Nuno Santos, André Schiper Ecole Polytechnique Fédérale de Lausanne (EPFL) Switzerland.](https://reader036.fdocuments.net/reader036/viewer/2022062314/56649f295503460f94c42141/html5/thumbnails/3.jpg)
The Paxos Protocol
Nuno Santos Context and Motivation 3
• Observation: leader receives and sends more messages than the followers• Potential system bottleneck…
• Paxos is a leader-based protocol• A distinguished process (leader) coordinates the others (followers)
![Page 4: S-Paxos: Eliminating the Leader Bottleneck Martin Biely, Zarko Milosevic, Nuno Santos, André Schiper Ecole Polytechnique Fédérale de Lausanne (EPFL) Switzerland.](https://reader036.fdocuments.net/reader036/viewer/2022062314/56649f295503460f94c42141/html5/thumbnails/4.jpg)
Paxos Performance
Nuno Santos Context and Motivation 4
Experimental settings•JPaxos – implementation of Paxos in Java (protocol shown previously)•n=3, request size=20 bytes, CPU 2x2cores @2.2Ghz
The bottleneck in Paxos is typically the leader
![Page 5: S-Paxos: Eliminating the Leader Bottleneck Martin Biely, Zarko Milosevic, Nuno Santos, André Schiper Ecole Polytechnique Fédérale de Lausanne (EPFL) Switzerland.](https://reader036.fdocuments.net/reader036/viewer/2022062314/56649f295503460f94c42141/html5/thumbnails/5.jpg)
Paxos is Leader-centric• Leader-centric protocol
• The leader does considerably more work than the followers• Therefore, the leader is prone to being the system bottleneck
• Paxos and most leader-based protocols are also leader-centric
Nuno Santos Context and Motivation 5
![Page 6: S-Paxos: Eliminating the Leader Bottleneck Martin Biely, Zarko Milosevic, Nuno Santos, André Schiper Ecole Polytechnique Fédérale de Lausanne (EPFL) Switzerland.](https://reader036.fdocuments.net/reader036/viewer/2022062314/56649f295503460f94c42141/html5/thumbnails/6.jpg)
Leader-based vs Leader-centric• Note that leader-based ≠ leader-centric
• Leader-based – algorithmic concept, leader is a distinguished process
• Leader-centric – resource usage, leader is a bottleneck
Nuno Santos Context and Motivation 6
Question: do leader-based protocols like Paxos must also be leader-centric?
![Page 7: S-Paxos: Eliminating the Leader Bottleneck Martin Biely, Zarko Milosevic, Nuno Santos, André Schiper Ecole Polytechnique Fédérale de Lausanne (EPFL) Switzerland.](https://reader036.fdocuments.net/reader036/viewer/2022062314/56649f295503460f94c42141/html5/thumbnails/7.jpg)
S-PAXOS OVERVIEWLeader-based but not leader-centric
Nuno Santos 7
![Page 8: S-Paxos: Eliminating the Leader Bottleneck Martin Biely, Zarko Milosevic, Nuno Santos, André Schiper Ecole Polytechnique Fédérale de Lausanne (EPFL) Switzerland.](https://reader036.fdocuments.net/reader036/viewer/2022062314/56649f295503460f94c42141/html5/thumbnails/8.jpg)
Why Paxos is Leader-centric• Leader does the following
• Receives requests from clients• Coordinates protocol to order requests• Replies to clients
• Followers do much less• Receive client requests from leader• Acknowledge order proposed by leader
• Underlying problem: unbalanced resource utilization• Leader runs out of resources (CPU, network bandwidth)• While followers are lightly loaded
Nuno Santos S-Paxos Overview 8
![Page 9: S-Paxos: Eliminating the Leader Bottleneck Martin Biely, Zarko Milosevic, Nuno Santos, André Schiper Ecole Polytechnique Fédérale de Lausanne (EPFL) Switzerland.](https://reader036.fdocuments.net/reader036/viewer/2022062314/56649f295503460f94c42141/html5/thumbnails/9.jpg)
S-Paxos: A Balanced Paxos Variant• S-Paxos balances workload across replicas
• Leader and followers have similar resource usage• The full resources of all replicas become available to the ordering
protocol
• S-Paxos is leader-based but not leader-centric
• Combines several well-known ideas in a novel way• All replicas handle client communication• All replicas disseminate requests• Ordering done on IDs
Nuno Santos S-Paxos Overview 9
![Page 10: S-Paxos: Eliminating the Leader Bottleneck Martin Biely, Zarko Milosevic, Nuno Santos, André Schiper Ecole Polytechnique Fédérale de Lausanne (EPFL) Switzerland.](https://reader036.fdocuments.net/reader036/viewer/2022062314/56649f295503460f94c42141/html5/thumbnails/10.jpg)
S-Paxos key ideasDistribute client communication
• Commonly used in practice• For instance, ZooKeeper
• But by itself, still leader-centric• Leader runs the ordering protocol on
requests (Phase 2a messages of Paxos) • Followers have to forward requests to
leader• And hence, sends requests to other
followers
Nuno Santos S-Paxos Overview 10
All replicas handle client communication
![Page 11: S-Paxos: Eliminating the Leader Bottleneck Martin Biely, Zarko Milosevic, Nuno Santos, André Schiper Ecole Polytechnique Fédérale de Lausanne (EPFL) Switzerland.](https://reader036.fdocuments.net/reader036/viewer/2022062314/56649f295503460f94c42141/html5/thumbnails/11.jpg)
S-Paxos key ideasDistribute request dissemination
• Note that Phase 2a messages have a dual purpose• Dissemination of requests• Establishing order
• All replicas disseminate requests• Ordering performed on IDs
Nuno Santos S-Paxos Overview 11
S-Paxos separates dissemination from ordering
![Page 12: S-Paxos: Eliminating the Leader Bottleneck Martin Biely, Zarko Milosevic, Nuno Santos, André Schiper Ecole Polytechnique Fédérale de Lausanne (EPFL) Switzerland.](https://reader036.fdocuments.net/reader036/viewer/2022062314/56649f295503460f94c42141/html5/thumbnails/12.jpg)
S-Paxos Architecture and Data Flow
Nuno Santos S-Paxos Overview 12
![Page 13: S-Paxos: Eliminating the Leader Bottleneck Martin Biely, Zarko Milosevic, Nuno Santos, André Schiper Ecole Polytechnique Fédérale de Lausanne (EPFL) Switzerland.](https://reader036.fdocuments.net/reader036/viewer/2022062314/56649f295503460f94c42141/html5/thumbnails/13.jpg)
S-Paxos balances work among replicas• Client communication and request
dissemination usually the bulk of the load• In S-Paxos this task is performed by all
replicas
• Leader still has to coordinate ordering protocol• But IDs are small messages• So leader has minimal additional overhead
Nuno Santos S-Paxos Overview 13
• Two levels of batching to further reduce load on leader• Dissemination layer: batch client requests and use ordering layer to
order ids of batches• Ordering layer: usual Paxos batching, in this case batches of batch
ids.
![Page 14: S-Paxos: Eliminating the Leader Bottleneck Martin Biely, Zarko Milosevic, Nuno Santos, André Schiper Ecole Polytechnique Fédérale de Lausanne (EPFL) Switzerland.](https://reader036.fdocuments.net/reader036/viewer/2022062314/56649f295503460f94c42141/html5/thumbnails/14.jpg)
Benefits in the presence of faults• Faster view change
• Since IDs are small, Phase 1 of Paxos completes quickly
• Failures affecting the leader have less impact on throughput• Ordering protocol is interrupted, but dissemination protocol
continues among working replicas• When a correct leader emerges, it can quickly order the IDs of the
requests that were disseminated while there was no leader
Nuno Santos S-Paxos Overview 14
![Page 15: S-Paxos: Eliminating the Leader Bottleneck Martin Biely, Zarko Milosevic, Nuno Santos, André Schiper Ecole Polytechnique Fédérale de Lausanne (EPFL) Switzerland.](https://reader036.fdocuments.net/reader036/viewer/2022062314/56649f295503460f94c42141/html5/thumbnails/15.jpg)
DISSEMINATION LAYER PROTOCOL
Nuno Santos 15
![Page 16: S-Paxos: Eliminating the Leader Bottleneck Martin Biely, Zarko Milosevic, Nuno Santos, André Schiper Ecole Polytechnique Fédérale de Lausanne (EPFL) Switzerland.](https://reader036.fdocuments.net/reader036/viewer/2022062314/56649f295503460f94c42141/html5/thumbnails/16.jpg)
Dissemination Layer Overview• Dissemination layer tasks
1) Receive requests from clients
2) Disseminate requests and IDs to all replicas
3) Initiate ordering of IDs
4) Execute requests in the order established for IDs
• Challenges• Once an ID is decided, the corresponding
request must remain available in the system• Coordinate view change between ordering and
dissemination layers to ensure that ids are ordered once-and-only once
Nuno Santos Dissemination Layer Protocol 16
2 2
1
3 4
![Page 17: S-Paxos: Eliminating the Leader Bottleneck Martin Biely, Zarko Milosevic, Nuno Santos, André Schiper Ecole Polytechnique Fédérale de Lausanne (EPFL) Switzerland.](https://reader036.fdocuments.net/reader036/viewer/2022062314/56649f295503460f94c42141/html5/thumbnails/17.jpg)
Overview of the Protocol
Nuno Santos Dissemination Layer Protocol 17
![Page 18: S-Paxos: Eliminating the Leader Bottleneck Martin Biely, Zarko Milosevic, Nuno Santos, André Schiper Ecole Polytechnique Fédérale de Lausanne (EPFL) Switzerland.](https://reader036.fdocuments.net/reader036/viewer/2022062314/56649f295503460f94c42141/html5/thumbnails/18.jpg)
PERFORMANCE EVALUATION
Nuno Santos 18
![Page 19: S-Paxos: Eliminating the Leader Bottleneck Martin Biely, Zarko Milosevic, Nuno Santos, André Schiper Ecole Polytechnique Fédérale de Lausanne (EPFL) Switzerland.](https://reader036.fdocuments.net/reader036/viewer/2022062314/56649f295503460f94c42141/html5/thumbnails/19.jpg)
Performance Evaluation• S-Paxos implemented on top of JPaxos, a Java implementation of
Paxos
• Experiments compare • JPaxos (leader-centric)• S-Paxos (non leader-centric)
• Testbed: Grid 5000 (helios cluster)• CPU: 2x2-cores @ 2.2Ghz• Network: 1Gbit Ethernet
• Experimental parameters• Request size: 20 bytes• Batch size
• S-Paxos: dissemination layer 1450 bytes, ordering layer: 50 bytes• JPaxos: 1450 bytes
• Null service
Nuno Santos Experimental Evaluation 19
![Page 20: S-Paxos: Eliminating the Leader Bottleneck Martin Biely, Zarko Milosevic, Nuno Santos, André Schiper Ecole Polytechnique Fédérale de Lausanne (EPFL) Switzerland.](https://reader036.fdocuments.net/reader036/viewer/2022062314/56649f295503460f94c42141/html5/thumbnails/20.jpg)
Load Distribution: Average CPU utilization
Nuno Santos Experimental Evaluation 20
JPaxos S-Paxos
![Page 21: S-Paxos: Eliminating the Leader Bottleneck Martin Biely, Zarko Milosevic, Nuno Santos, André Schiper Ecole Polytechnique Fédérale de Lausanne (EPFL) Switzerland.](https://reader036.fdocuments.net/reader036/viewer/2022062314/56649f295503460f94c42141/html5/thumbnails/21.jpg)
Performance with Increasing Number of Clients (n=3)
Nuno Santos Experimental Evaluation 21
Throughput Response time
![Page 22: S-Paxos: Eliminating the Leader Bottleneck Martin Biely, Zarko Milosevic, Nuno Santos, André Schiper Ecole Polytechnique Fédérale de Lausanne (EPFL) Switzerland.](https://reader036.fdocuments.net/reader036/viewer/2022062314/56649f295503460f94c42141/html5/thumbnails/22.jpg)
Scalability
Nuno Santos Experimental Evaluation 22
Throughput
![Page 23: S-Paxos: Eliminating the Leader Bottleneck Martin Biely, Zarko Milosevic, Nuno Santos, André Schiper Ecole Polytechnique Fédérale de Lausanne (EPFL) Switzerland.](https://reader036.fdocuments.net/reader036/viewer/2022062314/56649f295503460f94c42141/html5/thumbnails/23.jpg)
Throughput with crashes
Nuno Santos Experimental Evaluation 23
Crash of the leader
![Page 24: S-Paxos: Eliminating the Leader Bottleneck Martin Biely, Zarko Milosevic, Nuno Santos, André Schiper Ecole Polytechnique Fédérale de Lausanne (EPFL) Switzerland.](https://reader036.fdocuments.net/reader036/viewer/2022062314/56649f295503460f94c42141/html5/thumbnails/24.jpg)
False suspicions
Nuno Santos Experimental Evaluation 24
• Leader is (wrongly) suspected every 10 seconds
![Page 25: S-Paxos: Eliminating the Leader Bottleneck Martin Biely, Zarko Milosevic, Nuno Santos, André Schiper Ecole Polytechnique Fédérale de Lausanne (EPFL) Switzerland.](https://reader036.fdocuments.net/reader036/viewer/2022062314/56649f295503460f94c42141/html5/thumbnails/25.jpg)
Conclusion
Nuno Santos 25
A leader-based protocol does not need to be leader-centric
S-Paxos: balances the workload across replicas
Benefits•Better performance for the same number of replicas•Better scalability with the number of replicas•Better performance in the presence of faults
![Page 26: S-Paxos: Eliminating the Leader Bottleneck Martin Biely, Zarko Milosevic, Nuno Santos, André Schiper Ecole Polytechnique Fédérale de Lausanne (EPFL) Switzerland.](https://reader036.fdocuments.net/reader036/viewer/2022062314/56649f295503460f94c42141/html5/thumbnails/26.jpg)
ADDITIONAL SLIDES
Nuno Santos 26
![Page 27: S-Paxos: Eliminating the Leader Bottleneck Martin Biely, Zarko Milosevic, Nuno Santos, André Schiper Ecole Polytechnique Fédérale de Lausanne (EPFL) Switzerland.](https://reader036.fdocuments.net/reader036/viewer/2022062314/56649f295503460f94c42141/html5/thumbnails/27.jpg)
Discussion
Nuno Santos Dissemination Layer Protocol 27
• Broadcast of <request,ID>: best effort, no retransmission• Avoids cost of reliable broadcast on requests• Recovering from partial delivery (message loss/crashes):
• Request does not become stable - client timeouts and retransmits• Request becomes stable – after ID is decided, replicas poll other
replicas for request
• Broadcast of <Ack,ID>: retransmission• Ensures that once a request is stable, it will be proposed• Almost free in practice: acks are small and can be piggybacked on
other messages.