CQRD: A Switch-based Approach to Flow Interference in Data Center Networks Guo Chen Dan Pei, Youjian...
-
Upload
rebecca-skinner -
Category
Documents
-
view
227 -
download
0
description
Transcript of CQRD: A Switch-based Approach to Flow Interference in Data Center Networks Guo Chen Dan Pei, Youjian...
CQRD: A Switch-based Approach to Flow
Interference in Data Center NetworksGuo Chen
Dan Pei, Youjian Zhao
Tsinghua University, Beijing, China
2
The Problem
Flow interference dramatically increases the flow completion time (FCT) of short delay-sensitive flows in data center
networks (DCN)
3
Flow Interference
• Short delay-sensitive flows (majority in DCN) have to wait a long time at switches for buffer and bandwidth resources occupied by a few of long bandwidth-greedy flows (e.g., backup, replication)
4
Flow Interference
• Short delay-sensitive flows (majority in DCN) have to wait a long time at switches for buffer and bandwidth resources occupied by a few of long bandwidth-greedy flows (e.g., backup, replication)• Caused by coarse Output Queue (OQ) switch queue management
scheme
5
• Transport Layer Rate Control:• DCTCP [SIGCOMM’10]
• HULL [NSDI’12]
• D2TCP [SIGCOMM’12]
• D3 [SIGCOMM’11]
• Preemptive Flow Scheduling:• PDQ [SIGCOMM’12]
• pFabric [SIGCOMM’13]
Prior solutions
Modification to end host and/or switch hardware
New protocol stack and switch hardware
6
Intuition of CQRD
Tackling the root cause of flow interference:Need a more fine-grained
queue management scheme
7
The Goal
• Goal:• Alleviate flow interference• Reduce FCT of short delay-sensitive flows• Maintain high goodput of long bandwidth-greedy flows
• Objectives:• Transparent to end host• No modification to protocol stack• Based on underlying techniques available in commodity productions
8
Our Solution
CQRD: A fine-grained switch queue management scheme to flow
interference
9
AGENDA Background & Motivation Flow Interference CQRD Design Evaluation Conclusion
10
Toy Example: Flow Interference in OQ Switch
• NS2 simulation parameters: • Link capacity=10Gbps, Link delay=4us, Total buffer size=288KB, TCP initial window size=4, TCP initial RTO=200us.
• 8x8 switch connected to host 1-8, Host 1-5 sending 10KB TCP flow to host 8, Host 6-7 sending 100MB TCP flow to host 8
I1
I6
I7
I8
O1 O6 O7 O8
Short Delay-sensitive Flow :
Output Buffer(36KB)
...
...
...
Long Bandwidth-greedy Flow :
...
100MB TCP
10KB TCP
11
Toy Example: Flow Interference in OQ Switch
FCT
Goodput
Short flows completed in ~100ms
Goodput of short flows collapse
12
Toy Example: Flow Interference in OQ Switch
FCT
Goodput
Short flows completed in ~100ms
Goodput of short flows collapse
Interfered by these 2 long flows
13
Toy Example: Flow Interference in OQ Switch
FCT
Goodput
Short flows completed in ~100ms
Goodput of short flows collapse
Interfered by these 2 long flows
Unfairly served
14
AGENDA Background & Motivation Flow Interference CQRD Design Evaluation Conclusion
15O1 O6 O7 O8
Short Delay-sensitive Flow :
...
Long Bandwidth-greedy Flow :
Crosspoint-queue
...
...
...
I1
I6
I7
I8
...
Round-robin scheduling
CQRD Design
• Crosspoint-Queue
16O1 O6 O7 O8
Short Delay-sensitive Flow :
...
Long Bandwidth-greedy Flow :
Crosspoint-queue
...
...
...
I1
I6
I7
I8
...
Round-robin scheduling
CQRD Design
• Crosspoint-Queue• Eliminating interference between flows on different switch paths
(Output-Contending but not Path-Contending, OC-PC)
17O1 O6 O7 O8
Short Delay-sensitive Flow :
...
Long Bandwidth-greedy Flow :
Crosspoint-queue
...
...
...
I1
I6
I7
I8
...
Round-robin scheduling
CQRD Design
• Crosspoint-Queue• Eliminating interference between flows on different switch paths
(Output-Contending but not Path-Contending, OC-PC)
Separate buffer&
Fair scheduling
18
CQRD Design• Crosspoint-Queue
• Eliminating interference between flows on different switch paths (Output-Contending but not Path-Contending, OC-PC)
• Random-Drop• Alleviate the flow interference within the same switch path (Path-Contending, PC)
O1 O6 O7 O8
Short Delay-sensitive Flow :
... Long Bandwidth-greedy Flow :
Random-drop
...
...
...
I1
I6
I7
I8
...
19
CQRD Design• Crosspoint-Queue
• Eliminating interference between flows on different switch paths (Output-Contending but not Path-Contending, OC-PC)
• Random-Drop• Alleviate the flow interference within the same switch path (Path-Contending, PC)
O1 O6 O7 O8
Short Delay-sensitive Flow :
... Long Bandwidth-greedy Flow :
Random-drop
...
...
...
I1
I6
I7
I8
...
Occupy more buffer, more likely to be
dropped
20
CQRD Design• Crosspoint-Queue
• Eliminating interference between flows on different switch paths (Output-Contending but not Path-Contending, OC-PC)
• Random-Drop• Alleviate the flow interference within the same switch path (Path-Contending, PC)
O1 O6 O7 O8
Short Delay-sensitive Flow :
... Long Bandwidth-greedy Flow :
Random-drop
...
...
...
I1
I6
I7
I8
...
Occupy more buffer, more likely to be
dropped
21
Toy Example: Flow Interference
FCT
Goodput
3 orders shorter FCT
3 orders higher goodput
22
Toy Example: Flow Interference
FCT
Goodput
3 orders shorter FCT
3 orders higher goodput
Fairly served
Almost no cost of goodput
23
Toy Example: Flow Interference
FCT
Goodput
3 orders shorter FCT
3 orders higher goodput
Fairly served
Almost no cost of goodput
24
AGENDA Background & Motivation Flow Interference CQRD Design Evaluation Conclusion
25
Evaluation
• 1. How much FCT of short delay-sensitive flows is reduced in CQRD?
• 2. How much goodput of long bandwidth-greedy flows is sacrificed in CQRD?
26
Experiment 1
• Single aggregation/core switch (ns2 simulations)
Port 1
Port 12
Port 13
Port 24
Multiple Flows ...... ......Multiple Flows
Multiple Flows
Multiple Flows
• Simulation parameters: • Link capacity=10Gbps, Link delay=4us, Total buffer size=5MB, TCP initial window size=4, TCP initial RTO=200us.
• Traffic: • 1200 TCP flows, Flow size & inter-arrival time from realistic distributions, Random source & destination port
27
Single aggregation/core switch
FCT of all short flows ( < 100KB) and goodput of all large flows (> 100KB) interfered by the giant flows (> 1MB, included by large flows) at moderate load (0.1).
28
Single aggregation/core switch
FCT of all short flows ( < 100KB) and goodput of all large flows (> 100KB) interfered by the giant flows (> 1MB, included by large flows) at moderate load (0.1).
~36% lower
~7% lower
~28% lower
~4% lower
29
Experiment 2
• Multi-stage DCN switching fabric (ns2 simulations)
• Simulation parameters: • Link delay=2us, Agg switch buffer size=5MB, ToR switch buffer size=4MB, TCP initial window size=4, TCP initial
RTO=200us.• Traffic:
• 2000 TCP flows, realistic distributions; ECMP load-balancing schemes
. . .... ... ... ...
Aggregation Switches
ToR Switches
10Gbps Link
1Gbps Link
... ...
24 Racks
20 Hosts 20 Hosts 20 Hosts 20 Hosts
30
Single aggregation/core switch
~14% lower
~30% lower
~2.5% lower
~same
FCT of all short flows ( < 100KB) and goodput of all large flows (> 100KB) interfered by the giant flows (> 1MB, included by large flows) at moderate load (0.1).
31
AGENDA Background & Motivation Flow Interference CQRD Design Evaluation Conclusion
32
Conclusion
• Tackling the root cause of flow interference: • Need a more fine-grained queue management scheme
• Simple solution: CQRD—switch queue management scheme• Transparent to end host• No modification to protocol stack• Based on underlying techniques available in commodity productions
• Reduces the FCT of short flows by 20-44% in a single switch and 8-30% in a multi-stage data center switch network• At the cost of a minor goodput decrease for large flows
THANK YOU