Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf ·...
Transcript of Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf ·...
![Page 1: Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf · Towards a Redundancy -Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts)](https://reader035.fdocuments.net/reader035/viewer/2022062505/5ec6dca0df99ba07f6112c35/html5/thumbnails/1.jpg)
TowardsaRedundancy-AwareNetworkStackforDataCenters
AliMusaIftikhar(Tufts)
Ihsan AyyubQazi(LUMS)
FahadR.Dogar(Tufts)
![Page 2: Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf · Towards a Redundancy -Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts)](https://reader035.fdocuments.net/reader035/viewer/2022062505/5ec6dca0df99ba07f6112c35/html5/thumbnails/2.jpg)
TheProblemofTailLatency inDataCenters!
2
![Page 3: Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf · Towards a Redundancy -Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts)](https://reader035.fdocuments.net/reader035/viewer/2022062505/5ec6dca0df99ba07f6112c35/html5/thumbnails/3.jpg)
TheProblemofTailLatency inDataCenters!
Highfan-out
3
![Page 4: Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf · Towards a Redundancy -Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts)](https://reader035.fdocuments.net/reader035/viewer/2022062505/5ec6dca0df99ba07f6112c35/html5/thumbnails/4.jpg)
TheProblemofTailLatency inDataCenters!
• Loadimbalance• Backgroundtasks• Failures,etc.
Highfan-out
Straggler
4
![Page 5: Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf · Towards a Redundancy -Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts)](https://reader035.fdocuments.net/reader035/viewer/2022062505/5ec6dca0df99ba07f6112c35/html5/thumbnails/5.jpg)
TheProblemofTailLatency inDataCenters!
• Loadimbalance• Backgroundtasks• Failures,etc.
LongtaillatencyHighfan-out Stragglers+
Highfan-out
Straggler
5
![Page 6: Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf · Towards a Redundancy -Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts)](https://reader035.fdocuments.net/reader035/viewer/2022062505/5ec6dca0df99ba07f6112c35/html5/thumbnails/6.jpg)
Howtoavoidstragglers?
Reactively Proactively
6
![Page 7: Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf · Towards a Redundancy -Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts)](https://reader035.fdocuments.net/reader035/viewer/2022062505/5ec6dca0df99ba07f6112c35/html5/thumbnails/7.jpg)
Howtoavoidstragglers?
Reactively Proactively
PRO:lowoverhead
Hopper(SIGCOMM’15)C3(NSDI’15)
Sinbad(SIGCOMM’13)
CON:requiresstragglerdetection(slowandinaccurate)
7
![Page 8: Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf · Towards a Redundancy -Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts)](https://reader035.fdocuments.net/reader035/viewer/2022062505/5ec6dca0df99ba07f6112c35/html5/thumbnails/8.jpg)
Howtoavoidstragglers?
Reactively Proactively
PRO:lowoverhead
Dolly(NSDI’13)Lowlatencyvia
redundancy(CoNext’13)
Hopper(SIGCOMM’15)C3(NSDI’15)
Sinbad(SIGCOMM’13)
PRO:fastandaccurate
CON:requiresdeterminingthresholdload(non-trivial)
CON:requiresstragglerdetection(slowandinaccurate)
8
![Page 9: Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf · Towards a Redundancy -Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts)](https://reader035.fdocuments.net/reader035/viewer/2022062505/5ec6dca0df99ba07f6112c35/html5/thumbnails/9.jpg)
Howtoavoidstragglers?
Reactively Proactively
PRO:lowoverhead
Can we achieve the benefits of both without their limitations?
Dolly(NSDI’13)Lowlatencyvia
redundancy(CoNext’13)
Hopper(SIGCOMM’15)C3(NSDI’15)
Sinbad(SIGCOMM’13)
PRO:fastandaccurate
CON:requiresdeterminingthresholdload(non-trivial)
CON:requiresstragglerdetection(slowandinaccurate)
9
![Page 10: Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf · Towards a Redundancy -Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts)](https://reader035.fdocuments.net/reader035/viewer/2022062505/5ec6dca0df99ba07f6112c35/html5/thumbnails/10.jpg)
Overview• Duplicate-Aware Scheduling Framework
• Redundancy-Aware Network Stack
• Preliminary Results
Genericframework
NewnetworkstackforDC
10
![Page 11: Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf · Towards a Redundancy -Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts)](https://reader035.fdocuments.net/reader035/viewer/2022062505/5ec6dca0df99ba07f6112c35/html5/thumbnails/11.jpg)
Duplicate-awarescheduling
Replica1
Client
Replica2 11
![Page 12: Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf · Towards a Redundancy -Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts)](https://reader035.fdocuments.net/reader035/viewer/2022062505/5ec6dca0df99ba07f6112c35/html5/thumbnails/12.jpg)
Duplicate-awarescheduling
Replica1
Client
Replica2
high
low
high
low
1. PriorityQueues
12
![Page 13: Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf · Towards a Redundancy -Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts)](https://reader035.fdocuments.net/reader035/viewer/2022062505/5ec6dca0df99ba07f6112c35/html5/thumbnails/13.jpg)
Duplicate-awarescheduling
Replica1
Client
Replica2
high
low
high
low
request
1. PriorityQueues
13
![Page 14: Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf · Towards a Redundancy -Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts)](https://reader035.fdocuments.net/reader035/viewer/2022062505/5ec6dca0df99ba07f6112c35/html5/thumbnails/14.jpg)
Duplicate-awarescheduling
Replica1
Client
Replica2
high
low
high
low
request
1. PriorityQueues
14
![Page 15: Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf · Towards a Redundancy -Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts)](https://reader035.fdocuments.net/reader035/viewer/2022062505/5ec6dca0df99ba07f6112c35/html5/thumbnails/15.jpg)
Duplicate-awarescheduling
Replica1
Client
Replica2
high
low
high
low
P
request
B
1. PriorityQueues
15
![Page 16: Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf · Towards a Redundancy -Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts)](https://reader035.fdocuments.net/reader035/viewer/2022062505/5ec6dca0df99ba07f6112c35/html5/thumbnails/16.jpg)
Duplicate-awarescheduling
Replica1
Client
Replica2
high
low
P
high
lowB
request
1. PriorityQueues
16
![Page 17: Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf · Towards a Redundancy -Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts)](https://reader035.fdocuments.net/reader035/viewer/2022062505/5ec6dca0df99ba07f6112c35/html5/thumbnails/17.jpg)
Duplicate-awarescheduling
Replica1
Client
Replica2
high
low
high
lowB
request
1. PriorityQueues
17
![Page 18: Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf · Towards a Redundancy -Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts)](https://reader035.fdocuments.net/reader035/viewer/2022062505/5ec6dca0df99ba07f6112c35/html5/thumbnails/18.jpg)
Duplicate-awarescheduling
Replica1
Client
Replica2
high
low
high
lowB
request
purge
1. PriorityQueues2. Purging
18
![Page 19: Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf · Towards a Redundancy -Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts)](https://reader035.fdocuments.net/reader035/viewer/2022062505/5ec6dca0df99ba07f6112c35/html5/thumbnails/19.jpg)
NeedforPriorityQueuing
high
lowbackup
primary
19
ØDuplicationhasanoverhead!
L
![Page 20: Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf · Towards a Redundancy -Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts)](https://reader035.fdocuments.net/reader035/viewer/2022062505/5ec6dca0df99ba07f6112c35/html5/thumbnails/20.jpg)
NeedforPriorityQueuing
high
lowbackup
primary
üStrictprioritiesüWorkconservationüPreemption
20
ØDuplicationhasanoverhead!
LPropertiesrequired:
![Page 21: Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf · Towards a Redundancy -Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts)](https://reader035.fdocuments.net/reader035/viewer/2022062505/5ec6dca0df99ba07f6112c35/html5/thumbnails/21.jpg)
NeedforPriorityQueuing
high
lowbackup
primary
üStrictprioritiesüWorkconservationüPreemption
21
ØDuplicationhasanoverhead!
LPropertiesrequired:
PQ makes the overhead of duplication low. sJ
![Page 22: Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf · Towards a Redundancy -Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts)](https://reader035.fdocuments.net/reader035/viewer/2022062505/5ec6dca0df99ba07f6112c35/html5/thumbnails/22.jpg)
NeedforPriorityQueuing
high
lowbackup
primary
üStrictprioritiesüWorkconservationüPreemption
22
ØDuplicationhasanoverhead!
LPropertiesrequired:
PQ makes the overhead of duplication low. sJessential
![Page 23: Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf · Towards a Redundancy -Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts)](https://reader035.fdocuments.net/reader035/viewer/2022062505/5ec6dca0df99ba07f6112c35/html5/thumbnails/23.jpg)
ImportanceofPurging
ØStalerequestsblocknewrequests.
Lhigh
lowreq1req2
stale
23
![Page 24: Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf · Towards a Redundancy -Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts)](https://reader035.fdocuments.net/reader035/viewer/2022062505/5ec6dca0df99ba07f6112c35/html5/thumbnails/24.jpg)
ImportanceofPurging
high
lowreq1req2
stale
ØStalerequestsblocknewrequests.
L
Purging makes the system more efficient! A J24
![Page 25: Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf · Towards a Redundancy -Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts)](https://reader035.fdocuments.net/reader035/viewer/2022062505/5ec6dca0df99ba07f6112c35/html5/thumbnails/25.jpg)
ImportanceofPurging
high
lowreq1req2
stale
ØStalerequestsblocknewrequests.
L
Purging makes the system more efficient! A Joptimization 25
![Page 26: Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf · Towards a Redundancy -Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts)](https://reader035.fdocuments.net/reader035/viewer/2022062505/5ec6dca0df99ba07f6112c35/html5/thumbnails/26.jpg)
RealizingDuplicate-AwareSchedulingateverypotentialbottleneck resourceinaDC
26
![Page 27: Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf · Towards a Redundancy -Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts)](https://reader035.fdocuments.net/reader035/viewer/2022062505/5ec6dca0df99ba07f6112c35/html5/thumbnails/27.jpg)
RealizingDuplicate-AwareSchedulingateverypotentialbottleneck resourceinaDC
Network
27
![Page 28: Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf · Towards a Redundancy -Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts)](https://reader035.fdocuments.net/reader035/viewer/2022062505/5ec6dca0df99ba07f6112c35/html5/thumbnails/28.jpg)
RealizingDuplicate-AwareSchedulingateverypotentialbottleneck resourceinaDC
Compute
Network
28
![Page 29: Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf · Towards a Redundancy -Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts)](https://reader035.fdocuments.net/reader035/viewer/2022062505/5ec6dca0df99ba07f6112c35/html5/thumbnails/29.jpg)
RealizingDuplicate-AwareSchedulingateverypotentialbottleneck resourceinaDC
Memory
Compute
Network
29
![Page 30: Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf · Towards a Redundancy -Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts)](https://reader035.fdocuments.net/reader035/viewer/2022062505/5ec6dca0df99ba07f6112c35/html5/thumbnails/30.jpg)
RealizingDuplicate-AwareSchedulingateverypotentialbottleneck resourceinaDC
GFSHDFS BigTable
Memory
Compute
Filesystem/Database
Network
30
![Page 31: Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf · Towards a Redundancy -Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts)](https://reader035.fdocuments.net/reader035/viewer/2022062505/5ec6dca0df99ba07f6112c35/html5/thumbnails/31.jpg)
RealizingDuplicate-AwareSchedulingateverypotentialbottleneck resourceinaDC
GFSHDFS BigTable
Storage
Memory
Compute
Filesystem/Database
Network
31
![Page 32: Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf · Towards a Redundancy -Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts)](https://reader035.fdocuments.net/reader035/viewer/2022062505/5ec6dca0df99ba07f6112c35/html5/thumbnails/32.jpg)
ateverypotentialbottleneck resourceinaDC
GFSHDFS BigTable
Memory
Compute
Storage
Network
Filesystem/Database
In-networkpurging
Prioritization
Purging+preemption
challenges
32
RealizingDuplicate-AwareScheduling
![Page 33: Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf · Towards a Redundancy -Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts)](https://reader035.fdocuments.net/reader035/viewer/2022062505/5ec6dca0df99ba07f6112c35/html5/thumbnails/33.jpg)
RedundancyAwareNetworkStack(RANS)
Application
Transport
Link
Network
Duplicate-Awareness
Pointtomultipoint
PriorityQueues
Physical Sameasbefore
Sameasbefore
ExpressiveInterface
Layer NewRole
+purging
+purging
33
![Page 34: Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf · Towards a Redundancy -Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts)](https://reader035.fdocuments.net/reader035/viewer/2022062505/5ec6dca0df99ba07f6112c35/html5/thumbnails/34.jpg)
RedundancyAwareNetworkStack(RANS)
Application
Transport
Link
Network
Duplicate-Awareness
Pointtomultipoint
PriorityQueues
Physical Sameasbefore
Sameasbefore
ExpressiveInterface
Layer NewRole
+purging
Applicationsneedtobemodified.
challenge
+purging
34
![Page 35: Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf · Towards a Redundancy -Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts)](https://reader035.fdocuments.net/reader035/viewer/2022062505/5ec6dca0df99ba07f6112c35/html5/thumbnails/35.jpg)
RedundancyAwareNetworkStack(RANS)
Application
Transport
Link
Network
Duplicate-Awareness
Pointtomultipoint
PriorityQueues
Physical Sameasbefore
Sameasbefore
ExpressiveInterface
Layer NewRole
+purging
Applicationsneedtobemodified.
Expressiveinterfaceallowsrichcommunicationb/wAppand
Transport.E.g.DAG
challenge
opportunity
+purging
35
![Page 36: Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf · Towards a Redundancy -Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts)](https://reader035.fdocuments.net/reader035/viewer/2022062505/5ec6dca0df99ba07f6112c35/html5/thumbnails/36.jpg)
RedundancyAwareNetworkStack(RANS)
Application
Transport
Link
Network
Duplicate-Awareness
Pointtomultipoint
PriorityQueues
Physical Sameasbefore
Sameasbefore
ExpressiveInterface
Layer NewRole
+purging
Applicationsneedtobemodified.
Expressiveinterfaceallowsrichcommunicationb/wAppand
Transport.E.g.DAG
challenge
opportunity
Hardtoimplementperpacketpurging.
challenge+purging
36
![Page 37: Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf · Towards a Redundancy -Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts)](https://reader035.fdocuments.net/reader035/viewer/2022062505/5ec6dca0df99ba07f6112c35/html5/thumbnails/37.jpg)
RedundancyAwareNetworkStack(RANS)
Application
Transport
Link
Network
Duplicate-Awareness
Pointtomultipoint
PriorityQueues
Physical Sameasbefore
Sameasbefore
ExpressiveInterface
Layer NewRole
+purging
Applicationsneedtobemodified.
Expressiveinterfaceallowsrichcommunicationb/wAppand
Transport.E.g.DAG
challenge
opportunity
Hardtoimplementperpacketpurging.
challenge
AddssupportforexistingPQsinDCswitches.
opportunity
+purging
37
![Page 38: Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf · Towards a Redundancy -Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts)](https://reader035.fdocuments.net/reader035/viewer/2022062505/5ec6dca0df99ba07f6112c35/html5/thumbnails/38.jpg)
RedundancyAwareNetworkStack(RANS)
Application
Transport
Link
Network
Duplicate-Awareness
Pointtomultipoint
PriorityQueues
Physical Sameasbefore
Sameasbefore
ExpressiveInterface
Layer NewRole
+purging
Applicationsneedtobemodified.
Expressiveinterfaceallowsrichcommunicationb/wAppand
Transport.E.g.DAG
challenge
opportunity
Hardtoimplementperpacketpurging.
challenge
AddssupportforexistingPQsinDCswitches.
opportunity
+purging
38
![Page 39: Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf · Towards a Redundancy -Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts)](https://reader035.fdocuments.net/reader035/viewer/2022062505/5ec6dca0df99ba07f6112c35/html5/thumbnails/39.jpg)
e.g.Improvedfaulttolerance
üMultipath
üMulti-destination
RANSTransport:PointtoMulti-point
ØEnables:Richtransport
Sender1(replica1)
Receiver(client)
Sender2(replica2)
39
![Page 40: Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf · Towards a Redundancy -Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts)](https://reader035.fdocuments.net/reader035/viewer/2022062505/5ec6dca0df99ba07f6112c35/html5/thumbnails/40.jpg)
RANSTransport:ByteAggregation
Sender1(replica1)
Receiver(client)
Sender2(replica2)
ØOpportunity:Receiverdriventransport
Response
e.g.Moreefficientcongestioncontrol(2xormore)
üTwoormoreresponsestreams
üAggregatebytesatreceiverside
40
![Page 41: Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf · Towards a Redundancy -Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts)](https://reader035.fdocuments.net/reader035/viewer/2022062505/5ec6dca0df99ba07f6112c35/html5/thumbnails/41.jpg)
RANSTransport:PriorityAssignment
Sender1(replica1)
Receiver(client)
Sender2(replica2)
ØDynamicreplicaassignment
Response+Feedback
e.g.Improvedreplicaassignment
üFinegrainedmonitoringofcongestionwindow
üDynamicallyreprioritizeflows
üFeedbacktoApplication
41
![Page 42: Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf · Towards a Redundancy -Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts)](https://reader035.fdocuments.net/reader035/viewer/2022062505/5ec6dca0df99ba07f6112c35/html5/thumbnails/42.jpg)
Overview• Duplicate-Aware Scheduling Framework
• Redundancy-Aware Network Stack
• Preliminary Results
42
![Page 43: Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf · Towards a Redundancy -Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts)](https://reader035.fdocuments.net/reader035/viewer/2022062505/5ec6dca0df99ba07f6112c35/html5/thumbnails/43.jpg)
PreliminaryEvaluation:ns-2setupdetailsØ Storagescenario
Client
10servers
Replica1
Replica2
43
![Page 44: Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf · Towards a Redundancy -Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts)](https://reader035.fdocuments.net/reader035/viewer/2022062505/5ec6dca0df99ba07f6112c35/html5/thumbnails/44.jpg)
PreliminaryEvaluation:ns-2setupdetailsØ Storagescenario
Client
10servers
Replica1
Replica2
bottlenecks
44
![Page 45: Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf · Towards a Redundancy -Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts)](https://reader035.fdocuments.net/reader035/viewer/2022062505/5ec6dca0df99ba07f6112c35/html5/thumbnails/45.jpg)
PreliminaryEvaluation:ns-2setupdetails
TrafficDetails
Totalrequests 20K
Arrivalprocess Poisson
Server&replicaselection Uniformlyrandom
Ø Storagescenario
Client
10servers
Replica1
Replica2
bottlenecks
45
![Page 46: Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf · Towards a Redundancy -Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts)](https://reader035.fdocuments.net/reader035/viewer/2022062505/5ec6dca0df99ba07f6112c35/html5/thumbnails/46.jpg)
PreliminaryEvaluation:ns-2setupdetails
TrafficDetails
Totalrequests 20K
Arrivalprocess Poisson
Server&replicaselection Uniformlyrandom
Ø Storagescenario
Client
10servers
Replica1
Replica2
bottlenecks
The only source of stragglers is load imbalance.
46
![Page 47: Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf · Towards a Redundancy -Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts)](https://reader035.fdocuments.net/reader035/viewer/2022062505/5ec6dca0df99ba07f6112c35/html5/thumbnails/47.jpg)
Noduplicates(baseline)
2-copies(proactivew/oPQ)
+PQs
+Purging
+ByteAggregation(RANS)
Averagerequestcompletiontimeof:
Load(%)
Requ
estcom
pletiontim
e(s)
47
![Page 48: Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf · Towards a Redundancy -Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts)](https://reader035.fdocuments.net/reader035/viewer/2022062505/5ec6dca0df99ba07f6112c35/html5/thumbnails/48.jpg)
Noduplicates(baseline)
2-copies(proactivew/oPQ)
+PQs
+Purging
+ByteAggregation(RANS)
Averagerequestcompletiontimeof:
Load(%)
Requ
estcom
pletiontim
e(s)
48
![Page 49: Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf · Towards a Redundancy -Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts)](https://reader035.fdocuments.net/reader035/viewer/2022062505/5ec6dca0df99ba07f6112c35/html5/thumbnails/49.jpg)
Noduplicates(baseline)
2-copies(proactivew/oPQ)
+PQs
+Purging
+ByteAggregation(RANS)
Averagerequestcompletiontimeof:
Load(%)
Requ
estcom
pletiontim
e(s)
49
![Page 50: Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf · Towards a Redundancy -Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts)](https://reader035.fdocuments.net/reader035/viewer/2022062505/5ec6dca0df99ba07f6112c35/html5/thumbnails/50.jpg)
Noduplicates(baseline)
2-copies(proactivew/oPQ)
+PQs
+Purging
+ByteAggregation(RANS)
Averagerequestcompletiontimeof:
Load(%)
Requ
estcom
pletiontim
e(s)
50
![Page 51: Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf · Towards a Redundancy -Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts)](https://reader035.fdocuments.net/reader035/viewer/2022062505/5ec6dca0df99ba07f6112c35/html5/thumbnails/51.jpg)
Noduplicates(baseline)
2-copies(proactivew/oPQ)
+PQs
+Purging
+ByteAggregation(RANS)
Averagerequestcompletiontimeof:
Load(%)
Requ
estcom
pletiontim
e(s)
51
![Page 52: Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf · Towards a Redundancy -Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts)](https://reader035.fdocuments.net/reader035/viewer/2022062505/5ec6dca0df99ba07f6112c35/html5/thumbnails/52.jpg)
Noduplicates(baseline)
2-copies(proactivew/oPQ)
+PQs
+Purging
+ByteAggregation(RANS)
Averagerequestcompletiontimeof:
Load(%)
Requ
estcom
pletiontim
e(s)
~2X
52
![Page 53: Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf · Towards a Redundancy -Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts)](https://reader035.fdocuments.net/reader035/viewer/2022062505/5ec6dca0df99ba07f6112c35/html5/thumbnails/53.jpg)
Noduplicates(baseline)
2-copies(proactivew/oPQ)
+PQs
+Purging
+ByteAggregation(RANS)
Averagerequestcompletiontimeof:
Load(%)
Requ
estcom
pletiontim
e(s)
~2X
Expecting more gains even at lower loads with additional straggler sources.
53
![Page 54: Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf · Towards a Redundancy -Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts)](https://reader035.fdocuments.net/reader035/viewer/2022062505/5ec6dca0df99ba07f6112c35/html5/thumbnails/54.jpg)
Noduplicates(baseline)
2-copies(proactivew/oPQ)
+PQs
+Purging
+ByteAggregation(RANS)
Averagerequestcompletiontimeof:
Load(%)
Requ
estcom
pletiontim
e(s)
50-80% improvement over the baseline across all loads.
Expecting more gains even at lower loads with additional straggler sources.
~2X
54
![Page 55: Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf · Towards a Redundancy -Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts)](https://reader035.fdocuments.net/reader035/viewer/2022062505/5ec6dca0df99ba07f6112c35/html5/thumbnails/55.jpg)
Summary&Futurework
• TheIssueofStragglers
• Duplicate-AwareSchedulingFramework
• RANS
• ImplementinginHDFSandCassandra
Simpleyetchallengingsolution
Afirststeptowardsaduplicate-awarenetwork
55
![Page 56: Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf · Towards a Redundancy -Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts)](https://reader035.fdocuments.net/reader035/viewer/2022062505/5ec6dca0df99ba07f6112c35/html5/thumbnails/56.jpg)
RANS:FeedbackandDiscussion
• AliMusaIftikhar([email protected])
• FahadR.Dogar ([email protected])
• Ihsan A.Qazi ([email protected])Transport
Link
Network
PointtomultipointByteaggregationPriorityassignment
PriorityQueues
Physical Sameasbefore
Sameasbefore
ExpressiveInterface
Application Duplicate-Awareness
Layer NewRole
+purging
+purging
56
![Page 57: Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf · Towards a Redundancy -Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts)](https://reader035.fdocuments.net/reader035/viewer/2022062505/5ec6dca0df99ba07f6112c35/html5/thumbnails/57.jpg)
Possiblequestions– backupslide• Preemptionoverhead
• Notreallyanissueinthenetworkbecausepacketsaresmall.
• Packetpurging• PFC(backpressure,buildqueuesattheendhostsandpurgethem)
• Droptheentireduplicatequeue(easierthanper-packetdrops)
• Recenttrendtowardsprogrammableswitches
• GainswithPQ• Moregainswithfailuresasstragglers(primaryundergoesafailure)
• Alsomorebenefitswithdifferentresources
• Duplicationoverheadatclient• Clientisusuallynotthebottleneck
• Non-Idempotentrequests• Wearetargetingtheclassofappswhichhaveflexibleendpointsandrequireatleastoncesemantics
• Replicatingonlysmallpacketsandprioritizingthem• Onlybeneficialwithbursty smallflows• HDFShaveatypicalchunksizeb/w64MB-128MB
• Quorumsystems• RANScomplementssuchsystems,theycanusethistechniqueandsendKoutofNrequestsathighprio whileN-Kasbackups
• Can’tjustimplementattheappandgetthesamebenefits?• Networkcouldbeabottleneck• Finegrainedcontrol,muchmorecontrol
• Rootcausesofperformanceimprovement• PQavoidsoverheads• Nowwecaneasilygetthebenefitsofduplicationslikeaggregationetc.
• Purgingwillalsoattimespurgeprimarymakingthesystemmoreefficient.
57
![Page 58: Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf · Towards a Redundancy -Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts)](https://reader035.fdocuments.net/reader035/viewer/2022062505/5ec6dca0df99ba07f6112c35/html5/thumbnails/58.jpg)
Foodforthought
DCPrimary DCFailover
InterDCDuplicate-AwareScheduling
e.g.Google’sGeo-DistributedDatabase“Spanner”(OSDI’12)
58
![Page 59: Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf · Towards a Redundancy -Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts)](https://reader035.fdocuments.net/reader035/viewer/2022062505/5ec6dca0df99ba07f6112c35/html5/thumbnails/59.jpg)
Foodforthought
DCPrimary DCFailover
InterDCDuplicate-AwareScheduling
e.g.Google’sGeo-DistributedDatabase“Spanner”(OSDI’12)
Prefetch
SearchsuggestionsSpellcheck
• Searchenginesdropspellcheck,suggestions,etc.athighloads.
• Canbenefitfromduplicate-awarescheduling.
59
![Page 60: Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf · Towards a Redundancy -Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts)](https://reader035.fdocuments.net/reader035/viewer/2022062505/5ec6dca0df99ba07f6112c35/html5/thumbnails/60.jpg)
WhenRANSworksbest?
• Applicationfanout ishighandstragglersarefrequent.• End-pointsareflexibleand“atleastonce”semanticsaresufficient.• Clientisnotthebottleneck.• Requestsizesaresmall(orpreemptionoverheadisminimal).
60