EyeQ:(An engineer’s approach to)
Taming network performance unpredictability in the Cloud
VimalMohammad Alizadeh
Balaji PrabhakarDavid Mazières
Changhoon KimAlbert Greenberg
2
What are we depending on?
http://techblog.netflix.com/2010/12/5-lessons-weve-learned-using-aws.html
5 Lessons We’ve Learned Using AWS
… in the Netflix data centers, we have a high capacity, super fast, highly reliablenetwork. This has afforded us the luxury of designing around chatty APIs to remote systems. AWS networking has more variable latency.
Overhaul appsto deal with variability
Many customersdon’t even realise network issues:
Just “spin up more VMs!”Makes app more network dep.
3
Cloud: Warehouse Scale ComputerMulti-tenancy: To increase cluster utilisation
6/11/12
http://research.google.com/people/jeff/latency.html
Provisioning the WarehouseCPU, memory, disk
Network
4
Sharing the Network
• Policy– Sharing model
• Mechanism– Computing rates– Enforcing rates on entities…• Per-VM (multi-tenant)• Per-service (search, map-reduce, etc.)
6/11/12
Can we achieve this?
2Ghz VCPU15GB memory1Gbps network
Tenant X’s Virtual Switch
VM1 VM2 VMnVM3 …
Tenant Y’s Virtual Switch
VM1 VM2 VMiVM3 …
Customer X specifiesthe thickness of each pipe.No traffic matrix.(Hose Model)
5
Why is it hard? (1)
• Bandwidth demands can be…– Random, bursty– Short: few millisecond requests
• Timescales matter!– Need guarantees on the order of few RTTs (ms)
6/11/12
• Default policy insufficient: 1 vs many TCP flows, UDP, etc.• Poor scalability of traditional QoS mechanisms
10–100KB 10–100MB
6
Seconds: Eternity
6/11/12
Switch
1 Long livedTCP flow
Bursty UDP sessionON: 5msOFF: 15ms
Shared10G pipe
7
Under the hood
6/11/12
Switch
8
Why is it hard? (2)
6/11/12
Switch
• Switch sees contention, but lacks VM state• Receiver-host has VM state, but does not see contention
(1) Drops in network: servers don’t see true demand
(2) Elusive TCP (back-off) makes true demand detection harder
9
Key Idea: Bandwidth Headroom• Bandwidth guarantees: managing congestion• Congestion: link util reaches 100%
– At millisecond timescales• Don’t allow 100% util
– 10% headroom: Early detection at receiver
6/11/12
N x 10G
UDP
TCP
Shared pipeLimit to 9G
Single Switch: Headroom
What about a network?
10
Network design: the old
6/11/12
http://bradhedlund.com/2012/04/30/network-that-doesnt-suck-for-cloud-and-big-data-interop-2012-session-
teaser/
Over-subscription
11
Network design: the new
6/11/12
http://bradhedlund.com/2012/04/30/network-that-doesnt-suck-for-cloud-and-big-data-interop-2012-session-
teaser/
(1) Uniform capacity across racks
(2) Over-subscription only atTop-of-Rack
12
Mitigating Congestion in a Network
6/11/12
Load balancing + Admissibility =Hotspot free network core
[VL2, FatTree, Hedera, MicroTE]
Aggregate rate > 10GbpsFabric gets congested
Server
VM
10Gbps pipe
Fabric
Aggregate rate < 10GbpsCongestion free Fabric
Server
VM
10Gbps pipe
FabricLoad balancing: ECMP, etc.
Admissibility: e2e congestion control (EyeQ)
13
EyeQ Platform
6/11/12
TX packets
VMVM
TX
VM
Software VSwitchAdaptive Rate
Limiters
untrusted
RX
3Gbps6Gbps
RX packets
Software VSwitch
VM
Congestion Detectors
untrusted VM
RX componentdetects
TX componentreacts
End-to-endflow control
(VSwitch—VSwitch)
DataCentreFabric
Congestion Feedback
14
Does it work?
6/11/12
Without EyeQ With EyeQ
Improves utilisation
Provides protection
TCP: 6GbpsUDP: 3Gbps
15
State: only at edge
EyeQ
One Big Switch
16
EyeQ Load balancing+ Bandwidth headroom+ Admissibility at millisec timescales= Network as one big switch= Bandwidth sharing at edge
Linux, Windows implementation for 10Gbps~1700 lines C codehttp://github.com/jvimal/perfiso_10g (Linux kmod)No documentation, yet.
176/11/12
Top Related