XPANDER: TOWARDS OPTIMAL-PERFORMANCE … · WHAT IS THE “RIGHT” DATACENTER ARCHITECTURE?...

26
XPANDER: TOWARDS OPTIMAL-PERFORMANCE DATACENTERS Asaf Valadarsky (Hebrew University) Gal Shahaf (Hebrew University) Michael Dinitz (Johns Hopkins University) Michael Schapira (Hebrew University)

Transcript of XPANDER: TOWARDS OPTIMAL-PERFORMANCE … · WHAT IS THE “RIGHT” DATACENTER ARCHITECTURE?...

Page 1: XPANDER: TOWARDS OPTIMAL-PERFORMANCE … · WHAT IS THE “RIGHT” DATACENTER ARCHITECTURE? DEPLOYABILITY E Jellyfish Slim-Fly???? Fat Tree SWDC, DCell, BCube, c-Through, Helios,

XPANDER: TOWARDS OPTIMAL-PERFORMANCE

DATACENTERSAsaf Valadarsky (Hebrew University)

Gal Shahaf (Hebrew University)Michael Dinitz (Johns Hopkins University)

Michael Schapira (Hebrew University)

Page 2: XPANDER: TOWARDS OPTIMAL-PERFORMANCE … · WHAT IS THE “RIGHT” DATACENTER ARCHITECTURE? DEPLOYABILITY E Jellyfish Slim-Fly???? Fat Tree SWDC, DCell, BCube, c-Through, Helios,

DESIGNING A DATACENTER ARCHITECTURE

Network Topology? Routing? Congestion Control?

Page 3: XPANDER: TOWARDS OPTIMAL-PERFORMANCE … · WHAT IS THE “RIGHT” DATACENTER ARCHITECTURE? DEPLOYABILITY E Jellyfish Slim-Fly???? Fat Tree SWDC, DCell, BCube, c-Through, Helios,

DESIGNING A DATACENTERARCHITECTURE

Performance

➡Throughput➡Resiliency to failures ➡Path diversity➡…

Deployability

➡Cabling complexity➡Operations cost➡Equipment costs➡…

Page 4: XPANDER: TOWARDS OPTIMAL-PERFORMANCE … · WHAT IS THE “RIGHT” DATACENTER ARCHITECTURE? DEPLOYABILITY E Jellyfish Slim-Fly???? Fat Tree SWDC, DCell, BCube, c-Through, Helios,

WHAT IS THE “RIGHT” DATACENTER ARCHITECTURE?

DEPLOYABILITY

PERF

ORM

ANCE

Jellyfish

Slim-Fly

????

Fat Tree

SWDC, DCell, BCube,c-Through, Helios, …

Page 5: XPANDER: TOWARDS OPTIMAL-PERFORMANCE … · WHAT IS THE “RIGHT” DATACENTER ARCHITECTURE? DEPLOYABILITY E Jellyfish Slim-Fly???? Fat Tree SWDC, DCell, BCube, c-Through, Helios,

AGENDA

• Reaching that upper-right corner entails designing “expander datacenters”

• Xpander: a tangible and near-optimaldatacenter design

Page 6: XPANDER: TOWARDS OPTIMAL-PERFORMANCE … · WHAT IS THE “RIGHT” DATACENTER ARCHITECTURE? DEPLOYABILITY E Jellyfish Slim-Fly???? Fat Tree SWDC, DCell, BCube, c-Through, Helios,

EXPANDER DATACENTERS

• An expander datacenter architecture:

➡ Utilizes an expander graph as its network topology (see next slide)

➡ Employs (multi-path) routing and congestion control to exploit path diversity

Page 7: XPANDER: TOWARDS OPTIMAL-PERFORMANCE … · WHAT IS THE “RIGHT” DATACENTER ARCHITECTURE? DEPLOYABILITY E Jellyfish Slim-Fly???? Fat Tree SWDC, DCell, BCube, c-Through, Helios,

EXPANDER GRAPHS: INTUITION

S V\S

• A graph is called an “expander graph” if it has “good” edge expansion

• Intuition: In an expander graph, the capacity traversing each cut is “large”➡ Traffic is never bottlenecked at small set of links➡ High path diversity

Page 8: XPANDER: TOWARDS OPTIMAL-PERFORMANCE … · WHAT IS THE “RIGHT” DATACENTER ARCHITECTURE? DEPLOYABILITY E Jellyfish Slim-Fly???? Fat Tree SWDC, DCell, BCube, c-Through, Helios,

CONSTRUCTING EXPANDERS

• Constructing expanders is a prominent research area in mathematics and computer science

• Applications in networking, computational complexity, coding, and beyond

Page 9: XPANDER: TOWARDS OPTIMAL-PERFORMANCE … · WHAT IS THE “RIGHT” DATACENTER ARCHITECTURE? DEPLOYABILITY E Jellyfish Slim-Fly???? Fat Tree SWDC, DCell, BCube, c-Through, Helios,

➡ Support higher traffic loads➡ More resilient to failures➡ Support more servers with less network

devices➡ Multiple short-paths between hosts➡ Incrementally expandable

EXPANDER DATACENTERS ACHIEVE NEAR-OPTIMAL PERFORMANCE

Page 10: XPANDER: TOWARDS OPTIMAL-PERFORMANCE … · WHAT IS THE “RIGHT” DATACENTER ARCHITECTURE? DEPLOYABILITY E Jellyfish Slim-Fly???? Fat Tree SWDC, DCell, BCube, c-Through, Helios,

OUR EVALUATION

➡ Theoretical analyses

➡ Flow- and packet-level simulations

➡ Experiments on network emulator

➡ Experiments on an SDN-capable network

Page 11: XPANDER: TOWARDS OPTIMAL-PERFORMANCE … · WHAT IS THE “RIGHT” DATACENTER ARCHITECTURE? DEPLOYABILITY E Jellyfish Slim-Fly???? Fat Tree SWDC, DCell, BCube, c-Through, Helios,

EXPANDER DATACENTERS ARE THE STATE-OF-THE-ART

Low-Diameter Graph

Random Graph

DEPLOYABILITY

PERF

ORM

ANCE

Jellyfish

Slim-Fly

????

Fat Tree

SWDC, DCell, BCube,c-Through, Helios, …

Page 12: XPANDER: TOWARDS OPTIMAL-PERFORMANCE … · WHAT IS THE “RIGHT” DATACENTER ARCHITECTURE? DEPLOYABILITY E Jellyfish Slim-Fly???? Fat Tree SWDC, DCell, BCube, c-Through, Helios,

CAN WE HAVE IT ALL?A well structured

designNear optimal performance

YES! :)

Page 13: XPANDER: TOWARDS OPTIMAL-PERFORMANCE … · WHAT IS THE “RIGHT” DATACENTER ARCHITECTURE? DEPLOYABILITY E Jellyfish Slim-Fly???? Fat Tree SWDC, DCell, BCube, c-Through, Helios,

XPANDER DATACENTER ARCHITECTURE

Near-Optimal Performance

➡Throughput➡Resiliency to failures ➡Path diversity➡…

Deployable

➡Cabling complexity➡Operations cost➡Equipment costs➡…

ExpanderDatacenter

Deployment-Oriented

Construction

Page 14: XPANDER: TOWARDS OPTIMAL-PERFORMANCE … · WHAT IS THE “RIGHT” DATACENTER ARCHITECTURE? DEPLOYABILITY E Jellyfish Slim-Fly???? Fat Tree SWDC, DCell, BCube, c-Through, Helios,

ToR ToR

ToR

ToR

XPANDER DATACENTER ARCHITECTURE

MetaNode

MetaNode

Same number of ToRs within any meta-node

Same number of

links between every two

meta-nodes

Leverages a deterministic graph-theoretic construction of expanders [BL ’06]

ToRToR

ToR

ToRToR

ToRToR

ToR ToR

ToRToR

ToR

No links within the

same meta-node

Page 15: XPANDER: TOWARDS OPTIMAL-PERFORMANCE … · WHAT IS THE “RIGHT” DATACENTER ARCHITECTURE? DEPLOYABILITY E Jellyfish Slim-Fly???? Fat Tree SWDC, DCell, BCube, c-Through, Helios,

WHERE ARE MY PODS?An Xpander can be divided into smaller

“Xpander pods”

ToR ToR

ToR

ToR

Page 16: XPANDER: TOWARDS OPTIMAL-PERFORMANCE … · WHAT IS THE “RIGHT” DATACENTER ARCHITECTURE? DEPLOYABILITY E Jellyfish Slim-Fly???? Fat Tree SWDC, DCell, BCube, c-Through, Helios,

XPANDER DATACENTER ARCHITECTURE

Topology

Routing Multipath Routing(K-Shortest Paths)

Congestion Control

Multipath Congestion Control(Multipath-TCP)

ToR ToR

ToR

ToR

Page 17: XPANDER: TOWARDS OPTIMAL-PERFORMANCE … · WHAT IS THE “RIGHT” DATACENTER ARCHITECTURE? DEPLOYABILITY E Jellyfish Slim-Fly???? Fat Tree SWDC, DCell, BCube, c-Through, Helios,

➡ Support higher traffic loads➡ More resilient to failures➡ Support more servers with less

network devices➡ Multiple short-paths between hosts➡ Incrementally expandable

EXPANDER DATACENTERS ACHIEVE NEAR-OPTIMAL PERFORMANCE

Page 18: XPANDER: TOWARDS OPTIMAL-PERFORMANCE … · WHAT IS THE “RIGHT” DATACENTER ARCHITECTURE? DEPLOYABILITY E Jellyfish Slim-Fly???? Fat Tree SWDC, DCell, BCube, c-Through, Helios,

NEAR OPTIMAL ALL-TO-ALL THROUGHPUT

Theorem: In the all-to-all setting, the throughout of any d-regular expander G on n vertices is within a factor of

O(logd) of that of the throughput-optimal d-regular graph on n vertices

* 18-port switches

0.5 0.55

0.6 0.65

0.7 0.75

0.8 0.85

0.9 0.95

1

0 500 1000 1500 2000 Nor

mel

ized

Thr

ough

put

Number Of Servers

All-to-All Throughput

Xpander

Jellyfish

LPS_54

LPS_62

*

Page 19: XPANDER: TOWARDS OPTIMAL-PERFORMANCE … · WHAT IS THE “RIGHT” DATACENTER ARCHITECTURE? DEPLOYABILITY E Jellyfish Slim-Fly???? Fat Tree SWDC, DCell, BCube, c-Through, Helios,

Theorem: In any d-regular expander, any two vertices are connected by exactly d edge-disjoint paths.

RESILIENCE TO FAILURES

Page 20: XPANDER: TOWARDS OPTIMAL-PERFORMANCE … · WHAT IS THE “RIGHT” DATACENTER ARCHITECTURE? DEPLOYABILITY E Jellyfish Slim-Fly???? Fat Tree SWDC, DCell, BCube, c-Through, Helios,

• Expander datacenters empirically attain near-optimal throughput under skewed TMs (mice and elephants)

• We prove that expander datacenters are optimal with respect to adversarial traffic conditions

NEAR-OPTIMAL THROUGHPUT UNDER SKEWED TRAFFIC MATRICES

Page 21: XPANDER: TOWARDS OPTIMAL-PERFORMANCE … · WHAT IS THE “RIGHT” DATACENTER ARCHITECTURE? DEPLOYABILITY E Jellyfish Slim-Fly???? Fat Tree SWDC, DCell, BCube, c-Through, Helios,

COST EFFICIENCY: XPANDER VS. FAT-TREE

Switch Degree #Switches All-to-All Throughput

8* 80% 121%

10 100% 157%

24 80% 111%

*Validated using Mininet experiments

Page 22: XPANDER: TOWARDS OPTIMAL-PERFORMANCE … · WHAT IS THE “RIGHT” DATACENTER ARCHITECTURE? DEPLOYABILITY E Jellyfish Slim-Fly???? Fat Tree SWDC, DCell, BCube, c-Through, Helios,

SEE PAPER FOR• Analysis of shortest-paths and diameter• Physical layout and costs• Incremental expansion of expander datacenters• Results for skewed traffic matrices• Results for Xpander vs. Jellyfish• Results for Xpander vs. Slim-Fly• Additional results for Xpander vs. Fat Tree• Experiments with the Mininet network emulator• Experiments on the OCEAN SDN-capable network testbed• …

Page 23: XPANDER: TOWARDS OPTIMAL-PERFORMANCE … · WHAT IS THE “RIGHT” DATACENTER ARCHITECTURE? DEPLOYABILITY E Jellyfish Slim-Fly???? Fat Tree SWDC, DCell, BCube, c-Through, Helios,

DEPLOYING XPANDER

➡ Place ToRs of each meta-node in close proximity➡ Bundle cables between two meta-nodes➡ Use color-coding to distinguish between different

meta-nodes and bundles of cables

No links within the

same meta-node

Same number of

links between every two

meta-nodes

ToR ToR

ToR

ToR

Page 24: XPANDER: TOWARDS OPTIMAL-PERFORMANCE … · WHAT IS THE “RIGHT” DATACENTER ARCHITECTURE? DEPLOYABILITY E Jellyfish Slim-Fly???? Fat Tree SWDC, DCell, BCube, c-Through, Helios,

DEPLOYING XPANDER

• Analysed physical layout, cabling complexity, #cables and cable length for both large-scale and “container” datacenters

SwitchPorts #Switches #Servers #Cables Cable Length Throughput

32 42 vs. 48(87.5%)

504 vs. 512(98.44%)

420 vs. 512(82%)

4.2 km vs 5.12km(82%)

109%

48 66 vs. 72(92%)

1056 vs. 1152(92%)

1056 vs. 1152(92%)

10.5 km vs 11.5km(92%)

142%

Page 25: XPANDER: TOWARDS OPTIMAL-PERFORMANCE … · WHAT IS THE “RIGHT” DATACENTER ARCHITECTURE? DEPLOYABILITY E Jellyfish Slim-Fly???? Fat Tree SWDC, DCell, BCube, c-Through, Helios,

CONCLUSION• We show that expander datacenters outperform traditional

datacenters

✓ Sheds light on past results about random and low-diameter graphs based datacenters

• We present Xpander, a novel datacenter architecture

✓ Suggests a tangible alternative to today’s datacenter architectures

✓ Achieves near-optimal performance

Page 26: XPANDER: TOWARDS OPTIMAL-PERFORMANCE … · WHAT IS THE “RIGHT” DATACENTER ARCHITECTURE? DEPLOYABILITY E Jellyfish Slim-Fly???? Fat Tree SWDC, DCell, BCube, c-Through, Helios,

QUESTIONS?THANK YOU!

See project webpage at: https://husant.github.io/Xpander/