Data Center Switch Architecture in the Age of … · Data Center Switch Architecture in the Age of...

36
Data Center Switch Architecture in the Age of Merchant Silicon Nathan Farrington Erik Rubow Amin Vahdat

Transcript of Data Center Switch Architecture in the Age of … · Data Center Switch Architecture in the Age of...

Page 1: Data Center Switch Architecture in the Age of … · Data Center Switch Architecture in the Age of ... A Scalable, Commodity Data Center Network ... and S. Sengupta. VL2: A Scalable

Data Center Switch Architecturein the Age of Merchant Silicon

Nathan FarringtonErik Rubow

Amin Vahdat

Page 2: Data Center Switch Architecture in the Age of … · Data Center Switch Architecture in the Age of ... A Scalable, Commodity Data Center Network ... and S. Sengupta. VL2: A Scalable

The Network is a Bottleneck

• HTTP request amplification– Web search (e.g. Google)

– Small object retrieval (e.g. Facebook)

– Web services (e.g. Amazon.com)

• MapReduce-style parallel computation– Inverted search index

– Data analytics

• Need high-performance interconnects

Hot Interconnects August 27, 2009

Nathan Farrington [email protected]

2

Page 3: Data Center Switch Architecture in the Age of … · Data Center Switch Architecture in the Age of ... A Scalable, Commodity Data Center Network ... and S. Sengupta. VL2: A Scalable

The Network is Expensive

Hot Interconnects August 27, 2009

Nathan Farrington [email protected]

3

Rack 1 Rack 2 Rack 3 Rack N

8xGbE

. . . 48xGbE TOR Switch . . .

. . . 40x1U Servers . . .

10GbE

Page 4: Data Center Switch Architecture in the Age of … · Data Center Switch Architecture in the Age of ... A Scalable, Commodity Data Center Network ... and S. Sengupta. VL2: A Scalable

What we really need: One Big Switch

• Commodity

• Plug-and-play

• Potentially no oversubscription

Hot Interconnects August 27, 2009

Nathan Farrington [email protected]

4

Rack 1 Rack 2 Rack 3 Rack N

Page 5: Data Center Switch Architecture in the Age of … · Data Center Switch Architecture in the Age of ... A Scalable, Commodity Data Center Network ... and S. Sengupta. VL2: A Scalable

Why not just use a fat tree of commodity TOR switches?

M. Al-Fares, A. Loukissas, A. Vahdat. A Scalable, Commodity Data Center Network Architecture. In SIGCOMM ’08.

Hot Interconnects August 27, 2009

5Nathan Farrington

[email protected]

k=4,n=3

Page 6: Data Center Switch Architecture in the Age of … · Data Center Switch Architecture in the Age of ... A Scalable, Commodity Data Center Network ... and S. Sengupta. VL2: A Scalable

10 Tons of Cable

• 55,296 Cat-6 cables

• 1,128 separate cable bundles

The “Yellow Wall”

Hot Interconnects August 27, 2009

6Nathan Farrington

[email protected]

Page 7: Data Center Switch Architecture in the Age of … · Data Center Switch Architecture in the Age of ... A Scalable, Commodity Data Center Network ... and S. Sengupta. VL2: A Scalable

Merchant Silicon gives usCommodity Switches

Maker Broadcom Fulcrum Fujitsu

Model BCM56820 FM4224 MB86C69RBC

Ports 24 24 26

Cost NDA NDA $410

Power NDA 20 W 22 W

Latency < 1 μs 300 ns 300 ns

Area NDA 40 x 40 mm 35 x 35 mm

SRAM NDA 2 MB 2.9 MB

Process 65 nm 130 nm 90 nm

Hot Interconnects August 27, 2009

7Nathan Farrington

[email protected]

Page 8: Data Center Switch Architecture in the Age of … · Data Center Switch Architecture in the Age of ... A Scalable, Commodity Data Center Network ... and S. Sengupta. VL2: A Scalable

Eliminate Redundancy

• Networks of packet switches contain many redundant components– chassis, power

conditioning circuits, cooling

– CPUs, DRAM

• Repackage these discrete switches to lower the cost and power consumption

CPUASIC

PHY

SFP+ SFP+ SFP+

FAN

FAN

FAN

FAN

PSU

8 Ports

Hot Interconnects August 27, 2009

8Nathan Farrington

[email protected]

Page 9: Data Center Switch Architecture in the Age of … · Data Center Switch Architecture in the Age of ... A Scalable, Commodity Data Center Network ... and S. Sengupta. VL2: A Scalable

Our Architecture, in a Nutshell

• Fat tree of merchant silicon switch ASICs• Hiding cabling complexity with PCB traces and

optics• Partition into multiple pod switches + single

core switch array• Custom EEP ASIC to further reduce cost and

power• Scales to 65,536 ports when 64-port ASICs

become available, late 2009

Hot Interconnects August 27, 2009

Nathan Farrington [email protected]

9

Page 10: Data Center Switch Architecture in the Age of … · Data Center Switch Architecture in the Age of ... A Scalable, Commodity Data Center Network ... and S. Sengupta. VL2: A Scalable

3 Different Designs

• 24-ary 3-tree

• 720 switch ASICs

• 3,456 ports of 10GbE

• No oversubscription

Hot Interconnects August 27, 2009

Nathan Farrington [email protected]

10

1 2 3

Page 11: Data Center Switch Architecture in the Age of … · Data Center Switch Architecture in the Age of ... A Scalable, Commodity Data Center Network ... and S. Sengupta. VL2: A Scalable

Network 1: No Engineering Required

Hot Interconnects August 27, 2009

Nathan Farrington [email protected]

11

Cost of Parts $4.88M

Power 52.7 kW

Cabling Complexity 3,456

Footprint 720 RU

NRE $0

• 720 discrete packet switches, connected with optical fiber

Cabling complexity (noun): the number of long cables in a data center network.

Page 12: Data Center Switch Architecture in the Age of … · Data Center Switch Architecture in the Age of ... A Scalable, Commodity Data Center Network ... and S. Sengupta. VL2: A Scalable

Network 2: Custom Boards and Chassis

Hot Interconnects August 27, 2009

Nathan Farrington [email protected]

12

Cost of Parts $3.07M

Power 41.0 kW

Cabling Complexity 96

Footprint 192 RU

NRE $3M est

• 24 “pod” switches, one core switch array, 96 cables

This design is shown in more detail later.

Page 13: Data Center Switch Architecture in the Age of … · Data Center Switch Architecture in the Age of ... A Scalable, Commodity Data Center Network ... and S. Sengupta. VL2: A Scalable

Switch at 10G,but Transmit at 40G

SFP SFP+ QSFP

Rate 1 Gb/s 10 Gb/s 40 Gb/s

Cost/Gb/s $35* $25* $15*

Power/Gb/s 500mW 150mW 60mW

* 2008-2009 Prices

Hot Interconnects August 27, 2009

13Nathan Farrington

[email protected]

Page 14: Data Center Switch Architecture in the Age of … · Data Center Switch Architecture in the Age of ... A Scalable, Commodity Data Center Network ... and S. Sengupta. VL2: A Scalable

Network 3: Network 2 + Custom ASIC

Hot Interconnects August 27, 2009

Nathan Farrington [email protected]

14

Cost of Parts $2.33M

Power 36.4 kW

Cabling Complexity 96

Footprint 114 RU

NRE $8M est

• Uses 40GbE between pod switches and core switch array; everything else is same as Network 2.

EEP

This simple ASIC provides tremendous cost and power savings.

Page 15: Data Center Switch Architecture in the Age of … · Data Center Switch Architecture in the Age of ... A Scalable, Commodity Data Center Network ... and S. Sengupta. VL2: A Scalable

Cost of Parts

4.88

3.072.33

0

1

2

3

4

5

6

Cost of Parts (in millions)

Network 1

Network 2

Network 3

Hot Interconnects August 27, 2009

Nathan Farrington [email protected]

15

Page 16: Data Center Switch Architecture in the Age of … · Data Center Switch Architecture in the Age of ... A Scalable, Commodity Data Center Network ... and S. Sengupta. VL2: A Scalable

Power Consumption

52.7

4136.4

0

10

20

30

40

50

60

Power Consumption (kW)

Network 1

Network 2

Network 3

Hot Interconnects August 27, 2009

Nathan Farrington [email protected]

16

Page 17: Data Center Switch Architecture in the Age of … · Data Center Switch Architecture in the Age of ... A Scalable, Commodity Data Center Network ... and S. Sengupta. VL2: A Scalable

Cabling Complexity

3,456

96 960

500

1,000

1,500

2,000

2,500

3,000

3,500

4,000

Cabling Complexity

Network 1

Network 2

Network 3

Hot Interconnects August 27, 2009

Nathan Farrington [email protected]

17

Page 18: Data Center Switch Architecture in the Age of … · Data Center Switch Architecture in the Age of ... A Scalable, Commodity Data Center Network ... and S. Sengupta. VL2: A Scalable

Footprint

720

192114

0

100

200

300

400

500

600

700

800

Footprint (in rack units)

Network 1

Network 2

Network 3

Hot Interconnects August 27, 2009

Nathan Farrington [email protected]

18

Page 19: Data Center Switch Architecture in the Age of … · Data Center Switch Architecture in the Age of ... A Scalable, Commodity Data Center Network ... and S. Sengupta. VL2: A Scalable

Partially Deployed Switch

Hot Interconnects August 27, 2009

19Nathan Farrington

[email protected]

Page 20: Data Center Switch Architecture in the Age of … · Data Center Switch Architecture in the Age of ... A Scalable, Commodity Data Center Network ... and S. Sengupta. VL2: A Scalable

Fully Deployed Switch

Hot Interconnects August 27, 2009

20Nathan Farrington

[email protected]

Page 21: Data Center Switch Architecture in the Age of … · Data Center Switch Architecture in the Age of ... A Scalable, Commodity Data Center Network ... and S. Sengupta. VL2: A Scalable

Pod Switch

Hot Interconnects August 27, 2009

21Nathan Farrington

[email protected]

Page 22: Data Center Switch Architecture in the Age of … · Data Center Switch Architecture in the Age of ... A Scalable, Commodity Data Center Network ... and S. Sengupta. VL2: A Scalable

Logical Topology

Hot Interconnects August 27, 2009

Nathan Farrington [email protected]

22

Page 23: Data Center Switch Architecture in the Age of … · Data Center Switch Architecture in the Age of ... A Scalable, Commodity Data Center Network ... and S. Sengupta. VL2: A Scalable

Pod Switch Line Card

Hot Interconnects August 27, 2009

23Nathan Farrington

[email protected]

Page 24: Data Center Switch Architecture in the Age of … · Data Center Switch Architecture in the Age of ... A Scalable, Commodity Data Center Network ... and S. Sengupta. VL2: A Scalable

Pod Switch Uplink Card

Hot Interconnects August 27, 2009

24Nathan Farrington

[email protected]

Page 25: Data Center Switch Architecture in the Age of … · Data Center Switch Architecture in the Age of ... A Scalable, Commodity Data Center Network ... and S. Sengupta. VL2: A Scalable

Core Switch Array Card

Hot Interconnects August 27, 2009

25Nathan Farrington

[email protected]

Page 26: Data Center Switch Architecture in the Age of … · Data Center Switch Architecture in the Age of ... A Scalable, Commodity Data Center Network ... and S. Sengupta. VL2: A Scalable

Why an Ethernet Extension Protocol?

• Optical transceivers are 80% of the cost

• EEP allows the use of fewer and faster optical transceivers

Hot Interconnects August 27, 2009

Nathan Farrington [email protected]

26

EEP EEP40GbE

10GbE

10GbE

10GbE

10GbE

10GbE

10GbE

10GbE

10GbE

Page 27: Data Center Switch Architecture in the Age of … · Data Center Switch Architecture in the Age of ... A Scalable, Commodity Data Center Network ... and S. Sengupta. VL2: A Scalable

How does EEP work?

• Ethernet frames are split up into EEP frames• Most EEP frames are 65 bytes

– Header is 1 byte; payload is 64 bytes

• Header encodes ingress/egress port

Hot Interconnects August 27, 2009

Nathan Farrington [email protected]

27

EEP EEP

Page 28: Data Center Switch Architecture in the Age of … · Data Center Switch Architecture in the Age of ... A Scalable, Commodity Data Center Network ... and S. Sengupta. VL2: A Scalable

How does EEP work?

• Round-robin arbiter• EEP frames are transmitted as one large

Ethernet frame• 40GbE overclocked by 1.6%

Hot Interconnects August 27, 2009

Nathan Farrington [email protected]

28

EEP EEP

Page 29: Data Center Switch Architecture in the Age of … · Data Center Switch Architecture in the Age of ... A Scalable, Commodity Data Center Network ... and S. Sengupta. VL2: A Scalable

Hot Interconnects August 27, 2009

Nathan Farrington [email protected]

29

EEP EEP

Ethernet Frames

Page 30: Data Center Switch Architecture in the Age of … · Data Center Switch Architecture in the Age of ... A Scalable, Commodity Data Center Network ... and S. Sengupta. VL2: A Scalable

Hot Interconnects August 27, 2009

Nathan Farrington [email protected]

30

EEP EEP

EEP Frames

123

1

12

1

3

2

Page 31: Data Center Switch Architecture in the Age of … · Data Center Switch Architecture in the Age of ... A Scalable, Commodity Data Center Network ... and S. Sengupta. VL2: A Scalable

Hot Interconnects August 27, 2009

Nathan Farrington [email protected]

31

EEP EEP

123

1

12

1

3

2

123

1

12

1

3

2

nfarring
Sticky Note
This slide shows an animation of EEP frames being selected round-robin by the EEP chip on the left, transmitted to the EEP chip on the right, and then being reassembled.
Page 32: Data Center Switch Architecture in the Age of … · Data Center Switch Architecture in the Age of ... A Scalable, Commodity Data Center Network ... and S. Sengupta. VL2: A Scalable

EEP Frame Format

SOF: Start of Ethernet Frame

EOF: End of Ethernet Frame

LEN: Set if EEP Frame contains less than 64B of payload

Virtual Link ID: Corresponds to port number (0-15)

Payload Length: (0-63B)

Hot Interconnects August 27, 2009

32Nathan Farrington

[email protected]

Page 33: Data Center Switch Architecture in the Age of … · Data Center Switch Architecture in the Age of ... A Scalable, Commodity Data Center Network ... and S. Sengupta. VL2: A Scalable

Why not use VLANs?

• Because it adds latency and requires more SRAM

• FPGA Implementation– VLAN tagging

– EEP

Hot Interconnects August 27, 2009

Nathan Farrington [email protected]

33

Page 34: Data Center Switch Architecture in the Age of … · Data Center Switch Architecture in the Age of ... A Scalable, Commodity Data Center Network ... and S. Sengupta. VL2: A Scalable

Latency Measurements

Hot Interconnects August 27, 2009

34Nathan Farrington

[email protected]

Page 35: Data Center Switch Architecture in the Age of … · Data Center Switch Architecture in the Age of ... A Scalable, Commodity Data Center Network ... and S. Sengupta. VL2: A Scalable

Related Work

• M. Al-Fares, A. Loukissas, A. Vahdat. A Scalable, Commodity Data Center Network Architecture. In SIGCOMM ’08.• Fat trees of commodity switches, Layer 3 routing, flow scheduling

• R. N. Mysore, A. Pamboris, N. Farrington, N. Huang, P. Miri, S. Radhakrishnan, V. Subramanya, and A. Vahdat. PortLand: A Scalable Fault-Tolerant Layer 2 Data Center Network Fabric. In SIGCOMM ’09.

– Layer 2 routing, plug-and-play configuration, fault tolerance, switch software modifications

• A. Greenberg, J. R. Hamilton, N. Jain, S. Kandula, C. Kim, P. Lahiri, D. A. Maltz, P. Patel, and S. Sengupta. VL2: A Scalable and Flexible Data Center Network. In SIGCOMM ’09.

– Layer 2 routing, end-host modifications

Hot Interconnects August 27, 2009

35Nathan Farrington

[email protected]

Page 36: Data Center Switch Architecture in the Age of … · Data Center Switch Architecture in the Age of ... A Scalable, Commodity Data Center Network ... and S. Sengupta. VL2: A Scalable

Conclusion

• General architecture– Fat tree of merchant silicon switch ASICs

– Hiding cabling complexity

– Pods + Core

– Custom EEP ASIC

– Scales to 65,536 ports with 64-port ASICs

• Design of a 3,456-port 10GbE switch

• Design of the EEP ASIC

Hot Interconnects August 27, 2009

Nathan Farrington [email protected]

36