Hardware-Accelerated Control Planes Hotnets 18

Post on 08-May-2022

5 views 0 download

Transcript of Hardware-Accelerated Control Planes Hotnets 18

Hardware-Accelerated Network Control Planes

Edgar Costa Molero(1),

Stefano Vissicchio(2), Laurent Vanbever(1)

(2)(1)

Modern networks architectures are

split in (at least) two planes

2

3

Modern networks architectures are

split in (at least) two planes

data plane

control plane

4

Network planes can be implemented in bothsoftware or hardware

data plane

control plane

Software Hardware

Software Hardware

Existing data plane implementations cover the entire

software/hardware spectrum

5

data plane OVS ASIC

VPP

DPDK

FPGA

control plane

Click

Software Hardware

Software Hardware

6

What about the control plane ?

data plane

control plane

OVS ASIC

VPP

DPDK

FPGAClick

Software Hardware

Software Hardware

Control plane implementations make

seldom use of the hardware resources

7

data plane

control plane BIRD

Quagga

FRRouting

OVS ASIC

VPP

DPDK

FPGAClick

Software Hardware

Software Hardware

8

data plane

control plane BIRD

Quagga

FRRouting

OVS ASIC

VPP

DPDK

FPGAClick

Software Hardware

Software Hardware

9

data plane

control plane BIRD

Quagga

FRRouting

OVS ASIC

VPP

DPDK

FPGAClick

Software Hardware

Software Hardware

Not explored

10

data plane

control plane BIRD

Quagga

FRRouting

OVS ASIC

VPP

DPDK

FPGAClick

Software Hardware

Software Hardware

Do we care?

Even state-of-the-art software control planeshave room for improvement

11

Even state-of-the-art software control planeshave room for improvement

12

React1 It can take up to a minute to detect normal failures

Even state-of-the-art software control planeshave room for improvement

13

React1

Compute2 ~1.5 minutes to converge the control plane of an IXP route server

Even state-of-the-art software control planeshave room for improvement

14

React1

Compute2

Update3 O(100us) to update a forwardingentry

15

data plane

control plane BIRD

Quagga

FRRouting

OVS ASIC

VPP

DPDK

FPGAClick

Software Hardware

Software Hardware

Do we care?

16

data plane

control plane BIRD

Quagga

FRRouting

OVS ASIC

VPP

DPDK

FPGAClick

Software Hardware

Software Hardware

Yes!

17

data plane

control plane BIRD

Quagga

FRRouting

OVS ASIC

VPP

DPDK

FPGAClick

Software Hardware

Software Hardware

Can we do somethingabout it?

Modern programmable devices can perform computations on billions of packets per second

18

Modern programmable devices can perform computations on billions of packets per second

19

Read & modify packet headerse.g. to update network state

Basic operationse.g. min & max

Add or remove custom headerse.g. to carry routing information

Keep statee.g. to save best paths

20

data plane

control plane BIRD

Quagga

FRRouting

OVS ASIC

VPP

DPDK

FPGAClick

Software Hardware

Software Hardware

Can we do somethingabout it?

21

data plane

control plane BIRD

Quagga

FRRouting

OVS ASIC

VPP

DPDK

FPGAClick

Software Hardware

Software Hardware

Yes!

22

data plane

control plane

hardwarebased CP

Step 1

BIRD

Quagga

FRRouting

OVS ASIC

VPP

DPDK

FPGAClick

Software Hardware

Software Hardware

23

data plane

control plane HW/SW Codesign

Step 2

BIRD

Quagga

FRRouting

OVS ASIC

VPP

DPDK

FPGAClick

Software Hardware

Software Hardware

24

Sensing1 monitors network to detect changes

Main tasks to compute forwarding state are…

25

Sensing1

Notification2

Main tasks to compute forwarding state are…

exchanges with network devices all the information learnt

26

Sensing1

Notification2

Computation 3

Main tasks to compute forwarding state are…

Computes forwarding paths when network changes are detected

Updates the data plane accordingly

27

Hardware-based network sensing

Goal

Challenges

28

Detect both hard and gray failures

Hardware-based network sensing

Goal

Challenges

29

Detect both hard and gray failures

Hardware-based network sensing

Goal

e.g. random drop, TCAM bit flips

Challenges

30

Hardware-based network sensing

Goal

Challenges Basic hello-based mechanisms are not enough

Detect both hard and gray failures Goal

e.g. random drop, TCAM bit flips

31

A B

Switches synchronously exchange packet counts

32

A B

destination

# counter

detection state

stored in registers

Switches synchronously exchange packet counts

0

0

0

0

0

0

# counter

destination

33

A B

start counting

stop counting

traffic

destination

# received & forwarded packets

detection state

stored in registers

Upstream switch starts probing campaigns

0

0

0

0

0

0

# sent packets

destination

34

A B

start counting

stop counting

traffic

destination

# received & forwarded packets

detection state

stored in registers

Traffic for some prefixes gets dropped

0

0

0

0

0

0

# sent packets

destination

red packets get dropped

35

A B

start counting

stop counting

traffic

destination

# received & forwarded packets

detection state

stored in registers

send counters & compare

Downstream switch sends countersto upstream

0

2

2

3

2

2

# sent packets

destination

36

A B

start counting

stop counting

traffic

destination

# received & forwarded packets

detection state

stored in registers

send counters & compare

Upstream switch detects the failure by comparing counters

0

2

2

3

2

2

# sent packets

destination

Hardware-based notifications

37

Goal

Challenges

Hardware-based notifications

38

Implement a broadcast notificationmechanism in hardware

Goal

Challenges

Hardware-based notifications

39

Implement a broadcast notificationmechanism in hardware

Goal

Challenges

Require reliable communication

Avoid broadcast storms

40

▸ Use per switch broadcast sequence numbers

Hardware-based notifications

Avoid broadcast storms

41

▸ Use per switch broadcast sequence numbers

▸ Send notification duplicates

▸ Use maximum priority queues

Hardware-based notifications

Avoid broadcast storms

Require reliable communication

42

Goal

Challenges

Hardware-based computation

43

e.g. path vector

Goal

Challenges

Hardware-based computation

Run distributed routing algorithms in hardware

44

e.g. path vector

Goal

Challenges

Hardware-based computation

Run distributed routing algorithms in hardware

Computation logic is limited

Resources are heavily limited

45

B C DA1

0output port

prefix-to- index

link cost

A

port cost path

…50

50 1 3 [A B C D]

forwarding state

stored in registers

11

10

C101

1

46

B C DA1

0output port

prefix-to- index

link cost

A statically configured

port cost path

…50

50 1 3 [A B C D]

forwarding state

stored in registers

11

10

Statically configured tables map prefixes to registers in memory

C101

1

maps prefixes to registers

destinationnetwork

47

B C DA1

0output port

prefix-to- index

link cost

A statically configured

port cost path

…50

50 1 3 [A B C D]

forwarding state

stored in registers

11

10

Registers store best paths and its attributes

C101

1

maps prefixes to registers

only store the best pathand its attributes

destinationnetwork

48

B C DA1

0output port

prefix-to- index

link cost

A…

50

11

10

Switches periodically advertise vectorsto neighbors

C101

1

destination path

0 [A]

cost

0 [A]

periodicallyadvertise vectors

port cost path

50 -1 ∞ Ø

forwarding state

stored in registers

dynamicallycomputed

If (10 + 0) < ∞

50 0 10 [A D]

49

B C DA1

0output port

prefix-to- index

link cost

A

port cost path

…50

50 0 10 [A D]

forwarding state

stored in registers

11

10

Switches periodically advertise vectorsto neighbors

C101

1

destination path

1 [A]

cost

50

B C DA1

0output port

11

10

Switches periodically advertise vectorsto neighbors

1

destination path

2 [A]

cost

prefix-to- index

link cost

A statically configured

…50

dynamicallycomputed

If (2 + 1) < 10

50 1 3

C101

port cost path

50 0 10 [A D]

forwarding state

stored in registers[A B C D]

51

B C DA1

0output port

prefix-to- index

link cost

A statically configured

port cost path

…50

50 1 3 [A B C D]

forwarding state

stored in registers

11

10

Computing new forwarding state after a a link failure

C101link failure

52

B C DA

destination path

1

0output port

prefix-to- index

link cost

A statically configured

port cost path

…50

50 1 3 [A B C D]

forwarding state

stored in registers

11

10

∞ Ø

dynamicallycomputed

Computing new forwarding state after a a link failure

50 -1 ∞

C101

Ø

costlink failure

53

B C DA1

0output port

data-plane-generated path-vector

prefix-to- index

link cost

A statically configured

port cost path

…50

50 -1 ∞ Ø

forwarding state

stored in registers

11

10

0 [A]

dynamicallycomputed

If (10 + 0) < ∞

50 0 10 [A D]

C101link failure

Computing new forwarding state after a a link failure

Does it actually work?

54

Does it actually work? Yes!

55

56

Hardware-Accelerated P4 prototype

Compiled it to bmv2

2000 lines of P4 code

Implementation

Capabilities

Implemented in P416

path-vector routing▸ Intra-domain destinations

▸ Inter-domain destinationsBGP-like route selection

57

A

S2

S3

S4 S5

B

C

D

F

S1

AS1

AS2

AS3

AS4

AS5

G

H

x

p2

p1

AS7

3

1

1

1

3

2

2

peer

cust

cust

peer

peer

E

AS6

We tested our implementation in a real case study

58

A

S2

S3

S4 S5

B

C

D

F

S1

AS1

AS2

AS3

AS4

AS5

G

H

x

p2

p1

AS7

3

1

1

1

3

2

2

peer

cust

cust

peer

peer

E

AS6

Only the internal switches run the hardware-based control plane

59

A

S2

S3

S4 S5

B

C

D

F

S1

AS1

AS2

AS3

AS4

AS5

G

H

x

p2

p1

AS7

3

1

1

1

3

2

2

peer

cust

cust

peer

peer

E

AS6

Each switch is connected to an external peer or customer

60

A

S2

S3

S4 S5

B

C

D

F

S1

AS1

AS2

AS3

AS4

AS5

G

H

x

p2

p1

AS7

3

1

1

1

3

2

2

peer

cust

cust

peer

peer

E

AS6

We generate two TCP flowsfrom AS1 and AS2

61

A

S2

S3

S4 S5

B

C

D

F

S1

AS1

AS2

AS3

AS4

AS5

G

H

x

p2

p1

AS7

3

1

1

1

3

2

2

peer

cust

cust

peer

peer

E

AS6

LLnk (S1-AS3)

)low from AS1 )low from AS2

LLnk (S5-AS5)

time [s]

Bandwidth [Mbps]

0 4.8 15 25

0

10

4

6

Monitor traffic before the failure

Traffic S1- AS3

62

A

S2

S3

S4 S5

B

C

D

F

S1

AS1

AS2

AS3

AS4

AS5

G

H

x

p2

p1

AS7

3

1

1

1

3

2

2

peer

cust

cust

peer

peer

E

AS6

(1) internal Link failure

LLnk (S1-AS3)

)low from AS1 )low from AS2

LLnk (S5-AS5)

time [s]

Bandwidth [Mbps]

0 4.8 15 25

S2 to S3 link failure

0

10

4

6

Internal link fails and triggersthe path-vector algorithm

Traffic S1- AS3

63

A

S2

S3

S4 S5

B

C

D

F

S1

AS1

AS2

AS3

AS4

AS5

G

H

x

p2

p1

AS7

3

1

1

1

3

2

2

peer

cust

cust

peer

peer

E

AS6

(1) internal Link failure

LLnk (S1-AS3)

)low from AS1 )low from AS2

LLnk (S5-AS5)

time [s]

Bandwidth [Mbps]

0 4.8 15 25

S2 to S3 link failure

0

10

4

6

Traffic S1- AS3

Internal link fails and triggersthe path-vector algorithm

64

A

S2

S3

S4 S5

B

C

D

F

S1

AS1

AS2

AS3

AS4

AS5

G

H

x

p2

p1

AS7

3

1

1

1

3

2

2

peer

cust

cust

peer

peer

E

AS6

(1) internal Link failure

(2) external Link failure

(3) prefix x withdrawal

LLnk (S1-AS3)

)low from AS1 )low from AS2

LLnk (S5-AS5)

time [s]

Bandwidth [Mbps]

0 4.8 15 25

S2 to S3 link failure

withdrawal

0

10

4

6

Traffic S1- AS3

External link failure triggers a prefix withdrawal

65

A

S2

S3

S4 S5

B

C

D

F

S1

AS1

AS2

AS3

AS4

AS5

G

H

x

p2

p1

AS7

3

1

1

1

3

2

2

peer

cust

cust

peer

peer

E

AS6

(1) internal Link failure

(2) external Link failure

(3) prefix x withdrawal

(4) BGP exportpolicy violation

LLnk (S1-AS3)

)low from AS1 )low from AS2

LLnk (S5-AS5)

time [s]

Bandwidth [Mbps]

0 4.8 15 25

withdrawal

0

10

4

6

Network computes new egressand applies new policies

Traffic S5- AS5

66

data plane

control plane

hardwarebased CP

Step 1

BIRD

Quagga

FRRouting

OVS ASIC

VPP

DPDK

FPGAClick

Software Hardware

Software Hardware

67

data plane

control plane

hardwarebased CPBIRD

Quagga

FRRouting

OVS ASIC

VPP

DPDK

FPGAClick

Software Hardware

Software Hardware

is it a good idea?

Programmable hardware is great but… not limitless

68

69

Some tasks cannot be offloaded

Others might not be even desirable !

Programmable hardware is great but… not limitless

70

Some tasks cannot be offloaded

Others might not be even desirable !

Reliable protocolse.g. TCP would require too many resources !

Poor scalability of control plane taskshardware memory is scare and expensive

Programmable hardware is great but… not limitless

Can we have the best of both worlds?

71

72

HW/SWcodesign

Can we have the best of both worlds?

73

data plane

control plane HW/SW Codesign

Step 2

BIRD

Quagga

FRRouting

OVS ASIC

VPP

DPDK

FPGAClick

Software Hardware

Software Hardware

Hardware-software codesign

74

Specification Optimization Synthesis

functions

Costi( . )

Performancei( . )

constraints

Software

Hardware

problemgraph

mapping set

architecture graph

∀i: pred(i)<100

Hardware-software codesign

75

Specification Optimization

functions minn

∑i=1

Costi( . )

maxn

∑i=1

Performancei( . )Costi( . )

constraints

Software

Hardware

problemgraph

mapping set

architecture graph

∀i: pred(i)<100

Software

Hardware

Software

Hardware

cost(x):120

perf(x):200

cost(y):80

perf(y):200

Synthesis

Performancei( . )

Hardware-software codesign

76

Specification Optimization Synthesis

Software

Hardware

functions

runtime API

configurations P4 code

minn

∑i=1

Costi( . )

maxn

∑i=1

Performancei( . )Costi( . )

constraints

Software

Hardware

problemgraph

mapping set

architecture graph

configurations C/C++

∀i: pred(i)<100

Software

Hardware

Software

Hardware

cost(x):120

perf(x):200

cost(y):80

perf(y):200

Performancei( . )

Summary

77

We identified an unexploited opportunity

Software Hardware

Opportunity!

Summary

78

We showed that programmable data planes can run control plane tasks

Software Hardware

Opportunity!

We identified an unexploited opportunity

Summary

79

We showed that programmable data planes can run control plane tasks

We plan on leveraging HW/SW codesignto explore design tradeoffs

Software Hardware

Opportunity!

minn

∑i=1

Costi( . )

maxn

∑i=1

Per for m ancei( . )Costi( . )

Per for m ancei( . )

We identified an unexploited opportunity

80