Nerd Lunch
date post
07-Apr-2018Category
Documents
view
223download
0
Embed Size (px)
Transcript of Nerd Lunch
8/3/2019 Nerd Lunch
1/51
1
RouterBricks
Scaling Software Routers with Modern Servers
Kevin Fall
Intel Labs, Berkeley
Feb 24, 2010Ericsson, San Jose, CA
8/3/2019 Nerd Lunch
2/51
2
Project Participants
Intel Labs Gianluca Iannaccone (co-PI, researcher)
Sylvia Ratnasamy (co-PI, researcher)
Kevin Fall (principal engineer) Allan Knies (principal engineer)
Maziar Manesh (research engineer)
Eddie Kohler (Click expert)
Dan Dahle (tech strategy)
Badarinath Kommandur (tech strategy)
Ecole Polytecnique (EPFL), Switzerland Katerina Argyraki (faculty)
Mihai Dobrescu (student)
Diaqing Chu (student)
8/3/2019 Nerd Lunch
3/51
3
Outline
Introduction
Approach: cluster-based router
RouteBricks implementation
Performance results
Next steps
8/3/2019 Nerd Lunch
4/51
4
RouterBricks: in a nutshell
A high-speed router using IA server components
fully programmable: control and data plane
extensible: evolve networks via software upgrade
incrementally scalable: flat cost per bit
8/3/2019 Nerd Lunch
5/51
5
Motivation
Network infrastructure is doing more than ever before
Packet-pushing (routing) no longer the whole story security, data loss protection, application optimization, etc.
has led to a proliferation of special appliances
and notions that perhaps routers could do more Cisco, Juniper supporting open APIs Openflow consortium: Stanford, HP, Broadcom, Cisco
But these platforms werent born programmable
8/3/2019 Nerd Lunch
6/51
6
Motivation
If flexibility ultimately implies programmability...
Hard to beat IA platforms and their ecosystem
Or price
However, must deal with persistent folklore:
IA cant do high-speed packet processing
But todays IA isnt the IA you know from your youth
multicore, multiple integrated mem-controllers, PCIe, multi-Q NICs,
8/3/2019 Nerd Lunch
7/51
7
Motivation
Combine a desire for more programmability...
with new router friendly server trends
a new opportunity for IA servers?
Router Bricks: How might we
build a big (~1Tbps) IA-based software router?
8/3/2019 Nerd Lunch
8/51
8
Challenge
traditional software routers
research prototypes (2007): 1 - 2 Gbps
Vyatta* datasheet (2009): 2 - 4 Gbps
current carrier-grade routers
line speeds: 10/40Gbps aggregate switching speeds:40Gbps to 92Tbps!
* Other names and brands may be claimed as properties of others
8/3/2019 Nerd Lunch
9/51
9
Strategy
1. A cluster-based router architecture
each server need only scale to line speeds (10-40Gbps),rather than aggregate speeds (40Gbps 92Tbps)
2. Understand whether modern server architecturescan scale to line speeds (10-40Gbps)
if not, why?
3. Leverage open-source control plane implementations
xorp, quagga, etc. [but we focus on data plane here]
8/3/2019 Nerd Lunch
10/51
10
Broader Benefits
1. infrastructure that is well-known and cheaper to evolve
familiar programming environment
separately-evolvable network software and hardware
reduced cost -> more frequent upgrade opportunity
2. networks with the benefits of the PC ecosystem
high-volume manufacturing
widespread supply/support
state-of-the-art process technologies (ride Moores Law)
evolving PC platform features (power mgmt, crypto, etc.)
8/3/2019 Nerd Lunch
11/51
11
Outline
Introduction
Approach: cluster-based router
RouteBricks implementation
Performance results Next steps
8/3/2019 Nerd Lunch
12/51
1212
Traditional router architecture
#1 Nports
R bps[R each direction]
2 3
N ports, per-port speed R bps
8/3/2019 Nerd Lunch
13/51
13
Traditional router architecture
R bps
switchscheduler
switch fabric
queuemgmnt,
shaping,etc.
IP addresslookup,
Q mgmnt, etc.
addrtables,FIB,ACLs
IP address
lookup,
q-mgmt, etc.
addr tables,FIB, ACLs
queue
mgmt,
shaping,
etc.
linecard
queuemgmnt,
shaping,etc.
IP addresslookup,
Q mgmnt, etc.
addrtables,FIB,ACLs
queuemgmnt,
shaping,etc.
IP addresslookup,
Q mgmnt, etc.
addrtables,FIB,ACLs
control processor (runs IOS/quagga/xorp, etc)
runs atR bps
runs atNR
8/3/2019 Nerd Lunch
14/51
1414
Moving to a cluster-router
#1 Nports
R bps
switchscheduler
switch fabric
2 3
queuemgmnt,
shaping,etc.
IP addresslookup,
Q mgmnt, etc.
addrtables,FIB,ACLs
IP address
lookup,
q-mgmt, etc.
addr tables,FIB, ACLs
queue
mgmt,
shaping,
etc.
linecard
queuemgmnt,
shaping,etc.
IP addresslookup,
Q mgmnt, etc.
addrtables,FIB,ACLs
queuemgmnt,
shaping,etc.
IP addresslookup,
Q mgmnt, etc.
addrtables,FIB,ACLs
control processor (runs IOS/quagga/xorp, etc)
#1 Nportsstep 1: single server implements one port;
N ports N servers
8/3/2019 Nerd Lunch
15/51
1515
Moving to a cluster-router
#1 N
R bps
step 1: single server implements one port;
N ports N servers
switchscheduler
switch fabric
2
control processor (runs IOS/quagga/xorp, etc)
IP addresslookup,
Q mgmnt, etc.
addr tables,FIB, ACLs
queuemgmnt,shaping,
etc.
linecard
implementedin software
Each server must
process at least2R traffic (in+out)
8/3/2019 Nerd Lunch
16/51
1616
Moving to a cluster-router
#1 Nports
R bps
step 2: replace switch fabric and scheduler
with a distributed, software-based solution
switchscheduler
switch fabric
control processor (runs IOS/quagga/xorp, etc)
2
8/3/2019 Nerd Lunch
17/51
17
Moving to a cluster-router
#1 Nports
R bps
2
control processor (runs IOS/quagga/xorp, etc)
server-to-serverinterconnect topology
step 2: replace switch fabric and scheduler
with a distributed, software-based solution
distributed schedulingalgorithms, based on
Valiant Load Balancing (VLB)
8/3/2019 Nerd Lunch
18/51
1818
Example: VLB over a mesh** other topologies offer different tradeoffs
# servers N
internal fanout N-1
internal link capacity
(RN/[N(N-1)/2])
2R
N-1
processing/server
[out+in+through]
3R(2R)*
N servers can achieve switching speeds of N R bps, provided each
server can process packets at 3R (*2R for Direct-VLB avg case)
N ports, Rbpsport rate
Rbps [each direction]
N
1
2
3 5
4
8/3/2019 Nerd Lunch
19/51
19
Outline
Introduction
Approach: cluster-based router
RouteBricks implementation
RB4 prototype Click overview
Performance results
Next steps
8/3/2019 Nerd Lunch
20/51
20
RB4: hardware architecture
10Gbps 4 dual-socket NHM-EPs
8x 2.8GHz cores (no SMT)
8MB L3 cache
6x1 GB DDR3
2 PCIe 2.0 slots (8 lanes)
default BIOS setting
2x 10Gbps Oplin cards per server
dual port
PCIe 1.1
(now using Niantic /PCIe 2.0)
8/3/2019 Nerd Lunch
21/51
21
RB4: software architecture
10Gbps
Linux
2.6.24
KernelClick runtime
RB
VLB
RB device driver
user space
packet
processing
(linecard)
NIC NIC NIC NIC
Place for value-added services(e.g., monitoring, energyproxy, management, etc.)
hooks
for
new
srvcs
implemented in Click
unmodified
RB data plane
8/3/2019 Nerd Lunch
22/51
22
Click Overview
Modular, extensible software router
built on Linux as kernel module
combines versatility and high performance
Architecture consists of elementsthat implement packet processing functions
configuration language that connects elements into a packet data flow
internal scheduler that decides which element to run
Large open source library (200+ elements) means new routingapplications can often be written with just a configuration script
slide material courtesy E.Kohler, UCLA
8/3/2019 Nerd Lunch
23/51
23
RB4: software architecture
Linux
2.6.24
KernelClick runtime
RB
VLB
RB device driver
user space
packetprocessing
(linecard)
NIC NIC NIC NIC
Value-added services(e.g., monitoring, energyproxy, management, etc.)
hooks
fornew
srvcs
implemented in Click
unmodified
Intel 10G driver polling-only operation
(no interrupts)
transfers packets tomemory in batches
of k (we use k=16) RSS w/ upto 32/64 rx/tx
NIC queues
8/3/2019 Nerd Lunch
24/51
24
Outline
Introduction
Approach: cluster-based router
RouteBricks implementation
RB4 prototype Click