SDN and NFV Portfolio - Open...

28
Copyright 2016 FUJITSU LABORATORIES LIMITED Kazuki Hyoudou Fujitsu Laboratories Ltd. NUMA-aware DPDK-OVS for High Performance NFV Platform

Transcript of SDN and NFV Portfolio - Open...

Page 1: SDN and NFV Portfolio - Open vSwitchopenvswitch.org/support/ovscon2016/8/1335-hyoudou.pdfMEM_LOAD_UOPS_L3_MISS_RETIRED.REMOTE_DRAM 11 171,975 MEM_LOAD_UOPS_LLC_MISS_RETIRED.REMOTE_HITM

Copyright 2016 FUJITSU LABORATORIES LIMITED

Kazuki Hyoudou Fujitsu Laboratories Ltd.

NUMA-aware DPDK-OVS for High Performance NFV Platform

Page 2: SDN and NFV Portfolio - Open vSwitchopenvswitch.org/support/ovscon2016/8/1335-hyoudou.pdfMEM_LOAD_UOPS_L3_MISS_RETIRED.REMOTE_DRAM 11 171,975 MEM_LOAD_UOPS_LLC_MISS_RETIRED.REMOTE_HITM

Copyright 2016 FUJITSU LABORATORIES LIMITED

What is NFV?

Network Appliances

Proprietary Hardware

Vendor specific API/CLI

High CAPEX & OPEX

Need long term to update

Network Functions

Virtualization (NFV)

NW functions by software

Standard Hardware

1

Refer to http://portal.etsi.org/NFV/NFV_White_Paper2.pdf

Page 3: SDN and NFV Portfolio - Open vSwitchopenvswitch.org/support/ovscon2016/8/1335-hyoudou.pdfMEM_LOAD_UOPS_L3_MISS_RETIRED.REMOTE_DRAM 11 171,975 MEM_LOAD_UOPS_LLC_MISS_RETIRED.REMOTE_HITM

Copyright 2016 FUJITSU LABORATORIES LIMITED

Important aspects for NFV Platform

Use of commodity (non-proprietary) hardware

Standard I/F for VNF such as virtio

I/O throughputs

The number of I/O ports (Eth I/F)

The number of VNFs in a BOX

etc.

2

Page 4: SDN and NFV Portfolio - Open vSwitchopenvswitch.org/support/ovscon2016/8/1335-hyoudou.pdfMEM_LOAD_UOPS_L3_MISS_RETIRED.REMOTE_DRAM 11 171,975 MEM_LOAD_UOPS_LLC_MISS_RETIRED.REMOTE_HITM

Copyright 2016 FUJITSU LABORATORIES LIMITED

To increase VNFs and I/O per BOX …

3

CPU1

I/O Controller

NIC

NIC

NIC

Core Core Core Core

Core Core Core Core

VM VM VM VM

Me

mo

ry C

on

tro

ller

Me

mo

ry

for VMs

for DPDK-OVS

• Assumptions

• a CPU has 8 cores and 3 I/O slots

• use 2 cores for physical port and

more 2 cores for vhost-user

• a VM occupies one core

• A BOX supports

• 4 VMs (VNFs)

• 3 NICs

IA server (BOX)

Page 5: SDN and NFV Portfolio - Open vSwitchopenvswitch.org/support/ovscon2016/8/1335-hyoudou.pdfMEM_LOAD_UOPS_L3_MISS_RETIRED.REMOTE_DRAM 11 171,975 MEM_LOAD_UOPS_LLC_MISS_RETIRED.REMOTE_HITM

Copyright 2016 FUJITSU LABORATORIES LIMITED

To increase VNFs and I/O per BOX …

One of the easiest way is to use multi-socket server, But …

4

CPU1

I/O Controller

NIC

NIC

NIC

Core Core Core Core

Core Core Core Core

VM VM VM VM

Me

mo

ry C

on

tro

ller

Me

mo

ry

CPU2

I/O Controller

NIC

NIC

NIC

Core Core Core Core

Core Core Core Core

VM VM VM VM

Me

mo

ry C

on

tro

ller

Me

mo

ry

for VMs for VMs

for DPDK-OVS for DPDK-OVS

IA server (BOX)

• Assumptions

• a CPU has 8 cores and 3 I/O slots

• use 2 cores for physical port and

more 2 cores for vhost-user

• a VM occupies one core

• A BOX supports

• 4 VMs 8 VMs

• 3 NICs 6 NICs

Page 6: SDN and NFV Portfolio - Open vSwitchopenvswitch.org/support/ovscon2016/8/1335-hyoudou.pdfMEM_LOAD_UOPS_L3_MISS_RETIRED.REMOTE_DRAM 11 171,975 MEM_LOAD_UOPS_LLC_MISS_RETIRED.REMOTE_HITM

Copyright 2016 FUJITSU LABORATORIES LIMITED

Performance Issue of OVS on NUMA system

If the VM is running on a NUMA node different from the node to which NIC

is connected, its throughput decreases significantly.

5

CPU1 CPU2 M

em

ory

Co

ntr

olle

r

Me

mo

ry C

on

tro

ller

Me

mo

ry

Me

mo

ry

I/O Controller I/O Controller

NIC

Eth Port Eth Port

OVS w/ DPDK

pmd thread

for PHY

pmd thread

for vhost-user

VM

DMA

by NIC

VRING (virtio)

Remote memory access

by PMD thread (CPU)

QPI

Page 7: SDN and NFV Portfolio - Open vSwitchopenvswitch.org/support/ovscon2016/8/1335-hyoudou.pdfMEM_LOAD_UOPS_L3_MISS_RETIRED.REMOTE_DRAM 11 171,975 MEM_LOAD_UOPS_LLC_MISS_RETIRED.REMOTE_HITM

Copyright 2016 FUJITSU LABORATORIES LIMITED

Pre-test to Study the Issue

Measure the total throughput of 3 VNFs (L3 forwarder w/o DPDK)

with 2 cases as follows;

6

CPU1 CPU2

Me

mo

ry C

on

tro

ller

Me

mo

ry C

on

tro

ller

Me

mo

ry

Me

mo

ry

I/O Controller I/O Controller

NIC Eth Port Eth Port

OVS w/ DPDK

pmd (PHY)

Core#4

pmd (PHY)

Core#5

pmd (VIF)

Core#6 pmd (VIF)

Core#7

CPU1 CPU2

Me

mo

ry C

on

tro

ller

Me

mo

ry

Me

mo

ry

I/O Controller I/O Controller

NIC Eth Port Eth Port

OVS w/ DPDK

pmd (PHY)

Core#4

pmd (PHY)

Core#5

pmd (VIF)

Core#6 pmd (VIF)

Core#7

Case1: All components are in the same NUMA node

(no flows go through QPI)

Case2: VMs are running on different node

(flows go through QPI by CPU)

QPI QPI

VM#1

Core#3

VM#1

Core#2

VM#1

Core#1 VM#1

Core#11

VM#1

Core#10

VM#1

Core#9

Me

mo

ry C

on

tro

ller

Page 8: SDN and NFV Portfolio - Open vSwitchopenvswitch.org/support/ovscon2016/8/1335-hyoudou.pdfMEM_LOAD_UOPS_L3_MISS_RETIRED.REMOTE_DRAM 11 171,975 MEM_LOAD_UOPS_LLC_MISS_RETIRED.REMOTE_HITM

Copyright 2016 FUJITSU LABORATORIES LIMITED

Result of Pre-test

Throughput goes down 30 to 40%!

(Max. decreasing is 43.5% in this test)

7

% of Case1

pkt size Case1 Case2

64 100.0 58.3

72 100.0 57.1

128 100.0 56.5

256 100.0 60.6

512 100.0 64.4

768 100.0 66.7

1024 100.0 76.0

1280 100.0 84.7

1500 100.0 88.0

02468

10121416182022

64 72 128 256 512 768 1024 1280 1500

Case1

Case2

Th

rou

gh

put (G

bits/s

ec)

Packet Size (Bytes)

Max. Throughput (no drop condition)

• CPU: Xeon® E5-2667 v3 @3.20GHz, 8 Cores x2

• Memory: 128 GB (DDR4-2133 16G x4 x2)

• NIC: Intel® X710-DA2 (PCIe Gen3 x8)

• OS: CentOS 7.1

• OVS: v.2.4.1 w/ DPDK v.2.0.0

43.5%

DOWN

Page 9: SDN and NFV Portfolio - Open vSwitchopenvswitch.org/support/ovscon2016/8/1335-hyoudou.pdfMEM_LOAD_UOPS_L3_MISS_RETIRED.REMOTE_DRAM 11 171,975 MEM_LOAD_UOPS_LLC_MISS_RETIRED.REMOTE_HITM

Copyright 2016 FUJITSU LABORATORIES LIMITED

Analysis of the Issue

Consider the cause of the issue

QPI link has enough bandwidth (307 Gbps bi-directional bandwidth)

Remote memory access latency for copying data and lock operations are High!

• L3 cache: 10ns, Local memory: 70ns, Remote memory: 120ns

Measure performance counters concerning L3 cache miss event

8

Event Name Case1 Case2

MEM_LOAD_UOPS_RETIRED.L3_MISS 552,945 6,708,912

MEM_LOAD_UOPS_L3_MISS_RETIRED.LOCAL_DRAM 554,079 644,671

MEM_LOAD_UOPS_L3_MISS_RETIRED.REMOTE_DRAM 11 171,975

MEM_LOAD_UOPS_LLC_MISS_RETIRED.REMOTE_HITM 69 5,585,607

Removing remote memory access by CPU,

Is it expected to increase throughput?

Remote memory

access after L3 miss

Page 10: SDN and NFV Portfolio - Open vSwitchopenvswitch.org/support/ovscon2016/8/1335-hyoudou.pdfMEM_LOAD_UOPS_L3_MISS_RETIRED.REMOTE_DRAM 11 171,975 MEM_LOAD_UOPS_LLC_MISS_RETIRED.REMOTE_HITM

Copyright 2016 FUJITSU LABORATORIES LIMITED

Additional Test for Assumption

Removing remote memory access by CPU using DMA

9

CPU1 CPU2

Me

mo

ry C

on

tro

ller

Me

mo

ry C

on

tro

ller

Me

mo

ry

Me

mo

ry

I/O Controller I/O Controller

NIC

Eth Port Eth Port

OVS w/ DPDK

pmd (PHY)

Core#12

pmd (PHY)

Core#13

pmd (VIF)

Core#14 pmd (VIF)

Core#15

Case3: Only NIC is connected to the NUMA node

which is different from all software are running

(flows go through QPI by DMA) • Modify OVS 2.4.1 for this test as follows

• User can specify NUMA-node per vport

• mbuf pool is allocated on the specified node

• DMA buffer is mapped to the specified node

if the vport is a DPDK port (physical port)

• Receiving process of the vport is handled by

a thread running on a core of the specified node

VM#1

Core#11

VM#1

Core#10

VM#1

Core#9

QPI

Remote memory access

by DMA

traffic

walkthrough of data to VM

Page 11: SDN and NFV Portfolio - Open vSwitchopenvswitch.org/support/ovscon2016/8/1335-hyoudou.pdfMEM_LOAD_UOPS_L3_MISS_RETIRED.REMOTE_DRAM 11 171,975 MEM_LOAD_UOPS_LLC_MISS_RETIRED.REMOTE_HITM

Copyright 2016 FUJITSU LABORATORIES LIMITED

Result of Additional Test

Case3 (flows go through QPI by DMA) achieves high-throughput

as same as case1 (no flows go through QPI) !

10

02468

10121416182022

64 72 128 256 512 768 1024 1280 1500

Case1Case2Case3

Th

rou

gh

put (G

bits/s

ec)

Packet Size (Bytes)

Max. Throughput (no drop condition) % of Case1

pkt size Case1 Case2 Case3

64 100.0 58.3 108.3

72 100.0 57.1 107.1

128 100.0 56.5 100.0

256 100.0 60.6 103.0

512 100.0 64.4 108.5

768 100.0 66.7 102.5

1024 100.0 76.0 104.2

1280 100.0 84.7 102.0

1500 100.0 88.0 100.0

Page 12: SDN and NFV Portfolio - Open vSwitchopenvswitch.org/support/ovscon2016/8/1335-hyoudou.pdfMEM_LOAD_UOPS_L3_MISS_RETIRED.REMOTE_DRAM 11 171,975 MEM_LOAD_UOPS_LLC_MISS_RETIRED.REMOTE_HITM

Copyright 2016 FUJITSU LABORATORIES LIMITED

Summary on Studying the Issue

"Flows go through QPI by CPU" (Case2) mainly influences

throughput degradation on NUMA system.

"Flows go through QPI by DMA" (Case3) achieves comparative

performance with "no flows go through QPI" (Case1).

11

Considering the case VMs are running on all NUMA-nodes,

DMA should write flows to the memory of the NUMA node

on which their destination VM is running

Page 13: SDN and NFV Portfolio - Open vSwitchopenvswitch.org/support/ovscon2016/8/1335-hyoudou.pdfMEM_LOAD_UOPS_L3_MISS_RETIRED.REMOTE_DRAM 11 171,975 MEM_LOAD_UOPS_LLC_MISS_RETIRED.REMOTE_HITM

Copyright 2016 FUJITSU LABORATORIES LIMITED

Trial Implementation for

NUMA-aware DPDK-OVS

12

Page 14: SDN and NFV Portfolio - Open vSwitchopenvswitch.org/support/ovscon2016/8/1335-hyoudou.pdfMEM_LOAD_UOPS_L3_MISS_RETIRED.REMOTE_DRAM 11 171,975 MEM_LOAD_UOPS_LLC_MISS_RETIRED.REMOTE_HITM

Copyright 2016 FUJITSU LABORATORIES LIMITED

Concept and Basic Design

Use multiple RXQs & MAC filter features of standard NIC

Each NUMA node has a RXQ (associated with a HwQ) on its own memory

Flows are forwarded to each node according to the destination MAC address by MAC filter

13

CPU1 CPU2

OVS w/ DPDK

I/O Controller

Eth Port

Me

mo

ry C

on

tro

ller

Me

mo

ry C

on

tro

ller

MAC Filter

Hw

Q0

Hw

Q1

I/O Controller

Port#1

RX

Q

#0

pmd (PHY) pmd (PHY)

NIC

VM1 VM2

pmd (VIF) pmd (VIF)

vhost vhost

Memory Memory

Port#1

RX

Q

#1

Packets for VM1 to RXQ0

Packets for VM2 to RXQ1

Page 15: SDN and NFV Portfolio - Open vSwitchopenvswitch.org/support/ovscon2016/8/1335-hyoudou.pdfMEM_LOAD_UOPS_L3_MISS_RETIRED.REMOTE_DRAM 11 171,975 MEM_LOAD_UOPS_LLC_MISS_RETIRED.REMOTE_HITM

Copyright 2016 FUJITSU LABORATORIES LIMITED

Trial Implementation w/ Intel® X710

Use VMDq and MAC-VLAN filter

14

X710 PF

main

VSI#0

VMDq#1

VSI#1

MAC-VLAN filter

Eth Port

OVS w/ DPDK

Port#1

RX

Q

#0

RX

Q

#1

map to

NUMA#0 map to

NUMA#1

• Dest. VSI can be specified corresponding

to dest. MAC addr w/ MAC-VLAN filter

• Packets w/ unregistered MAC addr go to

main VSI of PF

• It only needs to register dest. MAC to go to NUMA#1

• Broadcast frames can be handled w/o configuration

• Needs some modifications to DPDK

• Default DPDK can not configure queues of specific

VMDq, so we changed i40e PMD to specify the

number of queues per VSI including main VSI.

VSI: Virtual Station Interface (a set of queues)

Page 16: SDN and NFV Portfolio - Open vSwitchopenvswitch.org/support/ovscon2016/8/1335-hyoudou.pdfMEM_LOAD_UOPS_L3_MISS_RETIRED.REMOTE_DRAM 11 171,975 MEM_LOAD_UOPS_LLC_MISS_RETIRED.REMOTE_HITM

Copyright 2016 FUJITSU LABORATORIES LIMITED

Memory Allocation for NUMA

Allocation of mbuf pool (dpdk_mp)

Setup Rx queues

15

dpdk_mp = dpdk_mp_get(dev->socket_id, mtu);

dpdk_mp = dpdk_rte_mzalloc(sizeof(struct dpdk_mp) * n_numas);

for (numa_id = 0; numa_id < n_numas; numa_id++) {

dpdk_mp[numa_id] = dpdk_mp_get(numa_id, mtu);

}

n_rxqs = MIN(max_rx_queue, n_dpdk_rxqs);

for (i = 0; i < n_rxqs; i++) {

rte_eth_rx_queue_setup(port_id, i, RX_Q_SIZE,

dev->socket_id, NULL,

dev->dpdk_mp->mp );

}

n_rxq_per_vsi = MIN(max_rx_queue_per_pool, n_dpdk_rxqs);

n_rxqs = n_rxq_per_vsi * n_numas;

for (qid = 0; qid < n_rxqs; qid++) {

numa_id = dpdk_get_numa_id_by_qindex(dev, qid);

rte_eth_rx_queue_setup(port_id, qid, RX_Q_SIZE,

numa_id, NULL,

dev->dpdk_mp[numa_id]->mp );

}

Current OVS allocates single mbuf pool for a dpdk

port with socket_id (numa id) of the device

We change it to allocate and hold multiple mbuf pools for

each NUMA node for a dpdk port

Current OVS sets up all rx queues for a port on the

same NUMA node specified by dev->socket_id

We change it to setup each Rx queue on the NUMA node

specified by numa_id associated with the queue index

Page 17: SDN and NFV Portfolio - Open vSwitchopenvswitch.org/support/ovscon2016/8/1335-hyoudou.pdfMEM_LOAD_UOPS_L3_MISS_RETIRED.REMOTE_DRAM 11 171,975 MEM_LOAD_UOPS_LLC_MISS_RETIRED.REMOTE_HITM

Copyright 2016 FUJITSU LABORATORIES LIMITED

MAC Address Registration

Develop new utility: ovs-numactl

issue command to add / del MAC-NUMA info to bridge

Add NUMA control features to bridge

manage the MAC-NUMA table by receiving command

from ovs-numactl

set MAC-NUMA info in the table to all dpdk-ports

Add API to netdev-dpdk

set MAC-NUMA info to specified dpdk-port

use rte_eth_dev_mac_addr_add() API to register

MAC address to MAC-VLAN filter of PF

16

vswitchd

ovs-numactl

bridge MAC-NUMA

table

vport vport

Netdev-

dpdk

Netdev-

dpdk

dpdk

ethdev

dpdk

ethdev

PMD PMD

PF PF NIC

Page 18: SDN and NFV Portfolio - Open vSwitchopenvswitch.org/support/ovscon2016/8/1335-hyoudou.pdfMEM_LOAD_UOPS_L3_MISS_RETIRED.REMOTE_DRAM 11 171,975 MEM_LOAD_UOPS_LLC_MISS_RETIRED.REMOTE_HITM

Copyright 2016 FUJITSU LABORATORIES LIMITED

Evaluation

Measure the total throughput of 6 VNFs (L3 forwarder w/o DPDK)

Traffics are inputted from a port of each NIC (maximum 20 Gbps)

17

CPU1 CPU2

Me

mo

ry C

on

tro

ller

Me

mo

ry C

on

tro

ller

Me

mo

ry

Me

mo

ry

I/O Controller I/O Controller

NIC1

Eth Port Eth Port

OVS w/ DPDK

pmd (PHY)

Core#12

pmd (PHY)

Core#13

pmd (VIF)

Core#14 pmd (VIF)

Core#15

VM#1

Core#11

VM#1

Core#10

VM#4

Core#9

NIC2

Eth Port Eth Port

pmd (PHY)

Core#4

pmd (PHY)

Core#5

pmd (VIF)

Core#6 pmd (VIF)

Core#7

VM#1

Core#3

VM#1

Core#2

VM#1

Core#1

CPU1 CPU2

Me

mo

ry C

on

tro

ller

Me

mo

ry C

on

tro

ller

Me

mo

ry

Me

mo

ry

I/O Controller I/O Controller

NIC1

Eth Port Eth Port

OVS w/ DPDK

pmd (VIF)

Core#14 pmd (VIF)

Core#15

VM#1

Core#11

VM#1

Core#10

VM#4

Core#9

NIC2

Eth Port Eth Port

pmd (PHY)

Core#4

pmd (VIF)

Core#6 pmd (VIF)

Core#7

VM#1

Core#3

VM#1

Core#2

VM#1

Core#1

Standard OVS: A dpdk port is mapped to only the

NUMA node to which its NIC is connected

Our Approach: Each dpdk port has multple queues

mapped to each NUMA node respectively

pmd (PHY)

Core#12

Page 19: SDN and NFV Portfolio - Open vSwitchopenvswitch.org/support/ovscon2016/8/1335-hyoudou.pdfMEM_LOAD_UOPS_L3_MISS_RETIRED.REMOTE_DRAM 11 171,975 MEM_LOAD_UOPS_LLC_MISS_RETIRED.REMOTE_HITM

Copyright 2016 FUJITSU LABORATORIES LIMITED

Result

Try 10 times of max rate search at no drop condition for each size

Our approach achieved over 2.5 times throughput in this test

18

0

2

4

6

8

10

12

14

16

18

20

22

64 72 128 256 512 768 1024 1280 1500

Std. OVS

Numa-aware OVS

0

2

4

6

8

10

12

14

16

18

20

22

64 72 128 256 512 768 1024 1280 1500

Std. OVS

Numa-aware OVS

Th

rou

gh

pu

t (G

bits/s

ec)

Packet Size (Bytes) T

hro

ug

hpu

t (G

bits/s

ec)

Packet Size (Bytes)

Maximum Throughputs in 10 results Median Throughputs of 10 results

x2.8

x2

.5

Page 20: SDN and NFV Portfolio - Open vSwitchopenvswitch.org/support/ovscon2016/8/1335-hyoudou.pdfMEM_LOAD_UOPS_L3_MISS_RETIRED.REMOTE_DRAM 11 171,975 MEM_LOAD_UOPS_LLC_MISS_RETIRED.REMOTE_HITM

Copyright 2016 FUJITSU LABORATORIES LIMITED

Conclusions

We introduced a concept and trial implementation of NUMA-aware

DPDK-OVS

Improvement throughput by reducing remote memory access by CPU

Use multiple RXQs and MAC filter to forward received packets directly to the

memory of the NUMA node on which its destination VM is running

• Each NUMA node has a RXQ associated with HwQ on its own memory, and

• NICs can write received packets directly to there by DMA according dest. MAC address

• Implement on trial with Intel® X710 (use VMDq and MAC-VLAN filter)

We observed significant performance improvement in our approach

19

Page 21: SDN and NFV Portfolio - Open vSwitchopenvswitch.org/support/ovscon2016/8/1335-hyoudou.pdfMEM_LOAD_UOPS_L3_MISS_RETIRED.REMOTE_DRAM 11 171,975 MEM_LOAD_UOPS_LLC_MISS_RETIRED.REMOTE_HITM

Copyright 2016 FUJITSU LABORATORIES LIMITED

Conclusions

Limitations

NIC is limited to be enabled in our approach

a few MAC address can be registered (depends on NIC specification)

only destination MAC address is used to specify destination NUMA node

It is difficult to support a BOX using tunnel protocol in current standard NICs

etc.

Need investigations to extend the applicable cases

20

Page 22: SDN and NFV Portfolio - Open vSwitchopenvswitch.org/support/ovscon2016/8/1335-hyoudou.pdfMEM_LOAD_UOPS_L3_MISS_RETIRED.REMOTE_DRAM 11 171,975 MEM_LOAD_UOPS_LLC_MISS_RETIRED.REMOTE_HITM

Copyright 2016 FUJITSU LABORATORIES LIMITED 21

Thank You! [email protected]

Page 23: SDN and NFV Portfolio - Open vSwitchopenvswitch.org/support/ovscon2016/8/1335-hyoudou.pdfMEM_LOAD_UOPS_L3_MISS_RETIRED.REMOTE_DRAM 11 171,975 MEM_LOAD_UOPS_LLC_MISS_RETIRED.REMOTE_HITM

Copyright 2010 FUJITSU LIMITED 22

Page 24: SDN and NFV Portfolio - Open vSwitchopenvswitch.org/support/ovscon2016/8/1335-hyoudou.pdfMEM_LOAD_UOPS_L3_MISS_RETIRED.REMOTE_DRAM 11 171,975 MEM_LOAD_UOPS_LLC_MISS_RETIRED.REMOTE_HITM

Copyright 2016 FUJITSU LABORATORIES LIMITED

BACKUP

23

Page 25: SDN and NFV Portfolio - Open vSwitchopenvswitch.org/support/ovscon2016/8/1335-hyoudou.pdfMEM_LOAD_UOPS_L3_MISS_RETIRED.REMOTE_DRAM 11 171,975 MEM_LOAD_UOPS_LLC_MISS_RETIRED.REMOTE_HITM

Copyright 2016 FUJITSU LABORATORIES LIMITED

Evaluation (Bi-direction, input from 4 ports)

Measure the total bi-directional throughput of 4 VNFs (dpdk l2fwd)

Each CPU has a dual 10GbE NIC (X710-DA2), four 10GbE ports are used for this test

24

CPU1 CPU2

Me

mo

ry C

on

tro

ller

Me

mo

ry C

on

tro

ller

Me

mo

ry

Me

mo

ry

I/O Controller I/O Controller

NIC1

Eth Port Eth Port

OVS w/ DPDK

pmd (PHY)

Core#12

pmd (PHY)

Core#13

pmd (VIF)

Core#14 pmd (VIF)

Core#15

VM#4

Core#10 VM#3

Core#9

NIC2

Eth Port Eth Port

pmd (PHY)

Core#4

pmd (PHY)

Core#5

pmd (VIF)

Core#6 pmd (VIF)

Core#7

VM#2

Core#2

VM#1

Core#1

CPU1 CPU2

Me

mo

ry C

on

tro

ller

Me

mo

ry C

on

tro

ller

Me

mo

ry

Me

mo

ry

I/O Controller I/O Controller

NIC1

Eth Port Eth Port

OVS w/ DPDK

pmd (PHY)

Core#12

pmd (PHY)

Core#13

pmd (VIF)

Core#14 pmd (VIF)

Core#15

VM#4

Core#10

VM#3

Core#9

NIC2

Eth Port Eth Port

pmd (PHY)

Core#4

pmd (PHY)

Core#5

pmd (VIF)

Core#6 pmd (VIF)

Core#7

VM#2

Core#2

VM#1

Core#1

Standard OVS: A dpdk port is mapped to only the

NUMA node to which its NIC is connected

Our Approach: Each dpdk port has multple queues

mapped to each NUMA node respectively

Page 26: SDN and NFV Portfolio - Open vSwitchopenvswitch.org/support/ovscon2016/8/1335-hyoudou.pdfMEM_LOAD_UOPS_L3_MISS_RETIRED.REMOTE_DRAM 11 171,975 MEM_LOAD_UOPS_LLC_MISS_RETIRED.REMOTE_HITM

Copyright 2016 FUJITSU LABORATORIES LIMITED

Result (Bi-direction)

Our approach achieved up to 2 times throughput compared with Std. OVS.

25

0

4

8

12

16

20

24

28

32

36

40

64 72 128 256 512 768 1024 1280 1500

Std. OVS

Numa-aware OVS

Th

rou

gh

pu

t (G

bits/s

ec)

Packet Size (Bytes)

Median Throughputs of 10 results

0

4

8

12

16

20

24

28

32

36

40

64 72 128 256 512 768 1024 1280 1500

Std. OVS

Numa-aware OVS

Th

rou

gh

pu

t (G

bits/s

ec)

Packet Size (Bytes)

Maximum Throughputs in 10 results

Page 27: SDN and NFV Portfolio - Open vSwitchopenvswitch.org/support/ovscon2016/8/1335-hyoudou.pdfMEM_LOAD_UOPS_L3_MISS_RETIRED.REMOTE_DRAM 11 171,975 MEM_LOAD_UOPS_LLC_MISS_RETIRED.REMOTE_HITM

Copyright 2016 FUJITSU LABORATORIES LIMITED

vhost-user NUMA Awareness in OVS 2.6

Memory for structure & mbuf pool for vhost can be allocated on NUMA

node on which connected VM is running

26

Refer to

https://software.intel.com/en-us/articles/vhost-user-numa-awareness-in-open-vswitch-with-dpdk?language=ru

Page 28: SDN and NFV Portfolio - Open vSwitchopenvswitch.org/support/ovscon2016/8/1335-hyoudou.pdfMEM_LOAD_UOPS_L3_MISS_RETIRED.REMOTE_DRAM 11 171,975 MEM_LOAD_UOPS_LLC_MISS_RETIRED.REMOTE_HITM

Copyright 2016 FUJITSU LABORATORIES LIMITED

Example of Connection Model for Our Products

All VNFs need to transmit / receive to / from all physical ports

27

BOX#1

10GbE

10GbE

10GbE

10GbE

BOX#2

10GbE

10GbE

10GbE

10GbE

10G

bE

10G

bE

10G

bE

10G

bE

Intranet Internet

VNFs

VNFs

LAG Redundancy

Path