Mellanox Messaging Accelerator (VMA) Installation Guide...Mellanox Messaging Accelerator (VMA)...
Transcript of Mellanox Messaging Accelerator (VMA) Installation Guide...Mellanox Messaging Accelerator (VMA)...
www.mellanox.com Mellanox Technologies
Mellanox Messaging Accelerator (VMA)
Installation Guide
Rev 8.4.10
Doc #: Doc-10055 Mellanox Technologies 2
Mellanox Technologies
350 Oakmead Parkway Suite 100
Sunnyvale, CA 94085
U.S.A.
www.mellanox.com
Tel: (408) 970-3400
Fax: (408) 970-3403
© Copyright 2017. Mellanox Technologies Ltd. All Rights Reserved.
Mellanox®, Mellanox logo, Accelio®, BridgeX®, CloudX logo, CompustorX®, Connect-IB®, ConnectX®,
CoolBox®, CORE-Direct®, EZchip®, EZchip logo, EZappliance®, EZdesign®, EZdriver®, EZsystem®,
GPUDirect®, InfiniHost®, InfiniBridge®, InfiniScale®, Kotura®, Kotura logo, Mellanox CloudRack®, Mellanox
CloudXMellanox®, Mellanox Federal Systems®, Mellanox HostDirect®, Mellanox Multi-Host®, Mellanox Open
Ethernet®, Mellanox OpenCloud®, Mellanox OpenCloud Logo®, Mellanox PeerDirect®, Mellanox ScalableHPC®,
Mellanox StorageX®, Mellanox TuneX®, Mellanox Connect Accelerate Outperform logo, Mellanox Virtual Modular
Switch®, MetroDX®, MetroX®, MLNX-OS®, NP-1c®, NP-2®, NP-3®, Open Ethernet logo, PhyX®, PlatformX®,
PSIPHY®, SiPhy®, StoreX®, SwitchX®, Tilera®, Tilera logo, TestX®, TuneX®, The Generation of Open Ethernet
logo, UFM®, Unbreakable Link®, Virtual Protocol Interconnect®, Voltaire® and Voltaire logo are registered
trademarks of Mellanox Technologies, Ltd.
All other trademarks are property of their respective owners.
For the most updated list of Mellanox trademarks, visit http://www.mellanox.com/page/trademarks
NOTE:
THIS HARDWARE, SOFTWARE OR TEST SUITE PRODUCT (“PRODUCT(S)”) AND ITS RELATED
DOCUMENTATION ARE PROVIDED BY MELLANOX TECHNOLOGIES “AS-IS” WITH ALL FAULTS OF ANY
KIND AND SOLELY FOR THE PURPOSE OF AIDING THE CUSTOMER IN TESTING APPLICATIONS THAT
USE THE PRODUCTS IN DESIGNATED SOLUTIONS. THE CUSTOMER'S MANUFACTURING TEST
ENVIRONMENT HAS NOT MET THE STANDARDS SET BY MELLANOX TECHNOLOGIES TO FULLY
QUALIFY THE PRODUCT(S) AND/OR THE SYSTEM USING IT. THEREFORE, MELLANOX TECHNOLOGIES
CANNOT AND DOES NOT GUARANTEE OR WARRANT THAT THE PRODUCTS WILL OPERATE WITH THE
HIGHEST QUALITY. ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
IMPLIED WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
NONINFRINGEMENT ARE DISCLAIMED. IN NO EVENT SHALL MELLANOX BE LIABLE TO CUSTOMER OR
ANY THIRD PARTIES FOR ANY DIRECT, INDIRECT, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
DAMAGES OF ANY KIND (INCLUDING, BUT NOT LIMITED TO, PAYMENT FOR PROCUREMENT OF
SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY FROM THE USE OF THE
PRODUCT(S) AND RELATED DOCUMENTATION EVEN IF ADVISED OF THE POSSIBILITY OF SUCH
DAMAGE.
Rev 8.4.10 Mellanox Technologies 3
Table of Contents
Document Revision History .................................................................................................................. 5
1 Introduction ..................................................................................................................................... 6
1.1 VMA System Requirements ................................................................................................... 6
2 Installing VMA ................................................................................................................................. 7
2.1 Installing VMA Binary as Part of Mellanox Drivers ................................................................. 7
2.1.1 Mellanox OFED Driver for Linux ............................................................................... 7
2.1.2 Mellanox EN Driver for Linux .................................................................................... 8
2.1.3 Starting the Mellanox Drivers.................................................................................... 8
3 Running VMA ................................................................................................................................... 9
3.1 Benchmarking Example .......................................................................................................... 9
3.1.1 Prerequisites ............................................................................................................. 9
3.1.2 Kernel Performance ................................................................................................ 10
3.1.3 VMA Performance .................................................................................................. 10
3.1.4 Comparing Results ................................................................................................. 12
3.1.5 Libvma-debug.so .................................................................................................... 12
4 VMA Installation Options ............................................................................................................. 14
4.1 Installing VMA Binary Manually ............................................................................................ 14
4.1.1 Installing the VMA Packages .................................................................................. 14
4.1.2 Verifying VMA Installation ....................................................................................... 16
4.2 Building VMA from Sources .................................................................................................. 16
4.2.1 Verifying VMA Compilation ..................................................................................... 16
4.3 Installing VMA in RHEL 7.x Inbox ......................................................................................... 17
5 Uninstalling VMA ........................................................................................................................... 18
5.1 Automatic VMA Uninstallation .............................................................................................. 18
5.2 Manual VMA Uninstallation .................................................................................................. 18
5.3 VMA in RHEL 7.x Inbox Uninstallation ................................................................................. 18
6 Upgrading libvma.conf ................................................................................................................. 19
Appendix A: Port Type Management ........................................................................................ 20
A.1 ConnectX-3/ConnectX-3 Pro Port Type Management ......................................................... 20
A.2 ConnectX-4/ConnectX-5 Port Type Management/VPI Cards Configuration ........................ 21
Appendix B: Basic Performance Tuning .................................................................................. 22
B.1 VMA Tuning Parameters ...................................................................................................... 22
B.2 Binding VMA to the Closest NUMA ...................................................................................... 22
B.3 Configuring the BIOS ............................................................................................................ 23
4 Mellanox Technologies Rev 8.4.10
List of Tables
Table 1: Document Revision History ....................................................................................................... 5
Table 2: System Requirements ................................................................................................................ 6
Table 3: Supported ConnectX®-3 Port Configurations .......................................................................... 20
Rev 8.4.10 Mellanox Technologies 5
Document Revision History
Table 1: Document Revision History
Revision Date Description
Rev 8.4.10 December 5, 2017 No changes to the document except for the software
version.
Rev 8.4.8 October 31, 2017 Updated the following sections:
Running VMA
Kernel Performance Server Side
Kernel Performance Client Side
VMA Performance Server Side
VMA Performance Client Side
Comparing Results
Rev 8.3.7 June 30, 2017 Updated the following sections:
Running VMA
Kernel Performance Server Side
Kernel Performance Client Side
VMA Performance Server Side
VMA Performance Client Side
Libvma-debug.so
Comparing Results
Rev 8.3.5 May 31, 2017 Updated the following sections:
Building VMA from Sources
Installing VMA in RHEL 7.x Inbox
Rev 8.2.10 March 28, 2017 Updated section Mellanox EN Driver for Linux
Rev 8.2.8 January 31st, 2017 Major restructure of the document
Rev 8.1.7 November 01, 2016 No changes
6 Mellanox Technologies Rev 8.4.10
1 Introduction
The Mellanox' Messaging Accelerator (VMA) Installation Guide provides an introduction to:
Installing VMA (see section Installing VMA)
Running VMA (see section Running VMA)
for UDP/TCP latency and throughput performance.
1.1 VMA System Requirements
The following table presents the currently certified combinations of stacks, platforms, and
supported CPU architectures for VMA 8.4.10
Table 2: System Requirements
Specification Value
Network Adapter Cards ConnectX®-3 / ConnectX®-3 Pro
ConnectX®-4 / ConnectX®-4 Lx
ConnectX®-5 / ConnectX®-5 Ex
Firmware ConnectX®-3/ ConnectX®-3 v2.40.7000
ConnectX®-4 v12.20.1000
ConnectX®-4 Lx v14.20.1000
ConnectX®-5 v16.20.1000
Driver Stack MLNX-OFED:
v4.1-1.2.0.0 [Ethernet and InfiniBand]
v4.0-2.0.0.1 [Ethernet and InfiniBand]
Supported Operating Systems
and Kernels
All Linux 64 bit distributions supported by:
MLNX_OFED 4.1-x.x.x.x/ v4.0-x.x.x.x
MLNX_EN v4.0-x.x.x.x
CPU Architecture x86_64 (Intel Xeon), arm64, ppc64
Minimum memory requirements 1 GB of free memory for installation 800 MB per process running with
VMA
Minimum disk space
requirements
1 GB
Transport Ethernet / InfiniBand / VPI
Rev 8.4.10 Mellanox Technologies 7
2 Installing VMA
Before you begin, verify you are using a supported operating system and a supported CPU
architecture for your operating system. See supported combinations in section VMA System
Requirements.
VMA v8.4.10 can work on top of both MLNX_OFED driver stack that supports Ethernet
and InfiniBand and on a lighter driver stack, MLNX_EN that supports only Ethernet.
The VMA library is delivered as a user-space library, and is called libvma.so.X.Y.Z.
VMA can be installed using one of the following methods:
Installing VMA binary as part of Mellanox drivers (see section 2.1)
Installing VMA binary manually (see section 4.1)
Building VMA from sources (see section 4.2)
Installing VMA in RHEL 7.x Inbox (see section 4.3)
2.1 Installing VMA Binary as Part of Mellanox Drivers
VMA is integrated into the Mellanox drivers (MLNX_OFED/MLNX_EN), therefore, it is
recommended to install VMA automatically when installing the Mellanox drivers as it
depends on drivers’ latest firmware, libraries and kernel modules. This installation assures
VMA’s correct functionality. Since the Mellanox drivers have a plethora of distributions for
RHEL, Ubuntu and others, you will have to correctly select the drivers’ version that matches
your distribution.
This option suits users who want to install a new VMA version or upgrade to the latest VMA
version by overriding the previous one.
2.1.1 Mellanox OFED Driver for Linux
1. Download latest Mellanox OFED driver:
http://www.mellanox.com/page/products_dyn?product_family=26&mtag=linux_sw_driv
ers
2. Install the VMA packages.
./mlnxofedinstall --vma
3. Install the packages required by VMA to support Ethernet.
./mlnxofedinstall --vma-eth
4. Verify the installation completed successfully.
/etc/infiniband/info
5. [ConnectX®-3/ConnectX®-3 Pro ] Verify the following is configured in the
/etc/modprobe.d/mlnx.conf file.
options ib_uverbs disable_raw_qp_enforcement=1
options mlx4_core fast_drop=1
options mlx4_core log_num_mgm_entry_size=-1
8 Mellanox Technologies Rev 8.4.10
2.1.2 Mellanox EN Driver for Linux
1. Download the latest Mellanox EN driver:
http://www.mellanox.com/page/products_dyn?product_family=27
2. Install the VMA packages:
./install --vma
3. Install packages required by VMA to support Ethernet:
./install --vma-eth
4. Verify the installation completed successfully:
$ cat /etc/infiniband/info
#!/bin/bash
echo prefix=/usr
echo Kernel=3.10.0-514.el7.x86_64
echo
echo "Configure options: --with-core-mod --with-user_mad-mod --with-
user_access-mod --with-addr_trans-mod --with-mlx4-mod --with-mlx4_en-mod
--with-mlx5-mod --with-ipoib-mod --with-srp-mod --with-iser-mod --with-
isert-mod"
echo
5. [ConnectX®-3/ConnectX®-3 Pro ] Verify the following is configured in the
/etc/modprobe.d/mlnx.conf file:
options ib_uverbs disable_raw_qp_enforcement=1
options mlx4_core fast_drop=1
options mlx4_core log_num_mgm_entry_size=-1
2.1.3 Starting the Mellanox Drivers
1. Start the relevant Mellanox driver (MLNX_OFED/MLNX_EN):
/etc/init.d/openibd restart
2. Verify that the supported version of firmware is installed.
ibv_devinfo
NOTE: Configure ConnectX® adapter card ports to work with the desired transport
using the connectx_port_config script. Make sure that the ports are not
configured with auto mode.
For further information, please refer to ConnectX-3/ConnectX-3 Pro Port Type
Management and ConnectX-4/ConnectX-5 Port Type Management/VPI Cards
Configuration sections.
Rev 8.4.10 Mellanox Technologies 9
3 Running VMA
In this section we show how to run a simple network benchmarking test and compare the
running results using the kernel’s network stack and that of VMA.
Before running a user application, you must set the library libvma.so into the environment
variable LD_PRELOAD. For further information, please refer to the VMA User Manual.
Example:
$ LD_PRELOAD=libvma.so sockperf server -i 11.4.3.1
NOTE: If LD_PRELOAD is assigned with libvma.so without a path (as in the
Example) then libvma.so is read from a known library path under your distributions’ OS
otherwise it is read from the specified path.
As a result, a VMA header message should precede your running application.
VMA INFO: ---------------------------------------------------------------------------
VMA INFO: VMA_VERSION: 8.4.10-0 Release built on Oct 15 2017 10:07:12
VMA INFO: Cmd Line: ls --color=auto
VMA INFO: OFED Version: MLNX_OFED_LINUX-4.2-0.1.6.0:
VMA INFO: ---------------------------------------------------------------------------
VMA INFO: Log Level INFO [VMA_TRACELEVEL]
VMA INFO: ---------------------------------------------------------------------------
Followed by the application output, please note that:
VMA version is 8.4.10
The command line indicates your application’s name (in the above example: sockperf)
The appearance of the VMA header indicates that the VMA library is loaded with your
application.
3.1 Benchmarking Example
3.1.1 Prerequisites
Install sockperf - a tool for network performance measurement.
This can be done by either
Install sockperf - a tool for network performance measurement
This can be done by either
Download and build from source from: https://github.com/Mellanox/sockperf
Use yum install: yum install sockperf
Two machines, one serves as the server and the second as a client
Management interfaces configured with an IP that machines can ping each other
Physical installation of Mellanox NIC in your machines
Your system must recognize the Mellanox NIC. To verify it recognizes it, run
lspici | grep Mellanox
Example output:
10 Mellanox Technologies Rev 8.4.10
$ lspci |grep Mellanox
03:00.0 Ethernet controller: Mellanox Technologies MT27700 Family [ConnectX-4]
3.1.2 Kernel Performance
3.1.2.1 Kernel Performance Server Side
On the first machine run:
$ sockperf server -i 11.4.3.1
Server side example output:
sockperf: == version #3.1-14.gita9f6056282ef ==
sockperf: [SERVER] listen on:
[ 0] IP = 11.4.3.1 PORT = 11111 # UDP
sockperf: Warmup stage (sending a few dummy messages)...
3.1.2.2 Kernel Performance Client Side
On the second machine run:
$ sockperf ping-pong -t 4 -i 11.4.3.1
Client side example output:
sockperf: == version #3.1-14.gita9f6056282ef ==
sockperf[CLIENT] send on:sockperf: using recvfrom() to block on socket(s)
[ 0] IP = 11.4.3.1 PORT = 11111 # UDP
sockperf: Warmup stage (sending a few dummy messages)...
sockperf: Starting test...
sockperf: Test end (interrupted by timer)
sockperf: Test ended
sockperf: [Total Run] RunTime=4.450 sec; SentMessages=373354;
ReceivedMessages=373353
sockperf: ========= Printing statistics for Server No: 0
sockperf: [Valid Duration] RunTime=3.993 sec; SentMessages=335791;
ReceivedMessages=335791
sockperf: ====> avg-lat= 5.917 (std-dev=0.668)
sockperf: # dropped messages = 0; # duplicated messages = 0; # out-of-order
messages = 0
sockperf: Summary: Latency is 5.917 usec
sockperf: Total 335791 observations; each percentile contains 3357.91
observations
sockperf: ---> <MAX> observation = 19.148
sockperf: ---> percentile 99.999 = 13.727
sockperf: ---> percentile 99.990 = 11.025
sockperf: ---> percentile 99.900 = 8.222
sockperf: ---> percentile 99.000 = 7.530
sockperf: ---> percentile 90.000 = 7.026
sockperf: ---> percentile 75.000 = 6.296
sockperf: ---> percentile 50.000 = 5.680
sockperf: ---> percentile 25.000 = 5.312
sockperf: ---> <MIN> observation = 5.013
3.1.3 VMA Performance
VMA performance can be checked by running sockperf and using the
VMA_SPEC=latency environment variable.
3.1.3.1 VMA Performance Server Side
On the first machine run:
Rev 8.4.10 Mellanox Technologies 11
$ LD_PRELOAD=libvma.so VMA_SPEC=latency sockperf server -i 11.4.3.1
Server side example output:
VMA INFO: ---------------------------------------------------------------------------
VMA INFO: VMA_VERSION: 8.4.10-0 Release built on Oct 15 2017 10:07:12
VMA INFO: Cmd Line: sockperf server -i 11.4.3.1
VMA INFO: OFED Version: MLNX_OFED_LINUX-4.2-0.1.6.0:
VMA INFO: ---------------------------------------------------------------------------
VMA INFO: VMA Spec Latency [VMA_SPEC]
VMA INFO: Log Level INFO [VMA_TRACELEVEL]
VMA INFO: Ring On Device Memory TX 16384 [VMA_RING_DEV_MEM_TX]
VMA INFO: Tx QP WRE 256 [VMA_TX_WRE]
VMA INFO: Tx QP WRE Batching 4 [VMA_TX_WRE_BATCHING]
VMA INFO: Rx QP WRE 256 [VMA_RX_WRE]
VMA INFO: Rx QP WRE Batching 4 [VMA_RX_WRE_BATCHING]
VMA INFO: Rx Poll Loops -1 [VMA_RX_POLL]
VMA INFO: Rx Prefetch Bytes Before Poll 256 [VMA_RX_PREFETCH_BYTES_BEFORE_POLL]
VMA INFO: GRO max streams 0 [VMA_GRO_STREAMS_MAX]
VMA INFO: Select Poll (usec) -1 [VMA_SELECT_POLL]
VMA INFO: Select Poll OS Force Enabled [VMA_SELECT_POLL_OS_FORCE]
VMA INFO: Select Poll OS Ratio 1 [VMA_SELECT_POLL_OS_RATIO]
VMA INFO: Select Skip OS 1 [VMA_SELECT_SKIP_OS]
VMA INFO: CQ Drain Interval (msec) 100 [VMA_PROGRESS_ENGINE_INTERVAL]
VMA INFO: CQ Interrupts Moderation Disabled [VMA_CQ_MODERATION_ENABLE]
VMA INFO: CQ AIM Max Count 128 [VMA_CQ_AIM_MAX_COUNT]
VMA INFO: CQ Adaptive Moderation Disabled [VMA_CQ_AIM_INTERVAL_MSEC]
VMA INFO: CQ Keeps QP Full Disabled [VMA_CQ_KEEP_QP_FULL]
VMA INFO: TCP nodelay 1 [VMA_TCP_NODELAY]
VMA INFO: Avoid sys-calls on tcp fd Enabled [VMA_AVOID_SYS_CALLS_ON_TCP_FD]
VMA INFO: Internal Thread Affinity 0 [VMA_INTERNAL_THREAD_AFFINITY]
VMA INFO: Thread mode Single [VMA_THREAD_MODE]
VMA INFO: Mem Allocate type 2 (Huge Pages) [VMA_MEM_ALLOC_TYPE]
VMA INFO: ---------------------------------------------------------------------------
sockperf: == version #3.1-14.gita9f6056282ef ==
sockperf: [SERVER] listen on:
[ 0] IP = 11.4.3.1 PORT = 11111 # UDP
sockperf: Warmup stage (sending a few dummy messages)...
sockperf: [tid 8188] using recvfrom() to block on socket(s)
3.1.3.2 VMA Performance Client Side
On the second machine run:
$ LD_PRELOAD=libvma.so VMA_SPEC=latency sockperf ping-pong -t 4 -i 11.4.3.1
Client side example output:
VMA INFO: ---------------------------------------------------------------------------
VMA INFO: VMA_VERSION: 8.4.10-0 Release built on Oct 15 2017 10:07:12
VMA INFO: Cmd Line: sockperf ping-pong -t 4 -i 11.4.3.1
VMA INFO: OFED Version: MLNX_OFED_LINUX-4.2-0.1.6.0:
VMA INFO: ---------------------------------------------------------------------------
VMA INFO: VMA Spec Latency [VMA_SPEC]
VMA INFO: Log Level INFO [VMA_TRACELEVEL]
VMA INFO: Ring On Device Memory TX 16384 [VMA_RING_DEV_MEM_TX]
VMA INFO: Tx QP WRE 256 [VMA_TX_WRE]
VMA INFO: Tx QP WRE Batching 4 [VMA_TX_WRE_BATCHING]
VMA INFO: Rx QP WRE 256 [VMA_RX_WRE]
VMA INFO: Rx QP WRE Batching 4 [VMA_RX_WRE_BATCHING]
VMA INFO: Rx Poll Loops -1 [VMA_RX_POLL]
VMA INFO: Rx Prefetch Bytes Before Poll 256 [VMA_RX_PREFETCH_BYTES_BEFORE_POLL]
VMA INFO: GRO max streams 0 [VMA_GRO_STREAMS_MAX]
VMA INFO: Select Poll (usec) -1 [VMA_SELECT_POLL]
VMA INFO: Select Poll OS Force Enabled [VMA_SELECT_POLL_OS_FORCE]
VMA INFO: Select Poll OS Ratio 1 [VMA_SELECT_POLL_OS_RATIO]
VMA INFO: Select Skip OS 1 [VMA_SELECT_SKIP_OS]
VMA INFO: CQ Drain Interval (msec) 100 [VMA_PROGRESS_ENGINE_INTERVAL]
VMA INFO: CQ Interrupts Moderation Disabled [VMA_CQ_MODERATION_ENABLE]
VMA INFO: CQ AIM Max Count 128 [VMA_CQ_AIM_MAX_COUNT]
VMA INFO: CQ Adaptive Moderation Disabled [VMA_CQ_AIM_INTERVAL_MSEC]
VMA INFO: CQ Keeps QP Full Disabled [VMA_CQ_KEEP_QP_FULL]
VMA INFO: TCP nodelay 1 [VMA_TCP_NODELAY]
VMA INFO: Avoid sys-calls on tcp fd Enabled [VMA_AVOID_SYS_CALLS_ON_TCP_FD]
VMA INFO: Internal Thread Affinity 0 [VMA_INTERNAL_THREAD_AFFINITY]
VMA INFO: Thread mode Single [VMA_THREAD_MODE]
VMA INFO: Mem Allocate type 2 (Huge Pages) [VMA_MEM_ALLOC_TYPE]
VMA INFO: ---------------------------------------------------------------------------
sockperf: == version #3.1-14.gita9f6056282ef ==
12 Mellanox Technologies Rev 8.4.10
sockperf[CLIENT] send on:sockperf: using recvfrom() to block on socket(s)
[ 0] IP = 11.4.3.1 PORT = 11111 # UDP
sockperf: Warmup stage (sending a few dummy messages)...
sockperf: Starting test...
sockperf: Test end (interrupted by timer)
sockperf: Test ended
sockperf: [Total Run] RunTime=4.450 sec; SentMessages=1802853; ReceivedMessages=1802852
sockperf: ========= Printing statistics for Server No: 0
sockperf: [Valid Duration] RunTime=4.000 sec; SentMessages=1628031;
ReceivedMessages=1628031
sockperf: ====> avg-lat= 1.211 (std-dev=0.078)
sockperf: # dropped messages = 0; # duplicated messages = 0; # out-of-order messages = 0
sockperf: Summary: Latency is 1.211 usec
sockperf: Total 1628031 observations; each percentile contains 16280.31 observations
sockperf: ---> <MAX> observation = 7.248
sockperf: ---> percentile 99.999 = 3.563
sockperf: ---> percentile 99.990 = 2.911
sockperf: ---> percentile 99.900 = 2.133
sockperf: ---> percentile 99.000 = 1.402
sockperf: ---> percentile 90.000 = 1.291
sockperf: ---> percentile 75.000 = 1.233
sockperf: ---> percentile 50.000 = 1.196
sockperf: ---> percentile 25.000 = 1.169
sockperf: ---> <MIN> observation = 1.090
3.1.4 Comparing Results
VMA is showing over 388.6% performance improvement comparing to kernel
Average latency:
Using Kernel 5.917 usec
Using VMA 1.211 usec
Percentile latencies:
Percentile Kernel VMA
Max 19.148 7.248
99.999 13.727 3.563
99.990 11.025 2.911
99.900 8.222 2.133
99.000 7.530 1.402
90.000 7.026 1.291
75.000 6.296 1.233
50.000 5.680 1.196
25.000 5.312 1.169
MIN 5.013 1.090
In order to tune your system and get best performance see section Basic Performance
Tuning.
3.1.5 Libvma-debug.so
libvma.so is limited to DEBUG log level. In case it is required to run VMA with detailed
logging higher than DEBUG level – use a library called libvma-debug.so that comes with
OFED installation.
Before running your application, set the library libvma-debug.so into the environment
variable LD_PRELOAD (instead of libvma.so).
Rev 8.4.10 Mellanox Technologies 13
Example:
$ LD_PRELOAD=libvma-debug.so sockperf server -i 11.4.3.1
NOTE: libvma-debug.so is located in the same library path as libvma.so under your
distribution’s OS.
For example in RHEL7.2 x86_64, the libvma.so is located in /usr/lib64/libvma-
debug.so.
NOTE: If you need to compile VMA with a log level higher than DEBUG run
“configure” with the following parameter:
./configure --enable-opt-log=none
See section Building VMA from Sources.
14 Mellanox Technologies Rev 8.4.10
4 VMA Installation Options
4.1 Installing VMA Binary Manually
VMA can be installed from a dedicated VMA RPM or a Debian package. In this case, please
make sure that the MLXN Driver is already installed and that MLNX Driver and VMA
versions match so that VMA functions correctly.
This option is suitable for users who receive an OEM VMA version, or a Fast Update
Release VMA version which is newer than the installed VMA version.
4.1.1 Installing the VMA Packages
This section addresses both RPM and DEB (Ubuntu OS) installations.
VMA includes the following packages which should be saved on your local drive:
The libvma package which contains the binary library shared object file (.so),
configuration and documentation files
The libvma-utils package which contains utilities such as vma_stats to monitor vma
traffic and statistics
The libvma-devel package which contains VMA’s extra API header files, for extra
functionality not provided by the socket API
Before you begin, please verify the following
Check whether VMA is installed using:
For RPM packages:
#rpm -qil libvma
For DEB packages:
#dpkg –s libvma
If the VMA packages are not installed, an appropriate message is displayed.
If the VMA packages are installed, the RPM or the DEB logs the VMA package
information and the installed file list.
Uninstall the current VMA using:
For RPM packages:
#rpm -e libvma
For DEB packages:
#apt-get remove libvma
4.1.1.1 Installing the VMA Binary Package
1. Go to the location where the libvma package was saved
2. Run the command below to start installation:
For RPM packages:
#rpm -i libvma-X.Y.Z-R.<arch>.rpm
An example for libvma-X.Y.Z-R.<arch>.rpm: libvma-8.3.7-1.x86_64.rpm
Rev 8.4.10 Mellanox Technologies 15
For DEB packages:
#dpkg -i libvma_X.Y.Z-R_<arch>.deb
Example:
libvma_8.4.10-0_amd64.deb
During the installation process the:
VMA library is copied to standard library location (e.g. /usr/lib64/libvma.so). In
addition VMA debug library is copied under the same location (e.g. /usr/lib64/libvma-
debug.so)
README.txt and version information (VMA_VERSION) are installed at
/usr/share/doc/libvma-X.Y.Z/
VMA installs its configuration file, libvma.conf, to the following location:
/etc/libvma.conf
The vmad service utility is copied into /sbin.
The script vma is installed under /etc/init.d/. This script can be used to load and unload
the VMA service utility.
3. [ConnectX®-3/ConnectX®-3 Pro] Verify the following is configured in the
/etc/modprobe.d/mlnx.conf file.
options ib_uverbs disable_raw_qp_enforcement=1
options mlx4_core fast_drop=1
options mlx4_core log_num_mgm_entry_size=-1
When running over InfiniBand, configure the IPoIB mode to be datagram.
i. Modify "SET_IPOIB_CM=no" in the file "/etc/infiniband/openib.conf"
ii. Verify that it is configured to work as UD mode
$cat /sys/class/net/<IB port>/mode
datagram
4.1.1.2 Install the VMA utils Package
1. Go to the location where the utils package was saved
2. Run the command below to start installation:
For RPM packages:
#rpm -i libvma-utils-X.Y.Z-R.<arch>.rpm
For DEB packages:
#dpkg -i libvma-utils_X.Y.Z-R_<arch>.deb
During the installation process the VMA monitoring utility is installed at
/usr/bin/vma_stat.
4.1.1.3 Installing the VMA devel Package
1. Go to the location where the devel package was saved
2. Run the command below to start installation:
For RPM packages:
#rpm -i libvma-devel-X.Y.Z-R.<arch>.rpm
For DEB packages:
16 Mellanox Technologies Rev 8.4.10
#dpkg -i libvma-dev_X.Y.Z-R_<arch>.deb
During the installation process the VMA extra header file is installed at
/usr/include/mellanox/vma_extra.h.
4.1.2 Verifying VMA Installation
For further information, please refer to section Running VMA.
4.2 Building VMA from Sources
VMA as an open source enables users to try the code, inspect and modify. This option
assumes that VMA (binary package) is already installed on top of Mellanox driver, and
functions correctly (as explained in Installing the VMA Binary Package). Users can
download the VMA sources from libvma github (https://github.com/Mellanox/libvma/) and
build a new VMA version.
This option is suitable for users who wish to implement custom VMA modifications.
Please visit the libvma wiki page at https://github.com/Mellanox/libvma/wiki for various
“how-to” issues.
For building VMA from sources, please visit:
https://github.com/Mellanox/libvma/wiki/Build-Instruction
For adding high debug log level to VMA compile it with
./configure --enable-opt-log=none <other configure parameters>
For libvma configuration file please visit:
https://github.com/Mellanox/libvma/wiki/VMA_Configuration_File
4.2.1 Verifying VMA Compilation
The compiled VMA library is located under:
<path-to-vma-sources-root-tree>/src/vma/.libs/libvma.so
When running a user application, you must set the compiled library into the env variable
LD_PRELOAD.
Example:
#LD_PRELOAD=<path-to-vma-sources-root-tree>/src/vma/.libs/libvma.so sockperf
ping-pong -t 5 -i 224.22.22.22
As indicated in section Running VMA, the appearance of the VMA header verifies that the
compiled VMA library is loaded with your application.
Remark: it is recommended to keep the original libvma.so under the distribution’s original
location (e.g. /usr/lib64/libvma.so) and not to override it with the newly complied libvma.so.
Rev 8.4.10 Mellanox Technologies 17
4.3 Installing VMA in RHEL 7.x Inbox
NOTE: Please note, VMA can be installed in RHEL 7.2 and above Inbox Drivers.
RHEL 7.x distribution contains InfiniBand and RoCE drivers version which can be retrieved
by running “yum install”. The Mellanox drivers contain a code that was tested and integrated
into the Linux upstream. However, this code may be either more or less advanced than
MLNX_OFED version 4.0-x.x.x or MLNX-EN v3.4-x.x.x. In this case, you can:
This option is suitable for users who do not wish to change their exiting Mellanox driver that
comes with RHEL 7.x installation.
For running VMA over RHEL 7.2 with Inbox drivers, please visit:
https://github.com/Mellanox/libvma/wiki/VMA-over-RHEL-7.x-with-inbox-driver
For running VMA over RHEL 7.3 or later with Inbox drivers, execute
yum install libvma
For libvma configuration file please visit:
https://github.com/Mellanox/libvma/wiki/VMA_Configuration_File
For information on how to uninstall VMA in RHEL 7.x Inbox, please refer to section VMA
in RHEL 7.x Inbox Uninstallation.
NOTE: By default “yum install libvma” will install one of past VMA versions. In order
to run with latest VMA version, you can compile libvma directly from GitHub.. For
further information, refer to Verifying VMA Compilation.
18 Mellanox Technologies Rev 8.4.10
5 Uninstalling VMA
5.1 Automatic VMA Uninstallation
If you are about to install a new Mellanox driver version, the old VMA version will be
automatically uninstalled as part of this process (followed by a new VMA version
installation). Please refer to Installing VMA Binary as Part of Mellanox Drivers for installing
Mellanox drivers command details.
If you are about to uninstall the Mellanox Driver, VMA will be automatically uninstalled as
part of this process.
5.2 Manual VMA Uninstallation
If you are about to manually uninstall VMA packages, please follow these steps:
To uninstall VMA:
For RPM packages:
#rpm -e libvma-utils
#rpm -e libvma-devel
#rpm -e libvma
For DEB packages:
#dpkg -r libvma-utils
#dpkg -r libvma-dev
#dpkg -r libvma
When you uninstall VMA, the libvma.conf configuration file is saved with the existing
configuration. The path of the saved path is displayed immediately after the uninstallation is
complete.
5.3 VMA in RHEL 7.x Inbox Uninstallation
VMA under RHEL 7.x is built from sources and is not installed as package. Therefore if
VMA is not needed - delete the compiled libvma.so (or the entire libvma sources) and the
configuration file (e.g. /etc/libvma.conf)
Note: Uninstalling InfiniBand & RoCE drivers’ package using “yum remove” will neither
delete the complied VMA library nor the configuration file.
To uninstall VMA, run:
rpm –e libvma
Rev 8.4.10 Mellanox Technologies 19
6 Upgrading libvma.conf
When you upgrade VMA (through automatic or manual installation), the libvma.conf
configuration file is handled as follows:
If the existing configuration file has been modified since it was installed and is different
from the upgraded RPM or DEB, the modified version will be left in place, and the
version from the new RPM or DEB will be installed with a new suffix.
If the existing configuration file has not been modified since it was installed, it will
automatically be replaced by the version from the upgraded RPM or DEB.
If the existing configuration file has been edited on disk, but is not actually different from
the upgraded RPM or DEB, the edited version will be left in place; the version from the
new RPM or DEB will not be installed.
20 Mellanox Technologies Rev 8.4.10
Appendix A: Port Type Management
A.1 ConnectX-3/ConnectX-3 Pro Port Type Management
ConnectX®-3 ports can be individually configured to work as InfiniBand or Ethernet ports.
By default both ConnectX-3 ports are initialized as InfiniBand ports. If you wish to change
the port type use the connectx_port_config script after the driver is loaded.
NOTE: When changing port type using the "connectx_port_config" utility, all the
HCA's ports and interfaces are brought down, and then brought back up with a new port
configuration.
Running “/sbin/connectx_port_config -s” will show current port configuration
for all ConnectX devices.
Port configuration is saved in the file: /etc/infiniband/connectx.conf. This saved
configuration is restored at driver restart only if restart is done via “/etc/init.d/openibd
restart”.
Possible port types are:
eth – Ethernet
ib – Infiniband
auto – Link sensing mode - Detect port type based on the attached network type. If no
link is detected, the driver retries link sensing every few seconds.
The table below lists the ConnectX port configurations supported by VPI.
Table 3: Supported ConnectX®-3 Port Configurations
Port 1 Configuration Port 2 Configuration
Ib ib
Ib eth
Eth eth
The port link type can be configured for each device in the system at run time using the
“/sbin/connectx_port_config” script. This utility will prompt for the PCI device
to be modified (if there is only one it will be selected automatically).
In the next stage the user will be prompted for the desired mode for each port. The desired
port configuration will then be set for the selected device.
This utility also has a non-interactive mode:
/sbin/connectx_port_config [[-d|--device <PCI device ID>] -c|--conf
<port1,port2>]"
Rev 8.4.10 Mellanox Technologies 21
A.2 ConnectX-4/ConnectX-5 Port Type Management/VPI Cards Configuration
ConnectX®-4/ConnectX®-5 ports can be individually configured to work as InfiniBand or
Ethernet ports. By default both ConnectX®-4/ConnectX®-5 VPI ports are initialized as
InfiniBand ports. If you wish to change the port type, use the mlxconfig script after the
driver is loaded.
For further information on how to set the port type in ConnectX®-4, please refer to the MFT
User Manual (www.mellanox.con Products Software InfiniBand/VPI Software
MFT - Firmware Tools)
22 Mellanox Technologies Rev 8.4.10
Appendix B: Basic Performance Tuning Please see Mellanox Tuning Guide and VMA Performance Tuning Guide for detailed
instructions on how to optimally tune your machines for VMA performance.
B.1 VMA Tuning Parameters
Parameter Description Example
VMA_SPEC Optimized performance can easily
be measured by VMA predefined
specification profile for latency:
Latency profile spec -
optimized latency on all use
cases. System is tuned to keep
balance between Kernel and
VMA.
Note: It may limit the maximum
bandwidth.
LD_PRELOAD=libvma.so
VMA_SPEC=latency sockperf
ping-pong -t 10 -i 11.4.3.1
VMA_RX_POLL For blocking sockets only. It
controls the number of times the
ready packets can be polled on the
RX path before they go to sleep
(wait for interrupt in blocked
mode). The recommended value
for best latency is -1 (unlimited),
For best latency, use -1 for
infinite polling
For low CPU usage use 1 for
single poll
Default value is 100000
Server:
VMA_RX_POLL=-1
LD_PRELOAD=libvma.so
taskset -c 15 sockperf sr
-i 17.209.13.142
Client:
VMA_RX_POLL=-1
LD_PRELOAD=libvma.so
taskset -c 15 sockperf pp
-i 17.209.13.142 -t 5
VMA_INTERNAL_T
HREAD_AFFINITY
Controls which CPU core(s) the
VMA internal thread is serviced
on. The recommended
configuration is to run VMA
internal thread on a different core
than the application but on the
same NUMA node.
Server:
VMA_INTERNAL_THREAD_AFFINIT
Y=14 VMA_RX_POLL=-1
LD_PRELOAD=libvma.so
taskset -c 15 sockperf sr
-i 17.209.13.142
Client:
VMA_INTERNAL_THREAD_AFFINIT
Y= 14 VMA_RX_POLL=-1
LD_PRELOAD=libvma.so
taskset -c 15 sockperf pp
-i 17.209.13.142 -t 5
B.2 Binding VMA to the Closest NUMA
1. Check which NUMA is related to your interface.
cat /sys/class/net/<interface_name>/device/numa_node
Example:
[root@r-host142 ~]# cat /sys/class/net/ens5/device/numa_node
1
The output above shows that your device is installed next to NUMA 1.
Rev 8.4.10 Mellanox Technologies 23
2. Check which CPU is related to the specific NUMA.
[root@r-host144 ~]# lscpu
NUMA node0 CPU(s): 0-13,28-41
NUMA node1 CPU(s): 14-27,42-55
The output above shows that:
CPUs 0-13 & 28-41 are related to NUMA 0
CPUs 14-27 & 42-55 are related to NUMA 1
Since we want to use NUMA 1, one of the following CPUs should be used: 14-27 &
42-55
3. Use the taskset command to run the VMA process on a specific CPU.
Server side:
LD_PRELOAD=libvma.so taskset -c 15 sockperf sr -i < MLX
IP interface >
Client Side:
LD_PRELOAD=libvma.so taskset -c 15 sockperf pp -i < IP of
FIRST machine MLX interface >
In this example, we use CPU 15 that belongs to NUMA 1. You can also use
numactl - -hardware
B.3 Configuring the BIOS
Each machine has its own BIOS parameters. It is important to implement any server
manufacturer and Linux distribution tuning recommendations for lowest latency.
When configuring the BIOS, please pay attention to the following:
1. Enable Max performance mode.
2. Enable Turbo mode.
3. Power modes – disable C-states and P-states, do not let the CPU sleep on idle.
4. Hyperthreading – there is no right answer if you should have it ON or OFF.
ON means more CPU to handle kernel tasks, so the amortized cost will be smaller for
each CPU
OFF means do not share cache with other CPUs, so cache utilization is better
If all of your system jitter is under control, it is recommended to turn is OFF, if not keep
it ON.
5. Disable SMI interrupts.
Look for "Processor Power and Utilization Monitoring" and "Memory Pre-Failure
Notification" SMIs.
The OS is not aware of these interrupts, so the only way you might be able to notice them
is by reading the CPU msr register.
Please make sure to carefully read your vendor BIOS tuning guide as the configuration
options differ per vendor.