High Throughput File Servers with SMB Direct, Using the 3 Flavors of RDMA network adapters
Jose Barreto Principal Program Manager
Microsoft Corporation
Abstract
In Windows Server 2012, we introduce the “SMB Direct” protocol, which allows file servers to use high throughput/low latency RDMA network interfaces.
However, there are three distinct flavors of RDMA, each with their own specific requirements and advantages, their own pros and cons.
In this session, we'll look into iWARP, InfiniBand and RoCE, outline the differences between them. We'll also list the specific vendors that offer each technology and provide step-by-step instructions for anyone planning to deploy them.
The talk will also include an update on RDMA performance and a customer case study.
Summary
• Overview of SMB Direct (SMB over RDMA)
• Three flavors of RDMA
• Setting up SMB Direct
• SMB Direct Performance
• SMB Direct Case Study
SMB Direct (SMB over RDMA) • New class of SMB file storage for the Enterprise
– Minimal CPU utilization for file storage processing
– Low latency and ability to leverage high speed NICs
– Fibre Channel-equivalent solution at a lower cost
• Traditional advantages of SMB file storage – Easy to provision, manage and migrate – Leverages converged network – No application change or administrator
configuration
• Required hardware – RDMA-capable network interface (R-NIC) – Support for iWARP, InfiniBand and RoCE
• Uses SMB Multichannel for Load Balancing/Failover
File Client File Server
SMB Server SMB Client
User
Kernel
Application
Disk
R-NIC
Network w/ RDMA support
NTFS SCSI
Network w/ RDMA support
R-NIC
What is RDMA?
• Remote Direct Memory Access Protocol – Accelerated IO delivery model which works
by allowing application software to bypass most layers of software and communicate directly with the hardware
• RDMA benefits – Low latency – High throughput – Zero copy capability – OS / Stack bypass
• RDMA Hardware Technologies – Infiniband – iWARP: RDMA over TCP/IP – RoCE: RDMA over Converged Ethernet
File Server
SMB Direct
Client
RDMA NIC
SMB Direct
Ethernet or InfiniBand
SMB Server
SMB Client
Memory Memory
NDKPI NDKPI
RDMA NIC
RDMA
File Server
SMB Direct
1. Application (Hyper-V,
SQL Server) does not
need to change.
2. SMB client makes the
decision to use SMB
Direct at run time
3. NDKPI provides a much
thinner layer than
TCP/IP
4. Remote Direct Memory
Access performed by
the network interfaces.
SMB over TCP and RDMA
Client
Application
NIC
RDMA NIC
TCP/ IP
User Kernel
SMB Direct
Ethernet and/or InfiniBand
TCP/ IP
Unchanged API
SMB Server SMB Client
Memory Memory
NDKPI NDKPI
RDMA NIC NIC
RDMA 1
2
3
4 1
2
3
4
Type (Cards*) Pros Cons
Non-RDMA Ethernet
(wide variety of NICs)
• TCP/IP-based protocol
• Works with any Ethernet switch
• Wide variety of vendors and models
• Support for in-box NIC teaming (LBFO)
• Currently limited to 10Gbps per NIC port
• High CPU Utilization under load
• High latency
iWARP
(Intel NE020*,
Chelsio T4)
Low
CP
U U
tiliz
atio
n u
nd
er lo
ad
Low
late
ncy
• TCP/IP-based protocol
• Works with any 10GbE switch
• RDMA traffic routable
• Currently limited to 10Gbps per NIC port*
RoCE
(Mellanox ConnectX-2,
Mellanox ConnectX-3*)
• Ethernet-based protocol
• Works with high-end 10GbE/40GbE switches
• Offers up to 40Gbps per NIC port today*
• RDMA traffic not routable via existing IP infrastructure
• Requires DCB switch with Priority Flow Control (PFC)
InfiniBand
(Mellanox ConnectX-2,
Mellanox ConnectX-3*)
• Offers up to 54Gbps per NIC port today*
• Switches typically less expensive per port than
10GbE switches*
• Switches offer 10GbE or 40GbE uplinks
• Commonly used in HPC environments
• Not an Ethernet-based protocol
• RDMA traffic not routable via existing IP infrastructure
• Requires InfiniBand switches
• Requires a subnet manager (on the switch or the host)
Comparing RDMA Technologies
* This is current as of the release of Windows Server 2012 RC. Information on this slide is subject to change as technologies evolve and new cards become available.
Mellanox ConnectX®-3 dual-Port Adapter with VPI (InfiniBand and Ethernet)
• Mellanox provides end-to-end InfiniBand and Ethernet connectivity solutions (adapters, switches, cables)
– Connecting data center servers and storage
• Up to 56Gb/s InfiniBand and 40Gb/s Ethernet per port
– Low latency, Low CPU overhead, RDMA
– InfiniBand to Ethernet Gateways for seamless operation
• Windows Server 2012 exposes the great value of InfiniBand for storage traffic, virtualization and low latency
– InfiniBand and Ethernet (with RoCE) integration
– Highest Efficiency, Performance and return on investment
• For more information:
– http://www.mellanox.com/content/pages.php?pg=file_server
– Gilad Shainer, [email protected], [email protected]
Intel 10GbE iWARP Adapter - NE020
• In production today – Supports Microsoft’s MPI via ND in Windows Server 2008 R2 and beyond
– See Intel’s Download site (http://downloadcenter.intel.com) for drivers (search “NE020”)
• Drivers inbox since Beta for Windows Server 2012 – Supports Microsoft’s SMB Direct via NDK
– Uses the IETF’s iWARP RDMA technology that is built on top of IP
– The only WAN-routable, “cloud-ready” RDMA technology
– Uses standard ethernet switches
– Beta drivers available from Intel’s Download site (http://downloadcenter.intel.com) for drivers (search “NE020”)
• For more information: – [email protected]
Chelsio T4 line of 10GbE adapters (iWARP)
http://www.chelsio.com/wp-content/uploads/2011/07/ProductSelector-0312.pdf
• Contact: [email protected]
Setting up SMB Direct • Install hardware and drivers
– Get-NetAdapter – Get-NetAdapterRdma
• Configure IP addresses – Get-SmbServerNetworkInterface – Get-SmbClientNetworkInterface
• Establish an SMB Connection – Get-SmbConnection – Get-SmbMultichannelConnection
• Similar to configuring SMB for regular network interfaces
• Verify client Performance Counters – RDMA Activity – 1/interface – SMB Direct Connection – 1/connection – SMB Client Shares – 1/share
• Verify server Performance Counters – RDMA Activity – 1/interface – SMB Direct Connection – 1/connection – SMB Server Shares – 1/share – SMB Server Session – 1/session
InfiniBand details • Cards
– Mellanox ConnectX-2 – Mellanox ConnectX-3
• Configure a subnet manager on the switch – Using a managed switches with a built-in subnet manager
• Or use OpenSM on Windows Server 2012 – Included as part of the Mellanox package – New-Service –Name "OpenSM" –BinaryPathName "`"C:\Program
Files\Mellanox\MLNX_VPI\IB\Tools\opensm.exe`" --service -L 128" -DisplayName "OpenSM" –Description "OpenSM" -StartupType Automatic
iWARP details
• Cards – Intel NE020
– Chelsio T4
• Configure the firewall – SMB Direct with iWARP uses TCP port 5445
– Enable-NetFirewallRule FPSSMBD-iWARP-In-TCP
• Allow cross-subnet access (optional) – iWARP RDMA technology can be routed across IP subnets
– Set-NetOffloadGlobalSetting -NetworkDirectAcrossIPSubnets Allow
RoCE details • Cards
– Mellanox ConnectX-3 – Make sure to configure the NIC for Ethernet
• Configuring Priority Flow Control (PFC) on Windows
– Install-WindowsFeature Data-Center-Bridging – New-NetQosPolicy “RoCE” –NetDirectPortMatchCondition 445 -
PriorityValue8021Action 4 – Enable-NetQosFlowControl –Priority 4 – Enable-NetAdapterQos –InterfaceAlias RDMA1 – Set-NetQosDcbxSetting –willing 0 – New-NetQoSTrafficClass "RoCE" -Priority 4 -Bandwidth 60 -Algorithm ETS
• Configuring PFC on the Switch
SMB Direct Performance – 1 x 54GbIB
Single Server
Fusion IO Fusion IO Fusion IO Fusion IO
IO Micro Benchmark
SMB Client
SMB Server
Fusion IO Fusion IO Fusion IO Fusion IO
IO Micro Benchmark
10 GbE
10GbE
SMB Client
SMB Server
Fusion IO Fusion IO Fusion IO Fusion IO
IO Micro Benchmark
IB FDR
IB FDR
SMB Client
SMB Server
Fusion IO Fusion IO Fusion IO Fusion IO
IO Micro Benchmark
IB QDR
IB QDR
SMB Direct Performance – 1 x 54GbIB
*** Preliminary *** results from two Intel Romley machines with 2 sockets each, 8 cores/socket Both client and server using a single port of a Mellanox network interface PCIe Gen3 x8 slot
Data goes all the way to persistent storage, using 4 FusionIO ioDrive 2 cards
Preliminary results based on Windows Server 2012 beta
Configuration BW MB/sec
IOPS 512KB IOs/sec
%CPU Privileged
Non-RDMA (Ethernet, 10Gbps) 1,129 2,259 ~9.8
RDMA (InfiniBand QDR, 32Gbps) 3,754 7,508 ~3.5
RDMA (InfiniBand FDR, 54Gbps) 5,792 11,565 ~4.8
Local 5,808 11,616 ~6.6
Configuration BW MB/sec
IOPS 8KB IOs/sec
%CPU Privileged
Non-RDMA (Ethernet, 10Gbps) 571 73,160 ~21.0
RDMA (InfiniBand QDR, 32Gbps) 2,620 335,446 ~85.9
RDMA (InfiniBand FDR, 54Gbps) 2,683 343,388 ~84.7
Local 4,103 525,225 ~90.4
Workload: 512KB IOs, 8 threads, 8 outstanding Workload: 8KB IOs, 16 threads, 16 outstanding
htt
p:/
/sm
b3
.info
File Client (SMB 3.0)
SMB Direct Performance – 2 x 54GbIB
Single Server
SQLIO File Server (SMB 3.0)
SQLIO
RDMA NIC
RDMA NIC
RDMA NIC
RDMA NIC
Hyper-V (SMB 3.0)
File Server (SMB 3.0)
VM
RDMA NIC
RDMA NIC
RDMA NIC
RDMA NIC
SQLIO
SAS
RAID Controller
JBOD SSD SSD SSD SSD SSD SSD SSD SSD
SAS
RAID Controller
JBOD SSD SSD SSD SSD SSD SSD SSD SSD
SAS
RAID Controller
JBOD SSD SSD SSD SSD SSD SSD SSD SSD
SAS
RAID Controller
JBOD SSD SSD SSD SSD SSD SSD SSD SSD
SAS
RAID Controller
JBOD SSD SSD SSD SSD SSD SSD SSD SSD
SAS
RAID Controller
JBOD SSD SSD SSD SSD SSD SSD SSD SSD
SAS
RAID Controller
JBOD SSD SSD SSD SSD SSD SSD SSD SSD
SAS
RAID Controller
JBOD SSD SSD SSD SSD SSD SSD SSD SSD
SAS
RAID Controller
JBOD SSD SSD SSD SSD SSD SSD SSD SSD
SAS
RAID Controller
JBOD SSD SSD SSD SSD SSD SSD SSD SSD
SAS
RAID Controller
JBOD SSD SSD SSD SSD SSD SSD SSD SSD
SAS
RAID Controller
JBOD SSD SSD SSD SSD SSD SSD SSD SSD
SMB Direct Performance – 2 x 54GbIB
Preliminary results based on Windows Server 2012 RC
Configuration BW MB/sec
IOPS 512KB IOs/sec
%CPU Privileged
Latency milliseconds
1 – Local 10,090 38,492 ~2.5% ~3ms
2 – Remote 9,852 37,584 ~5.1% ~3ms
3 - Remote VM 10,367 39,548 ~4.6% ~3 ms
SMB Direct Performance
File Server (SMB 3.0)
File Client (SMB 3.0)
SMB Direct Performance – 3 x 54GbIB
SQLIO
RDMA NIC
RDMA NIC
RDMA NIC
RDMA NIC
RDMA NIC
RDMA NIC
SAS
RAID Controller
JBOD SSD SSD SSD SSD SSD SSD SSD SSD
SAS
SAS HBA
JBOD SSD SSD SSD SSD SSD SSD SSD SSD
SAS
SAS HBA
JBOD SSD SSD SSD SSD SSD SSD SSD SSD
SAS
SAS HBA
JBOD SSD SSD SSD SSD SSD SSD SSD SSD
SAS
SAS HBA
JBOD SSD SSD SSD SSD SSD SSD SSD SSD
SAS
SAS HBA
JBOD SSD SSD SSD SSD SSD SSD SSD SSD
Storage Spaces
Workload BW MB/sec
IOPS IOs/sec
%CPU Privileged
Latency milliseconds
512KB IOs, 100% read, 2t, 8o 16,778 32,002 ~11% ~ 2 ms
8KB IOs, 100% read, 16t, 2o 4,027 491,665 ~65% < 1 ms
Preliminary results based on Windows Server 2012 RC
Case Study
Summary
• Overview of SMB Direct (SMB over RDMA)
• Three flavors of RDMA
• Setting up SMB Direct
• SMB Direct Performance
• SMB Direct Case Study
Related Content
• Blog Posts http://smb3.info
• TechEd Talks WSV328 The Path to Continuous Availability with Windows Server 2012
VIR306 Hyper-V over SMB: Remote File Storage Support in Windows Server 2012 Hyper-V
WSV314 Windows Server 2012 NIC Teaming and SMB Multichannel Solutions
WSV334 Windows Server 2012 File and Storage Services Management
WSV303 Windows Server 2012 High-Performance, Highly-Available Storage Using SMB
WSV330 How to Increase SQL Availability and Performance Using WS 2012 SMB 3.0 Solutions
WSV410 Continuously Available File Server: Under the Hood
WSV310 Windows Server 2012: Cluster-in-a-Box, RDMA, and More
Top Related