A New Path to Better Data Movement within System Memory ...

27
1 A New Path to Better Data Movement within System Memory, Computational Memory with SDXI Shyam Iyer, Chair, SNIA SDXI Technical Work Group; Distinguished Engineer, Dell (With inputs from SDXI TWG Members)

Transcript of A New Path to Better Data Movement within System Memory ...

Page 1: A New Path to Better Data Movement within System Memory ...

1

A New Path to Better Data Movement within System Memory, Computational Memory with SDXIShyam Iyer, Chair, SNIA SDXI Technical Work Group; Distinguished Engineer, Dell(With inputs from SDXI TWG Members)

Page 2: A New Path to Better Data Movement within System Memory ...

22

SDXI: Smart Data Accelerator Interface SDXI(Smart Data Accelerator Interface) is a proposed SNIA standard for a memory to memory Data Mover interface.

Page 3: A New Path to Better Data Movement within System Memory ...

33

AgendaThe problem and the need for a solutionIntroducing SDXI(Smart Data Acceleration Interface)

Page 4: A New Path to Better Data Movement within System Memory ...

4

System Design Challenges

• Core counts increasing to enable Compute scaling

• Compute density is on the rise

• Converged and Hyperconverged Storage appliances are enabling new workloads on server class systems

• Data locality is important

• Single threaded performance is under pressure.

• I/O intensive workloads can take away compute CPU cycles available.

• Network and Storage workloads can take compute cycles

• Data Movement, Encryption, Decryption, Compression

Unlocking trapped resources

© 2021 SNIA Persistent Memory + Computational Storage Summit. All Rights Reserved.

Page 5: A New Path to Better Data Movement within System Memory ...

5

Host Stack

• Each intra-host exchange can comprise multiple memory buffer copies (or transformations)

• Generally implemented with layers of software stacks:

• Kernel-to-I/O can leverage I/O-specific hardware memory copy

• But SW-to-SW usually relies on per-core synchronous software (CPU-only) memory copies

© 2021 SNIA Persistent Memory + Computational Storage Summit. All Rights Reserved.

10GbE

40GbE

100GbE

RoCE

TCP/IP

iWarp

25GbE

NVMe

NVDIMM

New Memory Technologies

Reducing Storage Network LatencyIncreasing BW demands

Application Workload demands

Intra-Host Workload Congestion

Application

Compute Stack

Network Stack

Host Network Uplink

Host/Hypervisor

Storage Stack (Eg: Storage VMs)

Network/Storage Stack

vSwitch

Remote Storage

Storage Cluster Network

PCIe

Gen-Z

CXL

Page 6: A New Path to Better Data Movement within System Memory ...

6

Disaggregated Host Stacks

© 2021 SNIA Persistent Memory + Computational Storage Summit. All Rights Reserved.

10GbE

40GbE

100GbE

RoCE

TCP/IP

iWarp

25GbE

NVMe

NVDIMM

New Memory Technologies

Reducing Storage Network LatencyIncreasing BW demands

Application Workload demands

Application

Compute Stack

Network Stack

Host Network Uplink Host/Hypervisor

Storage Stack (Eg: Storage VMs)

Network/Storage Stack

Remote Storage

Storage Cluster Network

Host/Hypervisor

Network Stack

Application

PCIe

Gen-Z

CXLThe same system design challenges apply !

Page 7: A New Path to Better Data Movement within System Memory ...

7

System Physical Address space

DRAM(Context A)

DRAM(Context B)DRAM (Context B)

DRAM (Context A)

Current memory to memory data movement standard:

• Stable CPU ISA for SW based memory copies

• Takes away from application performance

• Software overhead to provide context isolation

• Synchronous SW copies stall applications

• Less portable to different ISAs(Instruction Set Architectures)

• Finely tuned CPU data movement algorithms can break with new microarchitectures

© 2021 SNIA Persistent Memory + Computational Storage Summit. All Rights Reserved.

Application(Context A) Application(Context B)

CPU

SW context isolation layers

Page 8: A New Path to Better Data Movement within System Memory ...

8

Offload DMA engines: A new concept ?

• Fast DMA offload engines are -• Vendor-specific HW

• Vendor specific drivers, APIs

• Vendor specific work submission/completion models

• Direct access by user level software is difficult

• Limited Usage Models

• Vendor specific DMA states – Makes it harder to abstract/virtualize and migrate the work to other hosts

© 2021 SNIA Persistent Memory + Computational Storage Summit. All Rights Reserved.

Page 9: A New Path to Better Data Movement within System Memory ...

9

Solution Requirements

1. Need to offload I/O from Compute CPU cycles

2. Need Architectural Stability

3. Enable Application/VM acceleration but,

• Help migration from existing SW Stacks

4. Create abstractions in Control Path for scale and management

5. Enable performance in data path with offloads

© 2021 SNIA Persistent Memory + Computational Storage Summit. All Rights Reserved.

Page 10: A New Path to Better Data Movement within System Memory ...

10

SW context isolation layers

MMIO (Memory Mapped I/O)

SCM (Storage Class Memory)

CXL/Fabric Attached Memory/Gen-Z

System Physical Address space

Security

Data mover Acceleration (CPU offloaded)Security

Application(Context A) Application(Context B)

1. Leverage a standard specification

Direct User mode

2. Innovate around the spec

3. Add incremental Data acceleration features

Architectural Stability

GPU

FPGA

SMART IO

CPU Family B

CPU CPU Family A

DRAM(Context A)

DRAM(Context B)DRAM (Context B)

DRAM (Context A)

The need for an industry standard

Accelerator

Accelerator

Accelerator

Accelerator

Accelerator

We are entering a tiered

Memory world !

© 2021 SNIA Persistent Memory + Computational Storage Summit. All Rights Reserved.

Page 11: A New Path to Better Data Movement within System Memory ...

11

Why focus on standardizing a memory to memory interface ?

• Heterogenous Compute resource usage on the rise

• Blurring boundaries between a Server host and Storage host

• Disaggregation unlocks trapped resources

© 2021 SNIA Persistent Memory + Computational Storage Summit. All Rights Reserved.

Data in Use

DRAM

Memory Expansion with CXL

MMIO (Memory Mapped I/O)

Persistent Memory

Host

Memory to Memory Interface

Computational Memory Interface

Interface to accelerate Storage & Memory Convergence

Interface to accelerate resource disaggregation with memory links/fabrics

Page 12: A New Path to Better Data Movement within System Memory ...

1212

AgendaThe problem and the need for a solution Introducing SDXI(Smart Data Acceleration Interface)

Page 13: A New Path to Better Data Movement within System Memory ...

13

Accelerated, Virtualized, Standardized HW Data Movement

PF

IOMMU

DRAM

VF

App

VF

App

VM

VF

App

VF

App

VM VM VM

Scalable, Concurrent, Non-mediated, Standardized, HW Data Movement

1. A standardized data mover interface independent of actual implementations & underlying I/O.

2. Data movement between different address spaces both within and across VMs.

3. Data movement without mediation by privileged software(Hypervisor).

4. An interface and architecture that can be abstracted or virtualized by privileged software.

5. Concurrent DMA model.6. Allow “live” workload or virtual machine migration

between servers.7. Forwards and backwards compatibility.

– Allow Hardware, software Interoperability

8. Incorporate additional offloads in the future.

VF

© 2021 SNIA Persistent Memory + Computational Storage Summit. All Rights Reserved.

Page 14: A New Path to Better Data Movement within System Memory ...

14

SDXI Function Architecture

• Function setup & control accelerator-independent.

• One standard descriptor format.

• All SDXI context state resides in memory.

• No special mechanisms to serialize state.

• Very easy to virtualize.

Read Index

Write Index

DescriptorRingContext Ctrl

and State

ContextDoorbellAKey Table

ContextTables

Function MMIO

© 2021 SNIA Persistent Memory + Computational Storage Summit. All Rights Reserved.

Page 15: A New Path to Better Data Movement within System Memory ...

15

SDXI Function Architecture

• Ring state directly managed by user mode.

• Multiple contexts per function.

• Allows easier multi-producer support within one VM.

• One way to setup, control, and quiesce contexts.

• One flexible way to log errors.

Read Index

Write Index

DescriptorRingContext Ctrl

and State

ContextDoorbellAKey Table

ContextTables

Error Log

Function MMIO

User Mode AccessPrivileged Access

© 2021 SNIA Persistent Memory + Computational Storage Summit. All Rights Reserved.

Page 16: A New Path to Better Data Movement within System Memory ...

16

A Standard Descriptor Format (1)

*Room for lots of future operations

Completion_Ptr

Operation-Specific Descriptor Body

VOperation CTLRsvdOp Group

64-Bytes

AdministrativeConnect

DMA BaseAtomic

Vendor-DefinedOthers …

Architecturally Registered Operation Groups:

DMA: Copy, RepCopy, WriteImmAtomic: Bitwise Ops, Add, Sub, Swap, Min, Max, CmpSwap, etcAdmin: Start/Stop/Update/Sync, Interrupt Function & Contexts

(easily virtualizable)Connect: Connect VMs through SDXI functions

(easily virtualizable)

Completion Status Block• Initialized by SW, Decremented by Function on Success• Can be shared across groupings of descriptors• Errors flagged in Completion Signal and Misc Status

Completion SignalMisc Status

© 2021 SNIA Persistent Memory + Computational Storage Summit. All Rights Reserved.

Rsvd

Page 17: A New Path to Better Data Movement within System Memory ...

17

Driver

SDXI Device

RepCopy Example

VMA

Dest

Source

Initialize a Large Buffer with zeroes for VM init

Kernel Application

© 2021 SNIA Persistent Memory + Computational Storage Summit. All Rights Reserved

Page 18: A New Path to Better Data Movement within System Memory ...

18

Kernel Mode Driver

Kernel Mode Application

User Mode Driver(Library)

SDXI HW

User Mode Application

Baremetal Stack View

1. Initialize2. Discover

Capabilities

Producer Context’s Descriptor Ring in Kernel Address Space

• Producer Context’s Descriptor Ring in User Address Space

• Direct, Secure Access with hardware

OS-Specific Interface to enable a User Mode Driver

Framework-Specific Interface to enable a User Mode App with a Descriptor ring, Context specific structures

© 2021 SNIA Persistent Memory + Computational Storage Summit. All Rights Reserved.

Page 19: A New Path to Better Data Movement within System Memory ...

19

Direct HW access, Tier across Memory Tiers

DRAM PMEM MMIO CXL

Source and Destination Memory Targets for Data transfer in System Physical Address Space

© 2021 SNIA Persistent Memory + Computational Storage Summit. All Rights Reserved.

Page 20: A New Path to Better Data Movement within System Memory ...

20

PF VF VF VFSDXI HW

Kernel Mode Driver

Kernel Mode Application

User Mode Driver(Library)

User Mode Application

Address Space A

User Mode Application

Address Space B

Scale Baremetal Apps – Multi-Address Space

© 2021 SNIA Persistent Memory + Computational Storage Summit. All Rights Reserved.

Page 21: A New Path to Better Data Movement within System Memory ...

21

Hypervisor Kernel Mode Driver

SDXI Device

User Mode Driver

(Library)

Guest Kernel Mode

Application

Guest Kernel Mode Driver

Connection Manager

User Mode App

Guest Kernel Mode

Application

Guest Kernel Mode Driver

SDXI Virtual Device

User Mode Driver

(Library)

User Mode App

Scale with Compute Virtualization –Multi-VM address space

Connection Manager

VMA VMB

SDXI Virtual Device

© 2021 SNIA Persistent Memory + Computational Storage Summit. All Rights Reserved.

Page 22: A New Path to Better Data Movement within System Memory ...

22

SDXI TWG 2020 Accomplishments

• SDXI(Smart Data Accelerator Interface) is a proposed SNIA standard for a memory to memory Data Mover interface.

• Charter and Program of Work

• https://members.snia.org/document/dl/31306

• Recent completed work - summary

• TWG approved by SNIA Technical Council and Board in July 2020

• TWG had its first meeting on July 30, 2020

• AMD, Dell, VMware provided contributed v0.7 draft specification as a starting point for this TWG(September)

• 22 Companies and over 50 individual members and growing..

© 2021 SNIA Persistent Memory + Computational Storage Summit. All Rights Reserved.

Page 23: A New Path to Better Data Movement within System Memory ...

23

SDXI TWG 2021 Work Items• Initial RFC review process – Completed

• Work on building an OS-independent Software reference - Underway

• Initiate Public review of a pre 1.0 draft specification

• Release a v1.0 SNIA architecture document

• Plan and begin work on Post v1.0 features

• Work on designing Compliance Testing tools

• SNIA Group collaboration

• Leverage expertise in other SNIA groups around persistent memory.

• Coordinate synergies with Computational Storage Work Group

• External group collaboration / Alliance work items• PCI SIG, CXL, Gen-Z

© 2021 SNIA Persistent Memory + Computational Storage Summit. All Rights Reserved.

Page 24: A New Path to Better Data Movement within System Memory ...

24

SDXI TWG’s Program of Work

• Post v1.0 Focus

• New data mover operations for smart acceleration

• Data mover operations involving persistent memory targets

• Cache coherency models for data movers

• Security Features involving data movers

• Connection Management architecture for data movers

• Encourage adopting companies to work towards compliant software implementations and driver models.

• Educate and encourage adoption by OS, Hypervisors, OEMs, Applications and Data Acceleration vendors

© 2021 SNIA Persistent Memory + Computational Storage Summit. All Rights Reserved.

Page 25: A New Path to Better Data Movement within System Memory ...

25

SDXI TWG Membership as of 4-14-2021

• Advanced Micro Devices(AMD)

• ARM

• Broadcom

• Dell Inc.

• Fujitsu America Inc.

• Futurewei Technologies, Inc

• Hewlett Packard Enterprise

• Huawei Technologies Co. Ltd

• IBM(includes Red Hat, Inc)

• Inspur Electronic Information Industry Co Ltd.

• MemVerge

• Micron

• Microsemi a Microchip Company

• Microsoft Corporation

• NetApp

• NGD Systems, Inc

• Samsung Electronics Co., LTD

• Scaleflux

• SK Hynix

• VMware

• Western Digital

• Xilinx, Inc

Come join the SDXI TWG!

© 2021 SNIA Persistent Memory + Computational Storage Summit. All Rights Reserved.

Page 26: A New Path to Better Data Movement within System Memory ...

26

Links

1. How to get more involved ?• https://www.snia.org/sdxi

• Participate in public review of draft specification – Coming Soon!

2. Need more details ? • SDC 2020 Conference presentation on SDXI

• https://www.youtube.com/watch?v=iv2GUfnxG-A

3. Questions ?• [email protected]

4. Acknowledgement• SDXI TWG members – Rich Brunner, Philip NG, Glen Sescila, Don Dutile, Beau Beachamp, Jason Wohglemuth,

Murali Ravirala, Frederick Knight, Alexandre Romana, Dwight Riley, Paul Hartke, Paul Von Stamwitz, James Leighton, Srinivas Gowda , Bill Martin & others

© 2021 SNIA Persistent Memory + Computational Storage Summit. All Rights Reserved.

Page 27: A New Path to Better Data Movement within System Memory ...

2727

Thank youPlease visit www.snia.org/pm-summit for presentations

27