Evolutionary Technology, Revolutionary Implications ...Evolutionary Technology, Revolutionary...

26
Evolutionary Technology, Revolutionary Implications Persistent Memory Pankaj Mehra VP and Senior Fellow Mar 31, 2017 ©2 0 16 Western Digital Corporation or affiliates. All rights reserved. Confidential.

Transcript of Evolutionary Technology, Revolutionary Implications ...Evolutionary Technology, Revolutionary...

Page 1: Evolutionary Technology, Revolutionary Implications ...Evolutionary Technology, Revolutionary Implications Persistent Memory Pankaj Mehra VP and Senior Fellow Mar 31, 2017 ©2016 Western

Evolutionary Technology, Revolutionary Implications

Persistent MemoryPankaj MehraVP and Senior Fellow

Mar 31, 2017

©2016 Western Digital Corporation or affiliates. All rights reserved. Confidential.

Page 2: Evolutionary Technology, Revolutionary Implications ...Evolutionary Technology, Revolutionary Implications Persistent Memory Pankaj Mehra VP and Senior Fellow Mar 31, 2017 ©2016 Western

SanDisk Confidential - Office of CTO

We always overestimate the change that will occur in the next two years and underestimate the change that will occur in the next ten.

Don't let yourself be lulled into inaction.

Bill Gates

During Persistent Memory's first go round, I made the first mistake and my company made the second.

Page 3: Evolutionary Technology, Revolutionary Implications ...Evolutionary Technology, Revolutionary Implications Persistent Memory Pankaj Mehra VP and Senior Fellow Mar 31, 2017 ©2016 Western

Data Growth in Transactions & Analytics

Page 4: Evolutionary Technology, Revolutionary Implications ...Evolutionary Technology, Revolutionary Implications Persistent Memory Pankaj Mehra VP and Senior Fellow Mar 31, 2017 ©2016 Western

ESS Technology

Other Major Sources of Data

Machine Learning and Video Analyticsturn video data into a data warehouse(license plates, brands, cats too)

Page 5: Evolutionary Technology, Revolutionary Implications ...Evolutionary Technology, Revolutionary Implications Persistent Memory Pankaj Mehra VP and Senior Fellow Mar 31, 2017 ©2016 Western

Logging, and not just transactions

TLOG ALOG ELOG

©2016 Western Digital Corporation or affiliates. All rights reserved. Confidential. 5

The root of all data collection

TRANSACTION LOGGING

Business Critical Tx in Operational Data Stores

Paid transactions ($0.10/tx)Æ Free Transactions**

($0)

**Blockchain (FSI, pharma, …) for Distributed Ledger

APPLICATION LOGGING

SEIM (ArcSight), Kissmetrics (SaaS) and Google Analytics, spur a wave of app logging

5 EB in MSFT Cosmos!

LOG EVERYTHING

The user is the product

Every read becomes a write

PBs/day pour in from phones, fixed cameras, cars (GM), travelers, …

Page 6: Evolutionary Technology, Revolutionary Implications ...Evolutionary Technology, Revolutionary Implications Persistent Memory Pankaj Mehra VP and Senior Fellow Mar 31, 2017 ©2016 Western

What is Persistent Memory?

Durable memory that is synchronously accessed but whose metadata are managed like file storage

• First public description PMehra @Ohio State University (Oct 10, ‘02)• Mehra-Fineberg (‘04) showed that RDMA-attached persistent memory improves the performance, availability and scalability of OLTP

Page 7: Evolutionary Technology, Revolutionary Implications ...Evolutionary Technology, Revolutionary Implications Persistent Memory Pankaj Mehra VP and Senior Fellow Mar 31, 2017 ©2016 Western

7

2004: Persistent Memory based Write Aside Buffer replicated byte-grain log writes in under 10 msec

Page 8: Evolutionary Technology, Revolutionary Implications ...Evolutionary Technology, Revolutionary Implications Persistent Memory Pankaj Mehra VP and Senior Fellow Mar 31, 2017 ©2016 Western

8

2005@HP TechCon, we questioned decades old conventional wisdom around write ahead logging

a.k.a. write-behind logging, recently rediscovered in PelotonDB by Arulraj and Pavlo[accepted to appear @VLDB’17]

Page 9: Evolutionary Technology, Revolutionary Implications ...Evolutionary Technology, Revolutionary Implications Persistent Memory Pankaj Mehra VP and Senior Fellow Mar 31, 2017 ©2016 Western

9

and because we knew thatpersistence and replication go hand-in-hand ...

NetworkRead Write

Latency Bandwidth Latency BandwidthInfiniBand (4x, 12-port switch, HP rx5670, HP-UX) 14.7 ms 337 MB/sec 9.9 ms 337 MB/sec

ServerNet (NSK S86000 host, “Sequoia” PMU) 14.5 ms 26.5 MB/sec 14.2 ms 32.8 MB/sec

Communication-link-attached persistent memory deviceUS 20040148360 A1ABSTRACTA system is described that includes a network attached persistent memory unit. The system includes a processor node for initiating persistent memory operations (e.g., read/write). The processor unit references its address operations relative to a persistent memory virtual address space that corresponds to a persistent memory physical address space. A network interface is used to communicate with the persistent memory unit wherein the persistent memory unit has its own network interface. The processor node and the persistent memory unit communicate over a communication link such as a network (e.g., SAN). The persistent memory unit is configured to translate between the persistent memory virtual address space known to the processor nodes and a persistent memory physical address space known only to the persistent memory unit. In other embodiments, multiple address spaces are provided wherein the persistent memory unit provides translation from these spaces to a persistent memory physical address space.

Page 10: Evolutionary Technology, Revolutionary Implications ...Evolutionary Technology, Revolutionary Implications Persistent Memory Pankaj Mehra VP and Senior Fellow Mar 31, 2017 ©2016 Western

Early Persistent Memory prototypes on Tandem NSK

•Lab prototype based on NonStop Sequoia I/O board

•Device firmware based on Fibre Channel HBA firmware

•Device attached to S-Series servers via MSEB or other ServerNet port (precursor to InfiniBand and RDMA)

Page 11: Evolutionary Technology, Revolutionary Implications ...Evolutionary Technology, Revolutionary Implications Persistent Memory Pankaj Mehra VP and Senior Fellow Mar 31, 2017 ©2016 Western

page 11

Persistent Memory was 95% software!

•3 Software Components–Access• Library supported privileged NSK processes (such as DP2)• Fast, synchronous writes and reads, direct to device• Implemented pointer chasing, mirroring, …

–Manager• Presented a named volume but hid device location• Implemented using standard process pairs

–Device• Relatively simple hardware

–sometimes entirely special-purpose software

… even so, the access path was 95% hardware!

Page 12: Evolutionary Technology, Revolutionary Implications ...Evolutionary Technology, Revolutionary Implications Persistent Memory Pankaj Mehra VP and Senior Fellow Mar 31, 2017 ©2016 Western

Network Persistent Memory Unit Operation

Page 13: Evolutionary Technology, Revolutionary Implications ...Evolutionary Technology, Revolutionary Implications Persistent Memory Pankaj Mehra VP and Senior Fellow Mar 31, 2017 ©2016 Western

Things to remember, Things to ponder

•A persistent memory filesystem should support– Memory-like access– Storage-like management

•Offer an integrated solution to replication and persistence– That places persistent bits outside the

fault zone of the last CPU that wrote them

•Exploit byte-grain persistence deeply– Write-behind logging– Image-based service replication

SanDisk Confidential - Office of CTO

•What’s new and truly different?•What are the other significant developments since 2006?•What will persistent memory do to the memory hierarchy this time around?

Page 14: Evolutionary Technology, Revolutionary Implications ...Evolutionary Technology, Revolutionary Implications Persistent Memory Pankaj Mehra VP and Senior Fellow Mar 31, 2017 ©2016 Western

Storage

Memory

Storage-Class Memory

NAND

HDD

DRAM

SRAMSTT-MRAM

PCM

EmbeddedNVM

(SNDK) ReRAM

Non-volatile

Volatile

Acc

ess

Tim

e (s

ec) 3D

XPoint

CBRAM (Micron)

SanDisk Confidential - Office of CTO

Emerging memories

STT-MRAM will surpass SRAM cost while providing competitive performance and non-volatility enabling memory of choice for IoT

3D XPoint will bifurcate into performance and cost optimized flavors

NAND will continue dropping in cost/bit widening gulf for SCM to fill

Source: Chris Petti, SanDisk

Page 15: Evolutionary Technology, Revolutionary Implications ...Evolutionary Technology, Revolutionary Implications Persistent Memory Pankaj Mehra VP and Senior Fellow Mar 31, 2017 ©2016 Western

Memory-Storage Hierarchy

� Memory = Precious Resource

� But Huge Penalty leaving Memory

� Storage = Continuum of Memory Hierarchy

� Storage = Permanence & Capacity

� SCM (ReRAM) Changes Hierarchy• Permanent, Fast, Vast = 10s Tbytes per node

� Will be used as Memory and Storage

+104 ns25,000 instruction gap: Latency penalty for leaving memory

hierarchy

Flash

HDD 107 ns

105 ns

“Storage-like” block-based

DRAM

Processor

Cache 101 ns

102 ns“Memory-like” L/S access

often 64Byte Cache Line

SCM (ReRAM)

SCM (ReRAM)

103 ns

103 ns

Order Latency

Page 16: Evolutionary Technology, Revolutionary Implications ...Evolutionary Technology, Revolutionary Implications Persistent Memory Pankaj Mehra VP and Senior Fellow Mar 31, 2017 ©2016 Western

SW Stack: Promise of NV Main MemoryFreeing up CPU cycles for real work

Touch

OS

File

Block

NVMe / VSL

Hardware

NAND - TBytes

Touch

ReRAM: 10’s TByte

I/O Stack NV Main Memory

25k

inst

ruct

ions

+ C

onte

xt

Swap

s +

cac

he p

ollu

tion

I/O latency &

IOPs

Transaction Overhead

~8uS

100 uS Media <<1 uS Media

Massive Datasets

mm

ap

25k Instructions8uS

Cache Pollution

1 Instruction<<1uS

Clean Cache

Industry Simulations = ~50x Work / Server

Page 17: Evolutionary Technology, Revolutionary Implications ...Evolutionary Technology, Revolutionary Implications Persistent Memory Pankaj Mehra VP and Senior Fellow Mar 31, 2017 ©2016 Western

Persistent Memory: Assumptions & Projections

• Reduction in context switches– NVMe and other network learnings– Hardware-only access path to persistent memory

• Cost, Amount of persistent memory– PCIe CBDRAM bufs Æ NVDIMM-F Æ NVDIMM-N Æ SCM and NVDIMM-P

• Remote Persistent Memory– RDMA PMUs (2002) Æ GenZ (2016)

• Other key developments– Atomic writes to flash (2014), In-Place Update Engines (2015), Explicit placement of data (2015+)

Scale Metric 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021

2 msec Context Switches 7 3 2 1 0log10 B Amount of persistent memory 3 9 9 10 12 15log10 B In-memory data 10 10 11 11 11 12 13 13 14 15ns append to persistent log 100,000 15,000 5,000 300

persist transactional data mutation 5,000,000 1,000,000 60atomic-persistent-write / log-append ? WAL WAL WAL WAL WBL WBL WBL WBL WBL WBL

ns append to persistent log [remote] 1,000,000 19,000 2,500 900 750 400Cost of persistent memory D+N N D+N 0.5D 0.35D

Page 18: Evolutionary Technology, Revolutionary Implications ...Evolutionary Technology, Revolutionary Implications Persistent Memory Pankaj Mehra VP and Senior Fellow Mar 31, 2017 ©2016 Western

Standards Based (Remote) LD-ST Persistent Memory

18

© 2016 Gen-Z

• Common protocol for many PHYs, topologies• Splits memory controller for pipelining• Light headers (better than RDMA)• Achieves 90% peak BW at 64B message size!• Rich set of x-ISA memory ops (atomic, flush)

Protocol

Transport

Memory Controller

DRAM DIMMDDR

SCM DIMM

NVDIMM-P

Non-determinism enabled w/ new commands on same busNVDIMM-P – New device class shares DDR w/ DRAM

• New command overlaid on DDR4/5 pins• Supports non-deterministic reads, OOO completion, etc.• Data transferred per synchronous DRAM constraints so that DRAM

and NVM can share the bus• Intel DDR-T (to our best knowledge) is also an add-on protocol, like

NVDIMM-P, but proprietary and integrated w/ Intel CPU/MC.

DDR4 DDR5

Protocol is separate command layer/spec

NVDIMM-P Protocol Overview

Source: Dave Landsman and WD Standards Team

Page 19: Evolutionary Technology, Revolutionary Implications ...Evolutionary Technology, Revolutionary Implications Persistent Memory Pankaj Mehra VP and Senior Fellow Mar 31, 2017 ©2016 Western

What Databases are Doing

19Content credit: Arulraj VLDB paper

Existing DB, NVM Aware Engine Transactions, NVM-Aware DB, WBL Analytics, flatter memory hierarchy

Page 20: Evolutionary Technology, Revolutionary Implications ...Evolutionary Technology, Revolutionary Implications Persistent Memory Pankaj Mehra VP and Senior Fellow Mar 31, 2017 ©2016 Western

What Applications are Doing

C ontent c redit: McGuffy SPAA paper 20

Reduced-write, low-depth, work-efficient parallel algorithms for NVM

CPU1 1,1

SmallMemory

1,ω

LargeMemory

Page 21: Evolutionary Technology, Revolutionary Implications ...Evolutionary Technology, Revolutionary Implications Persistent Memory Pankaj Mehra VP and Senior Fellow Mar 31, 2017 ©2016 Western

What compilers still do

21

• Lifting of loads is still extremely helpful

• Also helps vectorization / SIMD

• But pushing out stores?

• What about other compiler optimization strategies in the face of– Asymmetric write costs– Long latency of reads

• These things make a bigger difference to code (1000x, for instance) than many things we do here

LD

ST

Page 22: Evolutionary Technology, Revolutionary Implications ...Evolutionary Technology, Revolutionary Implications Persistent Memory Pankaj Mehra VP and Senior Fellow Mar 31, 2017 ©2016 Western

Persistent Memory: Assumptions and ProjectionsCategory Where We Are Assumptions,

“How we got here”Projections“Where are we going?”

File Systems • Pmem.io• Named extents

• NVMFS/ACM support • Scale-out filesystems with persistent named objects that can be discovered and memory mapped very efficiently

Memory Managers • Persistent pointers• Pipelined fences, writebacks

• Mostly mapping and checkpointing bolted on to conventional volatile memory management and traditional file systems

• In-place update semantic displaces I/O semantic in applications not just in databases

• Judicious placement of data structures and fields between DRAM and SCM gives way to better optimizers

End-to-end low latency and high throughput paths (local)

• DDR3/4 NVDIMM-N • NVDIMM-F (memory channel storage)• PCIe BBDRAM, CBDRAM (I/O channel

memory)

• DDR4 NVDIMM-P• DDR5 NVDIMM-P

End-to-end low latency and high throughput paths (remote)

• RDMA to NVDIMM or PCIe BARs • NPMU/PMM • GenZ, PMOF• Memory Nodes

Page 23: Evolutionary Technology, Revolutionary Implications ...Evolutionary Technology, Revolutionary Implications Persistent Memory Pankaj Mehra VP and Senior Fellow Mar 31, 2017 ©2016 Western

SCM disrupts high-perf NAND and big-memory DRAM across enterprise segments

Hyperscale Server Enterprise Server Enterprise Storage

DRAM• Containers• In-memory database• Active data (GB tier)

• VMs• In-memory database• Active data (GB tier)

• I/O buffers• Recovery buffers

Storage Class Memory

• In-memory pub-sub• Indexes

• Distributed logging• Real-time analytics (TB tier)

• Mmap’ed files (TB Tier)• Swap/backing store

• Tmp data (VDI, analytics)• Transaction logging

• Fabric buffers• Metadata logs

• Hot data

NAND• Data Warehouse• Data Lakes (PB tier)• Batch analytics

• Indexes• Filesystems• Object store

• Staging/Buffering

• Metadata & Index

HDDs• Object storage

• Cold Tier (EB tier)

• PB (Copy data) tier • PB (Archival) Tier

Page 24: Evolutionary Technology, Revolutionary Implications ...Evolutionary Technology, Revolutionary Implications Persistent Memory Pankaj Mehra VP and Senior Fellow Mar 31, 2017 ©2016 Western

24

Converged Memory-Storage Markets (2021)

HPC Hyperscale Server

Enterprise Server

EnterpriseStorage,

Converged

ComputeTier

High DWPD SSDs for Burst Buffer(~5 PB/20K)

In-memory computing &in-memory caching(n TB SCM)

IMDB for analytics, SDM (TBs of SCM)

SDS$ (TBs high-DWPD SSD)

RAID$, WB in SCM (SW optimization) (TB)

ArchiveTier

HDD Æ Capacity Flash (EB)

HDD Æ? Capacity Flash (m EB)

Active Backup, CDM(n PB of 90-10 HDD-SSD)

High Capacity SSDs (PB AFAs and HCI)

Page 25: Evolutionary Technology, Revolutionary Implications ...Evolutionary Technology, Revolutionary Implications Persistent Memory Pankaj Mehra VP and Senior Fellow Mar 31, 2017 ©2016 Western

Memory Centric Computing

©2017 Western Digital Corporation or its affiliates. All rights reserved.

Shipping computation to the data

Works best when simple expressions computed against large number of data records

PowerReduction in data movement count and distance

PerformanceParallelism, Bandwidth, and Latency

CostLow gate count embedded cores with future open ISA and tools

CPU

Near Memory

FarMemory

FarCompute

DataNear

Compute

Page 26: Evolutionary Technology, Revolutionary Implications ...Evolutionary Technology, Revolutionary Implications Persistent Memory Pankaj Mehra VP and Senior Fellow Mar 31, 2017 ©2016 Western

Will memory disaggregate?

©2016 Western Digital Corporation or affiliates. All rights reserved. Confidential. 26

Need efficient memory-semantic fabrics, and major software shift