Mainframe Day 2017 Next Generation Memory for BS2000 as well · Title: BS2000 and PMEM Author:...
Transcript of Mainframe Day 2017 Next Generation Memory for BS2000 as well · Title: BS2000 and PMEM Author:...
0 FUJITSU INTERNAL Copyright 2017 FUJITSU
Mainframe Day 2017 Next Generation Memory – for BS2000 as well
Fujitsu Distinguished Engineer
CTO Enterprise Platform Services
2017-01-25 v4
1 FUJITSU INTERNAL Copyright 2017 FUJITSU
PMEM – Next Generation Memory
2 FUJITSU INTERNAL Copyright 2017 FUJITSU
Copyright 2016 FUJITSU
= Optane PCIe/NVMe
3 FUJITSU INTERNAL Copyright 2017 FUJITSU
Checking the Marketing numbers
Module Intel latency
factor ~
SRAM 1
DRAM 10
3D-XPoint 100
NAND 100.000
HDD 10.000.000
Latency
2-3 ns
20-35 ns
~ 250 ns
~ 80 µs
~ 5 ms
Intel size
factor ~
1
100
1.000
1.000
10.000
Capacity
~ 60 MB
~ 64 GB
~ 512 GB
~ 16 TB
~ 6 TB
Atomic
granularity
64 B
64 B
64 B
4096 B
512 / 4096 B
4 FUJITSU INTERNAL Copyright 2017 FUJITSU
Latency translated in Distance
Copyright 2016 FUJITSU
100 cm
1 km 100 km
1000 cm
10 cm
1 cm
5 FUJITSU INTERNAL Copyright 2017 FUJITSU
6 FUJITSU INTERNAL Copyright 2017 FUJITSU
7 FUJITSU INTERNAL Copyright 2017 FUJITSU
8 FUJITSU INTERNAL Copyright 2017 FUJITSU
What does this mean ?
NVMe Block Interface NVM-Libraries & Drivers
9 FUJITSU INTERNAL Copyright 2017 FUJITSU
10 FUJITSU INTERNAL Copyright 2017 FUJITSU
I/O
with
OS
Buffer
cache
I/O
with
CPU
Lx
cache
3D-Xpoint DIMM Software Architecture
3D-Xpoint DIMMs
11 FUJITSU INTERNAL Copyright 2017 FUJITSU
The Data Path
Core
L1
L2
L1
L3
Core
L1
L2
L1
Core
L1
L2
L1
Core
L1
L2
L1
Memory Controller
NV-DIMM / PMEM NV-DIMM / PMEM
Memory Controller
NV-DIMM / PMEM NV-DIMM / PMEM
MOV
12 FUJITSU INTERNAL Copyright 2017 FUJITSU
The Data Path
Core
L1
L2
L1
L3
Core
L1
L2
L1
Core
L1
L2
L1
Core
L1
L2
L1
Memory Controller
NV-DIMM / PMEM NV-DIMM / PMEM
Memory Controller
NV-DIMM / PMEM NV-DIMM / PMEM
MOV
CLFLUSH
CLFLUSHOPT
CLWB
PCOMMIT
13 FUJITSU INTERNAL Copyright 2017 FUJITSU
The Data Path
Core
L1
L2
L1
L3
Core
L1
L2
L1
Core
L1
L2
L1
Core
L1
L2
L1
Memory Controller
NV-DIMM / PMEM NV-DIMM / PMEM
Memory Controller
NV-DIMM / PMEM NV-DIMM / PMEM
MOV
CLFLUSH
CLFLUSHOPT
CLWB
ADR = Flush the
WPQ automatically on
power-fail or shutdown
15 FUJITSU INTERNAL Copyright 2017 FUJITSU
Example Code
MOV X1, 10
MOV X2, 20 X2, X1 are in PMEM
.
MOV R1, X1 Stores to X1 and X2 are globally
… visible, but may not be persistent
…
CLFLUSHOPT X1
CLFLUSHOPT X2 X1 and X2 moved from caches to memory
…
SFENCE
PCOMMIT ensures PCOMMIT has completed ADR
16 FUJITSU INTERNAL Copyright 2017 FUJITSU
What does this mean ?
NVM Libraries (optional)
17 FUJITSU INTERNAL Copyright 2017 FUJITSU
original libart tree init routine
int art_tree_init(art_tree *t) {
t->root = NULL;
t->size = 0;
return 0;
}
18 FUJITSU INTERNAL Copyright 2017 FUJITSU
libart tree init routine … ported to PMEM
int art_tree_init(PMEMobjpool *pop, int *newpool)
{
int errors = 0;
TOID(struct art_tree_root) root;
if (pop == NULL) { errors++; }
if (!errors) {
TX_BEGIN(pop) {
root = POBJ_ROOT(pop, struct art_tree_root);
if (*newpool) {
TX_ADD(root);
D_RW(root)->root.oid = OID_NULL;
D_RW(root)->size = 0;
*newpool = 0;
}
} TX_END
}
return(errors);
}
19 FUJITSU INTERNAL Copyright 2017 FUJITSU
original libart art_insert routine
void*
art_insert(art_tree *t, const unsigned char *key, int key_len, void *value)
{
int old_val = 0;
void *old = recursive_insert(t->root, &t->root, key, key_len, value, 0, &old_val);
if (!old_val) t->size++;
return old;
}
20 FUJITSU INTERNAL Copyright 2017 FUJITSU
libart art_insert routine … ported to PMEM
TOID(var_string)
art_insert(PMEMobjpool *pop, const unsigned char *key, int key_len, void *value, int val_len)
{
int old_val = 0;
TOID(var_string) old;
TOID(struct art_tree_root) root;
TX_BEGIN(pop) {
root = POBJ_ROOT(pop, struct art_tree_root);
TX_ADD(root);
old = recursive_insert(pop, D_RO(root)->root, &(D_RW(root)->root), (const unsigned
char *)key, key_len, value, val_len, 0, &old_val);
if (!old_val) D_RW(root)->size++;
} TX_ONABORT {
abort();
} TX_END
return old;
}
21 FUJITSU INTERNAL Copyright 2017 FUJITSU
My own experience - Summary
To get the maximum value our of it explicit changes in the
Application are necessary, but they pay back (factor ~1000)
Debug support is missing
Architecture Check before adapting an APP
Does it have the right structure ?
Where are adaptions necessary ?
Need optimized platform interconnects to create HA Storage
Still space for optimization in the end-to-end Software Stack
22 FUJITSU INTERNAL Copyright 2017 FUJITSU
Next Steps
Switch from SEP to real prototype platform with Skylake-SP
(RX2540-M4) and AEP in early 2017
Use Intel Parallel Studio for analyze & debug of PMEM
(1) App-Direct
Optimize NVM-Libs: propose measures to the reduce the overhead in the
Transactional logic to achieve optimized software storage access methods.
(2) Memory
Use Memkind-Lib: look for Algorithms for a new HMM / Hierarchical-Memory-
Management
(3) Remote access to PMEM in App-Direct & Memory mode
23 FUJITSU INTERNAL Copyright 2017 FUJITSU
BS2000 – a possible future outlook
Please contact us via email on 1st page if you are interested in more details
24 FUJITSU INTERNAL Copyright 2017 FUJITSU
ESA/390
25 FUJITSU INTERNAL Copyright 2017 FUJITSU
BS2000 functional layer /390
TU
TPR
SIH
HW
User Applications
Commands
Job control
SYSFILE Mgmt
SPOOL & RSO
Data transmission access
(TIAM, DCAM, UTM)
Programs and Macros
Catalog Mgmt
Device Mgmt
Media Mgmt
Access Methods
Dynamic & Static loader
Accounting
Test helps
Sub-System Mgmt
Logging
Address Mgmt
Memory Mgmt
Paging
Task control
Process Mgmt
Reconfiguration
Transport services for
remote data transmission
(BCAM)
Physical I/O
Start I/O Interrupt handler
Task initiator
Assign CPU to task
CPU Interrupts Function, Control Register
SVC = SystemCalls Sub-Systems
26 FUJITSU INTERNAL Copyright 2017 FUJITSU
BS2000 functional layer x86
TU
TPR
SIH
HW
User Applications
Commands
Job control
SYSFILE Mgmt
SPOOL & RSO
Data transmission access .
(TIAM, DCAM, UTM)
Programs and Macros
Catalog Mgmt
Device Mgmt
Media Mgmt
Access Methods
Dynamic & Static loader
Accounting
Test helps
Sub-System Mgmt
Logging
Address Mgmt
Memory Mgmt
Paging
Task control
Process Mgmt
Reconfiguration
Transport services for
remote data transmission
(BCAM)
Physical I/O
Start I/O Interrupt handler
Task initiator
Assign CPU to task
CPU Interrupts Function, Control Register
/390-
CPU
Emul
ation
&
JIT
SVC = SystemCalls Sub-Systems
27 FUJITSU INTERNAL Copyright 2017 FUJITSU
BS2000 and VME kickin’ & alive OS development in Europe
Use of latest HW and SW technologies
Fascinating tasks and exciting missions
Support young researchers with traineeships and master / PhD thesis
28 FUJITSU INTERNAL Copyright 2017 FUJITSU