REVISITING THE MEMORY HIERARCHY FOR TOMORROW COMPUTING SYSTEMS PPt/… · Caches Data moves along...
Transcript of REVISITING THE MEMORY HIERARCHY FOR TOMORROW COMPUTING SYSTEMS PPt/… · Caches Data moves along...
Leti Devices Workshop | Elisa Vianello | December 4, 2016
REVISITING THE MEMORY HIERARCHY FOR TOMORROW COMPUTING SYSTEMS
| 2Leti Devices Workshop | ElisaVianello | December 4, 2016
OUTLINE
• Ever Increasing Need for More Memory
• Rethinking the Memory Hierarchy with New NV Memory Technologies
• Rethinking the System Architecture?
| 3
0
M. Hilbert et al., Vol. 332, SCIENCE, 2011
In billions of GB
Analog
Digital
2002 turn-point between analog vs digital format
The amount of data being created is growing 40% a year into the next decade
IoT
Mobile
Server
EVER INCREASING NEED FOR MORE MEMORY
Leti Devices Workshop | ElisaVianello | December 4, 2016
By 2020 the digital universe is expected to contain nearly as many digital bits as there are Stars in the Universe…
| 4
Core count doubling ~ every 2 yearsDRAM capacity doubling ~ every 3 years
New memory technologies and rethinking of system ar chitecture focused on data storage and management are needed!
Compute costs dropping faster than memory costs
Source: Liam et al ISCA 2009Source: IBM Deep Computing
EVER INCREASING NEED FOR MORE MEMORY
logic
memory
Leti Devices Workshop | ElisaVianello | December 4, 2016
The growth in data produced is outpacing the improv ements in the density and cost of storage technologies
| 5
CONVENTIONAL TYPES OF MEMORIES AND TECHNOLOGIES
MAGNETIC
NAND Flash
HD
D
SD
D
DR
AM
CP
UMIM CAP
1T 1R
Register filesL1-L4 SRAM
Storage
Cos
t/bit
& S
peed
Memory volume
Latency gap
Main Memory
RegistersCaches
Dat
a m
oves
alo
ng a
ll le
vels
of s
tora
ge
hier
arch
y be
fore
and
afte
r be
ing
proc
esse
d at
the
proc
esso
r
Can new NVMs (RRAM, PCM, MRAM…) help to reduce the l atency gap and limit data movements across the Memory Hier archy?
Processor
Leti Devices Workshop | ElisaVianello | December 4, 2016
| 6
Cycle Number Nc [#]
MRAM
101 103 105 107 109 1011100
102
104
106
108
CBRAM OxRAM PCRAM
GST-Sb
Ta2O5/TaOxTa2O5/TaOx
GST-N
GST-O
Cu-GST/CuO/SiO2/BE
WOx
CNTTaON
TaN/GeTe/Cu
Ti/TaSiO/RuCu/PSE/Ru
Pro
gram
min
g W
indo
w R
off/R
on [#
]
Endurance Nc [#]
Cu-GST/CuO/SiO2/BE
Cu/Ta2O5/Pt
Ag/GeS2/HfO
2
Golden
GaSbGe
RRAM MEMORIES: PROGRAMMING WINDOW VS. ENDURANCE
• CBRAM largest Roff/Ron chalco-based and/or bilayers; best endurance oxide-based
• OxRAM largest Roff/Ron non-polar; best endurance bipolar
• PCRAM best endurance GST-based
• MRAM outlier..
Universal Memory does Not Exist!4.5 paper on Monday afternoon on RRAM Endurance,
Retention and Window Margin Trade-offLeti Devices Workshop | ElisaVianello | December 4, 2016
E. Vianello IEDM 2014L. Perniola IMW 2016
| 7
STORAGE CLASS MEMORIES
MAGNETIC
NAND Flash
HD
D
SD
D
DR
AM
CP
U
MIM CAP1T 1R
Register filesL1-L4 SRAM
PCM/RRAM
SC
M
Latency of access
<0.001µs registers<0.01µs cache
<0.03µs
>100µs
>103µs
0.05µs-100µs?processing @ SCM: from data compression to query
processing
com
putin
g
Leti Devices Workshop | ElisaVianello | December 4, 2016
Requirements:Fast access speed approaching DRAM
Nonvolatile retain data at power offHigh endurance (program/erase cycles)
Solid state (no moving parts)Low cost/bit (approaching HDD)
| 8
« TRANSFORMATIONAL » ABILITY OF PCM
0 4 8 12 16 2010-5
10-4
10-3
10-2
10-1
SETRESET
DE
FE
CT
DE
NS
ITY
CURRENT [µµµµA]
12 Mb 90 nm test chip
H. Y. Cheng et al., IEDM 2012G. Navarro et al., IEDM 2013H.Y. Cheng et al., JAP 2014P. Zuliani et al., SSE 2015V. Sousa et al., VLSI 2015
101 103 105 107 109104
105
106
107
SET
RESET
RE
SIT
AN
CE
[ΩΩ ΩΩ
]
CYCLES
2h bake at 230°C
RESET 30ns - SET 800ns
Leti Devices Workshop | ElisaVianello | December 4, 2016
| 9
HYBRID MAIN MEMORY
MAGNETIC
NAND Flash
HD
D
SD
D
DR
AM
CP
U
Hybrid RRAM/PCM
Register filesL1-L4 SRAM
PCM/RRAM
SC
MHigh capacity working memory
Hybrid memory: DRAM as a cache to RRAM or PCM tech
to achieve the best of multiple technologies
Leti Devices Workshop | ElisaVianello | December 4, 2016
• non-volatile• high-density• low-cost
• fast• durable
| 10
LRS and HRS on 4kbits arrayMAD Leti OxRAM
4.7 paper on Monday afternoon on Filament-based RR AM
Ti/HfO 2 BASED-OxRAM
<100ns programming time at 1V with up to 100M cycle s
E. Vianello IEDM 2014A. Benoist IRPS 2014 ST/Leti
Leti Devices Workshop | ElisaVianello | December 4, 2016
28nm CMOS
| 11
L2-L3 CACHE
MAGNETIC
NAND Flash
HD
D
SD
D
DR
AM
CP
U
Hybrid PCM/MRAM
Cache L2 L3
PCM/RRAM
SC
MReplacing SRAM (L2-L3)
best candidate is STT-MRAM (fast and high endurance)
with STT-MRAM
Leti Devices Workshop | ElisaVianello | December 4, 2016
Reduction of stand-by power consumption
4Mbit STT MRAM
Read 3.3ns @ 1.25V
65nm CMOS
[Noguchi, Toshiba, ISSCC 2016]
| 12
REPLACING SRAM WITH STT-MRAM
MAD Leti pSTT-MRAM
27.3 paper on Wednesday morning on Data Retention Extraction Methodology pSTT-MRAM
Leti Devices Workshop | ElisaVianello | December 4, 2016
STT-MRAM demonstrated fast writing speed and high e ndurance, however retention still to be fully characterized
different retention extraction methods are compared
a trade off exists between thermal stability and programming speed
4kbit array
| 13
RETHINKING THE SYSTEM ARCHITECTURE?
Neuromorphic systems• Detect and predict patterns in complex data• visual or auditory data analysis
Leti Devices Workshop | ElisaVianello | December 4, 2016
PCMGST , GeTe, GST/HfO2
CBRAMGeS2/Ag
OxRAMHfO2/Ti (planar + VRAM)
RRAM has been promoted by Leti (and many others!) to emulate synaptic plasticity: the ability of synapse s to
strengthen or weaken over time
| 14
RRAM TO IMPLEMENT ON-LINE LEARNINGIn nature synapses have a volatile component, is
it useful for the learning process?
RRAM data
Time [s]
Con
duct
ance
[S]
biological data
decoding of neural signals
Highly noisy neurological data
ANN with RRAM based synapses
synaptic change has a volatile component
The volatile component allows to improve detection in highly-noisy input data
16.6 paper on Tuesday morning on Short and Long-te rm Synaptic Plasticity Using OxRAM
Leti Devices Workshop | ElisaVianello | December 4, 2016
| 15
The data explosion is leading to a corresponding gr owth in data centric applications (capture, classify, archi ve…). The
adoption of new NVMs enable a rethinking of system architecture-based on data storage and management.
However universal memory does not exist, different NVM technologies have to be introduced in the storage h ierarchy
Challenges:
Design matched to the specific memory technology fe atures
Non volatility does non come for free
• static power vs. active power• memory window vs. endurance• memory window vs. data retention.
CONCLUSION
Leti Devices Workshop | ElisaVianello | December 4, 2016
| 16
Thanks to the new NVMs technologies (PCM, RRAM, MRA M) the persistent memory is getting closer to the compute center
avoiding wasting energy in the movement
CONCLUSION
Leti Devices Workshop | ElisaVianello | December 4, 2016
Leti, technology research instituteCommissariat à l’énergie atomique et aux énergies alternativesMinatec Campus | 17 rue des Martyrs | 38054 Grenoble Cedex | Francewww.leti.fr
| 18
EMBEDDED SYSTEMS: TOWARD DISTRIBUTED MEMORY
MCU (Stand-by)
Sensor
Radio
MCU (active)
Source: Renesas ASP-DAC 2014
MCU power challenge: need for Ultra-Low Stand-by Power design solution
NV Flip-Flop[N. Jovanovi ć ISCAS 2016]
NVM merges with SRAM and logicshutdown SRAM & registers thanks to
distributed NVM
28nm FDSOI + HfO 2 based OxRAM
eFla
sh(N
VM
) logic
nvSRAM
NV logic
SRAM
Leti Devices Workshop | ElisaVianello | December 4, 2016
• thermal stability for smart card & automotive
• easy integration with advance CMOS