Future Directions in Neuromorphic Computing...Neuromorphic Computing Available now Continued R& D...
Transcript of Future Directions in Neuromorphic Computing...Neuromorphic Computing Available now Continued R& D...
Future Directions in Neuromorphic
Computing
Steve Furber ICL Professor of Computer
Engineering
The University of Manchester
1
200 years ago…
• Ada Lovelace, b. 10 Dec. 1815
"I have my hopes, and very distinct ones too, of one day getting cerebral phenomena such that I can put them into mathematical equations--in short, a law or laws for the mutual actions of the molecules of brain. …. I hope to bequeath to the generations a calculus of the nervous system.”
2
65 years ago…
3
ConvNets - structure
• Dense convolution kernels • Abstract neurons • Only feed-forward connections • Trained through backpropagation
4
The cortex - structure
• Spiking neurons • Two-dimensional structure • Sparse connectivity
Feed-forward input
Feedback input
Feed-forward output
Feedback output
5
ConvNets - GPUs
• Dense matrix multiplications • 3.2kW • Low precision
6
Cortical models - Supercomputers
• Sparse matrix operations • Efficient communication of spikes • 2.3MW
7
Cortical models - Neuromorphic hardware
• Memory local to computation • Low-power • Real time • 62mW
8
http://www.technologyreview.com/featuredstory/526506/neuromorphic-chips/
9
10 https://agenda.weforum.org/2015/03/top-10-emerging-technologies-of-2015-2/
REPORT TO THE PRESIDENT
Ensuring Long-Term U.S.
Leadership in Semiconductors
Executive Office of the President
President’s Council of Advisors on
Science and Technology
January 2017
11
Ensuring Long-Term U.S. Leadership in Semiconductors
28
Table A1. Selected component technology vectors that have a high probability of deployment in ten years
(* denotes more speculative deployment within this timeframe)
Component technology vector
Time-frame to first commercial products
Approach to achieving and retaining competitive advantage
Neuromorphic Computing
Available now Continued R&D into new architectures coupled with 3D technologies and new materials, Deep Learning accelerators (for mobile and data center applications), and applications for true brain-inspired computing
Photonics Available now Foundries for tools and materials R&D; integrate photonics with CMOS and other materials
Sensors Available now Foundries for tools and materials R&D; integrate new types/classes of sensors with CMOS and other materials
CMOS (sub 7nm node size or new 3D structures)*
Advances in thermal management available with new process nodes
Deep understanding of transistor physics and chipset architecture and related design know-how; foundries and labs for transistor and materials R&D
Magnetics 1-2 years (MRAM as eFlash), 3 years (as DRAM), 5-7 years (as SRAM)
Foundries for tools and materials R&D; integrate magnetics with CMOS and other materials
3D 2-3 years (wafer-to-wafer stacking), 4-5 years (die-to-wafer stacking), 5-7 (Monolithic 3D)
Deep understanding of applications space and benefits associated use of 3D technologies and design know-how; foundries for tools and materials R&D; design automation tool R&D
Data-flow based architectures
3-4 years Continued architecture R&D, coupled with materials, integration, and manufacturing; build an ecosystem for solutions using data-flow based architectures
Ultra-high performance wireless systems
3 years (5G), 10-12 years (6G)
Continued R&D in new materials and processes, antenna design advances, chipset manufacturing, and integration
Advanced non-volatile memory as SRAM
5+ years Deep understanding of applications space and chipset architectures
Carbon nanotubes and phase change materials*
5-7 years Foundries/labs for materials R&D for hardware architectures; chipset designs to leverage these technologies
Biotech/human health
5-10 years R&D towards low power, highly integrated, high performance processing, high-data rate communications, wireless charging; couple R&D with clinical research to create, build, and evaluate on new materials and interfaces
Quantum Computing
< 10 years Pre-competitive R&D labs for new materials; foundries for new materials and hardware architectures; tools for quantum algorithms and software programming with various architectural paradigms
Point-of-Use Nanoscale 3D printing
Available now Desktop fab capabilities for rapid prototyping, additive manufacturing, moving beyond silicon and interfacing with soft matter, and small batch production
DNA for compute and storage*
10+ years Multi-disciplinary basic research in efficiently and reliably reading and writing and retrieving DNA strands
11
Start-ups and industry interest
SpiNNaker project
• A million mobile phone processors in one computer
• Able to model about 1% of the human brain…
• …or 10 mice!
13
SpiNNaker system
14
SpiNNaker chip
Multi-chip packaging by
UNISEM Europe
15
Chip resources
16
SpiNNaker machines
17
SpiNNaker chip
(18 ARM cores)
SpiNNaker board
(864 ARM cores)
SpiNNaker racks
(1M ARM cores)
• HBP platform
– 1M cores
– 11 cabinets (including server)
• Launch 30 March 2016 – then 500k cores
– 82 remote users
– 4,939 SpiNNaker jobs run
SpiNNaker machines
18
• 100 SpiNNaker
systems in use
– global coverage
• 4-node boards
– training & small-
scale robotics
• 48-node boards
– insect-scale
networks
• multi-board systems
• 1M-core HBP
platform
sales (40 48-node boards)
loans
19
Cortical microcolumn 1st full-scale simulation of 1mm2 cortex on neuromorphic & HPC systems
• 77,169 neurons, 285M synapses, 2,901 cores
S.J. van Albada, A.G. Rowley, A. Stokes, J. Senk, M. Hopkins, M. Schmidt, D.R. Lester, M. Diesmann, S.B. Furber, “Performance comparison of the digital neuromorphic hardware SpiNNaker and the Neural network simulation software NEST for a full-scale cortical microcircuit model”, Frontiers 2018.
20
Constraint satisfaction problems
work by: Gabriel Fonseca Guerra
(PhD student)
G. A. Fonseca Guerra and S. B. Furber,
“Using Stochastic Spiking Neural
Networks on SpiNNaker to Solve
Constraint Satisfaction Problems”,
Frontiers 2018.
S. Habenschuss, Z. Jonke, and
W. Maass, “Stochastic computations in
cortical microcircuit models”, PLOS
Computational Biology,
9(11):e1003311, 2013.
Stochastic spiking neural
network:
• Solves CSPs, e.g. Sudoku
• 37k neurons
• 86M synapses
• also
• map colouring
• Ising spin systems
Deep Rewiring
• Synaptic sampling as dynamic rewiring for rate-based neurons (deep networks)
• Ultra-low memory footprint even during learning
• Uses PRNG/TRNG, FPU, exp • speed-up 1.5
• Example: LeNet 300-100 • 1080 KB 36 KB • training on local SRAM possible • ≈ 100x energy reduction for training
on SpiNNaker2 prototype (28nm) compared to X86 CPU
• 96.2% MNIST accuracy for 0.6% connectivity
LeN
et 3
00
-10
0 In MNIST 784
Hidden FC 300
Hidden FC 100
Out Softmax 10
G. Bellec et al., “Deep rewiring: Training very sparse deep networks”, arXiv, 2018 Chen Liu et al., “Memory-efficient Deep Learning on a SpiNNaker 2 prototype”, submitted 21
SpiNNaker2 Chip Overview
22
• 160 ARM-based processing elements (PEs) • 8 GByte LPDDR4 DRAM • 7 energy efficient chip-to-chip links
New Hardware Feature Innovations
23
Feature Output/Benefit Patents Publications
Dynamic Neuromorphic Power Management
Reduce power consumption by up to x5 Examples: • Reward-based learning • Synfire chain • Event based vision sensor processing
DE102017128711.6 [Höppner2017]
Exponential Function Accelerator Enhance performance by x2 Examples: • Reward-based learning • STDP • BCPNN
[Partzsch2017] [Bauer2017] [Mikaitis2018]
Pseudo and true random number generation
Enhance performance by >x2 for pseudo random numbers Enable true randomness Examples: • Synaptic sampling • Deep-rewiring
EP3147775A1 US20170083289
[Neumärker2017]
exp exp
ARMexp
unitFIFOAHB
SpiNNaker2 Testchip JIB
24
• Taped out Testchip “JIB” • 8 processing elements (PE) • SpiNNaker Router, Chip2Chip Links • 9mm² in GLOBALFOUNDRIES 22FDX
technology • Low voltage operation down to 0.40V
• Power analyses for neuromorphic and
machine learning • Analysis results (scaled to SpiNNaker2)
• CPU Efficiency: • 47GOPS @<20µW/MHz
• Machine Learning • 4.6TOPS @ 6.4 Tops/W
• Scaling compared to SpiNNaker1
• x15 CPU performance • x2 performance by hardware
accelerators • >x30 better efficiency
PE
QPE
SpiNNaker Router
Shared SRAM
Host IF
Ch
ip2
Ch
ip IO
Ch
ip2
Ch
ip IO
QPE
Conclusions • SpiNNaker:
• has been 20 years in conception…
• …and 10 years in construction, • and is now ready for action!
• ~90 boards with groups around the world
• 1M core machine built
• HBP is supporting s/w development
• Industrial AI uses 2nd generation (non-spiking) neural nets
• there is an expectation that neuromorphics will contribute • especially energy-efficiency?
• SpiNNaker is the ideal research platform to explore this space • SpiNNaker2 will be 10x better!
25