Future Directions in Neuromorphic Computing...Neuromorphic Computing Available now Continued R& D...

Future Directions in Neuromorphic

Computing

Steve Furber ICL Professor of Computer

Engineering

The University of Manchester

1

200 years ago…

• Ada Lovelace, b. 10 Dec. 1815

"I have my hopes, and very distinct ones too, of one day getting cerebral phenomena such that I can put them into mathematical equations--in short, a law or laws for the mutual actions of the molecules of brain. …. I hope to bequeath to the generations a calculus of the nervous system.”

2

65 years ago…

3

ConvNets - structure

• Dense convolution kernels • Abstract neurons • Only feed-forward connections • Trained through backpropagation

4

The cortex - structure

• Spiking neurons • Two-dimensional structure • Sparse connectivity

Feed-forward input

Feedback input

Feed-forward output

Feedback output

5

ConvNets - GPUs

• Dense matrix multiplications • 3.2kW • Low precision

6

Cortical models - Supercomputers

• Sparse matrix operations • Efficient communication of spikes • 2.3MW

7

Cortical models - Neuromorphic hardware

• Memory local to computation • Low-power • Real time • 62mW

8

http://www.technologyreview.com/featuredstory/526506/neuromorphic-chips/

9

10 https://agenda.weforum.org/2015/03/top-10-emerging-technologies-of-2015-2/

REPORT TO THE PRESIDENT

Ensuring Long-Term U.S.

Leadership in Semiconductors

Executive Office of the President

President’s Council of Advisors on

Science and Technology

January 2017

11

Ensuring Long-Term U.S. Leadership in Semiconductors

28

Table A1. Selected component technology vectors that have a high probability of deployment in ten years

(* denotes more speculative deployment within this timeframe)

Component technology vector

Time-frame to first commercial products

Approach to achieving and retaining competitive advantage

Neuromorphic Computing

Available now Continued R&D into new architectures coupled with 3D technologies and new materials, Deep Learning accelerators (for mobile and data center applications), and applications for true brain-inspired computing

Photonics Available now Foundries for tools and materials R&D; integrate photonics with CMOS and other materials

Sensors Available now Foundries for tools and materials R&D; integrate new types/classes of sensors with CMOS and other materials

CMOS (sub 7nm node size or new 3D structures)*

Advances in thermal management available with new process nodes

Deep understanding of transistor physics and chipset architecture and related design know-how; foundries and labs for transistor and materials R&D

Magnetics 1-2 years (MRAM as eFlash), 3 years (as DRAM), 5-7 years (as SRAM)

Foundries for tools and materials R&D; integrate magnetics with CMOS and other materials

3D 2-3 years (wafer-to-wafer stacking), 4-5 years (die-to-wafer stacking), 5-7 (Monolithic 3D)

Deep understanding of applications space and benefits associated use of 3D technologies and design know-how; foundries for tools and materials R&D; design automation tool R&D

Data-flow based architectures

3-4 years Continued architecture R&D, coupled with materials, integration, and manufacturing; build an ecosystem for solutions using data-flow based architectures

Ultra-high performance wireless systems

3 years (5G), 10-12 years (6G)

Continued R&D in new materials and processes, antenna design advances, chipset manufacturing, and integration

Advanced non-volatile memory as SRAM

5+ years Deep understanding of applications space and chipset architectures

Carbon nanotubes and phase change materials*

5-7 years Foundries/labs for materials R&D for hardware architectures; chipset designs to leverage these technologies

Biotech/human health

5-10 years R&D towards low power, highly integrated, high performance processing, high-data rate communications, wireless charging; couple R&D with clinical research to create, build, and evaluate on new materials and interfaces

Quantum Computing

< 10 years Pre-competitive R&D labs for new materials; foundries for new materials and hardware architectures; tools for quantum algorithms and software programming with various architectural paradigms

Point-of-Use Nanoscale 3D printing

Available now Desktop fab capabilities for rapid prototyping, additive manufacturing, moving beyond silicon and interfacing with soft matter, and small batch production

DNA for compute and storage*

10+ years Multi-disciplinary basic research in efficiently and reliably reading and writing and retrieving DNA strands

11

Start-ups and industry interest

SpiNNaker project

• A million mobile phone processors in one computer

• Able to model about 1% of the human brain…

• …or 10 mice!

13

SpiNNaker system

14

SpiNNaker chip

Multi-chip packaging by

UNISEM Europe

15

Chip resources

16

SpiNNaker machines

17

SpiNNaker chip

(18 ARM cores)

SpiNNaker board

(864 ARM cores)

SpiNNaker racks

(1M ARM cores)

• HBP platform

– 1M cores

– 11 cabinets (including server)

• Launch 30 March 2016 – then 500k cores

– 82 remote users

– 4,939 SpiNNaker jobs run

SpiNNaker machines

18

• 100 SpiNNaker

systems in use

– global coverage

• 4-node boards

– training & small-

scale robotics

• 48-node boards

– insect-scale

networks

• multi-board systems

• 1M-core HBP

platform

sales (40 48-node boards)

loans

19

Cortical microcolumn 1st full-scale simulation of 1mm2 cortex on neuromorphic & HPC systems

• 77,169 neurons, 285M synapses, 2,901 cores

S.J. van Albada, A.G. Rowley, A. Stokes, J. Senk, M. Hopkins, M. Schmidt, D.R. Lester, M. Diesmann, S.B. Furber, “Performance comparison of the digital neuromorphic hardware SpiNNaker and the Neural network simulation software NEST for a full-scale cortical microcircuit model”, Frontiers 2018.

20

Constraint satisfaction problems

work by: Gabriel Fonseca Guerra

(PhD student)

G. A. Fonseca Guerra and S. B. Furber,

“Using Stochastic Spiking Neural

Networks on SpiNNaker to Solve

Constraint Satisfaction Problems”,

Frontiers 2018.

S. Habenschuss, Z. Jonke, and

W. Maass, “Stochastic computations in

cortical microcircuit models”, PLOS

Computational Biology,

9(11):e1003311, 2013.

Stochastic spiking neural

network:

• Solves CSPs, e.g. Sudoku

• 37k neurons

• 86M synapses

• also

• map colouring

• Ising spin systems

Deep Rewiring

• Synaptic sampling as dynamic rewiring for rate-based neurons (deep networks)

• Ultra-low memory footprint even during learning

• Uses PRNG/TRNG, FPU, exp • speed-up 1.5

• Example: LeNet 300-100 • 1080 KB 36 KB • training on local SRAM possible • ≈ 100x energy reduction for training

on SpiNNaker2 prototype (28nm) compared to X86 CPU

• 96.2% MNIST accuracy for 0.6% connectivity

LeN

et 3

00

-10

0 In MNIST 784

Hidden FC 300

Hidden FC 100

Out Softmax 10

G. Bellec et al., “Deep rewiring: Training very sparse deep networks”, arXiv, 2018 Chen Liu et al., “Memory-efficient Deep Learning on a SpiNNaker 2 prototype”, submitted 21

SpiNNaker2 Chip Overview

22

• 160 ARM-based processing elements (PEs) • 8 GByte LPDDR4 DRAM • 7 energy efficient chip-to-chip links

New Hardware Feature Innovations

23

Feature Output/Benefit Patents Publications

Dynamic Neuromorphic Power Management

Reduce power consumption by up to x5 Examples: • Reward-based learning • Synfire chain • Event based vision sensor processing

DE102017128711.6 [Höppner2017]

Exponential Function Accelerator Enhance performance by x2 Examples: • Reward-based learning • STDP • BCPNN

[Partzsch2017] [Bauer2017] [Mikaitis2018]

Pseudo and true random number generation

Enhance performance by >x2 for pseudo random numbers Enable true randomness Examples: • Synaptic sampling • Deep-rewiring

EP3147775A1 US20170083289

[Neumärker2017]

exp exp

ARMexp

unitFIFOAHB

SpiNNaker2 Testchip JIB

24

• Taped out Testchip “JIB” • 8 processing elements (PE) • SpiNNaker Router, Chip2Chip Links • 9mm² in GLOBALFOUNDRIES 22FDX

technology • Low voltage operation down to 0.40V

• Power analyses for neuromorphic and

machine learning • Analysis results (scaled to SpiNNaker2)

• CPU Efficiency: • 47GOPS @<20µW/MHz

• Machine Learning • 4.6TOPS @ 6.4 Tops/W

• Scaling compared to SpiNNaker1

• x15 CPU performance • x2 performance by hardware

accelerators • >x30 better efficiency

PE

QPE

SpiNNaker Router

Shared SRAM

Host IF

Ch

ip2

Ch

ip IO

Ch

ip2

Ch

ip IO

QPE

Conclusions • SpiNNaker:

• has been 20 years in conception…

• …and 10 years in construction, • and is now ready for action!

• ~90 boards with groups around the world

• 1M core machine built

• HBP is supporting s/w development

• Industrial AI uses 2nd generation (non-spiking) neural nets

• there is an expectation that neuromorphics will contribute • especially energy-efficiency?

• SpiNNaker is the ideal research platform to explore this space • SpiNNaker2 will be 10x better!

25

Future Directions in Neuromorphic Computing...Neuromorphic Computing Available now Continued R& D...

Documents

Transcript of Future Directions in Neuromorphic Computing...Neuromorphic Computing Available now Continued R& D...