Flexon: A Flexible Digital Neuron for Efficient Spiking ...iscaconf.org/isca2018/slides/4A1.pdf ·...
Transcript of Flexon: A Flexible Digital Neuron for Efficient Spiking ...iscaconf.org/isca2018/slides/4A1.pdf ·...
Flexon: A Flexible Digital Neuron for Efficient Spiking Neural Network Simulations
Dayeol Lee†, Gwangmu Lee*, Dongup Kwon*, Sunghwa Lee*, Youngsok Kim*, and Jangwoo Kim*
*Dept. of Electrical and Computer Engineering, Seoul National University†Dept. of Electrical Engineering and Computer Sciences, University of California, Berkeley
2/32
NeuroscienceThe Study of Neurons and Brains
Recognition
Object detection,Classification
Morality,Social Value
Parkinson's Disease,CJD, Dementia
Degeneration
Consciousness Emotion
Sympathy,Happiness, Apathy
3/32
NeuroscienceThe Study of Neurons and Brains
Recognition
Object detection,Classification
Morality,Social Value
Parkinson's Disease,CJD, Dementia
Degeneration
Consciousness Emotion
Sympathy,Happiness, Apathy
DNN did this
Unexplored yet
4/32
SpikeGen.
SpikeGen.
for each time-step:
NeuronSpikeGenerator
Time-step
1
Neuron
Modeling a Brain: Spiking Neural NetworkHow to compute spiking neural network?
Synapse
5/32
SpikeGen.
SpikeGen.
for each time-step:1) stimulus generation
NeuronSpikeGenerator
Time-step
1
Neuron
Modeling a Brain: Spiking Neural NetworkHow to compute spiking neural network?
Synapse
6/32
SpikeGen.
SpikeGen.
for each time-step:1) stimulus generation2) neuron computation
NeuronSpikeGenerator
Time-step
1
Neuron
Modeling a Brain: Spiking Neural NetworkHow to compute spiking neural network?
Synapse
7/32
SpikeGen.
SpikeGen.
NeuronSpikeGenerator
Time-step
1
Neuron
Modeling a Brain: Spiking Neural NetworkHow to compute spiking neural network?
Synapse
for each time-step:1) stimulus generation2) neuron computation3) synapse calculation
…
8/32
10 Representative Benchmarks on CPU/GPUCPU: Intel Xeon E5-2630 v4 CPU (12-core, 2.2 GHz) / GPU: NVIDIA Titan X (Pascal) GPU
Where Does Time Go?
~50% of overheads coming from
Neuron Computation
9/32
MembraneVoltage
(millivolt scale)
Threshold
RestingPotential
Time(millisecond scale)
Neuron
Biological Neuron
How Does a Neuron Behave?
10/32
MembraneVoltage
(millivolt scale)
Threshold
RestingPotential
Time(millisecond scale)
Neuron
Biological Neuron
How Does a Neuron Behave?
11/32
MembraneVoltage
(millivolt scale)
Threshold
RestingPotential
Time(millisecond scale)
Neuron
Biological Neuron
How Does a Neuron Behave?
12/32
MembraneVoltage
(millivolt scale)
Threshold
RestingPotential
Time(millisecond scale)
Neuron
Biological Neuron
How Does a Neuron Behave?
13/32
AxonDendrite Soma
Biological Neuron
Time(millisecond scale)
MembraneVoltage
(millivolt scale)
Spike initiation
Tons of variants exist, depending on their feature set.
Various Neuron Behaviors
Input spikeaccumulation Membrane decay Spike inhibition
We need to support various features for accurate brain simulations.
14/32
CustomHardware
Framework
SoftwareSimulation
Solutions and Limitations
Flexibility
Accuracy
High Performance
Low Energy
15/32
Design Goals & Key Ideas
High PerformanceGoal 1
High FlexibilityGoal 2
Low costGoal 3
Hardware-based
Feature-driven
Spatially-folded
16/32
Neuron Feature #1: Input Spike Accumulation
Current-based Conductance-based(Exponential-shaped)
Conductance-based(Alpha function-shaped)
17/32
Neuron Feature #1: Input Spike Accumulation
Current-based Conductance-based(Exponential-shaped)
Conductance-based(Alpha function-shaped)
18/32
Neuron Feature #2: Spike Initiation
Quadratic(with feature)(without feature)
Exponential
19/32
Neuron Feature #3: Spike-triggered Current
Adaptation Sub-threshold Oscillation(without) (with)
(without) (with)
20/32
Neuron Feature #4: Refractory Period
Absolute Relative
(without) (with) (without) (with)
21/32
(without) (with) (without) (with)
Neuron Feature: Flexible Feature Support
Absolute Relative
“Feature-driven” Flexonsupports 11 major neuron models
(LLIF, SLIF, DSRM0, DLIF, QIF, EIF, Izhikevich, AdEx, …)
22/32
1 10 100 1000
GeomeanNowotny et al.
IzhikevichGeomean
Vogels et al.Vogels-Abbott
Potjans-DiesmannMuller et al.
Destexhe-UpDownDestexhe-LTS
BrunelBrette et al.
GPU
CPU
Speed-up(Normalized to the baseline)
8 CPU + 2 GPU Representative BenchmarksFlexon: TSMC 45nm, Synopsys Design Compiler (neuron), CACTI 6.5 (SRAM)
87.4xOver CPU
8.19xOver GPU
Evaluation (12x Feature-driven Design)
CPU
Intel Xeon E5-2630 v4(12-core, 2.2 GHz)
GPUNVIDIA Titan X (Pascal)
23/32
1 10 100 1000 10000 100000
GeomeanNowotny et al.
IzhikevichGeomean
Vogels et al.Vogels-Abbott
Potjans-DiesmannMuller et al.
Destexhe-UpDownDestexhe-LTS
BrunelBrette et al.
GPU
CPU
Energy Efficiency(Normalized to the baseline)
8 CPU + 2 GPU Representative BenchmarksFlexon: TSMC 45nm, Synopsys Design Compiler (neuron), CACTI 6.5 (SRAM)
6,186xOver CPU
422xOver GPU
Evaluation (12x Feature-driven Design)
CPU
Intel Xeon E5-2630 v4(12-core, 2.2 GHz)
GPUNVIDIA Titan X (Pascal)
24/32
Intrinsic Space-inefficiency
(without) (with) (without) (with)
Absolute Relative
“Feature-driven” Flexonsupports 11 major neuron models
(LLIF, SLIF, DSRM0, DLIF, QIF, EIF, Izhikevich, AdEx, …)
Lots of redundant units(multiplier, adder, …)
25/32
Constructing Spatially-folded Flexon
Conductance-based(Exponential)
Quadratic
Adaptation
Spatially-folded design è reduce area- Remove redundant MAC operators
26/32
Constructing Spatially-folded Flexon
Exponential Relative(ADT: Adaptation)
- Remove redundant MAC operators
Modifications from the baseline- 2-stage pipeline, multi-cycle implementation
Spatially-folded design è reduce area
27/32
Constructing Spatially-folded Flexon
“Spatially-folded” Flexonsupports various major neuron models
6x area saving
- Remove redundant MAC operators
What we should change- 2-stage pipeline, multi-cycle implementation
Spatially-folded design è reduce area
28/32
1 10 100 1000
GeomeanNowotny et al.
IzhikevichGeomean
Vogels et al.Vogels-Abbott
Potjans-DiesmannMuller et al.
Destexhe-UpDownDestexhe-LTS
BrunelBrette et al.
GPU
CPU
Speed-up(Normalized to the baseline)
8 CPU + 2 GPU Representative BenchmarksFlexon: TSMC 45nm, Synopsys Design Compiler (neuron), CACTI 6.5 (SRAM)
122xOver CPU
9.83xOver GPU
Evaluation (72x Spatially-folded Design)
CPU
Intel Xeon E5-2630 v4(12-core, 2.2 GHz)
GPUNVIDIA Titan X (Pascal)
Feature-drivenSpatially-folded
29/32
1 10 100 1000 10000 100000
GeomeanNowotny et al.
IzhikevichGeomean
Vogels et al.Vogels-Abbott
Potjans-DiesmannMuller et al.
Destexhe-UpDownDestexhe-LTS
BrunelBrette et al.
GPU
CPU
Energy Efficiency(Normalized to the baseline)
8 CPU + 2 GPU Representative BenchmarksFlexon: TSMC 45nm, Synopsys Design Compiler (neuron), CACTI 6.5 (SRAM)
Evaluation (72x Spatially-folded Design)
CPU
Intel Xeon E5-2630 v4(12-core, 2.2 GHz)
GPUNVIDIA Titan X (Pascal)
5,413xOver CPU
135xOver GPU
Feature-drivenSpatially-folded
30/32
Baseline Flexon vs. Spatially-folded Flexon
• Baseline “Feature-driven” Flexon−Fast: 87.4x over CPUs, 8.19x over GPUs
−Energy-efficient: 6,186x over CPUs, 422x over GPUs
• “Spatially-folded” Flexon−Fast: 122x over CPUs, 9.83x over GPUs
−Energy-efficient: 5,413x over CPUs, 135x over GPUs
31/32
Conclusion
• Flexon is a flexible feature-driven digital neuron design,capable of realizing various major neuron models.−Flexible & power-efficient (6,186x over CPU)
• Spatially-folded Flexon makes features share units,reducing 6x circuit area.−Flexible & fast when integrated (122x over CPU)
32/32
Thank you for listening
FlexonA Flexible Digital Neuron for EfficientSpiking Neural Network Simulations