Spiking Neural Network with RRAM: Can We Use It …Spiking Neural Network with RRAM: Can We Use It...

6
Spiking Neural Network with RRAM: Can We Use It for Real-World Application? Tianqi Tang 1 , Lixue Xia 1 , Boxun Li 1 , Rong Luo 1 , Yiran Chen 2 , Yu Wang 1 , Huazhong Yang 1 1 Dept. of E.E., Tsinghua National Laboratory for Information Science and Technology (TNList), Tsinghua University, Beijing, China 2 Dept. of E.C.E., University of Pittsburgh, Pittsburgh, USA 1 Email: [email protected] Abstract—The spiking neural network (SNN) provides a promis- ing solution to drastically promote the performance and effi- ciency of computing systems. Previous work of SNN mainly focus on increasing the scalability and level of realism in a neural simulation, while few of them support practical cognitive applications with acceptable performance. At the same time, based on the traditional CMOS technology, the efficiency of SNN systems is also unsatisfactory. In this work, we explore different training algorithms of SNN for real-world applications, and demonstrate that the Neural Sampling method is much more effective than Spiking Time Dependent Plasticity (STDP) and Remote Supervision Method (ReSuMe). We also propose an energy efficient implementation of SNN with the emerging metal-oxide resistive random access memory (RRAM) devices, which includes an RRAM crossbar array works as network synapses, an analog design of the spike neuron, and an input encoding scheme. A parameter mapping algorithm is also introduced to configure the RRAM-based SNN. Simulation results illustrate that the system achieves 91.2% accuracy on the MNIST dataset with an ultra-low power consumption of 3.5mW. Moreover, the RRAM-based SNN system demonstrates great robustness to 20% process variation with less than 1% accuracy decrease, and can tolerate 20% signal fluctuation with about 2% accuracy loss. These results reveal that the RRAM-based SNN will be quite easy to be physically realized. I. I NTRODUCTION The explosion of data generates great demands on more powerful platforms with higher processing speed, lower energy consumption, and even intelligent mining algorithm. However, the scaling of conventional CMOS technology is approaching the limit, making it difficult for CMOS-based computing sys- tems to achieve considerable improvements from device scaling [1]. And the well-known “memory wall” problem is increasing- ly limiting the performance of the traditional Von Neumann architecture [2]. Emerging nano-devices and novel computer architectures have rapidly attracted substantial research interests for the great potential of boosting performance and efficiency gains. The spiking neural network (SNN) provides a promising so- lution to drastically promote the performance and efficiency of computing systems. Abstracted from the actual neural system, SNN processes sparse time-encoded neural signals in parallel [3]. The special architecture not only makes SNN a promising tool to deal with cognitive tasks, such as the object detection and speech recognition, but also inspires new computational paradigms beyond the von Neumann architecture [4], [5]. Many recent works in the SNN focus on increasing the scalability and level of realism in a neural simulation [6], [7]. These techniques are able to model thousands to billions of neurons in biological real time and provide promising tools to study the brain. However, few of them support practical cognitive applications, such as the handwritten digit recognition. In other words, there remains a serious lack of the study of ef- fective training algorithm of SNN, especially for the real-world applications, to achieve an acceptable cognitive performance. On the other hand, based on the traditional CMOS tech- nology, the energy efficiency of these brain simulators is also unsatisfactory. For example, IBM consumes power of 1.4MW to simulate the cat cortex with 10 9 neurons and 10 13 synapses on the Blue Gene supercomputer cluster, which was of five orders of magnitude higher than the brain (20W) [8]. An energy efficient implementation of is still highly demanded. The innovations of device technology offer great opportu- nities for the efficient implementation of SNN and radically different forms of architecture [9]. The metal-oxide resistive random access memory (RRAM) device (or the memristor) is one of these promising devices [10]. The RRAM device enjoys the ultra-integration density and enables a large number of signal connections within a small circuit size. More importantly, by naturally transferring the weighted combination of input signals to output voltages in parallel, the RRAM crossbar array is able to merge the computation and memory together as the brain, and provides an ultra-high efficient implementation of the matrix-vector multiplication, which is one of the most important operations in neural network models [11]. Many studies have explored the potential of RRAM-based neural computing architecture beyond Von Neumann. For example, a low power on-chip neural approximate computing system has been demonstrated with power efficiency of more than 400 GFLOPS/W [12]. These works demonstrate a great potential of realizing SNN with RRAM devices, but the concrete imple- mentation still remains further study. In this work, we propose an RRAM-based energy efficient implementation of SNN for practical cognitive tasks. The con- tribution of this paper includes: 860 978-3-9815370-4-8/DATE15/ c 2015 EDAA

Transcript of Spiking Neural Network with RRAM: Can We Use It …Spiking Neural Network with RRAM: Can We Use It...

Page 1: Spiking Neural Network with RRAM: Can We Use It …Spiking Neural Network with RRAM: Can We Use It for Real-World Application? Tianqi Tang 1, Lixue Xia , Boxun Li , Rong Luo , Yiran

Spiking Neural Network with RRAM:Can We Use It for Real-World Application?

Tianqi Tang1, Lixue Xia1, Boxun Li1, Rong Luo1, Yiran Chen2, Yu Wang1, Huazhong Yang11Dept. of E.E., Tsinghua National Laboratory for Information Science and Technology (TNList),

Tsinghua University, Beijing, China2Dept. of E.C.E., University of Pittsburgh, Pittsburgh, USA

1 Email: [email protected]

Abstract—The spiking neural network (SNN) provides a promis-ing solution to drastically promote the performance and effi-ciency of computing systems. Previous work of SNN mainlyfocus on increasing the scalability and level of realism in aneural simulation, while few of them support practical cognitiveapplications with acceptable performance. At the same time,based on the traditional CMOS technology, the efficiency ofSNN systems is also unsatisfactory. In this work, we exploredifferent training algorithms of SNN for real-world applications,and demonstrate that the Neural Sampling method is much moreeffective than Spiking Time Dependent Plasticity (STDP) andRemote Supervision Method (ReSuMe). We also propose an energyefficient implementation of SNN with the emerging metal-oxideresistive random access memory (RRAM) devices, which includesan RRAM crossbar array works as network synapses, an analogdesign of the spike neuron, and an input encoding scheme. Aparameter mapping algorithm is also introduced to configure theRRAM-based SNN. Simulation results illustrate that the systemachieves 91.2% accuracy on the MNIST dataset with an ultra-lowpower consumption of 3.5mW. Moreover, the RRAM-based SNNsystem demonstrates great robustness to 20% process variationwith less than 1% accuracy decrease, and can tolerate 20% signalfluctuation with about 2% accuracy loss. These results reveal thatthe RRAM-based SNN will be quite easy to be physically realized.

I. INTRODUCTION

The explosion of data generates great demands on morepowerful platforms with higher processing speed, lower energyconsumption, and even intelligent mining algorithm. However,the scaling of conventional CMOS technology is approachingthe limit, making it difficult for CMOS-based computing sys-tems to achieve considerable improvements from device scaling[1]. And the well-known “memory wall” problem is increasing-ly limiting the performance of the traditional Von Neumannarchitecture [2]. Emerging nano-devices and novel computerarchitectures have rapidly attracted substantial research interestsfor the great potential of boosting performance and efficiencygains.

The spiking neural network (SNN) provides a promising so-lution to drastically promote the performance and efficiency ofcomputing systems. Abstracted from the actual neural system,SNN processes sparse time-encoded neural signals in parallel[3]. The special architecture not only makes SNN a promisingtool to deal with cognitive tasks, such as the object detection

and speech recognition, but also inspires new computationalparadigms beyond the von Neumann architecture [4], [5].

Many recent works in the SNN focus on increasing thescalability and level of realism in a neural simulation [6], [7].These techniques are able to model thousands to billions ofneurons in biological real time and provide promising toolsto study the brain. However, few of them support practicalcognitive applications, such as the handwritten digit recognition.In other words, there remains a serious lack of the study of ef-fective training algorithm of SNN, especially for the real-worldapplications, to achieve an acceptable cognitive performance.

On the other hand, based on the traditional CMOS tech-nology, the energy efficiency of these brain simulators is alsounsatisfactory. For example, IBM consumes power of 1.4MW tosimulate the cat cortex with 109 neurons and 1013 synapses onthe Blue Gene supercomputer cluster, which was of five ordersof magnitude higher than the brain (∼ 20W) [8]. An energyefficient implementation of is still highly demanded.

The innovations of device technology offer great opportu-nities for the efficient implementation of SNN and radicallydifferent forms of architecture [9]. The metal-oxide resistiverandom access memory (RRAM) device (or the memristor) isone of these promising devices [10]. The RRAM device enjoysthe ultra-integration density and enables a large number ofsignal connections within a small circuit size. More importantly,by naturally transferring the weighted combination of inputsignals to output voltages in parallel, the RRAM crossbar arrayis able to merge the computation and memory together asthe brain, and provides an ultra-high efficient implementationof the matrix-vector multiplication, which is one of the mostimportant operations in neural network models [11]. Manystudies have explored the potential of RRAM-based neuralcomputing architecture beyond Von Neumann. For example, alow power on-chip neural approximate computing system hasbeen demonstrated with power efficiency of more than 400GFLOPS/W [12]. These works demonstrate a great potentialof realizing SNN with RRAM devices, but the concrete imple-mentation still remains further study.

In this work, we propose an RRAM-based energy efficientimplementation of SNN for practical cognitive tasks. The con-tribution of this paper includes:

860978-3-9815370-4-8/DATE15/ c©2015 EDAA

Page 2: Spiking Neural Network with RRAM: Can We Use It …Spiking Neural Network with RRAM: Can We Use It for Real-World Application? Tianqi Tang 1, Lixue Xia , Boxun Li , Rong Luo , Yiran

Fig. 1. (a). Physical model of the HfOx based RRAM. The resistance of theRRAM device is determined by the tunneling gap, which will evolve due to thefiled and thermally driven oxygen ion migration. (b). Structure of the RRAMCrossbar Array.

1) We compare different models of spiking neural networksfor practical cognitive tasks, including the Spike TimingDependent Plasticity (STDP), the Remote SupervisedMethod (ReSuMe), and the latest Neural Sampling Learn-ing Scheme. We demonstrate that the STDP and ReSuMescheme can NOT provide acceptable cognitive perfor-mance, while the neural sampling method is promisingfor real-world applications.

2) We propose an RRAM-based implementation of spikingneural network. The RRAM implementation consists ofan RRAM crossbar array working as network synapses,an RRAM-based design of the spike neuron, an inputencoding scheme, and an algorithm to configure theRRAM-based spiking neural network.

3) Key design parameters and physical constraints, such asthe process variation and signal fluctuation, are extractedand studied in the paper. Simulation results illustrate thatthe system achieves 91.2% accuracy on MNIST dataset[13] with power consumption of only 3.5mW. Moreover,the RRAM-based SNN is robust to a large process varia-tion of 20% with less than 1% accuracy decrease and 20%signal fluctuation with about 2% accuracy decrease. Theseresults demonstrate that the RRAM-based SNN will bequite easy for physically realization.

The rest of this paper is organized as follows: Section IIprovides the related background knowledge and Section IIIstudy the training algorithm of spiking neural network. TheRRAM-based SNN implementation is introduced in SectionIV. Section V presents a case study on the handwritten digitrecognition task. And Section VI concludes this work.

II. PRELIMINARIES

A. RRAM Device Basics

The RRAM device is a passive two-port element with vari-able resistance states. Fig. 1(a) demonstrates a 3D filamentmodel of the HfOx based RRAM [14]. The conductance of thedevice is exponentially dependent on the tunneling gap. Whena large voltage is applied on the device, the tunneling gap willmove and the resistance of RRAM devices will change.

As shown in Fig. 1(b), the RRAM devices can be used tobuild cross-point structure, also known as the RRAM crossbararray. The relationship between the input voltage vector (�Vi) andoutput voltage vector ( �Vo) can be expressed as follows [11]:

Vo,j =∑k

ck,j · Vi,k (1)

where k (k = 1,2,..,N ) and j (j = 1,2,..,M ) are the indexnumbers of input and output ports, and the matrix parameterck,j can be represented by the conductivity of the RRAM device(gk,j) and the load resistors (gs) as:

ck,j =gk,j

gs +N∑l=1

gk,l

(2)

Therefore, the RRAM crossbar array is able to store thematrix through the conductance states of RRAM devices, andperform the analog matrix-vector multiplication efficiently bymerging the computation and memory together.

B. Neurons and Spiking Neural Network

The spiking neural network system is made up of layers ofspiking neurons and synaptic weight matrixes between them.The structure of the whole system is shown in Fig. 2. Animage recognition task working on this system is taken as ademonstration. Each input neuron represents a pixel value ofthe image while each output neuron represents one classificationthe image might be labeled. First, for each input channel, thereis an encoding module which transforms the numeral value x(e.g. the pixel value of an image) to a 0/1 spike train X(t). Thenthe spike train propagates through the synaptic weight matrixW by manipulating a matrix-vector multiplication operation.The result WX(t) functions as the input Vin(t) of the spikeneuron and is accumulated by the state variable V (t). OnceV (t) exceeds the threshold Vthresh, the neuron will send a spiketo the next synaptic weight matrix and V (t) will reset to Vreset.The behaviour of the neuron is followed as the Leaky Integrate-and-Fire (LIF) Model [15] and can be described as:

V (t) =

{β · V (t− 1) + Vin(t) when V < Vth

Vreset and set a spike when V ≥ Vth

(3)

where V (t) is the state variable and β is the leaky parameter;Vth is the threshold state and Vreset is reset state.

After the spikes pass through all the synaptic weight crossbarsand the spike neurons, there is a counter to calculate the spikenumber of each neuron in the output layer. A comparatorlabels the image to the classification whose output neuron hasthe largest spike number. In Section IV, we would make thehardware mapping to build the RRAM-based spiking neuralnetwork system.

III. TRAINING SCHEME OF SNN

The spiking neural network faces a huge problem that itis difficult to train the synaptic weights when applied in thereal-world applications. In this section, we compare different

2015 Design, Automation & Test in Europe Conference & Exhibition (DATE) 861

Page 3: Spiking Neural Network with RRAM: Can We Use It …Spiking Neural Network with RRAM: Can We Use It for Real-World Application? Tianqi Tang 1, Lixue Xia , Boxun Li , Rong Luo , Yiran

Fig. 2. RRAM-based Spiking Neural Network System Structure

SNN training algorithms, including the Spike Timing Depen-dent Plasticity (STDP), Remote Supervision Method (ReSuMe)and the latest Siegert learning scheme. We demonstrate thatthe STDP and ReSuMe scheme can NOT provide acceptablecognitive performance, while the Siegert method are promisingfor real-world applications.

A. Spike Timing Dependent Plasticity (STDP)

The spike timing dependent plasticity (STDP) [16] is anunsupervised learning rule which updates the synaptic weightsaccording to relative spiking time of pre- and post-synapticneurons. The learning rate is decided by the time interval: thecloser distance between pre- and post-synaptic spikes, the largerthe learning rate. The weight updating direction is decided bywhich neuron spikes first: for the excitatory neuron, if the post-synaptic neuron spikes later, the synapse will be strengthened;otherwise, it will be decreased. For the inhibitory neuron, viceversa. When every synaptic weight no longer changes or is setto 0/1, the learning process is finished.

As an unsupervised method, STDP is mainly used as afeature extraction method. We can not build a complete machinelearning system only based on STDP. A classifier is usuallyrequired for practical recognition tasks. However, in our ex-periment, STDP method doesn’t demonstrate enough efficiencyof feature extraction. For example, we use the classic MNISThandwritten digit dataset [13] to test the performance with asupport vector machine (SVM) [17] without a kernel, wheretwo 50-dimension feature sets are extracted with STDP andprincipal component analysis (PCA). The PCA-SVM methodachieves a recognition accuracy of 94% while the STDP-basedmethod only reaches 91%. As PCA is usually the baseline forevaluating the performance of feature extraction, STDP doesNOT demonstrate an efficient method for real-world cognitiveapplications or many other machine learning tasks.

B. Remote Supervision Method (ReSuMe)

Remote Supervision Method (ReSuMe) is a supervised learn-ing method proposed in [18]. The algorithm introduces a super-vised spike train for each synapse while training. The training

process comes to an end if the post-synaptic spike train is thesame as the supervised spike train. However, ReSuMe facesthe difficulty on the pattern design of supervised spike trainsand little guidance is offered on how to define the differencesbetween different spike train. Although some papers [19] haveattempted to build learning systems under ReSuMe learningalgorithm, to the best of our knowledge, we have NOT seenany efficient way to solve a real-world task.

C. Neural Sampling Learning Scheme

The Neural Sampling learning scheme transforms the leakyIntegrate-and-Fire (LIF) neuron into a nonlinear function(named Sigert function) [20] which represents the relationshipbetween the input and output firing rate of a neuron. Moreover,Neftci demonstrates that nonlinear function, which is equivalentwith LIF neuron, is satisfied with neural sampling conditionsin [3] and can be approximated to sigmoid function undercertain condition. Therefore, it can take advantage of contrastivedivergence (CD), which is a classic algorithm exploited inrestricted Boltzmann machine (RBM) to train the network.Moreover, the spiking RBM can be stacked into multi-layerto form the spiking deep belief network (DBN), which hasdemonstrated satisfying performance. In [20], Connor showsthat a 784×500×500×10 spiking DBN achieves the recognitionaccuracy of 95.2% on MNIST dataset [13].

In Section IV, we will make a hardware mapping of thespiking neural network which is trained under neural samplinglearning scheme. The parameter quantization is discussed inSection V-C and V-D. The training process done on the CPUplatform will not be discussed in this work.

IV. RRAM-BASED SPIKING NEURAL NETWORK SYSTEM

STRUCTURE

In this section, we make the hardware mapping in order toimplement the RRAM-based spiking neural network accordingto the system structure described in Section II-B. First, we willgive a precise introduction of the input encoding module whichconverts the numeral values to the spike trains. Then, we will

862 2015 Design, Automation & Test in Europe Conference & Exhibition (DATE)

Page 4: Spiking Neural Network with RRAM: Can We Use It …Spiking Neural Network with RRAM: Can We Use It for Real-World Application? Tianqi Tang 1, Lixue Xia , Boxun Li , Rong Luo , Yiran

describe the RRAM-based implementation of synaptic weightcrossbar and the LIF spiking neuron separately.

A. Input Transformation Module

Since spike trains propagate in the spiking neural network,original input x = [x1, · · · , xN ] should be mapped to spiketrains X(t) = [X1(t), · · · , XN (t)] before running the testsamples. Here, we define Xi(t) as a binary train, which justhas two states 0/1. For the ith input channel, the spike trainis made of Nt spike pulses with each pulse width T0, whichimplies that the spike train lasts for the length of time Nt · T0.Suppose the spike number of all input channels during the giventime Nt · T0 is Ns, then the spike count Ni of the ith channelis allocated as:

Ni =

Nt−1∑k=0

Xi(kT0) = round(Ns · vi∑Nk=1 vk

) (4)

which impliesNi

Ns=

vi∑Ni=1 vi

(5)

Then the Ni spikes of the ith channel is randomly set on theNt time intervals. For an ideal mapping, we would like to haveNi << Nt to keep the spike sparsity on the time dimension.However, for the speed efficiency, we would like the runningtime Nt · T0 to be short. Here, T0 is defined by the physicalclock, i.e. the clock of the pulse generator. which implies thatwe can only optimize Nt directly. Here, we define the bit levelof the input as

log(Nt

mean(Ni)) (6)

which evaluates the tradeoff between time efficiency and theaccuracy performance. Some discussion on the choice of inputsignal bit level is shown in Section V-C.

B. RRAM-based Crossbar

According to Eq. (2), there does not exist a direct one-to-one mapping from the original weight matrix C to the crossbarconductance matrix G. Moreover, some physical limitations onG should be considered:

• The item cjk of the original weight matrix C can be eitherpositive or negative while every item the conductance ofRRAM crossbar G should be positive. Thus, the originalweight matrix C should be decomposed into two parts:one positive C+, the other negative C−;

• According to Eq. (2), the parameter cjk must be of thefollowing range to enable all the solved gjk are within therange between gOFF and gON .

χmin ≤ cjk ≤ χmax (7)

χmin =gOFF

gs + gOFF + (N − 1)gON(8)

χmax =gON

gs + gON + (N − 1)gOFF(9)

where χmax and χmin are the maximum and minimummatrix parameters that can be represented by a physicalRRAM crossbar array.

The resistance or conductance states of RRAM devices inthe crossbar array must be configured properly to realize themultiplied matrix C, which requires a mapping algorithm [21].In order to satisfy the condition described in Eq. (7), thecrossbar can be represented as:

C = C+ − C− = α[(C+ +Δ)− (C− +Δ)] (10)

And Eq. (2) can be expressed as:

gk,j − ck,j ·N∑l=1

gk,l = gs · ck,j (11)

Based on the above equations, all the RRAM devices inthe same column can form a linear equation system and wecan achieve a feasible solution, which satisfies the physicallimitations of RRAM devices, by exhausted search of the αand Δ.

According to Eq. (7)-(10), the constraints of α and Δ can beexpressed as:

χmin

α≤ Δ ≤ χmax

α− cmax (12)

α ≤ χmax − χmin

cmax(13)

The above constraints can be used to reduce the search space.

C. Hardware-Implemented Spiking Neuron

Here we introduce the spiking neuron hardware implemen-tation as the LIF model introduced in Section II-B. As shownin Fig. 3, the capacitor Cmem works as an integrator to accu-mulate the crossbar output current Ioj , which passes througha mirror current source (T1, T2). The RRAM device Rmem

functions as the leaky path. Therefore, the leaky time constantτ = RmemCmem is changeable by tuning the gap distanceof the RRAM device to specific length. T3 works as a digitalswitch which controls the reset path: Once the voltage on Vmem

exceeds the threshold Vth, the output of the comparator wouldbe set to high voltage level, the flip-flop would send a pulse tothe next crossbar is generated and at the same time, the resetpath is conducted, which means the voltage on Cmem is resetto Vreset.

V. EXPERIMENTS

In this section, a case study made on the MNIST digitrecognition dataset is used to evaluate the performance of theproposed RRAM-based SNN system framework on real-worldapplications.

2015 Design, Automation & Test in Europe Conference & Exhibition (DATE) 863

Page 5: Spiking Neural Network with RRAM: Can We Use It …Spiking Neural Network with RRAM: Can We Use It for Real-World Application? Tianqi Tang 1, Lixue Xia , Boxun Li , Rong Luo , Yiran

Fig. 4. Recognition accuracy under (a). different bit-level of RRAM devices (b). different bit-level of input module, and (c) different degrees of input signalfluctuation (d) different degrees of process variation of RRAM devices.

TABLE IIMPORTANT PARAMETERS OF THE SNN SYSTEM

Network Size 784× 500× 500× 10Number of Input Spike (Ns) 2000Number of Pulse Interval (Nt) 128Input pulse Voltage (V ) 1VThe Pulse Width (T0) 1ns

A. Experiments Setup

The MNIST dataset [13] is used to test the performance ofRRAM-based neural network system proposed in Section IV.MNIST is a widely used dataset for optical character recognitionwith 60,000 handwritten digits in training set and 10,000 intesting set. In our experiment, we use all the examples ofhandwritten digits of ‘0’∼‘9’ to train the neural network systemand randomly select 1,000 samples for testing. The parametersare shown in Table I.

For the spiking neural network simulation, we first train thespiking neural network on CPU and use SPICE to simulate thecircuit performance. In the training process, the weight synapsesare trained on the CPU platform by CD algorithm with fire ratespropogating in the network according to Section III-C.

Then for the test processing, a Verilog-A RRAM devicemodel [14] is used to build up the SPICE-level crossbar array.The circuit simulation is made where the weight matrix ismapped to RRAM-based crossbar and the voltage pulse trainis generated according to the fire rate. Meanwhile, the analogspike neuron is implemented with parameters mapping underthe Siegert approximation. The maximum amplitude of inputvoltage of each crossbar (the output of each flip-flop) is set to1V to achieve better linearity of RRAM devices. Most of the

Fig. 3. Circuit of the spike neuron

input voltages applied on the RRAM devices are around tensto hundreds of millivolt. A counter is cascaded at each port tocalculate the spike number of each spiking train from outputneuron layer and a comparator is used to select the port withthe highest spike count and to provide the recognition results.

The simulation results are provided in Fig. 4. Some compar-isons are made between different input and device bit levels.We also make some analysis of signal fluctuation and devicevariation.

B. System Performance and Efficiency

We train the SNN with the size of 784 × 500 × 500 × 10.The experiment results show that the recognition accuracy is95.4% on the CPU platform and 91.2% on the ideal RRAM-based hardware implementation. The recognition performancedecreases about 4% because it is impossible to satisfy withNt << Ns on the RRAM platform as mentioned in SectionIV-A. Moreover, for technical limit and the device variation,the effect of device bit level quantization on the performanceshould be taken into consideration. Meanwhile, as we discussedin Section IV-A, the bit level quantization of the input signalshould also be taken into consideration, which is discussed inSection V-C and V-D.

The power consumption of the system is mainly contributedby three parts: the crossbar, the comparator and the RmemCmem

leaky path. The simulation results show that the power con-sumption is about 3.5 mW on average. However, it takesNt = 128 cycles with the physical clock T0 = 1ns. Thoughinput conversion from numeral values to spike trains leads to100X clock rate decrease, the system is able to complete therecognition task in real time (∼1μs/sample) thanks to theshort latency of RRAM device.

C. Impact of Input Signal Bit Level Quantization and SignalFluctuation

As discussed in Section IV-A, larger input signal bit levelleads to better recognition accuracy while smaller bit level leadsto more power efficiency and makes one test complete in shortertime. The simulation results summarized in Fig 4 (b) shows thatthe input signal above 6-bit level achieves satisfying recognitionaccuracy (>85%). Based on the 8-bit RRAM result, differentlevels of signal fluctuation are added on the 8-bit input signal.The result shown in Fig 4 (c) demonstrates that the performanceof accuracy just decreases 3% given 20% variation. The sparsity

864 2015 Design, Automation & Test in Europe Conference & Exhibition (DATE)

Page 6: Spiking Neural Network with RRAM: Can We Use It …Spiking Neural Network with RRAM: Can We Use It for Real-World Application? Tianqi Tang 1, Lixue Xia , Boxun Li , Rong Luo , Yiran

of the spike train leads to the system robustness, making itinsensitive to the input fluctuation.

D. Impact of RRAM Device Bit Level Quantization and ProcessVariation

Since it is impossible to tune an RRAM device accurately toa specific gap distance, it is necessary to analyse the effect ofthe bit level quantization on the recognition accuracy. Then itwould offer a guide on choosing an appropriate bit level. Thesimulation results in Fig. 4(a) show that a 8-bit RRAM device isable to realize a recognition accuracy of nearly 90%. Moveover,there exists possibility that the gap distance of RRAM devicewould change when current adding on the device exceeds therequirement or some random events happen. Therefore, theperformance of different process variation degrees is studied,and the simulation results in Fig. 4(d) show that when RRAMdevice is made in 8-bit level with the 6-bit level input, theperformance does not decrease under 20% variation.

VI. CONCLUSION

In this work, we propose an energy efficient implementa-tion of SNN based on RRAM devices. As for the trainingalgorithm, the neural sampling learning scheme is chosen forpractical cognitive tasks compared with STDP and ReSuMemethods since it is easy to implement a multi-layer networkstructure and achieves good recognition accuracy. And for thehardware architecture, the implementation includes analog LIFneurons with a changeable leaky time constant, an RRAM-based crossbar functioning as synaptic weight matrix, and aninput encoding scheme converting the numeral values to spiketrains. A mapping algorithm is also introduced to configure theRRAM-based SNN efficiently. The experiments on the MNISTdatabase demonstrate that the proposed RRAM-based SNNachieves 91.2% accuracy and requires power consumption of3.5mW per test sample. In addition, the RRAM implementationof SNN is robust to 20% process variation with less than 1%accuracy decrease, and 20% signal fluctuation with about 2%accuracy decrease. These results demonstrate that the RRAM-based SNN will be quite easy for physically realization.

ACKNOWLEDGMENT

This work was supported by 973 project 2013CB329000,National Science and Technology Major Project(2013ZX03003013-003) and National Natural ScienceFoundation of China (No. 61373026,61261160501), theImportation and Development of High-Caliber Talents Projectof Beijing Municipal Institutions, Tsinghua UniversityInitiative Scientific Research Program and CNS-1253424. Andwe gratefully acknowledge the support from Prof. ShimengYu with the help of RRAM model and the opensource codepushed by Danny Neil on Github.

REFERENCES

[1] H. Esmaeilzadeh, E. Blem, R. St Amant, K. Sankaralingam, and D. Burg-er, “Dark silicon and the end of multicore scaling,” in Computer Archi-tecture (ISCA), 2011 38th Annual International Symposium on. IEEE,2011, pp. 365–376.

[2] Y. Xie, “Future memory and interconnect technologies,” in Design,Automation & Test in Europe Conference & Exhibition (DATE), 2013.IEEE, 2013, pp. 964–969.

[3] E. Neftci, S. Das, B. Pedroni, K. Kreutz-Delgado, and G. Cauwenberghs,“Event-driven contrastive divergence for spiking neuromorphic systems,”Frontiers in neuroscience, vol. 7, 2013.

[4] T. Masquelier and S. J. Thorpe, “Unsupervised learning of visual featuresthrough spike timing dependent plasticity,” PLoS computational biology,vol. 3, no. 2, p. e31, 2007.

[5] S. K. Esser, A. Andreopoulos, R. Appuswamy, P. Datta, D. Barch,A. Amir, J. Arthur, A. Cassidy, M. Flickner, P. Merolla et al., “Cognitivecomputing systems: Algorithms and applications for networks of neurosy-naptic cores,” in Neural Networks (IJCNN), The 2013 International JointConference on. IEEE, 2013, pp. 1–10.

[6] E. Painkras, L. A. Plana, J. Garside, S. Temple, F. Galluppi, C. Patterson,D. R. Lester, A. D. Brown, and S. B. Furber, “Spinnaker: A 1-w 18-coresystem-on-chip for massively-parallel neural network simulation,” Solid-State Circuits, IEEE Journal of, vol. 48, no. 8, pp. 1943–1953, 2013.

[7] R. Wang, T. J. Hamilton, J. Tapson, and A. van Schaik, “An fpgadesign framework for large-scale spiking neural networks,” in Circuitsand Systems (ISCAS), 2014 IEEE International Symposium on. IEEE,2014, pp. 457–460.

[8] R. Ananthanarayanan, S. K. Esser, H. D. Simon, and D. S. Modha,“The cat is out of the bag: cortical simulations with 109 neurons, 1013synapses,” in High Performance Computing Networking, Storage andAnalysis, Proceedings of the Conference on. IEEE, 2009, pp. 1–12.

[9] V. Narayanan, S. Datta, G. Cauwenberghs, D. Chiarulli, S. Levitan,and P. Wong, “Video analytics using beyond cmos devices,” in Design,Automation and Test in Europe Conference and Exhibition (DATE), 2014.IEEE, 2014, pp. 1–5.

[10] B. Liu, M. Hu, H. Li, Z.-H. Mao, Y. Chen, T. Huang, and W. Zhang,“Digital-assisted noise-eliminating training for memristor crossbar-basedanalog neuromorphic computing engine,” in Design Automation Confer-ence (DAC), 2013 50th ACM/EDAC/IEEE. IEEE, 2013, pp. 1–6.

[11] M. Hu, H. Li, Q. Wu, and G. S. Rose, “Hardware realization of bsbrecall function using memristor crossbar arrays,” in Proceedings of the49th Annual Design Automation Conference. ACM, 2012, pp. 498–503.

[12] B. Li, Y. Shan, M. Hu, Y. Wang, Y. Chen, and H. Yang, “Memristor-based approximated computation,” in Low Power Electronics and Design(ISLPED), 2013 IEEE International Symposium on. IEEE, 2013, pp.242–247.

[13] Y. LeCun and C. Cortes, “The mnist database of handwritten digits,” 1998.

[14] S. Yu, B. Gao, Z. Fang, H. Yu, J. Kang, and H.-S. P. Wong, “A low energyoxide-based electronic synaptic device for neuromorphic visual systemswith tolerance to device variation,” Advanced Materials, vol. 25, no. 12,pp. 1774–1779, 2013.

[15] G. Indiveri, “A low-power adaptive integrate-and-fire neuron circuit,” inISCAS (4), 2003, pp. 820–823.

[16] S. Song, K. D. Miller, and L. F. Abbott, “Competitive hebbian learningthrough spike-timing-dependent synaptic plasticity,” Nature neuroscience,vol. 3, no. 9, pp. 919–926, 2000.

[17] C.-C. Chang and C.-J. Lin, “Libsvm: a library for support vector ma-chines,” ACM Transactions on Intelligent Systems and Technology (TIST),vol. 2, no. 3, p. 27, 2011.

[18] F. Ponulak, “Resume-new supervised learning method for spiking neuralnetworks,” Institute of Control and Information Engineering, PoznanUniversity of Technology.(Available online at: http://d1. cie. put. poznan.pl/˜ fp/research. html), 2005.

[19] J. Hu, H. Tang, K. C. Tan, H. Li, and L. Shi, “A spike-timing-basedintegrated model for pattern recognition,” Neural computation, vol. 25,no. 2, pp. 450–472, 2013.

[20] P. O’Connor, D. Neil, S.-C. Liu, T. Delbruck, and M. Pfeiffer, “Real-time classification and sensor fusion with a spiking deep belief network,”Frontiers in neuroscience, vol. 7, 2013.

[21] P. Gu et al., “Technological exploration of rram crossbar array for matrix-vector multiplication,” in ASPDAC, 2015.

2015 Design, Automation & Test in Europe Conference & Exhibition (DATE) 865