Final Presentation Neural Network Implementation On FPGA Supervisor: Chen Koren Maria Nemets...

20
Final Presentation Final Presentation Neural Network Neural Network Implementation On FPGA Implementation On FPGA Supervisor: Chen Koren Supervisor: Chen Koren Maria Nemets 309326767 Maria Nemets 309326767 Maxim Zavodchik 310623772 Maxim Zavodchik 310623772
  • date post

    21-Dec-2015
  • Category

    Documents

  • view

    225
  • download

    1

Transcript of Final Presentation Neural Network Implementation On FPGA Supervisor: Chen Koren Maria Nemets...

Page 1: Final Presentation Neural Network Implementation On FPGA Supervisor: Chen Koren Maria Nemets 309326767 Maxim Zavodchik 310623772.

Final Presentation Final Presentation

Neural Network Implementation On Neural Network Implementation On FPGAFPGA

Supervisor: Chen KorenSupervisor: Chen Koren

Maria Nemets 309326767Maria Nemets 309326767Maxim Zavodchik 310623772Maxim Zavodchik 310623772

Page 2: Final Presentation Neural Network Implementation On FPGA Supervisor: Chen Koren Maria Nemets 309326767 Maxim Zavodchik 310623772.

Project ObjectivesProject Objectives

Implementing Neural Network on FPGAImplementing Neural Network on FPGA Creating modular designCreating modular design Implementing in software (Matlab)Implementing in software (Matlab) Creating PC InterfaceCreating PC Interface Performance Analyze:Performance Analyze:

Area on chipArea on chip InterconnectionsInterconnections Speed vs. software implementationSpeed vs. software implementation FrequencyFrequency CostCost

Page 3: Final Presentation Neural Network Implementation On FPGA Supervisor: Chen Koren Maria Nemets 309326767 Maxim Zavodchik 310623772.

Project’s Part A ObjectivesProject’s Part A Objectives

Implementing a single neuron in VHDL.Implementing a single neuron in VHDL. Researching and integrating into EDK Researching and integrating into EDK

environment and running the design on environment and running the design on FPGA.FPGA.

Implementing the feed forward calculation.Implementing the feed forward calculation. Implementing the learning in Matlab.Implementing the learning in Matlab. Building a Graphical User Interface for Building a Graphical User Interface for

friendly communication with the system. friendly communication with the system.

Page 4: Final Presentation Neural Network Implementation On FPGA Supervisor: Chen Koren Maria Nemets 309326767 Maxim Zavodchik 310623772.

Testing ApplicationTesting Application Single neuron can separate two regions by linear Single neuron can separate two regions by linear

line.line. There is need for a multi layered network to There is need for a multi layered network to

recognize an image.recognize an image. Implementing and/or functions:Implementing and/or functions:

(0,0)

(0,1)

(1,0)

(1,1)

(0,0)

(0,1)

(1,0)

(1,1)

AND Function

OR Function

Page 5: Final Presentation Neural Network Implementation On FPGA Supervisor: Chen Koren Maria Nemets 309326767 Maxim Zavodchik 310623772.

Learning in MatlabLearning in Matlab

Implementing a NN using logsig() Implementing a NN using logsig() activation function and ‘traingdx’ activation function and ‘traingdx’ training algorithm.training algorithm.

Providing a Truth Table for the binary Providing a Truth Table for the binary functions AND/OR as a training set.functions AND/OR as a training set.

% Build the NN temp = size(inputs_vec); in_range = zeros(temp(1),2); in_range(:,2) = 1; net = newff(in_range,[1],{'logsig'},'traingdx');

% Train the NN net.TrainParam.epochs = epochs; net.TrainParam.goal = error; net = train(net,inputs_vec,target_vec);

1( )

1 vv

e

Sigmoid Function:

Page 6: Final Presentation Neural Network Implementation On FPGA Supervisor: Chen Koren Maria Nemets 309326767 Maxim Zavodchik 310623772.

Hardware Description Hardware Description

XilinX ML310 Development BoardXilinX ML310 Development Board RS232 Standard - FPGA UARTRS232 Standard - FPGA UART

Transmission rate is 115,200 bits/sec optimally Transmission rate is 115,200 bits/sec optimally

VirtexII-Pro XC2VP30 FPGAVirtexII-Pro XC2VP30 FPGA 2 PowerPC 405 Core - 300+ MHz2 PowerPC 405 Core - 300+ MHz 2,448 Kbytes of BRAM memories2,448 Kbytes of BRAM memories 136 18x18 bits multipliers136 18x18 bits multipliers 30,816 Logic Cells30,816 Logic Cells Up to 111,232 internal registers Up to 111,232 internal registers Up to 111,232 LUTSUp to 111,232 LUTS

256 MB DDR DIMM256 MB DDR DIMM

Page 7: Final Presentation Neural Network Implementation On FPGA Supervisor: Chen Koren Maria Nemets 309326767 Maxim Zavodchik 310623772.

System InterfaceSystem Interface

InputsInputs Binary number ( up to 1024 bits)Binary number ( up to 1024 bits) Weights – 13 bits width Weights – 13 bits width

Fixed Point Presentation:Fixed Point Presentation: 1 sign bit1 sign bit 4 integer bits4 integer bits 8 fraction bits8 fraction bits

Sigmoid function values – 8 bit widthSigmoid function values – 8 bit width

OutputsOutputs Two bits – neuron’s binary result on the input number or Two bits – neuron’s binary result on the input number or

failure detection.failure detection.

Page 8: Final Presentation Neural Network Implementation On FPGA Supervisor: Chen Koren Maria Nemets 309326767 Maxim Zavodchik 310623772.

System Description System Description

Pow

er PC

Pow

er PC Weights

memory

Single NeuronSingle Neuron

UART

Inputmemory

Sigmoid

memory

PLB

OPB

PLB2OPBBridge

Page 9: Final Presentation Neural Network Implementation On FPGA Supervisor: Chen Koren Maria Nemets 309326767 Maxim Zavodchik 310623772.

EDK IntegrationEDK Integration

PPC writes the BRAMS and controls Single PPC writes the BRAMS and controls Single Neuron through the PLBNeuron through the PLB

Single Neuron connected to PLB as an User Core Single Neuron connected to PLB as an User Core IPIF. IPIF.

Memories:Memories:

PORT1: Connected to PLB as IPIFPORT1: Connected to PLB as IPIF

PORT2: Connected to Single Neuron directlyPORT2: Connected to Single Neuron directly UART (Serial Port) is connected to OPB.UART (Serial Port) is connected to OPB.

Page 10: Final Presentation Neural Network Implementation On FPGA Supervisor: Chen Koren Maria Nemets 309326767 Maxim Zavodchik 310623772.

Control FlowControl Flow

Get WeightsGet Sigmoid

Load Input numberLoad BiasCalculate

Calculate φφ(.)(.) Calculate output bits

01

m

j jj

w x w

Send the result to user

IDLEIDLELoad decision

valuesGet Inputs

Wait for loading

Bias

Page 11: Final Presentation Neural Network Implementation On FPGA Supervisor: Chen Koren Maria Nemets 309326767 Maxim Zavodchik 310623772.

Architecture – Single NeuronArchitecture – Single Neuron

Multiplier 1x13 bitsMultiplier 1x13 bits Accumulator 13 bits widthAccumulator 13 bits width FSM ControllerFSM Controller

MULTMULT AccumulatorAccumulator

RREEGG

RREEGG

RREEGG

RREEGG

ComparatorComparator Min DecisionMin Decision Max DecisionMax Decision

Bias WeightBias Weight

X[i]X[i]

W[i]W[i]

YY

vv

logsig(v)logsig(v)

bias/max/min/inputs_numbias/max/min/inputs_num

RREEGG

Bias/Min/Max/Inputs_num RegistersBias/Min/Max/Inputs_num Registers Comparator:Comparator: _ _ , "00"

( ) _ _ , "01"

, "10"

z min decision val

COMP z z max decision val

else

Page 12: Final Presentation Neural Network Implementation On FPGA Supervisor: Chen Koren Maria Nemets 309326767 Maxim Zavodchik 310623772.

Architecture – Memories (1)Architecture – Memories (1)

2-Port BRAMS with separate clocks. 2-Port BRAMS with separate clocks.

Special sized BRAMS generated by the Xilinx Core Generator.Special sized BRAMS generated by the Xilinx Core Generator.

VHDL SRAM controller wrappingVHDL SRAM controller wrapping

Inputs Memory:Inputs Memory: Up to 1024 binary bits Up to 1024 binary bits

1 Kbyte

Page 13: Final Presentation Neural Network Implementation On FPGA Supervisor: Chen Koren Maria Nemets 309326767 Maxim Zavodchik 310623772.

Architecture – Memories (2)Architecture – Memories (2)

Weights Memory:Weights Memory:• 1024*13bits =13,312 bits =1,664 bytes 1024*13bits =13,312 bits =1,664 bytes

Bias weight:Bias weight:• 1 register for output layer (13 bit width)1 register for output layer (13 bit width)

Sigmoid Memory:Sigmoid Memory: Values out of range [-4,4] are mapped to 0,1Values out of range [-4,4] are mapped to 0,1 Memory block quantizing sigmoid values :Memory block quantizing sigmoid values :

• 11 bits input representing values [-4,4]11 bits input representing values [-4,4]• 8 bits output representing values [0,1]8 bits output representing values [0,1]

~1.6 Kbyte

2 Kbyte

Page 14: Final Presentation Neural Network Implementation On FPGA Supervisor: Chen Koren Maria Nemets 309326767 Maxim Zavodchik 310623772.

Simulation (1)Simulation (1) Single Neuron VHDL simulation Single Neuron VHDL simulation Application: AND function with 4 inputsApplication: AND function with 4 inputs

• Minimum decision value:Minimum decision value: 0.37890.3789

• Maximum decision value:Maximum decision value: 0.62110.6211 3-Pipeline stages:3-Pipeline stages:

Memories Mult Accumulator

Page 15: Final Presentation Neural Network Implementation On FPGA Supervisor: Chen Koren Maria Nemets 309326767 Maxim Zavodchik 310623772.

Simulation (2)Simulation (2) Result:Result:

Sigmoid answer: Sigmoid answer: 9F = 10011111 = 0.62119F = 10011111 = 0.6211 ““ready” signal assigned when doneready” signal assigned when done Latency: 14 + |Inputs| - 1 [clocks]Latency: 14 + |Inputs| - 1 [clocks]

4 4

01 1

1 1 ( 3.5) 0.5i ii

v x w w

( ) (0.5) 0.6225 0.6211v

Page 16: Final Presentation Neural Network Implementation On FPGA Supervisor: Chen Koren Maria Nemets 309326767 Maxim Zavodchik 310623772.

SoftwareSoftware PPC’s program controls the whole flow.PPC’s program controls the whole flow. The PPC writes control words and reads result words on The PPC writes control words and reads result words on

PLB as 64 bits of data.PLB as 64 bits of data. Control/Result Words Structure:Control/Result Words Structure:

Memories:Memories:

Single Neuron:Single Neuron:• From CPUFrom CPU

• To CPUTo CPU

[load_w0][rst][start][w0_ready][load_min_val][load_max_val][load_inputs_num][w0/min_val/max_val/inputs_number][ "0" ][ 0 ][ 1 ][ 2 ][ 3 ][ 4 ][ 5 ][ 6 ][ 7÷÷19 ][20÷÷63]

[ y ][ready][w0_rd][ "0" ][0 ÷÷ 1][ 2 ][ 3 ][4÷÷63]

[[ 00 ][ ][ 1÷101÷10 ][ ][ 11÷24 / 11 11÷24 / 11 ][ ][ 25÷63 / 12÷6325÷63 / 12÷63 ] ][[USER_wr_aUSER_wr_a][][USER_addr_aUSER_addr_a][][USER_dout_aUSER_dout_a][ ][ “0” “0” ]]

[[ 00 ][ ][ 1÷111÷11 ][ ][ 12÷1912÷19 ][ ][ 20÷63 20÷63 ] ]SigmoidW/X

Page 17: Final Presentation Neural Network Implementation On FPGA Supervisor: Chen Koren Maria Nemets 309326767 Maxim Zavodchik 310623772.

Building a Graphical User Interface for friendly Building a Graphical User Interface for friendly communication between the user and the system.communication between the user and the system.

Implemented in Matlab 6.1Implemented in Matlab 6.1 The GUI enables:The GUI enables:

Choosing a function to be implementedChoosing a function to be implemented Define maximum error, epochs number Define maximum error, epochs number

and decision values.and decision values. Choosing the length of binary Choosing the length of binary

input vector.input vector. Simulating the neuron for input vector.Simulating the neuron for input vector.

Graphical User InterfaceGraphical User Interface

Page 18: Final Presentation Neural Network Implementation On FPGA Supervisor: Chen Koren Maria Nemets 309326767 Maxim Zavodchik 310623772.

Project’s Part B ObjectivesProject’s Part B Objectives

Creating a multi layered network to classify a Creating a multi layered network to classify a digit.digit.

Implementing a modular system :Implementing a modular system :Number of neurons in the hidden layer varies Number of neurons in the hidden layer varies

from 2 to 10.from 2 to 10.Number of sub-networks.Number of sub-networks.

Page 19: Final Presentation Neural Network Implementation On FPGA Supervisor: Chen Koren Maria Nemets 309326767 Maxim Zavodchik 310623772.

Project’s Part B Objectives (Cont.)Project’s Part B Objectives (Cont.)

Implementing a Parallel System:Implementing a Parallel System: Dividing complex fully-connected network into Dividing complex fully-connected network into

sub-networks. sub-networks. 10 sub-networks running concurrently.10 sub-networks running concurrently. Up to 10 neurons run concurrently in each sub-Up to 10 neurons run concurrently in each sub-

network.network. Up to 5 inputs are calculated together depending on Up to 5 inputs are calculated together depending on

number of neurons in hidden layer.number of neurons in hidden layer. Parallel calculations of output layer. Parallel calculations of output layer.

Page 20: Final Presentation Neural Network Implementation On FPGA Supervisor: Chen Koren Maria Nemets 309326767 Maxim Zavodchik 310623772.