Gaj1MAPLD 2005/1016 Development and Maintenance of User Libraries for SRC Reconfigurable Computers...

56
Gaj 1 MAPLD 2005/1016 Development and Maintenance of User Libraries for SRC Reconfigurable Computers Kris Gaj 1 , Tarek El-Ghazawi 2 , Paul Gage 3 , Dan Poznanovic 3 , Chang Shu 1 , Deapesh Misra 1 , Miaoqing Huang 2 , Esam El- Araby 2 , Mohamed Taher 2 1 George Mason University 2 The George Washington University 3 SRC Computers, Inc.

Transcript of Gaj1MAPLD 2005/1016 Development and Maintenance of User Libraries for SRC Reconfigurable Computers...

Page 1: Gaj1MAPLD 2005/1016 Development and Maintenance of User Libraries for SRC Reconfigurable Computers Kris Gaj 1, Tarek El-Ghazawi 2, Paul Gage 3, Dan Poznanovic.

Gaj 1 MAPLD 2005/1016

Development and Maintenance of User Libraries for

SRC Reconfigurable Computers

Kris Gaj1, Tarek El-Ghazawi2, Paul Gage3, Dan Poznanovic3,

Chang Shu1, Deapesh Misra1,

Miaoqing Huang2, Esam El-Araby2,

Mohamed Taher2

1 George Mason University2 The George Washington University3 SRC Computers, Inc.

Page 2: Gaj1MAPLD 2005/1016 Development and Maintenance of User Libraries for SRC Reconfigurable Computers Kris Gaj 1, Tarek El-Ghazawi 2, Paul Gage 3, Dan Poznanovic.

Gaj 2 MAPLD 2005/1016

ReconfigurableComputers

Page 3: Gaj1MAPLD 2005/1016 Development and Maintenance of User Libraries for SRC Reconfigurable Computers Kris Gaj 1, Tarek El-Ghazawi 2, Paul Gage 3, Dan Poznanovic.

Gaj 3 MAPLD 2005/1016

Interface

P memory

P memory

. . .

P P . . .

I/O Interface

FPGA memory

FPGA memory

. . .

FPGA FPGA . . .

I/O

Microprocessor system FPGA system

What is a reconfigurable computer?

Page 4: Gaj1MAPLD 2005/1016 Development and Maintenance of User Libraries for SRC Reconfigurable Computers Kris Gaj 1, Tarek El-Ghazawi 2, Paul Gage 3, Dan Poznanovic.

Gaj 4 MAPLD 2005/1016

Examples of High-End Reconfigurable Computers

• SRC-6E and SRC High-Bar Based Systems from SRC Computers, Inc.

• Cray XD1 (formerly Octiga Bay 12 K) from Cray Inc.

• SGI Altix 3000 from Silicon Graphics

• Star Bridge Hypercomputer from Star Bridge Systems

Page 5: Gaj1MAPLD 2005/1016 Development and Maintenance of User Libraries for SRC Reconfigurable Computers Kris Gaj 1, Tarek El-Ghazawi 2, Paul Gage 3, Dan Poznanovic.

Gaj 5 MAPLD 2005/1016

SRC MAP™ Reconfigurable Processor

Source: [SRC, MAPLD04]

Page 6: Gaj1MAPLD 2005/1016 Development and Maintenance of User Libraries for SRC Reconfigurable Computers Kris Gaj 1, Tarek El-Ghazawi 2, Paul Gage 3, Dan Poznanovic.

Gaj 6 MAPLD 2005/1016

SNAP

ComputerMemory(8 GB)

P4(2.8GHz)

P4(2.8GHz)

/ /22400MB/s

MIOC

L2L2

4256 MB/s

// 4256 MB/s1064 MB/s

DDRInterface

PCI-X

ControlFPGA

XC2V6000

2128 MB/s

On-Board Memory(24 MB)

/4800 MB/s(6x64 bits)

FPGA 1XC2V6000

FPGA 2XC2V6000

/

4800 MB/s(6x 64 bits)

/

4800 MB/s(6x 64 bits)

2400 MB/s(192 bits)

/

/ /

(108 bits)

ChainPorts 2400 MB/s

(108 bits)

/

1064 MB/s

½ MAPBoard

uPBoard

22400MB/s

SRC-6E Hardware Architecture

Page 7: Gaj1MAPLD 2005/1016 Development and Maintenance of User Libraries for SRC Reconfigurable Computers Kris Gaj 1, Tarek El-Ghazawi 2, Paul Gage 3, Dan Poznanovic.

Gaj 7 MAPLD 2005/1016

Storage Area Storage Area Network Network

Local Area Local Area Network Network

Wide Area Wide Area Network Network DiskDisk

Customers’ Existing NetworksCustomers’ Existing Networks

• Hi-Bar sustains 1.4 GB/s per port with 180 ns latency per tier• Up to 256 input and 256 output ports• Common Memory (CM) has controller with DMA capability• Up to 8 GB DDR SDRAM supported per CM node

PCI-XPCI-XPCI-XPCI-X

SRC Hi-Bar Based Systems

MAPMAP®®

SRC-6SRC-6

MAPMAP

PP

MemoryMemory

SNAPSNAP™™

PP

MemoryMemory

SNAPSNAP

Gig EthernetGig Ethernetetc.etc.

Common Common MemoryMemory

ChainingChainingGPIOGPIO

Common Common MemoryMemory

SRC Hi-Bar SwitchSRC Hi-Bar Switch

Source: [SRC, MAPLD04]

Page 8: Gaj1MAPLD 2005/1016 Development and Maintenance of User Libraries for SRC Reconfigurable Computers Kris Gaj 1, Tarek El-Ghazawi 2, Paul Gage 3, Dan Poznanovic.

Gaj 8 MAPLD 2005/1016

SRC Programming

HLL (C)

HDL (VHDL)

SRCP system

FPGA system

ApplicationProgrammer

LibraryDeveloper

Page 9: Gaj1MAPLD 2005/1016 Development and Maintenance of User Libraries for SRC Reconfigurable Computers Kris Gaj 1, Tarek El-Ghazawi 2, Paul Gage 3, Dan Poznanovic.

Gaj 9 MAPLD 2005/1016

C function for P

C function for FPGAs

VHDL macro for FPGAs

SRC Program Partitioning

P system

FPGA system

HLL

HDL

Page 10: Gaj1MAPLD 2005/1016 Development and Maintenance of User Libraries for SRC Reconfigurable Computers Kris Gaj 1, Tarek El-Ghazawi 2, Paul Gage 3, Dan Poznanovic.

Gaj 10 MAPLD 2005/1016

Main program

Function_1(a, d, e)

Function_2(d, e, f)

Function_1

Function_2

Macro_1(a, b, c)

Macro_2(b, d)Macro_2(c, e)

Macro_3(s, t)

Macro_1(n, b)Macro_4(t, k)

FPGA……

……

……

Macro_1

Macro_2 Macro_2

a

b c

d e

FPGA contents afterthe Function_1 call

Program in C or Fortran

Run Time Reconfiguration in SRC

Page 11: Gaj1MAPLD 2005/1016 Development and Maintenance of User Libraries for SRC Reconfigurable Computers Kris Gaj 1, Tarek El-Ghazawi 2, Paul Gage 3, Dan Poznanovic.

Gaj 11 MAPLD 2005/1016

SRC Development Environment

Objectfiles

Application sources

MAP CompilerP Compiler

Logic synthesis

Place & Route

Linker.bin files

.edf files

.o files .o files

Applicationexecutable

Configurationbitstreams

HDLsources.c or .f files .vhd or .v files

Objectfiles

Application sourcesUser

Macro Sources

MAP CompilerP Compiler

Logic synthesis

Place & Route

Linker

.edf files

.bin files

. files

.o files .o files

Applicationexecutable

Configurationbitstreams

HDL

.c or .f files .vhd or .v files

.v files

Page 12: Gaj1MAPLD 2005/1016 Development and Maintenance of User Libraries for SRC Reconfigurable Computers Kris Gaj 1, Tarek El-Ghazawi 2, Paul Gage 3, Dan Poznanovic.

Gaj 12 MAPLD 2005/1016

Advantages of reconfigurable computers

• can be programmed by mathematicians themselves using traditional programming languages or GUI environments

• encourage innovation and experimentation

• general-purpose: cost distributed among multiple users with different needs

• behave like hardware: - parallel processing - distributed memory - specialized functional units, etc.

Page 13: Gaj1MAPLD 2005/1016 Development and Maintenance of User Libraries for SRC Reconfigurable Computers Kris Gaj 1, Tarek El-Ghazawi 2, Paul Gage 3, Dan Poznanovic.

Gaj 13 MAPLD 2005/1016

Conditions necessary for the success of reconfigurable computers

• ease of use of library macros and functions

• existence of comprehensive libraries of user macros and functions capable of running on FPGAs

• significant speed-ups ( 100 x) of basic functions running on FPGAs compared to state-of-the-art microprocessors

Page 14: Gaj1MAPLD 2005/1016 Development and Maintenance of User Libraries for SRC Reconfigurable Computers Kris Gaj 1, Tarek El-Ghazawi 2, Paul Gage 3, Dan Poznanovic.

Gaj 14 MAPLD 2005/1016

Development and Maintenance of SRC

Libraries

Page 15: Gaj1MAPLD 2005/1016 Development and Maintenance of User Libraries for SRC Reconfigurable Computers Kris Gaj 1, Tarek El-Ghazawi 2, Paul Gage 3, Dan Poznanovic.

Gaj 15 MAPLD 2005/1016

Structure of the macro repository < top of repository >

<lib # 1 >

common rev_d rev_e

hdlfile InfoFile BlkBoxFile

macro1 macro2 macro3

< macros >

<lib # 2 > <lib # 3 >

rev_f

DebugCodeFile

DataSheet

Page 16: Gaj1MAPLD 2005/1016 Development and Maintenance of User Libraries for SRC Reconfigurable Computers Kris Gaj 1, Tarek El-Ghazawi 2, Paul Gage 3, Dan Poznanovic.

Gaj 16 MAPLD 2005/1016

common: • These are macros that have no connections to external

pins nor to any specific FPGA type specific feature. This type of macro can be used on any MAP

rev_d: • These macros have a specific dependency on the dual MAP

rev_e: • These macros have a specific dependency on the single

MAP rev_f:

• These macros have a specific dependency on compact MAP

Macro Types

Page 17: Gaj1MAPLD 2005/1016 Development and Maintenance of User Libraries for SRC Reconfigurable Computers Kris Gaj 1, Tarek El-Ghazawi 2, Paul Gage 3, Dan Poznanovic.

Gaj 17 MAPLD 2005/1016

Files describing the macro

Platform independent HDL file: macro.v or macro.vh

• Verilog or VHDL code defining the macro

Debug Code File: macro.c • provides the equivalent C functionality for the macro

Platform dependent Blk Box File: blackbox.v

• Interface (black box) definition for the macro in Verilog

Data sheet file: datasheet• contains the documentation for the macro

Info File: info• Info file entry for the given macro, containing macro type, latency, names of input/output/control signals, etc.

Page 18: Gaj1MAPLD 2005/1016 Development and Maintenance of User Libraries for SRC Reconfigurable Computers Kris Gaj 1, Tarek El-Ghazawi 2, Paul Gage 3, Dan Poznanovic.

Gaj 18 MAPLD 2005/1016

To properly manage a distribution of macros a CVS repository must be setup.

This allows the source code changes to be controlled and permits multiple developers

to work on the code.

CVS repository

Page 19: Gaj1MAPLD 2005/1016 Development and Maintenance of User Libraries for SRC Reconfigurable Computers Kris Gaj 1, Tarek El-Ghazawi 2, Paul Gage 3, Dan Poznanovic.

Gaj 19 MAPLD 2005/1016

The Installed Macro Library Structure

<xxx lib>

map 3 (built for the Xilinx Virtex2) map 4 (built for the Xilinx Virtex2Pro)

common rev_d rev_e

ngo blkbox.v macros.info

macro1 macro2 macro3 ......

common rev_d rev_e

Single info file

Single blackbox file

Obtained by running a special script developed by SRC

Page 20: Gaj1MAPLD 2005/1016 Development and Maintenance of User Libraries for SRC Reconfigurable Computers Kris Gaj 1, Tarek El-Ghazawi 2, Paul Gage 3, Dan Poznanovic.

Gaj 20 MAPLD 2005/1016

Library Script

Usage:

build_libs [OPTION][-b, --branch br] Specify CVS branch[-c, --checkout] Checkout only[-d, --CVSROOT cvsroot] Specify CVSROOT[-M, --MAP maptype] Build for MAP maptype[-m, --module mod] Build mod only[-r, --restart mmddyy-hhmm] Restart previous build[-s, --step target] Run build step target[-v, --version N.n] Package as version N.n[-V, --vendor vend] Specify distribution vendor[-w, --workspace path] Create workspace in path

Page 21: Gaj1MAPLD 2005/1016 Development and Maintenance of User Libraries for SRC Reconfigurable Computers Kris Gaj 1, Tarek El-Ghazawi 2, Paul Gage 3, Dan Poznanovic.

Gaj 21 MAPLD 2005/1016

Building libraries

• build_libs will checkout library and perform a build in /var/tmp/builds in a folder with a time stamp (i.e. 080405-1705)

• If there is an error check file called ‘output’ in the /var/tmp/builds. Fix the error and restart build by:

• build_libs --restart 080405-1705• You can also do a partial build, say only build

the library and not the CD• build_libs --step lib

• To build only a particular subset of a library, you can do so using a command such as:

• build_libs --module crypto

Page 22: Gaj1MAPLD 2005/1016 Development and Maintenance of User Libraries for SRC Reconfigurable Computers Kris Gaj 1, Tarek El-Ghazawi 2, Paul Gage 3, Dan Poznanovic.

Gaj 22 MAPLD 2005/1016

Structure for the repository of MAP C functions

< top of repository >

<lib # 1 >

common rev_d rev_e

routine1 routine2 routine3

< userlib >

<lib # 2 > <lib # 3 >

rev_f

Page 23: Gaj1MAPLD 2005/1016 Development and Maintenance of User Libraries for SRC Reconfigurable Computers Kris Gaj 1, Tarek El-Ghazawi 2, Paul Gage 3, Dan Poznanovic.

Gaj 23 MAPLD 2005/1016

Source file: • This is the .mc or .mf file defining the MAP routine

proto.h: • This file provides a prototype of the MAP routine

Makefile: • This is a standard Carte Makefile, with the exception that no BIN

environment variable is provided.

Docfile:• This file provide a man page format documentation

of the MAP routine.

Files describing the MAP C routine

Page 24: Gaj1MAPLD 2005/1016 Development and Maintenance of User Libraries for SRC Reconfigurable Computers Kris Gaj 1, Tarek El-Ghazawi 2, Paul Gage 3, Dan Poznanovic.

Gaj 24 MAPLD 2005/1016

The Installed MAP Routine Library Structure

<userlib >

map 3 map 4

common rev_d rev_e

lib1.a lib1.so lib2.a

common rev_d rev_e

lib2.so ......

Page 25: Gaj1MAPLD 2005/1016 Development and Maintenance of User Libraries for SRC Reconfigurable Computers Kris Gaj 1, Tarek El-Ghazawi 2, Paul Gage 3, Dan Poznanovic.

Gaj 25 MAPLD 2005/1016

Known problems:No support for variable size

of operands

Page 26: Gaj1MAPLD 2005/1016 Development and Maintenance of User Libraries for SRC Reconfigurable Computers Kris Gaj 1, Tarek El-Ghazawi 2, Paul Gage 3, Dan Poznanovic.

Gaj 26 MAPLD 2005/1016

We would like to be able to create and maintain a library of generic components that work for various operand sizes.

Problem statement

Example:

Basic arithmetic operations (addition, subtraction, multiplication, division) of multiprecision (n-bit) integers.

Page 27: Gaj1MAPLD 2005/1016 Development and Maintenance of User Libraries for SRC Reconfigurable Computers Kris Gaj 1, Tarek El-Ghazawi 2, Paul Gage 3, Dan Poznanovic.

Gaj 27 MAPLD 2005/1016

Possible solutions

1. Fixed-size interface to a macro

• using streams• without using streams

2. Variable-size interface to a macro cell

Page 28: Gaj1MAPLD 2005/1016 Development and Maintenance of User Libraries for SRC Reconfigurable Computers Kris Gaj 1, Tarek El-Ghazawi 2, Paul Gage 3, Dan Poznanovic.

Gaj 28 MAPLD 2005/1016

Input (64-bits)

Output (64-bits)

Process Process

Page 29: Gaj1MAPLD 2005/1016 Development and Maintenance of User Libraries for SRC Reconfigurable Computers Kris Gaj 1, Tarek El-Ghazawi 2, Paul Gage 3, Dan Poznanovic.

Gaj 29 MAPLD 2005/1016

Passing variable-size operandswithout streams

for (i=0; i<3*N+1; i++) { if (i < N) A_in = c[i]; B_in = d[i]; else A_in = 0; B_in = 0;

mul (i, A_in, B_in, &C_out);

if (i > N) e[i-N] = C_out;}

Page 30: Gaj1MAPLD 2005/1016 Development and Maintenance of User Libraries for SRC Reconfigurable Computers Kris Gaj 1, Tarek El-Ghazawi 2, Paul Gage 3, Dan Poznanovic.

Gaj 30 MAPLD 2005/1016

Passing variable size operandsusing streams

#pragma src section { for (i=0; i<N; i++) { put_stream (&S0, A[i], 1); // put A[i] to S0 put_stream (&S1, B[i], 1); // put B[i] to S1 } } #pragma src section { mul (&S0, &S1, &S2); // read from S0 and S1, write to S2 } #pragma src section { for (i=0; i<2*N; i++) get_stream (&S2, &C[i]); // take from S2 and write to C[i] }

Page 31: Gaj1MAPLD 2005/1016 Development and Maintenance of User Libraries for SRC Reconfigurable Computers Kris Gaj 1, Tarek El-Ghazawi 2, Paul Gage 3, Dan Poznanovic.

Gaj 31 MAPLD 2005/1016

Process Process

Page 32: Gaj1MAPLD 2005/1016 Development and Maintenance of User Libraries for SRC Reconfigurable Computers Kris Gaj 1, Tarek El-Ghazawi 2, Paul Gage 3, Dan Poznanovic.

Gaj 32 MAPLD 2005/1016

Multiprecision Integer Library Generator

Multiprecision Integer Library

Generator(C engine)

C/VHDL Wrapper

Black Box Info file

Size of operands - N

In-line MAP Cfunction

Page 33: Gaj1MAPLD 2005/1016 Development and Maintenance of User Libraries for SRC Reconfigurable Computers Kris Gaj 1, Tarek El-Ghazawi 2, Paul Gage 3, Dan Poznanovic.

Gaj 33 MAPLD 2005/1016

Inline MAP C functionfor N=2

int mul (int64_t *A, int64_t *B, int64_t *C, N){int64_t A0, A1;int64_t B0, B1;int64_t C0, C1, C2, C3;

A0=A[0];A1=A[1];B0=B[0];B1=B[1];Mul_128(A0, A1, B0, B1, &C0, &C1, &C2, &C3);C[0] = C0;C[1] = C1;C[2] = C2;C[3] = C3;}

Page 34: Gaj1MAPLD 2005/1016 Development and Maintenance of User Libraries for SRC Reconfigurable Computers Kris Gaj 1, Tarek El-Ghazawi 2, Paul Gage 3, Dan Poznanovic.

Gaj 34 MAPLD 2005/1016

Pros and cons of both methods

1. Fixed-size interface to a macro

Pros: Interface independent of the operand size

Cons: input/output overhead

2. Variable-size interface to a macro cell

Pros: minimum overhead

Cons: need to generate automatically several macro files,

need for changes in the compiler

Page 35: Gaj1MAPLD 2005/1016 Development and Maintenance of User Libraries for SRC Reconfigurable Computers Kris Gaj 1, Tarek El-Ghazawi 2, Paul Gage 3, Dan Poznanovic.

Gaj 35 MAPLD 2005/1016

GMU/GWU Libraries

Page 36: Gaj1MAPLD 2005/1016 Development and Maintenance of User Libraries for SRC Reconfigurable Computers Kris Gaj 1, Tarek El-Ghazawi 2, Paul Gage 3, Dan Poznanovic.

Gaj 36 MAPLD 2005/1016

Cryptographic Libraries

Secret Key Ciphers

Secret key ciphers encryption and breaking – SecCiph

Public Key Ciphers • Elliptic Curve Cryptosystems arithmetic - ECC• Binary Galois Field GF(2m) arithmetic in Polynomial Basis - GF2n_PB• Binary Galois Field GF(2m) arithmetic in Normal Basis - GF2n_NB• Multiprecision integer arithmetic (in collaboration with University of South Carolina) – Long_Int• Operations supporting factorization of large integers using Number Field Sieve - NFS

Page 37: Gaj1MAPLD 2005/1016 Development and Maintenance of User Libraries for SRC Reconfigurable Computers Kris Gaj 1, Tarek El-Ghazawi 2, Paul Gage 3, Dan Poznanovic.

Gaj 37 MAPLD 2005/1016

Digital Image Processing Libraries

Image Enhancement / Restoration Single-Resolution

Noise Reduction (Convolution Filtering) Smoothing (Lowpass) Gaussian (Lowpass) Blurring (Lowpass) Sharpening (Highpass)

Edge Detection (Derivative Filters) Prewitt Sobel

Multi-Resolution Discrete Wavelet Transform (DWT) Inverse Discrete Wavelet Transform (IDWT)

Similarity Measures Correlation

Page 38: Gaj1MAPLD 2005/1016 Development and Maintenance of User Libraries for SRC Reconfigurable Computers Kris Gaj 1, Tarek El-Ghazawi 2, Paul Gage 3, Dan Poznanovic.

Gaj 38 MAPLD 2005/1016

Miscellaneous Libraries

Sorting

Stream-searching

BMM - Bit Matrix Multiply

DARPA benchmarks

Page 39: Gaj1MAPLD 2005/1016 Development and Maintenance of User Libraries for SRC Reconfigurable Computers Kris Gaj 1, Tarek El-Ghazawi 2, Paul Gage 3, Dan Poznanovic.

Gaj 39 MAPLD 2005/1016

Performance of selected

applications based on GMU/GWU

libraries

Page 40: Gaj1MAPLD 2005/1016 Development and Maintenance of User Libraries for SRC Reconfigurable Computers Kris Gaj 1, Tarek El-Ghazawi 2, Paul Gage 3, Dan Poznanovic.

Gaj 40 MAPLD 2005/1016

1. input/output intensive applications• bulk data encryption

(DES, IDEA, and RC5 encryption) • image processing (Sobel Edge Detection, Median Filter,

Wavelet Hyperspectral Dimension Reduction)

2. computationally intensive applications• secret-key cipher breaking based on

the exhaustive key search (DES, IDEA, RC5 breakers)

• public-key cipher breaking based on factoring

3. latency-critical applications• cipher key agreement and signature (ECC schemes, RSA)

Classes of applications

Page 41: Gaj1MAPLD 2005/1016 Development and Maintenance of User Libraries for SRC Reconfigurable Computers Kris Gaj 1, Tarek El-Ghazawi 2, Paul Gage 3, Dan Poznanovic.

Gaj 41 MAPLD 2005/1016

PC based on Pentium IV, 2.4 GHz clock,

512 MB of RAM, 512 KB of cache

Reference Platform

Treated as a basic building block of a clusterof microprocessor boards.

Platform used in experiments

SRC-6E from SRC Computers, Inc.

Page 42: Gaj1MAPLD 2005/1016 Development and Maintenance of User Libraries for SRC Reconfigurable Computers Kris Gaj 1, Tarek El-Ghazawi 2, Paul Gage 3, Dan Poznanovic.

Gaj 42 MAPLD 2005/1016

Timing Measurements

MAPAlloc.

MAP

FreeDMA

DataOut

DMA

Data In

FPGA

Computation

.c file .mc file

End-to-End time (SW)

MAPfunction

MAP function

FPGA

Configure

Configuration time

MAP

Allocation

time

MAP

Release

Time

End-to-End time (HW)

MAP – SRC Reconfigurable Processor based on two User FPGAs

Page 43: Gaj1MAPLD 2005/1016 Development and Maintenance of User Libraries for SRC Reconfigurable Computers Kris Gaj 1, Tarek El-Ghazawi 2, Paul Gage 3, Dan Poznanovic.

Gaj 43 MAPLD 2005/1016

Application

ComputationalThroughput

(Mbits/s)

DataTransfer InThroughput

(Mbits/s)

DataTransfer OutThroughput

(Mbits/s)

End-to-End Throughput

(Mbits/s)Speed up

SRC 6E SRC 6E SRC 6E SRC 6E Pentium IV

DESEncryption 6,398 2,488 1,705 863 58 14.9

IDEAEncryption 12,788 2,487 1,799 938 165 5.7

RC5Encryption 6,398 2,505 1,590 836 366 2.3

Sobel EdgeDetection 5,680 2,493 1,701 849 76 11.0

MedianFilter 5,681 2,484 1,710 850 5 170

WaveletHyperspectral

DimensionReduction

6395 2,573 1,477 81867 – 159

(5 levels –1 level)

5 – 12(1 level –5 levels)

Input/Output Intensive ApplicationsP3 version of SRC-6E

Page 44: Gaj1MAPLD 2005/1016 Development and Maintenance of User Libraries for SRC Reconfigurable Computers Kris Gaj 1, Tarek El-Ghazawi 2, Paul Gage 3, Dan Poznanovic.

Gaj 44 MAPLD 2005/1016

Wavelet Hyperspectral Dimension ReductionTime contributions

P3 version of SRC-6E vs. Pentium IV PC

Page 45: Gaj1MAPLD 2005/1016 Development and Maintenance of User Libraries for SRC Reconfigurable Computers Kris Gaj 1, Tarek El-Ghazawi 2, Paul Gage 3, Dan Poznanovic.

Gaj 45 MAPLD 2005/1016

Application

ComputatinalThroughput

(Mbits/s)

DataTransfer InThroughput

(Mbits/s)

DataTransfer OutThroughput

(Mbits/s)

End-to-End Throughput

(Mbits/s)Speed up

SRC 6E SRC 6E SRC 6E SRC 6E Pentium IV

IDEAEncryption 12,790 10,627 10,583 3,479 165 21

RC5Encryption 6398 6371 6373 2,098 366 5.7

Sobel EdgeDetection 5,683 6,384 6,380 2,044 76 27

MedianFilter 5,684 6,384 6,383 2,044 5 409

WaveletHyperspectral

DimensionReduction

6,394 6,349 3,185 1,62667 – 159

(5 levels –1 level)

10 – 24(1 level – 5 levels)

Input/Output Intensive ApplicationsP4 version of SRC-6E

Page 46: Gaj1MAPLD 2005/1016 Development and Maintenance of User Libraries for SRC Reconfigurable Computers Kris Gaj 1, Tarek El-Ghazawi 2, Paul Gage 3, Dan Poznanovic.

Gaj 46 MAPLD 2005/1016

Wavelet Hyperspectral Dimension ReductionTime contributions

P4 version of SRC-6E vs. Pentium IV PC

Page 47: Gaj1MAPLD 2005/1016 Development and Maintenance of User Libraries for SRC Reconfigurable Computers Kris Gaj 1, Tarek El-Ghazawi 2, Paul Gage 3, Dan Poznanovic.

Gaj 47 MAPLD 2005/1016

Application

ComputationalThroughput

(Mbits/s)

DataTransfer InThroughput

(Mbits/s)

DataTransfer OutThroughput

(Mbits/s)

End-to-End Throughput

(Mbits/s)Speed up

SRC 6E SRC 6E SRC 6E SRC 6E Pentium IV

IDEAEncryption

(no overlapping)12,790 10,627 10,583 3,479 165 21

IDEAEncryption

(with overlapping)10,857 9,792 10,564 4,887 165 30

RC5Encryption

(no overlapping)6398 6371 6373 2,098 366 5.7

RC5Encryption

(with overlapping)6398 6,372 6,349 3,110 366 8.5

Input/Output Intensive ApplicationsP4 version of SRC-6E

without and with overlappingcomputations and data transfers

Page 48: Gaj1MAPLD 2005/1016 Development and Maintenance of User Libraries for SRC Reconfigurable Computers Kris Gaj 1, Tarek El-Ghazawi 2, Paul Gage 3, Dan Poznanovic.

Gaj 48 MAPLD 2005/1016

Application

ComputationalThroughput

(Mbits/s)

DataTransfer InThroughput

(Mbits/s)

DataTransfer OutThroughput

(Mbits/s)

End-to-End Throughput

(Mbits/s)Speed up

SRC 6 SRC 6 SRC 6 SRC 6 Pentium IV

DESEncryption

(no overlapping)19,200 11,350 10,760 4,240 58 73

IDEAEncryption

(no overlapping)19,200 11,350 10,760 4,240 165 26

RC5Encryption

(no overlapping)19,200 11,350 10,760 4,240 366 12

Input/Output Intensive ApplicationsSRC Hi-Bar Based System

Page 49: Gaj1MAPLD 2005/1016 Development and Maintenance of User Libraries for SRC Reconfigurable Computers Kris Gaj 1, Tarek El-Ghazawi 2, Paul Gage 3, Dan Poznanovic.

Gaj 49 MAPLD 2005/1016

Application

ComputationalThroughput

DataTransfer InThroughput

DataTransfer

OutThroughput

End-to-End Throughput

(mln keys/s) (mln keys/s) (mln keys/s) (mln keys/s)

SpeedupSRC 6E SRC 6E SRC 6E SRC 6E

PentiumIV

DES Breaker

800 N/A N/A 800 0.469 1706

IDEA Breaker

1000 N/A N/A 500 1.701 294

RC5

Breaker100 N/A N/A 100 0.516 194

Computationally Intensive ApplicationsP3 version of SRC-6E

Page 50: Gaj1MAPLD 2005/1016 Development and Maintenance of User Libraries for SRC Reconfigurable Computers Kris Gaj 1, Tarek El-Ghazawi 2, Paul Gage 3, Dan Poznanovic.

Gaj 50 MAPLD 2005/1016

Latency-Critical Applications

Application

ComputatinalLatency

DataTransfer In

Latency

DataTransfer

OutLatency

End-to-End Latency

(μs) (μs) (μs) (μs)

Speedup

SRC 6E SRC 6E SRC 6E SRC 6EPentium

IV

ECC DHKey Agreementover GF(2233),

Optimal Normal Basis

201 39 17 592 364,000 615

ECC DH Key Agreement

over GF(2233), Polynomial Basis

560 66 7 943 31,050 33

Page 51: Gaj1MAPLD 2005/1016 Development and Maintenance of User Libraries for SRC Reconfigurable Computers Kris Gaj 1, Tarek El-Ghazawi 2, Paul Gage 3, Dan Poznanovic.

Gaj 51 MAPLD 2005/1016

RSA: SRC vs. OpenSSL Software Comparison

Data SizeSW Function

Time (ms)SW Speedup

vs. MAP SW

1024 47.248 4.821 x

1536 138.466 3.642 x

2048 269.948 3.321 x

3072 853.050 3.468 x

4096 1755.266 3.624 x

Page 52: Gaj1MAPLD 2005/1016 Development and Maintenance of User Libraries for SRC Reconfigurable Computers Kris Gaj 1, Tarek El-Ghazawi 2, Paul Gage 3, Dan Poznanovic.

Gaj 52 MAPLD 2005/1016

Sparse matrix by vector multiplication

MatrixSize

K

OneMultiplicationTime in SW

(ns)

OneMultiplicationTime in HW

(ns)

Speedup

144x144(Mesh12x12)

70 3440 12 282

Reference Optimized SW Implementation:

PC, Pentium IV, 2.768 GHz, 1 GB RAM

Page 53: Gaj1MAPLD 2005/1016 Development and Maintenance of User Libraries for SRC Reconfigurable Computers Kris Gaj 1, Tarek El-Ghazawi 2, Paul Gage 3, Dan Poznanovic.

Gaj 53 MAPLD 2005/1016

Summary &

Conclusions

Page 54: Gaj1MAPLD 2005/1016 Development and Maintenance of User Libraries for SRC Reconfigurable Computers Kris Gaj 1, Tarek El-Ghazawi 2, Paul Gage 3, Dan Poznanovic.

Gaj 54 MAPLD 2005/1016

Summary

Type of applicationEnd-to-end

speed-up of SRC vs. P4

Computationally intensive(cipher breaking)

200-1700

Latency critical RSA 0.2-0.3 ECC polynomial bases, general fields 33 ECC polynomial bases, special fields 12-27 ECC optimal normal bases 600

Input/output intensive 3-30(secret key encryption/decryption)

Page 55: Gaj1MAPLD 2005/1016 Development and Maintenance of User Libraries for SRC Reconfigurable Computers Kris Gaj 1, Tarek El-Ghazawi 2, Paul Gage 3, Dan Poznanovic.

Gaj 55 MAPLD 2005/1016

Summary & conclusions (1)

General methodology for the design and maintenanceof SRC user libraries developed and tested

Existing libraries evaluated in terms of - performance - ease of use - flexibilityfor three wide classes of applications

Initial results very encouraging

Page 56: Gaj1MAPLD 2005/1016 Development and Maintenance of User Libraries for SRC Reconfigurable Computers Kris Gaj 1, Tarek El-Ghazawi 2, Paul Gage 3, Dan Poznanovic.

Gaj 56 MAPLD 2005/1016

Selected files from the SRC libraries can be usedfor development of comparable librariesfor other reconfigurable computers

Full compatibility with other reconfigurable computers difficult to achieve because of the technical differences and intellectual property constraints

Summary & conclusions (2)