HPPS - Final - 06/14/2007
-
Upload
usrdresd -
Category
Technology
-
view
1.074 -
download
0
Transcript of HPPS - Final - 06/14/2007
![Page 1: HPPS - Final - 06/14/2007](https://reader038.fdocuments.net/reader038/viewer/2022103000/556c0ae5d8b42a852a8b4770/html5/thumbnails/1.jpg)
POLITECNICO DI MILANO
High Performance Processors and
Systems PdM – UIC joint master 2007PdM – UIC joint master 2007
Instructor: Prof. Donatella SciutoInstructor: Prof. Donatella Sciuto
HPPS @ PdM – June 2007HPPS @ PdM – June 2007
![Page 2: HPPS - Final - 06/14/2007](https://reader038.fdocuments.net/reader038/viewer/2022103000/556c0ae5d8b42a852a8b4770/html5/thumbnails/2.jpg)
2
General OutlineGeneral Outline
DRESDDReAMS
Alessandro PanellaMatteo Murgida
Operating SystemIvan Beretta
Design FlowAntonio Piazzi
PolarisMassimo MorandiMarco Novati
HLRMarco Maggioni
![Page 3: HPPS - Final - 06/14/2007](https://reader038.fdocuments.net/reader038/viewer/2022103000/556c0ae5d8b42a852a8b4770/html5/thumbnails/3.jpg)
POLITECNICO DI MILANO
DRESDDRESD in a Nutshell in a NutshellDynamic Reconfigurability in Embedded System
Design
DRESD @ PdM – June 2007DRESD @ PdM – June 2007
![Page 4: HPPS - Final - 06/14/2007](https://reader038.fdocuments.net/reader038/viewer/2022103000/556c0ae5d8b42a852a8b4770/html5/thumbnails/4.jpg)
4
OutlineOutline
ReconfigurationMotivationsBasic DefinitionSoC
![Page 5: HPPS - Final - 06/14/2007](https://reader038.fdocuments.net/reader038/viewer/2022103000/556c0ae5d8b42a852a8b4770/html5/thumbnails/5.jpg)
5
MotivationsMotivations
Increasing need for behavioral flexibility in embedded systems design
Support of new standards, e.g. in media processingAddition of new features
Applications too large to fit on the device all at once
Speedup the overall computation of the final system
![Page 6: HPPS - Final - 06/14/2007](https://reader038.fdocuments.net/reader038/viewer/2022103000/556c0ae5d8b42a852a8b4770/html5/thumbnails/6.jpg)
6
ReconfigurationReconfiguration
The process of physically altering the location or functionality of network or system elements. Automatic configuration describes the way sophisticated networks can readjust themselves in the event of a link or device failing, enabling the network to continue operation.
Gerald Estrin, 1960
![Page 7: HPPS - Final - 06/14/2007](https://reader038.fdocuments.net/reader038/viewer/2022103000/556c0ae5d8b42a852a8b4770/html5/thumbnails/7.jpg)
7
SoC ReconfigurationSoC Reconfiguration
fix
Partial TotalEmbedded
![Page 8: HPPS - Final - 06/14/2007](https://reader038.fdocuments.net/reader038/viewer/2022103000/556c0ae5d8b42a852a8b4770/html5/thumbnails/8.jpg)
8
Different Scenarios...Different Scenarios...
Single Device Distributed System
![Page 9: HPPS - Final - 06/14/2007](https://reader038.fdocuments.net/reader038/viewer/2022103000/556c0ae5d8b42a852a8b4770/html5/thumbnails/9.jpg)
9
What’s nextWhat’s next
DRESDDReAMS
Alessandro PanellaMatteo Murgida
Operating SystemIvan Beretta
Design FlowAntonio Piazzi
PolarisMassimo MorandiMarco Novati
HLRMarco Maggioni
![Page 10: HPPS - Final - 06/14/2007](https://reader038.fdocuments.net/reader038/viewer/2022103000/556c0ae5d8b42a852a8b4770/html5/thumbnails/10.jpg)
POLITECNICO DI MILANO
DDynamicynamic Re Reconfigurabilityconfigurability AAppliedpplied toto M Multi-FPGAulti-FPGA
SSystemsystems
DReAMS
![Page 11: HPPS - Final - 06/14/2007](https://reader038.fdocuments.net/reader038/viewer/2022103000/556c0ae5d8b42a852a8b4770/html5/thumbnails/11.jpg)
DReAMSDReAMS
Dynamic ReconfigurabilityApplied to Multi-
FPGA SystemsBranch of DRESD projectInherits architectures and tools
Automatic workflow from VHDL system description to FPGA implementation
VHDL parsing and system simulationSystem creation over a specific architectureBitstream creation and download onto FPGAs
DReAMS
![Page 13: HPPS - Final - 06/14/2007](https://reader038.fdocuments.net/reader038/viewer/2022103000/556c0ae5d8b42a852a8b4770/html5/thumbnails/13.jpg)
13
OutlineOutline
Problem description
Project goals and contributions
Project phases
What is partitioning?
Existing approaches
Going deep into the problem
SpartAThe frameworkThe ideaThe algorithm
Experimental resultsFuture work
![Page 14: HPPS - Final - 06/14/2007](https://reader038.fdocuments.net/reader038/viewer/2022103000/556c0ae5d8b42a852a8b4770/html5/thumbnails/14.jpg)
14
Problem descriptionProblem description
Multi-FPGA - RATIONALELarge designs do not fit into a single chipHigh performance parallelized applicationsOur case: apply dynamic reconfigurability
Need to break the initial design into several blocks
One block corresponds to a single FPGA chipWhich inputs/outputs?Which objectives?Which techinques?
![Page 15: HPPS - Final - 06/14/2007](https://reader038.fdocuments.net/reader038/viewer/2022103000/556c0ae5d8b42a852a8b4770/html5/thumbnails/15.jpg)
15
Project goals and Project goals and contributionscontributions
Analyze existing approachesObtain a deep knowledge of this -well explored- fieldExtract basic ideas for a new approachObtain some terms of comparison
Define precisely which problem(s) we cope withContextualize the problemFocus on our needs
Develop a new solutionTheoretical backgroundImplementation and evaluation
![Page 16: HPPS - Final - 06/14/2007](https://reader038.fdocuments.net/reader038/viewer/2022103000/556c0ae5d8b42a852a8b4770/html5/thumbnails/16.jpg)
16
Project phasesProject phases
First Phase [15th March – 12th April]Documentation: presentation (12/4), reportGoals:
Analysis of the state of the artProduce some hints on a new approach
Second Phase [13th April – 17th May]Documentation: presentation (17/5), reportGoals:
Precise definition of the problemPropose a new solution
Third Phase [18th May – 14th June]Documentation: presentation (14/6), final reportGoal
Implementation and evaluation of the proposed solution
![Page 17: HPPS - Final - 06/14/2007](https://reader038.fdocuments.net/reader038/viewer/2022103000/556c0ae5d8b42a852a8b4770/html5/thumbnails/17.jpg)
17
What is partitioning?What is partitioning?
GoalDivide a set of interrelated objects into a set of subsetsOptimize a specific objective(s)
K-way partitioning• Given a graph G=(V,E), partition it into k subsets
V1...Vk such that their intersection is empty and their union = V.
• Balance constraint: |Vi| ≈ |V|/k
Aims at minimizing (or maximizing) an objective function
Edge-cutOther objectives
In general: NP-completeSeveral heuristics that provide good results have been developed
![Page 18: HPPS - Final - 06/14/2007](https://reader038.fdocuments.net/reader038/viewer/2022103000/556c0ae5d8b42a852a8b4770/html5/thumbnails/18.jpg)
18
Existing approaches - a glanceExisting approaches - a glance
Traditional methodsKernighan – Lin and Fiduccia – Mattheyses heuristics
Iterative-improvement algorithmsBegins with an initial partition and iteratively improve itO(n3) complexity
Iterative algorithmsGeneticSimulated annealing
Multilevel algorithmsClustering -> Initial partitioning -> RefiningMeTIS/hMETIS suite: best current results for large flattened graphs partitioning
![Page 19: HPPS - Final - 06/14/2007](https://reader038.fdocuments.net/reader038/viewer/2022103000/556c0ae5d8b42a852a8b4770/html5/thumbnails/19.jpg)
19
Going deeper into the problemGoing deeper into the problem
Two kinds of multi-FPGA partitionTopology-aware
Architecture topology is an inputNo optimization of the no. of FPGAs neededMain task: association between the (larger) system graph and the (smaller) architectural graph
Topology-freeArchitecture topology is not providedInput: dimension and communication features of FPGAsMinimization of the number of FPGAsPlace and route after partitioning
At the moment, we deal with the Topology-free problem
![Page 20: HPPS - Final - 06/14/2007](https://reader038.fdocuments.net/reader038/viewer/2022103000/556c0ae5d8b42a852a8b4770/html5/thumbnails/20.jpg)
20
SPartA: the frameworkSPartA: the framework
Input: VHDL system description
Output: several VHDL files, one for each block (FPGA)
Three main phases:Extract design from VHDL description“Real” partitioning phase (core)Build VHDL files
![Page 21: HPPS - Final - 06/14/2007](https://reader038.fdocuments.net/reader038/viewer/2022103000/556c0ae5d8b42a852a8b4770/html5/thumbnails/21.jpg)
21
SPartA: the ideaSPartA: the ideaStructural approach
Fully exploits the design hierarchyModules can be treated as single blocksBases for expansions toward dynamic reconfigurability
ObjectivesMinimize cutsizeMinimized the number of used FPGAsPreserving module integrity
![Page 22: HPPS - Final - 06/14/2007](https://reader038.fdocuments.net/reader038/viewer/2022103000/556c0ae5d8b42a852a8b4770/html5/thumbnails/22.jpg)
22
SPartA: the algorithm SPartA: the algorithm 1/21/2
Recursive algorithm (deals with trees)Starts from TOP nodePrecondition
No leaves with dimension > FPGA sizeAt every moment, a node can be:
COVERED, UNCOVERED or PARTIALLY COVERED
Stop condition• Node TOP is COVERED
![Page 23: HPPS - Final - 06/14/2007](https://reader038.fdocuments.net/reader038/viewer/2022103000/556c0ae5d8b42a852a8b4770/html5/thumbnails/23.jpg)
23
SPartA: the algorithm SPartA: the algorithm 2/22/2
OPEN ISSUE: Selecting the first node to be inserted into an empty partition
Random nodeNode with overall max communicationNode with max communication with its siblings
![Page 24: HPPS - Final - 06/14/2007](https://reader038.fdocuments.net/reader038/viewer/2022103000/556c0ae5d8b42a852a8b4770/html5/thumbnails/24.jpg)
24
Results Results 2/22/2
Complexity: exponential, due to the recursive nature of the algorithmExecution time however low (tens of seconds for a reasonable large design)EXAMPLE
ORIGINAL TREE PARTITIONED TREE
![Page 25: HPPS - Final - 06/14/2007](https://reader038.fdocuments.net/reader038/viewer/2022103000/556c0ae5d8b42a852a8b4770/html5/thumbnails/25.jpg)
25
Results Results 3/33/3
Evaluation metricsEDGECUT, FILLING and SPLITS
Evaluation of the three policies for node selection18 different trees of varying size
![Page 26: HPPS - Final - 06/14/2007](https://reader038.fdocuments.net/reader038/viewer/2022103000/556c0ae5d8b42a852a8b4770/html5/thumbnails/26.jpg)
26
Results Results 3/33/3
![Page 27: HPPS - Final - 06/14/2007](https://reader038.fdocuments.net/reader038/viewer/2022103000/556c0ae5d8b42a852a8b4770/html5/thumbnails/27.jpg)
27
Future workFuture work
Algorithm improvementBalancing of last partitionFirst node selection policiesMore refined “score” function for selecting node
Use closeness metrics
Comparisons with existing algorithms
ExpansionSpartA framework developmentTopology-aware partitioning
![Page 28: HPPS - Final - 06/14/2007](https://reader038.fdocuments.net/reader038/viewer/2022103000/556c0ae5d8b42a852a8b4770/html5/thumbnails/28.jpg)
28
The endThe end
ANY QUESTIONS?
![Page 29: HPPS - Final - 06/14/2007](https://reader038.fdocuments.net/reader038/viewer/2022103000/556c0ae5d8b42a852a8b4770/html5/thumbnails/29.jpg)
29
What’s nextWhat’s next
DRESDDReAMS
Alessandro PanellaMatteo Murgida
Operating SystemIvan Beretta
Design FlowAntonio Piazzi
PolarisMassimo MorandiMarco Novati
HLRMarco Maggioni
![Page 30: HPPS - Final - 06/14/2007](https://reader038.fdocuments.net/reader038/viewer/2022103000/556c0ae5d8b42a852a8b4770/html5/thumbnails/30.jpg)
POLITECNICO DI MILANO
ChimeraChimeraMulti-FPGAs Architecture DefinitionMulti-FPGAs Architecture Definition
Matteo [email protected]
![Page 31: HPPS - Final - 06/14/2007](https://reader038.fdocuments.net/reader038/viewer/2022103000/556c0ae5d8b42a852a8b4770/html5/thumbnails/31.jpg)
OutlineOutline
IntroductionProblem descriptionProject GoalsState of the Art
Project in detailsContributionsPhasesResults
What’s next
![Page 32: HPPS - Final - 06/14/2007](https://reader038.fdocuments.net/reader038/viewer/2022103000/556c0ae5d8b42a852a8b4770/html5/thumbnails/32.jpg)
32
Problem DescriptionProblem Description
Architectural description of a distributed FPGAs environment3 layers architecture
![Page 33: HPPS - Final - 06/14/2007](https://reader038.fdocuments.net/reader038/viewer/2022103000/556c0ae5d8b42a852a8b4770/html5/thumbnails/33.jpg)
33
Project GoalsProject Goals
Design the architecture of the most generic distributed system
Node definitionInterface definitionCommunication channel definition
Design a communication protocolEssential protocolInterrupt based protocolTimeout improvement
![Page 34: HPPS - Final - 06/14/2007](https://reader038.fdocuments.net/reader038/viewer/2022103000/556c0ae5d8b42a852a8b4770/html5/thumbnails/34.jpg)
34
State of the ArtState of the Art
CONFigurable ElecTronic TIssue (CONFETTI) by EPFLCellular based architecturePROs: high degree of parallelism, high computational powerCONs: no flexibility, oversized for small problems, small architectural customizations imply big cost/effort
Splash 2 by IDA Supercomputing CenterArchitecture composed by a Sun Sparcstation host, an interface board and “Splash Array”s boardsPROs: again high parallelism and powerCONs: a central host coordinates the computational units, no fault tollerance, no flexibility
![Page 35: HPPS - Final - 06/14/2007](https://reader038.fdocuments.net/reader038/viewer/2022103000/556c0ae5d8b42a852a8b4770/html5/thumbnails/35.jpg)
35
ContributionsContributions
The proposed architecture:
Allows several Spartan-3 Starter Boards to communicate and exchange data
It is portable to different FPGAs with minimum effort
It is the basic infrastructure that will allow external partial dynamic reconfiguration
![Page 36: HPPS - Final - 06/14/2007](https://reader038.fdocuments.net/reader038/viewer/2022103000/556c0ae5d8b42a852a8b4770/html5/thumbnails/36.jpg)
36
Project PhasesProject Phases
First Phase, time window: 15th March – 12th AprilDocumentation: prj presentation (12/4), prj reportGoals:
Digilent Spartan-3 Starter Board studyBoards connection
Second Phase, time window: 13th April – 17th MayDocumentation: prj presentation (17/5), prj reportGoals:
Communication between two Microblaze soft-processorsGPIO integration in the architecture
Third Phase, time window: 18th May – 14th JuneDocumentation: prj presentation (14/6), prj reportGoals:
Interrupt handling, timeout handlingSimple application as example
![Page 37: HPPS - Final - 06/14/2007](https://reader038.fdocuments.net/reader038/viewer/2022103000/556c0ae5d8b42a852a8b4770/html5/thumbnails/37.jpg)
37
Board StudyBoard Study
How to use resources like switches, leds and connectors in the boardHow to map an IP-Core port with a physical pin of the boardChoice of the A2 Expansion Connector to connect two boards
![Page 38: HPPS - Final - 06/14/2007](https://reader038.fdocuments.net/reader038/viewer/2022103000/556c0ae5d8b42a852a8b4770/html5/thumbnails/38.jpg)
38
Microblaze CommunicationMicroblaze Communication
Communication between two Microblaze soft-processorsDevelopment of a display controller to visualize the data flow
![Page 39: HPPS - Final - 06/14/2007](https://reader038.fdocuments.net/reader038/viewer/2022103000/556c0ae5d8b42a852a8b4770/html5/thumbnails/39.jpg)
39
GPIO InsertionGPIO Insertion
Higher architecture portability through the use of the GPIO IP-Core.Higher architecture portability through the use of the GPIO IP-Core
![Page 40: HPPS - Final - 06/14/2007](https://reader038.fdocuments.net/reader038/viewer/2022103000/556c0ae5d8b42a852a8b4770/html5/thumbnails/40.jpg)
40
Interrupt Controller InsertionInterrupt Controller Insertion
Communication protocol improvement by interrupt handling to prevent processor from busy waiting Interrupt Controller is included in the architecture to permit multi-interrupt detection and handling
![Page 41: HPPS - Final - 06/14/2007](https://reader038.fdocuments.net/reader038/viewer/2022103000/556c0ae5d8b42a852a8b4770/html5/thumbnails/41.jpg)
41
TimeoutTimeout
Malfunctioning due to interference on the communication channel lead to deadlocks
Communication protocol is not reliable at all
Counter implementation, including the driver used by the processor to lower down raised interrupts
Development of a simple application to verify to correctness of the proposed approach
![Page 42: HPPS - Final - 06/14/2007](https://reader038.fdocuments.net/reader038/viewer/2022103000/556c0ae5d8b42a852a8b4770/html5/thumbnails/42.jpg)
42
ResultsResults
A short Demo ...
![Page 43: HPPS - Final - 06/14/2007](https://reader038.fdocuments.net/reader038/viewer/2022103000/556c0ae5d8b42a852a8b4770/html5/thumbnails/43.jpg)
43
Future WorkFuture Work
Apply the proposed approach to external partial dynamic reconfiguration
Develop a co-simulation framework based on the VHDL/SystemC descriptions of distributed systems
Receive as input the VHDL description of the systemBuild the VHDL description for every nodeCreate the SystemC stub to allow inter node communicationDescribe the communication in SystemCCo-simulate the VHDL / SystemC description
![Page 44: HPPS - Final - 06/14/2007](https://reader038.fdocuments.net/reader038/viewer/2022103000/556c0ae5d8b42a852a8b4770/html5/thumbnails/44.jpg)
QuestionsQuestions
![Page 45: HPPS - Final - 06/14/2007](https://reader038.fdocuments.net/reader038/viewer/2022103000/556c0ae5d8b42a852a8b4770/html5/thumbnails/45.jpg)
45
What’s nextWhat’s next
DRESDDReAMS
Alessandro PanellaMatteo Murgida
Operating SystemIvan Beretta
Design FlowAntonio Piazzi
PolarisMassimo MorandiMarco Novati
HLRMarco Maggioni
![Page 46: HPPS - Final - 06/14/2007](https://reader038.fdocuments.net/reader038/viewer/2022103000/556c0ae5d8b42a852a8b4770/html5/thumbnails/46.jpg)
POLITECNICO DI MILANO
OOperatingperating Sy System support stem support forfor R Reconfeconfiigurablegurable S SoCoC
![Page 47: HPPS - Final - 06/14/2007](https://reader038.fdocuments.net/reader038/viewer/2022103000/556c0ae5d8b42a852a8b4770/html5/thumbnails/47.jpg)
POLITECNICO DI MILANO
Development of an OS Development of an OS architecture-independent architecture-independent
layer for dynamic layer for dynamic reconfigurationreconfiguration
Ivan [email protected]
![Page 48: HPPS - Final - 06/14/2007](https://reader038.fdocuments.net/reader038/viewer/2022103000/556c0ae5d8b42a852a8b4770/html5/thumbnails/48.jpg)
4848
OutlineOutline
IntroductionProblem descriptionProject GoalsState of the Art
Project in detailsContributionsPhasesResults
What’s next
![Page 49: HPPS - Final - 06/14/2007](https://reader038.fdocuments.net/reader038/viewer/2022103000/556c0ae5d8b42a852a8b4770/html5/thumbnails/49.jpg)
49
Problem descriptionProblem description
Need for an operating system support on Reconfigurable SoCs
Simplified software development processImproved code portability
Lack of support for dynamic reconfigurable architectures
Specific solutions for specific architectures
Need for an architecture-independent abstraction layer
49
![Page 50: HPPS - Final - 06/14/2007](https://reader038.fdocuments.net/reader038/viewer/2022103000/556c0ae5d8b42a852a8b4770/html5/thumbnails/50.jpg)
50
Project GoalProject Goal
Primary goals:Analysis of the State of the ArtDefinition of the new intermediate layerPhysical implementation
Specific goals:Study of the solutions developed inside the DRESD group Comparison between existing solutionsRecovery of on of the two implementationsHardware architectures generation using up-to-date tools on Xilinx Virtex II – Pro VP7
50
![Page 51: HPPS - Final - 06/14/2007](https://reader038.fdocuments.net/reader038/viewer/2022103000/556c0ae5d8b42a852a8b4770/html5/thumbnails/51.jpg)
51
State of the ArtState of the Art
Caronte implementation (Alberto Donato, 2005)Two kernel modules
ICAP deivice driverIP-Core manager (IPCM)
51
![Page 52: HPPS - Final - 06/14/2007](https://reader038.fdocuments.net/reader038/viewer/2022103000/556c0ae5d8b42a852a8b4770/html5/thumbnails/52.jpg)
52
State of the Art (cont’d)State of the Art (cont’d)
YaRA implementation (Vincenzo Rana, 2006)Multi-layered structure
Four modules: Reconfiguration controller driver, MAC, LOL, Reconfiguration LibraryROTFL architecture
52
![Page 53: HPPS - Final - 06/14/2007](https://reader038.fdocuments.net/reader038/viewer/2022103000/556c0ae5d8b42a852a8b4770/html5/thumbnails/53.jpg)
53
ContributionsContributions
Limits of existing implementationsLack of portability
E.g. YaRA solution implemented on RAPTOR2000
Reconfiguration process details visible from userspace
Definition of an architecture independent middleware
Improved portabilityIt works on different hardware architecturesIt works with different Linux distribution
Opportunity to optimize latencies53
![Page 54: HPPS - Final - 06/14/2007](https://reader038.fdocuments.net/reader038/viewer/2022103000/556c0ae5d8b42a852a8b4770/html5/thumbnails/54.jpg)
54
PhasesPhases
First phase: Layer definitionGoal: Factorization of common features
Boundaries of the new middlewareMapping of existing solutions on the functionalities
Motivation: Provide guidelines for actual implementation
Second phase: Implementation recoveryGoal: Recovery of bootstrap process and kernel imagesMotivation: Full recovery of Caronte solution
Third phase: Architectures generationGoal: Synthesis of hardware architectures using up-to-date Xilinx tools and coresMotivation: Synthesis of hardware architectures using up-to-date Xilinx tools54
![Page 55: HPPS - Final - 06/14/2007](https://reader038.fdocuments.net/reader038/viewer/2022103000/556c0ae5d8b42a852a8b4770/html5/thumbnails/55.jpg)
5555
First Phase: Layer definitionFirst Phase: Layer definition
Definition of new layer boundariesFactorization of existing featuresMapping of the required functionalities on existing implementations
Feature Caronte Solution YaRA Solution
Reconfiguration controller support
ICAP device driverReconfiguration Controller Driver
Dynamic address space assignment
IPCM Module MAC module
Dynamic device registration and driver
loadingIPCM Module LOL module
APIDirect interaction
with modulesReconfiguration
library
Module management (caching, placement...)
Not implemented ROTFL architectureLegend: ● = Both hardware and software ● = Hardware independent
![Page 56: HPPS - Final - 06/14/2007](https://reader038.fdocuments.net/reader038/viewer/2022103000/556c0ae5d8b42a852a8b4770/html5/thumbnails/56.jpg)
56
Second Phase: Implementation RecoverySecond Phase: Implementation Recovery
Bootstrap process from flash memory
56
16 MB Flash0xe4000000
0xe42FFFFF
...
...
0xe4F00000
0xe4F80000
64 MB DDR SDRAM0x000000
00
......
0xe4FFFFFF0x03FFFFFF
0x00800000
...
BRAM
PowerPC
FPGABootloader
Bootmanager
Kernel and RAMDisk Image1
2
3
4 5
6
![Page 57: HPPS - Final - 06/14/2007](https://reader038.fdocuments.net/reader038/viewer/2022103000/556c0ae5d8b42a852a8b4770/html5/thumbnails/57.jpg)
57
Second Phase: Implementation Recovery Second Phase: Implementation Recovery (cont’d)(cont’d)
Several issuesNo bootmanager nor linux kernel on flash memory at the beginningFlash memory seen as read-only memory at runtimeNeed for an ad-hoc solution
Avmon command line interfaceExecuted from DDR SDRAM memoryFTP transfert of bootmanager and flash programmingAlso useful for kernel download
Kernel executable imageKernel image built using a cross-compilerICAP and IPCM modules loaded at runtime
57
![Page 58: HPPS - Final - 06/14/2007](https://reader038.fdocuments.net/reader038/viewer/2022103000/556c0ae5d8b42a852a8b4770/html5/thumbnails/58.jpg)
58
Third Phase: Architecture generationThird Phase: Architecture generation
Hardware architecture used in Second Phase no longer useful
Synthesized with Xilinx ISE and EDK 6.1
Same hardware structure realized with updated cores and recend tool versions
Synthesis with Xilinx ISE and EDK 7.1Synthesis with Xilinx ISE and EDK 9.1
Lack of device driver support and documentation to configure newest cores
58
![Page 59: HPPS - Final - 06/14/2007](https://reader038.fdocuments.net/reader038/viewer/2022103000/556c0ae5d8b42a852a8b4770/html5/thumbnails/59.jpg)
59
Results: Implementation Results: Implementation RecoveryRecovery
Linux Bootstrap from flash memory
59
![Page 60: HPPS - Final - 06/14/2007](https://reader038.fdocuments.net/reader038/viewer/2022103000/556c0ae5d8b42a852a8b4770/html5/thumbnails/60.jpg)
60
Results: Implementation Results: Implementation RecoveryRecovery
Design summary for hardware architectures on Xilinx Virtex II – Pro VP7
Two main limitationsEthernet controllerNecessity of a top-level design
Design too large for module-based reconfiguration60
Xilinx ISE/EDK 7.1 Xilinx ISE/EDK 9.1
Resource
Used Available
% Used Available
%
Slices 4926 4928 99% 5318 4928 107%
Flip-Flops
52179856 52%
57249856 58%
4-in LUTs
69749856 70%
69939856 70%
![Page 61: HPPS - Final - 06/14/2007](https://reader038.fdocuments.net/reader038/viewer/2022103000/556c0ae5d8b42a852a8b4770/html5/thumbnails/61.jpg)
61
What’s nextWhat’s next
Device driver updates to support newest architectures
Intermediate layer implementationOpportunity to add some additional features
Reconfiguration scheduler
Opportunity to define a common device driver interface to simplify the creation of a new driver by the use
Integration of the middleware and the operating system support in a complete design flow
61
![Page 62: HPPS - Final - 06/14/2007](https://reader038.fdocuments.net/reader038/viewer/2022103000/556c0ae5d8b42a852a8b4770/html5/thumbnails/62.jpg)
6262
QuestionsQuestions
![Page 63: HPPS - Final - 06/14/2007](https://reader038.fdocuments.net/reader038/viewer/2022103000/556c0ae5d8b42a852a8b4770/html5/thumbnails/63.jpg)
63
What’s nextWhat’s next
DRESDDReAMS
Alessandro PanellaMatteo Murgida
Operating SystemIvan Beretta
Design FlowAntonio Piazzi
PolarisMassimo MorandiMarco Novati
HLRMarco Maggioni
![Page 65: HPPS - Final - 06/14/2007](https://reader038.fdocuments.net/reader038/viewer/2022103000/556c0ae5d8b42a852a8b4770/html5/thumbnails/65.jpg)
6565
OutlineOutline
IntroductionProblem descriptionProject GoalsState of the Art
Project in detailsContributionsPhasesResults
What’s next
![Page 66: HPPS - Final - 06/14/2007](https://reader038.fdocuments.net/reader038/viewer/2022103000/556c0ae5d8b42a852a8b4770/html5/thumbnails/66.jpg)
66
Problem descriptionProblem description
66
• User has to spread his attention on many problems, some of this related with the implementation of the design.
• Often users could don’t know anything about reconfigurable architecture generation and they haven’t.
![Page 67: HPPS - Final - 06/14/2007](https://reader038.fdocuments.net/reader038/viewer/2022103000/556c0ae5d8b42a852a8b4770/html5/thumbnails/67.jpg)
67
Project GoalsProject Goals
67
• New design methodology tailored to support partial dynamic reconfigurable architecture
• Definition and implememtantion of design framework able to
• Support different design paradigms i.e. Xilinx Module Based, Xilinx EAPR
• Hide the dirty work (due to the recofiguration) to the application designer
• Support different architectural solutions i.e. different communication infrastructure IBM CoreConnect or Wishbone
![Page 68: HPPS - Final - 06/14/2007](https://reader038.fdocuments.net/reader038/viewer/2022103000/556c0ae5d8b42a852a8b4770/html5/thumbnails/68.jpg)
68
ContributionsContributions
68
• With our frame work all user (novice and not) may be able to develop and debug their functionality through a reconfigurable architecture without analyze all problems related with that develop methodology
![Page 69: HPPS - Final - 06/14/2007](https://reader038.fdocuments.net/reader038/viewer/2022103000/556c0ae5d8b42a852a8b4770/html5/thumbnails/69.jpg)
69
PhasesPhases
69
•1st phase (15 March – 15 April): Budgeting
•Study of the state of the art
•2nd phase(15 April – 15 May): Realization phase
•Construction of the entire frame work based on previously separated tools
•Implementation of a innovative work flow
•3rd phase (15 May – 15 June): Project’s validation
• Definition of a new communication infrastructure and transfer protocol for the reconfigurable part
• Verify the integration of the new infrastructure in the project
![Page 70: HPPS - Final - 06/14/2007](https://reader038.fdocuments.net/reader038/viewer/2022103000/556c0ae5d8b42a852a8b4770/html5/thumbnails/70.jpg)
70
First PhaseFirst Phase
70
•Study of the state of the art
• Standard reconfigurable design flow
• Xilinx Modlue Based and EAPR
• Caronte Design Flow
• EDK-based architecture
![Page 71: HPPS - Final - 06/14/2007](https://reader038.fdocuments.net/reader038/viewer/2022103000/556c0ae5d8b42a852a8b4770/html5/thumbnails/71.jpg)
71
SelSelf Reconfigurable f Reconfigurable ArchitectureArchitecture
71
![Page 72: HPPS - Final - 06/14/2007](https://reader038.fdocuments.net/reader038/viewer/2022103000/556c0ae5d8b42a852a8b4770/html5/thumbnails/72.jpg)
72
Second Phase Second Phase 1/41/4
72
Costruction of the entire frame work based on prevoiusly separated tools
User has to focus his attention only on the develop of the IBM core-connect architecture and on writing modules which implement his functionality
SYSTEM.VHD contains all information about the IBM core-connect architecture
![Page 73: HPPS - Final - 06/14/2007](https://reader038.fdocuments.net/reader038/viewer/2022103000/556c0ae5d8b42a852a8b4770/html5/thumbnails/73.jpg)
73
Second Phase Second Phase 2/42/4
73
ArchGen take the system.vhd file and process the contained architecture and translate that static architecture in a dynamic one
FIX.VHD contains the instantiations of the processors (one or more) and all the components presented in the IBM core-connect architecture
TOP.VHD contains the instantiations of the fix component and the information about the communication infrastructure
![Page 74: HPPS - Final - 06/14/2007](https://reader038.fdocuments.net/reader038/viewer/2022103000/556c0ae5d8b42a852a8b4770/html5/thumbnails/74.jpg)
74
Second Phase Second Phase 33/4/4
74
COMiC generate an NCD file which contains the information about the communication infrastructure and an XDL file which contains the same information in text mode
![Page 75: HPPS - Final - 06/14/2007](https://reader038.fdocuments.net/reader038/viewer/2022103000/556c0ae5d8b42a852a8b4770/html5/thumbnails/75.jpg)
75
Second Phase Second Phase 4/4/44
75
At this point we have only to collect all the information we need and so, through a parser we insert those into a new top.vhd which will be our fix part of the architecture, at this point we have only to manage the reconfigurable modules written by the user
![Page 76: HPPS - Final - 06/14/2007](https://reader038.fdocuments.net/reader038/viewer/2022103000/556c0ae5d8b42a852a8b4770/html5/thumbnails/76.jpg)
76
Third Phase Third Phase 1/31/3
76
An OPB bus based on 3-state buffer used to link one or more modules to the fix part (created with ISE)
Definition of a new communication infrastructure and transfer protocol for the reconfigurable part
![Page 77: HPPS - Final - 06/14/2007](https://reader038.fdocuments.net/reader038/viewer/2022103000/556c0ae5d8b42a852a8b4770/html5/thumbnails/77.jpg)
77
Third Phase Third Phase 2/32/3
77
Use ncd2xdl converter to obtain an xdl file which contains all parameters of our bus
![Page 78: HPPS - Final - 06/14/2007](https://reader038.fdocuments.net/reader038/viewer/2022103000/556c0ae5d8b42a852a8b4770/html5/thumbnails/78.jpg)
78
Third Phase Third Phase 3/33/3
78
Perfect integration in our process, we can use all bus type to connect fix and reconfigurable part
Verify the integration of the new infrastructure in the project
![Page 79: HPPS - Final - 06/14/2007](https://reader038.fdocuments.net/reader038/viewer/2022103000/556c0ae5d8b42a852a8b4770/html5/thumbnails/79.jpg)
79
ResultsResults
79
• That frame work answer to the need of automation presented from the novice user and help, generally, all the users that they head a low time to market.
![Page 80: HPPS - Final - 06/14/2007](https://reader038.fdocuments.net/reader038/viewer/2022103000/556c0ae5d8b42a852a8b4770/html5/thumbnails/80.jpg)
80
What’s nextWhat’s next
80
• Our idea for future work is to schedule a one or two work day to patch some bugs presents in the project and to adjust the output of COMiC which has to create an OPB replay bus.
![Page 81: HPPS - Final - 06/14/2007](https://reader038.fdocuments.net/reader038/viewer/2022103000/556c0ae5d8b42a852a8b4770/html5/thumbnails/81.jpg)
8181
Questions?Questions?
![Page 82: HPPS - Final - 06/14/2007](https://reader038.fdocuments.net/reader038/viewer/2022103000/556c0ae5d8b42a852a8b4770/html5/thumbnails/82.jpg)
8282
What’s nextWhat’s next
DRESDDReAMS
Matteo MurgidaAlessandro Panella
Operating SystemIvan Beretta
Design FlowAntonio Piazzi
PolarisMassimo MorandiMarco Novati
HLRMarco Maggioni
![Page 83: HPPS - Final - 06/14/2007](https://reader038.fdocuments.net/reader038/viewer/2022103000/556c0ae5d8b42a852a8b4770/html5/thumbnails/83.jpg)
POLITECNICO DI MILANO
PolarisPolaris
![Page 84: HPPS - Final - 06/14/2007](https://reader038.fdocuments.net/reader038/viewer/2022103000/556c0ae5d8b42a852a8b4770/html5/thumbnails/84.jpg)
8484
PolarisPolaris
Create an integrated HW/SW system to manage 2D reconfiguration
SW side:Maintain information on FPGA statusDecide of how to efficiently allocate tasks
HW side:Provide support for effective task allocationPerform 2D bitstream relocation
84
![Page 85: HPPS - Final - 06/14/2007](https://reader038.fdocuments.net/reader038/viewer/2022103000/556c0ae5d8b42a852a8b4770/html5/thumbnails/85.jpg)
85
Management of 2D Management of 2D Reconfiguration in a Reconfiguration in a
Reconfigurable SystemReconfigurable System
Massimo [email protected]
![Page 86: HPPS - Final - 06/14/2007](https://reader038.fdocuments.net/reader038/viewer/2022103000/556c0ae5d8b42a852a8b4770/html5/thumbnails/86.jpg)
8686
OutlineOutline
IntroductionProblem description Project Goals and Contributions
Project in detailsPhasesResults
Future Work
![Page 87: HPPS - Final - 06/14/2007](https://reader038.fdocuments.net/reader038/viewer/2022103000/556c0ae5d8b42a852a8b4770/html5/thumbnails/87.jpg)
87
Problem DescriptionProblem Description
New Generation of FPGAsVirtex-4 and Virtex-5Allow bi-dimensional reconfiguration
This permits to:Better exploit reconfigurable areaObtain modules performance optimizations
More complex management:Handle one more degree of freedomAvoid more fragmentationPerform good placement choices to keep low TRRKeep acceptable intra-module routing paths
87
![Page 88: HPPS - Final - 06/14/2007](https://reader038.fdocuments.net/reader038/viewer/2022103000/556c0ae5d8b42a852a8b4770/html5/thumbnails/88.jpg)
88
Project Goals and Project Goals and ContributionsContributions
Analyze effects of 2D reconfigurationNew advantagesNew problems
Examine possible solutions to new problemsExplore literature to find promising ideasEvaluate those solutions in various scenarios
Propose a new solutionCombining ideas from literature with new onesObtaining good cost-quality tradeoff
88
![Page 89: HPPS - Final - 06/14/2007](https://reader038.fdocuments.net/reader038/viewer/2022103000/556c0ae5d8b42a852a8b4770/html5/thumbnails/89.jpg)
89
Project PhasesProject PhasesFirst Phase, time window: 15th March – 12th April
Documentation: prj presentation (12/4), prj reportGoals:
General analysis of 2D reconfigurationDetailed description of the new problems
Second Phase, time window: 13th April – 17th MayDocumentation: prj presentation (17/5), prj reportGoals:
Definition of desired features for a solutionAnalysis and evaluation of existing solutions
Third Phase, time window: 18th May – 14th JuneDocumentation: prj presentation (14/6), prj reportGoal: propose a new combined solution to effectively handle problems of 2D reconfiguration
89
![Page 90: HPPS - Final - 06/14/2007](https://reader038.fdocuments.net/reader038/viewer/2022103000/556c0ae5d8b42a852a8b4770/html5/thumbnails/90.jpg)
90
Setting and Advantages Setting and Advantages DefinitionDefinition
Definition of the setting:2D self partial dynamical run-time reconfiguration
Analysis of the advantages of 2D ReconfigurationIn area usage and performance
90
![Page 91: HPPS - Final - 06/14/2007](https://reader038.fdocuments.net/reader038/viewer/2022103000/556c0ae5d8b42a852a8b4770/html5/thumbnails/91.jpg)
9191
2D Fragmentation Problem2D Fragmentation Problem
Analysis of the 2D-fragmentation problemArea generally more fragmentedCan nullify the area optimizations obtained
![Page 92: HPPS - Final - 06/14/2007](https://reader038.fdocuments.net/reader038/viewer/2022103000/556c0ae5d8b42a852a8b4770/html5/thumbnails/92.jpg)
9292
Placement DecisionsPlacement Decisions
Analysis of 2D placement choices effects:Again, bad choices can lead to performance loss
![Page 93: HPPS - Final - 06/14/2007](https://reader038.fdocuments.net/reader038/viewer/2022103000/556c0ae5d8b42a852a8b4770/html5/thumbnails/93.jpg)
9393
Allocation managerAllocation manager
Definition of allocation manager desired features:Low TRRLow management overheadHigh routing efficiencyLow fragmentation
Definition of allocation manager structure:Empty space manager
Complete space Heuristic selection
FitterGeneral (FF,BL,BF,WF…)Focused (FA,RA… )
![Page 94: HPPS - Final - 06/14/2007](https://reader038.fdocuments.net/reader038/viewer/2022103000/556c0ae5d8b42a852a8b4770/html5/thumbnails/94.jpg)
94
Most relevant worksMost relevant works
Maintain complete information on empty space:KAMER:
Keep All Maximally Empty RectanglesApply a general fitting strategy
CUR:Maintain the Countour of a Union of RectanglesApply a focused fitting strategy
Heuristically prune part of the information:KNER:
Keep Non-overlapping Empty RectanglesApply a general fitting strategy
2D-HASHING:Keep Non-ov. Empty Rectangles in optimized data structure
Apply (exclusively) a general fitting strategy94
![Page 95: HPPS - Final - 06/14/2007](https://reader038.fdocuments.net/reader038/viewer/2022103000/556c0ae5d8b42a852a8b4770/html5/thumbnails/95.jpg)
95
Evaluation and Proposed Evaluation and Proposed ApproachApproach
Proposed ApproachHeuristic (KNER-like) empty space manager, to keep low complexity for use in a self-reconfigurable systemFitting strategy focused on minimizing routing paths, to maintain high performance of the reconfigurable system (chosen metric to minimize Manhattan distance)95
High placement quality => high complexityLowest compl. => no focused fitting (bad especially for routing)
![Page 96: HPPS - Final - 06/14/2007](https://reader038.fdocuments.net/reader038/viewer/2022103000/556c0ae5d8b42a852a8b4770/html5/thumbnails/96.jpg)
9696
Structure of the allocation managerStructure of the allocation manager
Task, defined by:Arrival time, ASAP, (ALAP), H, W, Latency, Communicating TasksHosted in a queue which also adds a pointer to the rectangle where it is placed
Reconfigurable Device, represented as:Binary Tree structure, each node is a Rectangle, each leaf is an empty Rectangle. Navigation trough pointers to left child, right child, next leaf and a function to find previous leaf (for bookkeeping after split or merge)
Rectangle, defined by:X, Y, H, WInitially one, (X,Y)=(0,0), H=FPGA Rows, W=FPGA Cols
![Page 97: HPPS - Final - 06/14/2007](https://reader038.fdocuments.net/reader038/viewer/2022103000/556c0ae5d8b42a852a8b4770/html5/thumbnails/97.jpg)
9797
The Placement AlgorithmThe Placement Algorithm
![Page 98: HPPS - Final - 06/14/2007](https://reader038.fdocuments.net/reader038/viewer/2022103000/556c0ae5d8b42a852a8b4770/html5/thumbnails/98.jpg)
98
Experimental ResultsExperimental Results
Benchmark of 100 randomly generated tasks:Size (5% to 25% of FPGA), randomly interconnected
Execution time: 3x less than CUR, close to KNERCommunication cost: 3x less than KNER, close to CURTask Rejection Rate: all solutions quite close
98
![Page 99: HPPS - Final - 06/14/2007](https://reader038.fdocuments.net/reader038/viewer/2022103000/556c0ae5d8b42a852a8b4770/html5/thumbnails/99.jpg)
99
Future WorkFuture Work
Apply the proposed solution to self reconfiguration:
Adapt the algorithm to run on the internal processorCreate a validation reconfigurable architectureIntegrate the architecture with relocation
Tune the algorithm to improve results:Experiment techniques to reduce TRRTry to optimize the code to have an algorithm with lower running time
99
![Page 100: HPPS - Final - 06/14/2007](https://reader038.fdocuments.net/reader038/viewer/2022103000/556c0ae5d8b42a852a8b4770/html5/thumbnails/100.jpg)
100100
Questions?Questions?
![Page 101: HPPS - Final - 06/14/2007](https://reader038.fdocuments.net/reader038/viewer/2022103000/556c0ae5d8b42a852a8b4770/html5/thumbnails/101.jpg)
101101
What’s nextWhat’s next
DRESDDReAMS
Alessandro PanellaMatteo Murgida
Operating SystemIvan Beretta
Design FlowAntonio Piazzi
PolarisMassimo MorandiMarco Novati
HLRMarco Maggioni
![Page 102: HPPS - Final - 06/14/2007](https://reader038.fdocuments.net/reader038/viewer/2022103000/556c0ae5d8b42a852a8b4770/html5/thumbnails/102.jpg)
POLITECNICO DI MILANO
Relocation for 2D Relocation for 2D Reconfigurable SystemsReconfigurable Systems
Marco [email protected]
![Page 103: HPPS - Final - 06/14/2007](https://reader038.fdocuments.net/reader038/viewer/2022103000/556c0ae5d8b42a852a8b4770/html5/thumbnails/103.jpg)
103103
OutlineOutline
IntroductionProblem descriptionProject Goals
Project in detailsPhases Results
What’s next
![Page 104: HPPS - Final - 06/14/2007](https://reader038.fdocuments.net/reader038/viewer/2022103000/556c0ae5d8b42a852a8b4770/html5/thumbnails/104.jpg)
104
ProblemProblem DescriptionDescription
Self Dynamical Runtime 2D ReconfigurationXilinx Virtex-4 and Virtex-5
Relocation, different solutionsSoftware (BAnMat, PARBIT)Hardware (REPLICA, BiRF)
We chose an hardware solutionBiRF Square
104
![Page 105: HPPS - Final - 06/14/2007](https://reader038.fdocuments.net/reader038/viewer/2022103000/556c0ae5d8b42a852a8b4770/html5/thumbnails/105.jpg)
105
Project GoalsProject Goals
Study of the new FPGA FamiliesExamination of Xilinx documentation on V4 and V5
Analysis of the new bitstream structureGeneration of V4 and V5 bitstream
Development of the new version of BiRFImplementationValidation
105
![Page 106: HPPS - Final - 06/14/2007](https://reader038.fdocuments.net/reader038/viewer/2022103000/556c0ae5d8b42a852a8b4770/html5/thumbnails/106.jpg)
106
PhasesPhases
First Phase: 15th March – 12th AprilDocumentation: prj presentation (12/4), prj reportGoals:
Xilinx documentation examinationV4 & V5 bitstream structure analysis
Second Phase: 13th April – 17th MayDocumentation: prj presentation (17/5), prj reportGoals:
Implementation of BiRF SquareSynthesis
Third Phase: 18th May – 14th JuneDocumentation: prj presentation (14/6), prj reportGoals:
Verification & Validation
106
![Page 107: HPPS - Final - 06/14/2007](https://reader038.fdocuments.net/reader038/viewer/2022103000/556c0ae5d8b42a852a8b4770/html5/thumbnails/107.jpg)
107107
Frame AddressingFrame Addressing
New Frame Addressing:Possibility of addressing rows and columns
![Page 108: HPPS - Final - 06/14/2007](https://reader038.fdocuments.net/reader038/viewer/2022103000/556c0ae5d8b42a852a8b4770/html5/thumbnails/108.jpg)
108108
New ParserNew Parser
![Page 109: HPPS - Final - 06/14/2007](https://reader038.fdocuments.net/reader038/viewer/2022103000/556c0ae5d8b42a852a8b4770/html5/thumbnails/109.jpg)
109
CRC CalculationCRC Calculation
Particular CRC value, used by Xilinx tools
Two version of BiRF Square:By using the “predefined” valueWith actual CRC calculation
An optimized algorithm has been used
109
![Page 110: HPPS - Final - 06/14/2007](https://reader038.fdocuments.net/reader038/viewer/2022103000/556c0ae5d8b42a852a8b4770/html5/thumbnails/110.jpg)
110
Synthesis resultsSynthesis results
On a Virtex-4 with speed grade -12General purpose version: max frequency of 160 MHzSpecific version: maxfrequency of 290Mhz
110
![Page 111: HPPS - Final - 06/14/2007](https://reader038.fdocuments.net/reader038/viewer/2022103000/556c0ae5d8b42a852a8b4770/html5/thumbnails/111.jpg)
111111
Target DeviceTarget Device
![Page 112: HPPS - Final - 06/14/2007](https://reader038.fdocuments.net/reader038/viewer/2022103000/556c0ae5d8b42a852a8b4770/html5/thumbnails/112.jpg)
112112
Validation ArchitectureValidation Architecture
![Page 113: HPPS - Final - 06/14/2007](https://reader038.fdocuments.net/reader038/viewer/2022103000/556c0ae5d8b42a852a8b4770/html5/thumbnails/113.jpg)
113
Results Results 1/21/2
BiRF SquarePermitsto apply relocation in a self partially and dynamically 2D-reconfigurable systemThe occupation ratio is relatively smallFrequency more than acceptableReduction of internal memory requirements
113
![Page 114: HPPS - Final - 06/14/2007](https://reader038.fdocuments.net/reader038/viewer/2022103000/556c0ae5d8b42a852a8b4770/html5/thumbnails/114.jpg)
114
Results Results 2/22/2
Throughput of 7,3 MB/s:
A total configuration file size is about 1 MBConsidering an architecture:
1/3 of the area as fixed part 2/3 as reconfigurable part with 6 slots
With such hypothesisSize of a partial bitstream will be about 110 KBRelocation time of about 15 ms
114
![Page 115: HPPS - Final - 06/14/2007](https://reader038.fdocuments.net/reader038/viewer/2022103000/556c0ae5d8b42a852a8b4770/html5/thumbnails/115.jpg)
115
What’s NextWhat’s Next
Future improvements:Direct access to the memory (DMA)
Direct manipulation of the bitstreamPortability
Integration with ICAPElimination of the relocation overhead Relocation time << reconfiguration time
The final goal:Creation of a real architecture that exploits self partial and dynamical 2D-reconfiguration,with relocation
115
![Page 116: HPPS - Final - 06/14/2007](https://reader038.fdocuments.net/reader038/viewer/2022103000/556c0ae5d8b42a852a8b4770/html5/thumbnails/116.jpg)
116116
QuestionsQuestions
![Page 117: HPPS - Final - 06/14/2007](https://reader038.fdocuments.net/reader038/viewer/2022103000/556c0ae5d8b42a852a8b4770/html5/thumbnails/117.jpg)
117117
What’s nextWhat’s next
DRESDDReAMS
Alessandro PanellaMatteo Murgida
Operating SystemIvan Beretta
Design FlowAntonio Piazzi
PolarisMassimo MorandiMarco Novati
HLRMarco Maggioni
![Page 119: HPPS - Final - 06/14/2007](https://reader038.fdocuments.net/reader038/viewer/2022103000/556c0ae5d8b42a852a8b4770/html5/thumbnails/119.jpg)
119
OutlineOutline
IntroductionProblem description
Project Goals
State of the Art
Project in detailsContributions
HLR workflowGraphGenIsomorphClusteringSimpleLatencySalomone
Results
What’s next
![Page 120: HPPS - Final - 06/14/2007](https://reader038.fdocuments.net/reader038/viewer/2022103000/556c0ae5d8b42a852a8b4770/html5/thumbnails/120.jpg)
120
Problem DescriptionProblem Description
What is High Level Reconfiguration...?Theoretical approach to dynamic reconfiguration...
Vision...Reconfigurability has many advantages...
Mission...Exploit these advantages to obtain best performance...
How...?Adapting a system to this execution model managing complexity and drawbacks...
![Page 121: HPPS - Final - 06/14/2007](https://reader038.fdocuments.net/reader038/viewer/2022103000/556c0ae5d8b42a852a8b4770/html5/thumbnails/121.jpg)
121
Project GoalProject Goal
Create a complete HLR workflow...From a real system specification to its reconfigurable execution model...
Define precise interfaces for each phase...To promote flexibility and future HLR researchs...To develop a complete toolchain...
Apply some algorithms regarding reconfigurability...To reuse past works...
![Page 122: HPPS - Final - 06/14/2007](https://reader038.fdocuments.net/reader038/viewer/2022103000/556c0ae5d8b42a852a8b4770/html5/thumbnails/122.jpg)
122
State of ArtState of Art
Present of HLR...Some ideas/concepts regarding clustering and scheduling...... but no a complete and well-defined workflow.... but a lot of work to do.
System specifications analysis...PandA HW/SW framework to promote new ideas...Dynamic Reconfigurability can be considered as a branch of this research...
![Page 123: HPPS - Final - 06/14/2007](https://reader038.fdocuments.net/reader038/viewer/2022103000/556c0ae5d8b42a852a8b4770/html5/thumbnails/123.jpg)
123
ContributionContribution
Dynamic library loading system...Embedded into GNU compilation tool-chain
Porting of PandA libraries into Earendil...Suitable for future analysis...
HLR tools deployed onto Earendil...Cover each step of workflow...
![Page 124: HPPS - Final - 06/14/2007](https://reader038.fdocuments.net/reader038/viewer/2022103000/556c0ae5d8b42a852a8b4770/html5/thumbnails/124.jpg)
124
Gcc Frontend PartitioningAlgorithmPandA
HLR workflowHLR workflow
Clustering (with Analysis)...1st Month
Coloring...2nd Month
Scheduling...3rd Month Scheduling
Algorithm
ClusteredGraph
MetricEvaluation
ReconfigurableClustered
Graph
AreaLatency
Rec. TimePower
Target Architecture
Database
![Page 125: HPPS - Final - 06/14/2007](https://reader038.fdocuments.net/reader038/viewer/2022103000/556c0ae5d8b42a852a8b4770/html5/thumbnails/125.jpg)
125
GraphGenGraphGen
GraphGen is the first step of the HLR toolchain...Takes as input a system specification or an algorithm...Produces a graph (CFG/BB/DFG/SDG)
Perfoms high level analysis step...Transforms the system description (C/C++/SystemC) to a representation suitable for further elaboration...Based on GCC and compiler theory...Uses PandA 0.4 funtionalities to produce a statement level graph...
![Page 126: HPPS - Final - 06/14/2007](https://reader038.fdocuments.net/reader038/viewer/2022103000/556c0ae5d8b42a852a8b4770/html5/thumbnails/126.jpg)
126
IsomorphClusteringIsomorphClustering
IsomorphClusteing follows GraphGen in the HLR toolchain...
Takes as input a statement level graph...Produces a clustered graph...
Clustering phase...Aggregates nodes into configuration (basic unit of reconfigurable execution)...Based on isomorphism, tries to find different instances of isomorph templates...We can also apply differents algorithms...
![Page 127: HPPS - Final - 06/14/2007](https://reader038.fdocuments.net/reader038/viewer/2022103000/556c0ae5d8b42a852a8b4770/html5/thumbnails/127.jpg)
127
SimpleLatencySimpleLatency
SimpleLatency follows IsomorphClusteing in the HLR toolchain...
Takes as input a clustered graph...Adds latency information at each configuration...Produces a reconfigurable clustered graph with latency evaluations...
Coloring...“Colors” each cluster with usefull evalution for reconfigurability...Based on clusters internal critical path...Different metric for different architectures...Connects HLR with real architectural parameters...
![Page 128: HPPS - Final - 06/14/2007](https://reader038.fdocuments.net/reader038/viewer/2022103000/556c0ae5d8b42a852a8b4770/html5/thumbnails/128.jpg)
128
SalomoneSalomone
Salomone is the last step in the HLR toolchain...Takes as input a reconfigurable clustered graph...Produces a schedule on an abstract reconfigurable architecture...
Scheduling...It's considered the core task of HLR...Maps each configuration on an area portion...Adapts the system execution to reconfigurable model...Based onto graph coloring algorithm...
![Page 129: HPPS - Final - 06/14/2007](https://reader038.fdocuments.net/reader038/viewer/2022103000/556c0ae5d8b42a852a8b4770/html5/thumbnails/129.jpg)
129
Results Results 1/31/3
Based onto AES encryption...
Templates found with Isomorph CLustering...Execution time... 123.94 s
![Page 130: HPPS - Final - 06/14/2007](https://reader038.fdocuments.net/reader038/viewer/2022103000/556c0ae5d8b42a852a8b4770/html5/thumbnails/130.jpg)
130
Results Results 2/32/3
Salomone adapting and coloring...Execution time... 113.55 s
![Page 131: HPPS - Final - 06/14/2007](https://reader038.fdocuments.net/reader038/viewer/2022103000/556c0ae5d8b42a852a8b4770/html5/thumbnails/131.jpg)
131
Results Results 3/33/3
Final Scheduling...
![Page 132: HPPS - Final - 06/14/2007](https://reader038.fdocuments.net/reader038/viewer/2022103000/556c0ae5d8b42a852a8b4770/html5/thumbnails/132.jpg)
132
What's nextWhat's next
Heuristich implementation for Salomone...To improve result quality in term of number of area portions...
A new metric for area/latency...Based on RTL logical synthesis evaluations...
Introduce feedback into HLR workflow...Based on schedule evaluation...
New clustering and scheduling algorithms...Such as Napoleon...
![Page 133: HPPS - Final - 06/14/2007](https://reader038.fdocuments.net/reader038/viewer/2022103000/556c0ae5d8b42a852a8b4770/html5/thumbnails/133.jpg)
133
QuestionsQuestions