Async2002 Tutorial: - 1 Balsa – Description to Layout A Hands-on Tutorial Session Doug Edwards &...
-
Upload
rolf-mcdaniel -
Category
Documents
-
view
220 -
download
0
Transcript of Async2002 Tutorial: - 1 Balsa – Description to Layout A Hands-on Tutorial Session Doug Edwards &...
Async2002 Tutorial: - 1
Balsa – Description to Layout
A Hands-on Tutorial SessionDoug Edwards & Andrew Bardsley
Async2002 Tutorial: - 2
Other Balsa Tutorials
Async 2000 (Eilat)• concentrated on language aspects• small design examples• Xilinx implementation of calculator
ACiD summer school, July 2002• probably in-depth use of language for
design examples• buy the book - ISBN 0-792-37613-7
Async2002 Tutorial: - 3
Async2002 Tutorial: - 4
Aims of Async 2002 Tutorial
To explore parts of the system not demonstrated previously • GUI front-end
– balsa-mgr
• back-end alternatives– “how do I produce silicon”?– “what are the trade-offs”?
To gain some exposure to the language• but, language learning by osmosis!
Async2002 Tutorial: - 5
Session Schedule
Brief overview of the Balsa system Language introduction
• hands-on examples To synthesise the original Manchester
Small-Scale Experimental Machine• experiment with different back-ends
A replica of the original SSEM is on display at the conference dinner
Async2002 Tutorial: - 6
Session Schedule
Coffee Break Advanced Design
• Spamulet0 – write your own ARM-like processor description
Async2002 Tutorial: - 7
The Balsa TeamDoug EdwardsDoug EdwardsTeam LeaderTeam Leader
Luis PlanaLuis PlanaDual RailDual RailBack-endBack-end
Andrew BardsleyAndrew BardsleyChief Architect/ImplementerChief Architect/Implementer
Will TomsWill Toms1-of-4 Back-end1-of-4 Back-end
Lilian JaninLilian Janinbalsa-mgr/LARDbalsa-mgr/LARD
Async2002 Tutorial: - 8
Balsa Requirements
Freely available• ftp://ftp.cs.man.ac.uk/pub/amulet/balsa/• not all back-ends available
OS requirements:• Linux• Sun Solaris 7-8• MacOS X (+ X11R6 …)
Async2002 Tutorial: - 9
Async2002 Tutorial: - 10
Front EndFront End
Async2002 Tutorial: - 11
Async2002 Tutorial: - 12
Compass DACompass DARouteRoute
Async2002 Tutorial: - 13
Async2002 Tutorial: - 14
xilinxxilinxrouteroute
Async2002 Tutorial: - 15
top level relies ontop level relies onpowerviewpowerview
Async2002 Tutorial: - 16
CadenceCadencerouteroute
Async2002 Tutorial: - 17
Async2002 Tutorial: - 18
Other Balsa Work
Burst-mode resynthesis• Tibi Chelcea & Steve Nowick
Faster LARD simulation• Lilian Janin (x50 speed up)
Datapath compilation optimisation• Andrew Bardsley
Complete Amulet implementation• Peter Riocreux et al.
Async2002 Tutorial: - 19
Proven Balsa Synthesis- DMA Controller for DRACO
Balsa SynthesisedBalsa SynthesisedDMA ControllerDMA Controller
Async2002 Tutorial: - 20
DMA Controller Layout
Async2002 Tutorial: - 21
What is Balsa?
Language for synthesising large async circuits & systems
CSP/OCCAM background Tangram-like
• based on Tangram compilation function• compiles to a small, parameterisable, set
of handshake components• origins: ESPRIT 6143 EXACT project
Async2002 Tutorial: - 22
Handshake circuits – 1
Components communicate along handshake channels
Channels connect to ports on components
Ports have:• Type• Direction• Sense
Async2002 Tutorial: - 23
Handshake Circuits – 2
Port type determines the number of data wires• no data wires == control only port!
Port direction is input, output or control only (called sync)
Port sense• Active: initiate transfers (source the req)• Passive: respond to requests (… the ack)
Async2002 Tutorial: - 24
Balsa Language Features
Data types based on sequence of bits• Arrays and records are bit-based• Element extraction is by array slicing• Strict data typing
Structural iteration Arrayed channels Parameterised, recursively expanded
procedures
Async2002 Tutorial: - 25
Balsa Language Features
Enclosed selection semantics• Allows passive ported circuits• Allows push (micropipeline-style) circuits• Allows unbuffered (latch-free) circuits
Async2002 Tutorial: - 26
Example: Single Place Buffer
import [balsa.types.basic]
type word is 16 bits
procedure buffer (input i : word; output o : word) is
variable x : word
begin
loop
i -> x ; -- Input communication
o <- x -- Output communication
end
end
Async2002 Tutorial: - 27
Example: Single Place Buffer
import [balsa.types.basic]
type word is 16 bits
procedure buffer (input i : word; output o : word) is
variable x : word
begin
loop
i -> x ; -- Input communication
o <- x -- Output communication
end
end
librarymechanismtype declaration
channel declarationsprocedure
definitionimplies latch
repeat forever
output local variable xto output channel
read input channel intolocal variable x
sequential operation
Async2002 Tutorial: - 28
Buffer Handshake Circuit
Single-place buffer
#
x T
;
Ti o
activationchannel
repeater
sequencer
variable
transferrer
Async2002 Tutorial: - 29
#
Buffer Handshake Circuit
Single-place buffer
Repeater is activated
x T
;
Ti o
Async2002 Tutorial: - 30
;
#
Buffer Handshake Circuit
Single-place buffer
Sequencer handshakes to left transferrer
x TTi o
Async2002 Tutorial: - 31
;
#
Buffer Handshake Circuit
Single-place buffer
Transferrer requests data from environment
x TTi o
Async2002 Tutorial: - 32
x
;
#
Buffer Handshake Circuit
Single-place buffer
Data transferred to variable x
TTi o
Async2002 Tutorial: - 33
x
;
#
Buffer Handshake Circuit
Single-place buffer
Variable handshake completes
TTi o
Async2002 Tutorial: - 34
x
;
#
Buffer Handshake Circuit
Single-place buffer
Transferrer handshake completes to environment
TTi o
Async2002 Tutorial: - 35
x
;
#
Buffer Handshake Circuit
Single-place buffer
Transferrer handshake completes
TTi o
Async2002 Tutorial: - 36
x
;
#
Buffer Handshake Circuit
Single-place buffer
Sequencer handshakes to right transferrer
TTi o
Async2002 Tutorial: - 37
x
;
#
Buffer Handshake Circuit
Single-place buffer
Transferrer reads variable
TTi o
Async2002 Tutorial: - 38
x
;
#
Buffer Handshake Circuit
Single-place buffer
Transferrer outputs to environment
TTi o
Async2002 Tutorial: - 39
x
;
#
Buffer Handshake Circuit
Single-place buffer
Sequencer initiated handshakes complete
TTi o
Async2002 Tutorial: - 40
x
;
#
Buffer Handshake Circuit
Single-place buffer
Sequencer completes its activation handshake
TTi o
Async2002 Tutorial: - 41
Buffer Handshake Circuit
Single-place buffer
Repeater initiates another transfer, repeat
x
;
#
TTi o
Async2002 Tutorial: - 42
Example Handshake Component
Handshake definition of repeater (Loop)Loop(a,b) = (a: #[b])
= (a: #[b;b])
= (ar: #[br ; ba ; br ; ba])
ba
brar
aa
Async2002 Tutorial: - 43
Example Handshake Component
Case component (single-rail)
data “n” bits wide
true/complement lines:dual-rail expansion
1 hot encoding
Async2002 Tutorial: - 44
Compilation Tools
balsa-c• compiles Balsa programs to Breeze• includes other Breeze definition files
– Breeze is a handshake -circuit netlist format– acts as a library format for within Balsa
balsa-netlist• produces an appropriate netlist from a
compiled Balsa program– technology specific options
Async2002 Tutorial: - 45
Simulation Tools
breeze2lard• produces a LARD simulation file
various LARD utilities• mainly hidden within the Makefile by
balsa-md
Async2002 Tutorial: - 46
Utilitity Tools
breeze2ps• creates a PostScript HC graph
breeze-cost• enumerates the handshake circuits used
and gives an approximate area cost balsa-md
• automatic Makefile maker balsa-mgr
• GUI interface to balsa-md
Async2002 Tutorial: - 47
Exercise: Single Stage Shift Register
Objective: introduction to balsa-mgr
cd ~/Balsa/shift-reg balsa-mgr &
• create new project: Project -> New
• add SRA1.balsa to project
Async2002 Tutorial: - 48
create new projectcreate new project
Creating a Project
Async2002 Tutorial: - 49
set nameset name
Set Project Name
Async2002 Tutorial: - 50
Add FilesAdd Files
Adding Files
Async2002 Tutorial: - 51
pick file(s)pick file(s)
Choosing Files
Async2002 Tutorial: - 52
File list paneFile list pane edit paneedit pane
usual iconsusual icons
Project Window
Async2002 Tutorial: - 53
Project Manager
tool-tip help pop-ups for icons editor icon opens the editor defined in: Project -> Environment dialogue• syntax modes for xemacs, elvis, nedit
right-mouse clicking on panes brings up context sensitive menus
Browse the various menus (& pop-ups)
Async2002 Tutorial: - 54
Single Stage Shift-Register
-- Single Stage Shift Register SRA1.balsaimport [balsa.types.basic]
procedure SRA1 (input i : byte ; output o : byte) is
variable x : byte
begin
loop
o <- x ;
i -> x
end
end
read before writeread before write
Async2002 Tutorial: - 55
Examining the Handshake Circuits
Switch to Makefile pane in balsa-mgr list handshake circuits & their area cost
• click on cost run button view handshake circuit graph
• click on SRA1.ps view button
Async2002 Tutorial: - 56
Viewing Cost
click on tabclick on tab
Async2002 Tutorial: - 57
make commandsmake commandsidentifiesidentifiesoutput paneoutput pane
list of HCslist of HCs
total costtotal cost
Execution Window
standard error panestandard error pane
standard out panestandard out pane
Async2002 Tutorial: - 58
Making Handshake Circuit Graph
Async2002 Tutorial: - 59
repeaterrepeater
sequencersequencer
transferrerstransferrers
registerregister
internalinternalchannel nameschannel names
I/O portsI/O ports
Async2002 Tutorial: - 60
Exercise:n-place Shift Register
Objective: illustration of composition, structural iteration and simulation.
specify an 8-place shift register• add SRA8.balsa to project• ensure SRA8.balsa is selected• click on breeze compile button in
Makefile pane• select add test fixture from right-click
pop-up
See KvB: “Handshake Circuits”See KvB: “Handshake Circuits”
Async2002 Tutorial: - 61
Adding SRA8
Async2002 Tutorial: - 62
SRA8 Code
-- Multistage Shift Register SRA8balsaimport [balsa.types.basic]import [SRA1]
procedure SRA8 (input i : byte; output o : byte) is constant n = 8 array 1..n-1 of channel c : bytebegin SRA1 (i, c[1]) || SRA1 (c[n-1], o) || for || j in 1 .. n-2 then SRA1 (c[j], c[j+1]) endend
Async2002 Tutorial: - 63
SRA8 Code
-- Multistage Shift Register SRA8balsaimport [balsa.types.basic]import [SRA1]
procedure SRA8 (input i : byte; output o : byte) is constant n = 8 array 1..n-1 of channel c : bytebegin SRA1 (i, c[1]) || SRA1 (c[n-1], o) || for || j in 1 .. n-2 then SRA1 (c[j], c[j+1]) endend
define a constantdefine a constant
internalinternalchannel arraychannel array
parallelparallelcompositioncomposition
structuralstructuraliterationiteration
Async2002 Tutorial: - 64
Structure of Circuit
SRA8SRA8 SRA8SRA8 SRA8SRA8 SRA8SRA8……..
channel ichannel i channel ochannel o
channel c[1]channel c[1] channel c[n-1]channel c[n-1]
channel c[2]channel c[2] channel c[n-2]channel c[n-2]
Async2002 Tutorial: - 65
SRA8 Code
-- Multistage Shift Register SRA8balsaimport [balsa.types.basic]import [SRA1]
procedure SRA8 (input i : byte; output o : byte) is constant n = 8 array 1..n-1 of channel c : bytebegin SRA1 (i, c[1]) || SRA1 (c[n-1], o) || for || j in 1 .. n-2 then SRA1 (c[j], c[j+1]) endend
Async2002 Tutorial: - 66
Exercise:Hierarchical vs Flattened views
Check the cost of SRA8 and view the handshake circuit
Change to flattened compilation• Project -> Project Options -> Flattened Compilation
Recheck the cost of SRA8 and view the handshake circuit again• Flattened compilation gives “true” cost
Async2002 Tutorial: - 67
AddingTest Fixture
right clickright clickpop-uppop-up
Async2002 Tutorial: - 68
Test Options Pane
set inputset inputfilename to: datafilename to: data
Async2002 Tutorial: - 69
Running LARD Simulations
text-onlytext-onlysimulationsimulation
channel-viewerchannel-viewersimulationsimulation
Async2002 Tutorial: - 70
Simulation Results (Text)
empty values readempty values readthen input datathen input data
Async2002 Tutorial: - 71
Lard Channel Viewer -1
Async2002 Tutorial: - 72
Lard Channel Viewer -2
input & outputinput & outputchannelschannels
internalinternalchannelschannels
incompleteincompletehandshakeshandshakes
red = requestred = request
green = ackgreen = ack
data valuesdata valueson channelson channels
zoom buttonszoom buttons
Async2002 Tutorial: - 73
Improved Shift-Register Stage
After 1st output last stage is ready for an input: it is vacant• The vacancy propagates backwards
towards the input stage Can not input a new value until vacancy
reaches input stage• poor throughput
Modify SRA1 to include an input and output register
Async2002 Tutorial: - 74
Improved Shift-Register Stage
Input channel i to reg x in parallel with outputting y to channel o from reg y
Then assign y to x Register assignment
is: y := x
ii
oo
xx
SRC1SRC1
yy
Async2002 Tutorial: - 75
Exercise:Language Level Trade-offs
Write your own SRC1 and SRC8• copy SRA1.balsa to SRC1.balsa and edit
Compare the cost of SRC8 with SRA8(must use flattened compilation)
Compare the behaviours of SRC8 and SRA8
Async2002 Tutorial: - 76
Wagging Shift Register: SRW8
SRD1: demux i to o1, o2 alternately SRE1: mux i1, i2 into o alternately Middle SR can be either type A or C
SRA/C3SRA/C3
SRA/C3SRA/C3
xx
yy
xx
yy
ii ooo1o1
o2o2
i1i1
i2i2
SRD1SRD1 SRE1SRE1
Async2002 Tutorial: - 77
Exercise:Build a Wagging Shift Register
SRD1• read channel i into register x while writing
register y to channel o1• read channel i into register y while writing
register x to channel o2• repeat
Middle Registers• compose 3 type A or type C register stages
in each half
Async2002 Tutorial: - 78
Answers:SRD1
-- Single Stage Shift Register: SRD1.balsa-- DeMuxes data stream for Wagging Shift Registerimport [balsa.types.basic]
procedure SRD1 (input i : byte; output o1, o2 : byte)is variable x, y : bytebegin loop o1 <- x || i -> y; o2 <- y || i -> x endend
Async2002 Tutorial: - 79
Answers:SRE1
-- Single Stage Shift Register: SRE1.balsa-- Muxes data streams for Wagging Shift Registerimport [balsa.types.basic]
procedure SRE1 (input i1, i2 : byte; output o : byte)is variable x, y : bytebegin loop o <- x || i1 -> y; o <- y || i2 -> x endend
Async2002 Tutorial: - 80
Answers:SRW8
-- multi-Stage Wagging Shift Register SRW8.balsaimport [balsa.types.basic]import [SRD1] import [SRE1] import [SRC1]
procedure SRW8 (input i : byte; output o : byte) is constant n = 8 -- n must be even array 1 .. n/2 of channel c1, c2 : bytebegin SRD1 (i, c1[1], c2[1]) || SRE1 (c1[n/2], c2[n/2], o) || for || j in 1 .. n/2 -1 then SRC1 (c1[j], c1[j+1]) || SRC1 (c2[j], c2[j+1]) endend
Async2002 Tutorial: - 81
The SSEM
Async2002 Tutorial: - 82
SSEM (The Baby)
World’s 1st stored program machine• ran 21st June 1948 (GCD program)• 32 bit processor, 2’s complement• 7 instruction types• 32 word memory - (x 256 banks)• Single register accumulator (ACC)• program counter (PC)
http://www.computer50.org/
Async2002 Tutorial: - 83
SSEM Instruction Set
JMP ; PC := M[Addr] indirect jump
JRP ; PC := PC + M[Addr] relative jump
LDN ; ACC:= - M[Addr] load negative
STO ; M[Addr] := ACC store result
SUB ; ACC := ACC - M[Addr] subtract
TEST ; if ACC<0 then PC := PC + 1 skip
STOP ; halt
Async2002 Tutorial: - 84
SSEM Operation
Instruction execution sequence:• PC := PC + 1• IR := M[PC]• Decode and execute instruction
– memory operand fetch if required
Repeat until STOP instruction 1st instruction from address 1
Async2002 Tutorial: - 85
SSEM Description: Types – 1
-- SSEM model in Balsa
type word is 32 bits
type LineAddress is 5 bits
type CRTAddress is 8 bits
-- SSEM function types
type SSEMFunc is enumeration
JMP, JRP, -- Abs. and rel. jumps
LDN, STO, -- Load negative and store
SUB, SUB_alt, -- Two encodings for subtract
TEST, STOP -- Skip and stop
end
obvious typeobvious typedefinitionsdefinitions
enumerationenumerationkeywordkeyword
Async2002 Tutorial: - 86
SSEM Description: Types – 2
-- Complete instruction encoding
type SSEMInst is record
LineNo : LineAddress;
CRTNo : CRTAddress;
Func : SSEMFunc
over word -- pad to 32 bits
recordrecordkeywordkeyword
overover keyword keywordrecord is paddedrecord is paddedto the width of wordto the width of word
Async2002 Tutorial: - 87
SSEM Channel & Variable Declarations
-- SSEM: Top levelprocedure SSEM ( -- Memory interface, MemA,MemRNW,MemR,MemW output MemA : LineAddress; output MemRNW : bit; input MemR : word; output MemW : word; -- Signal halt state sync halted) is variable ACC, ACC_slave : word variable IR : word variable PC, PC_step : LineAddress variable MDR : word variable Stopped : bit
main proceduremain procedure
channelchanneldeclarationsdeclarations
data-less handshakedata-less handshake
instruction registerinstruction register
memorymemorydata regdata reg
Async2002 Tutorial: - 88
SSEM: Function Use
-- Extract an address from a word
function ExtractAddress (wordVal : word) =
(wordVal as SSEMInst).LineNo
shared WriteExtractedAddress is begin
MemA <- ExtractAddress (IR) end
Async2002 Tutorial: - 89
use functionuse function
SSEM: Function Use
-- Extract an address from a word
function ExtractAddress (wordVal : word) =
(wordVal as SSEMInst).LineNo
shared WriteExtractedAddress is begin
MemA <- ExtractAddress (IR) end
parameterparameter
cast into a recordcast into a record field selectorfield selectorcast 32 bit word into 32 bit recordcast 32 bit word into 32 bit recordthen extract bottom 5 bitsthen extract bottom 5 bits
shared proceduresshared proceduresreuse hardwarereuse hardware
Async2002 Tutorial: - 90
SSEM: Auxillary Procedures – 1
-- Memory operations, shared procedures shared MemoryWrite is
begin MemRNW <- 0 || WriteExtractedAddress ()
|| MemW <- ACC_slave end
shared MemoryRead is
begin MemRNW <- 1 || WriteExtractedAddress ()
|| MemR -> MDR end
-- Fetch an instruction IR := M[PC]
procedure InstructionFetch is
begin MemRNW<-1 || MemA<-PC || MemR->IR end
copy of ACCcopy of ACC
Async2002 Tutorial: - 91
SSEM: Auxillary Procedures – 2
shared ZeroACC is begin ACC := 0 end
shared ZeroPC is begin PC := 0 end
shared SUB is begin MemoryRead (); ACC_slave := (ACC - MDR as word) end
-- Modify the programme counter PC
shared IncrementPC is begin
PC := (PC + PC_step as LineAddress) end
shared AddMDRToPC is begin
PC_step:=ExtractAddress(MDR); IncrementPC() end
Async2002 Tutorial: - 92
SSEM: Decode & Execute
procedure DecodeAndExecuteInstruction is begin case (IR as SSEMInst).Func of -- add JMP JRP LDN STO instructions SUB .. SUB_alt then SUB () | TEST then if (ACC as array 32 of bit)[31] then -- -ve? -- CI_step should already be 1 IncrementPC () end | STOP then Stopped := 1 end ; ACC := ACC_slave end
Async2002 Tutorial: - 93
SSEM: Main Body
begin
ZeroACC () || ZeroPC () ||
Stopped := 0; -- reset initialisation
while not Stopped then
PC_step := 1;
IncrementPC ();
InstructionFetch ();
DecodeAndExecuteInstruction ()
end;
sync halted; halt -- STOP instruction effect
end
Async2002 Tutorial: - 94
Exercise:Complete Instruction Decode
cd ~/Balsa/ssem create new project
• add ssem.balsa to project• complete DecodeAndExecute procedure or copy ssem.solution to ssem.balsa
• add LARD test file– options: LARD file: test-ssem.l– Sim Arguments: gcd.raw
• GCD source in gcd.s– assembler: ssem-asm
Async2002 Tutorial: - 95
Adding LARD Test File
Picture here
right clickright clickpop-uppop-up
Async2002 Tutorial: - 96
Adding Lard Testfile Options
Async2002 Tutorial: - 97
File Pane with Lard Test File
Async2002 Tutorial: - 98
Running the Simulation
Click on Makefile tab Click on Run sim-<name> button
• channel viewer not too useful here Data:
• addresses 0x11 and 0x12 (0xC and 0x8) Result
• address 0x11 (4)
Beware, odd behaviourBeware, odd behaviourif the balsa file is ill-formedif the balsa file is ill-formed
Async2002 Tutorial: - 99
Balsa Backend
A Balsa circuit may implemented in• Different technologies
– armlr7, ams035, (xilinx, st018, amust)– verilog - Example technology
• Each technology has different styles– Single rail– Dual Rail– One of Four
• Each style has various style options
Async2002 Tutorial: - 100
Style Options
Datapath Logic• Standard DIMS• Balanced DIMS
– for “secure” applications
• NCL (Theseus style implementations)
Storage Elements• SR latches• Spacer latches: SR with enforced RTZ
– for “secure” applications
• NCL: pipelined NCL variables
Async2002 Tutorial: - 101
Style Option Circuit Examples
Dual Rail Spacer LatchDual Rail Spacer Latch
Async2002 Tutorial: - 102
Style Option Circuit Examples
1-of-4 RS Latch1-of-4 RS Latch
Async2002 Tutorial: - 103
Style Option Circuit Examples
NCL Pipeline LatchNCL Pipeline Latch
Async2002 Tutorial: - 104
Adding an Implementation
right clickright clickpop-uppop-up
Async2002 Tutorial: - 105
Choosing an Implementation
Async2002 Tutorial: - 106
Generating a Netlist
Use balsa-mgr to generate an implementation• choose ams035• choose a different implementation / style
option from your neighbour.• make an implementation netlist
Async2002 Tutorial: - 107
Generating a Netlist
Async2002 Tutorial: - 108
Generating a Layout
Ensure you are in the correct directory• run_demo
A job is spawned on a remote machine to place/route the netlist and display a chip-plot
extract core area from report• core area: Total area (db units!)• report your result to the presenter• dismiss your chip-plot
Async2002 Tutorial: - 109
Async2002 Tutorial: - 110
Running a Back-Annotated Verilog Simulation
Ensure you are in the correct directory• run_demo -v gcd.raw
A job is spawned on a remote machine to run a verilog simulation• report simulation time to presenter• a waveform viewer (GTKWave) is spawned
– look at interesting signals !!
• dismiss the viewer.
Async2002 Tutorial: - 111
LARD vs Verilog Simulation
LARD• no implementations required• test harnesses easy to write• integrated into Balsa system• source-level debugging• new improved system “real soon”
Verilog• standard language• was much faster (no longer)• initialisation issues
Async2002 Tutorial: - 112
Compilation Process
balsa-c .breeze & .sbreeze files• .breeze reused by balsa-c• .sbreeze: lisp formatted version of breeze
balsa-netlist CAD system netlist• uses pameterised descriptions of
handshake components HCs described in abs language
Async2002 Tutorial: - 113
Async2002 Tutorial: - 114
position in fileposition in file
part add (
passive sync activate;
active input i : 4 bits;
active input j : 4 bits;
active output o : 4 bits) is
attributes ( isprocedure,isPermanent,noOfChannel=17,line=6,column=1)
local
sync “@9:4” #1
pull channel “i” #2 : 4 bits
pull channel “j” #3 : 4 bits
push channel “o” #4 : 4 bits
pull channel “x” #5 : 4 bits
sync “@12:11” #6
pull channel “b” #7 : 4 bits
Breeze Format
internal channel numberinternal channel number
Async2002 Tutorial: - 115
Breeze Format Ctd
begin
$BrzVariable ( 4,1,”a[0..3]” : #14,{#8} )
$BrzLoop ( #1,#17 )
#BrzSequence ( 3:#17,{#16,#11,#6} )
#BrzConcur ( 2:#16,{#15,#13} )
#BrzFetch ( 4 : #15,#2,#14 )
#BrzBinaryFunc (4,4,4,Add,false,false,false:#9,#8,#7)
#BrzFetch (4:#6,#5,#4 )
end
widthwidth no of read portsno of read ports write portwrite port
read port listread port list
Async2002 Tutorial: - 116
Abs Language
gate operators• expands descriptions to trees of gate• contains AND, OR-gates…, C-elements• places helper-cells
partitioning operators• manipulate vectors
small expression language
Async2002 Tutorial: - 117
Creating New Libraries
Configuration File:• specifies netlist format to use• specifies cell description files• name mapping (long to short names)• maximum fan-in of gates
Gate Description File• list of gates in library with pin-mappings
Async2002 Tutorial: - 118
Creating New Libraries
Gate Mappings File:• mappings from abs gates to cell library
gates and helper cells Helper Cells Descriptions
• descriptions of helper cells used in the abs descriptions
Component Descriptions File• technology specific abs components &
links to generic descriptions
Async2002 Tutorial: - 119
Minimum Component Requirement
Technology must support Structural Verilog, EDIF 2 0 0, or Compass ntl netlists• or the originator could add a netlist format
to Balsa Inverter 2-input AND, NAND, OR, NOR Latch or Flip-flop
Async2002 Tutorial: - 120
Recursive Definitions: An n-way multiplexer
Decompose multiplexer
inp0
inp 1
inp n-2
inp n-1out 1
out 0
inp 0
inp n-1
inp n/2
outout
Before Decomposition After Decomposition
inp n/2-1
Async2002 Tutorial: - 121
An n-way multiplexer -1
-- Pmux1.balsa: A recursive parameterised MUX definition
import [balsa.types.basic]
public
procedure PMux ( parameter X : type;
parameter n : cardinal;
array n of input inp : X;
output out : X ) is
begin
-- procedure body
width of input
number of inputs
each input is a channeloutput
channel
Async2002 Tutorial: - 122
An n-way multiplexer -2
if n = 0 then print error,”Parameter n should not be zero”
| n = 1 then
loop
select inp[0] then
out <- inp[0]
end
end
| n = 2 then
loop
select inp[0] then
out <- inp[0]
| inp[1] then
out <- inp[1]
end
end
when data arrives oneither input, pass it to output
base cases
selectselect keyword keywordencloses blockencloses block
Async2002 Tutorial: - 123
An n-way multiplexer -3
else
local
channel out0, out1 : X
constant mid = n/2
begin
PMux (X,mid, inp[0..mid-1], out0) ||
PMux (X,n-mid, inp[mid..n-1], out1) ||
PMux (X,2, {out0, out1}, out)
end
end
2 internallocal channels
two half-size muxs& one 2:1 mux
local block withlocal definitions
Async2002 Tutorial: - 124
Testing the MUX
cd ~/Balsa/pmux Open the project using balsa-mgr
• test_mux exercises the mux• Simulate test_mux
Async2002 Tutorial: - 125
Spamulet0 - Prototype for SPA1
Subset of ARM instruction set ALU ops, LDR/STR, Branch, Branch
with link (procedure call) implemented Sequential, register based design like
the SSEM Your task is to add to this description Spamulet0 lacks LDM/STM, MUL, SWI,
Coprocessor I/F, Pipelining, Exceptions, Operating modes
Async2002 Tutorial: - 126
The Project File cd ~/Balsa/spamulet0 4 Balsa files, LARD test harness
• types.balsa - type and instruction format records
• alu.balsa - ALU with CC handling• shift.balsa - parameterised shifter• spamulet.balsa - top-level, fetch-
decode-execute loop• test-spamulet.l - LARD test harness
Async2002 Tutorial: - 127
Simulation Framework
LARD test harness provides simulated memory loaded from raw memory dump files
Small number of provided examples: hello.s, multiply.s, helloc.c
Memory dumps generated from:• Assembler: .s spamulet-asm .raw• C: .c spamulet-cc .raw
Async2002 Tutorial: - 128
LDM/STM - Load/Store Multiple
Load and store any of the registers as a block - including the PC!
Commonly used for function arguments, entry register saves and return
ldmdir Rbase!opt, {Ri, Rj, …} stmdir Rbase!opt, {Ri, Rj …} Registers always appear in memory in
the same order with R0 at lowest address, R15 at highest
Async2002 Tutorial: - 129
LDM/STM Directions
dir is one of:• ib - increment before• db - decrement before• ia - increment after• da - decrement after
Problems include loading PC last and avoiding overwriting the base register
Async2002 Tutorial: - 130
LDM/STM Directions 2
LDM/STM can also be used with “stack addressing”• fa - full ascending (ldmda, stmib)• fd - full descending (ldmia, stmdb)• ea - empty ascending (ldmdb, stmia)• ed - empty descending (ldmib, stmda)
The C compiler generated LDM/STMs with the stack addressing names
Async2002 Tutorial: - 131
Instruction Encoding
Async2002 Tutorial: - 132
Instruction Encoding – 2
Look at instLdmStm in types.balsa Use: (ir as instLdmStm) to decode
instructions• e.g. (ir as instLdmStm).options.L
is the load(1)/store(0) select bit Option bits very similar to LDR and STR
instructions, read spamulet.balsa
Async2002 Tutorial: - 133
Implementation Strategies
Edit spamulet.balsa Use a while loop and a variable to
iterate through the register select bits ((inst as instLdmStm).regs)
Perform memory access using MemoryRead and MemoryWrite
Don’t worry about PC or overwriting the base register yet
Async2002 Tutorial: - 134
Some Important InformationInstruction L bit P bit U bit
ldmda/ldmfa 1 0 0
ldmia/ldmfd 1 0 1
ldmdb/ldmea 1 1 0
ldmib/ldmed 1 1 1
stmda/stmed 0 0 0
stmia/stmea 0 0 1
stmdb/stmfd 0 1 0
stmib/stmfa 0 1 1
Async2002 Tutorial: - 135
Some Important Information 2
W bit specifies whether the base register is to be written back (W=1) or keep its pre-LDM/STM value (W=0)
The “!” in the mnemonic selects writeback
Ignore the S bit - it’s used for processor mode changes (e.g. ISR returns)
Async2002 Tutorial: - 136
Some Important Information 3
To help with debugging, Balsa has the print command
Prints simulation values in LARD Example:
• b <- v1; print “Hello”; b <- v2• print “v1=“, v1, “v2=“, v2
Enjoy
Async2002 Tutorial: - 137
Additional Exercises
A choice of advanced design exercises:• A general shifter• A bit population counter
Language Summary in handout + code listings
Async2002 Tutorial: - 138
A Balsa Shifter
General shifters required for processors Write a description for a rotate right
function• solution in ror/solution
Alternatively extend the standard solution to other shift functions
Async2002 Tutorial: - 139
Structure of a ROR shifter
a local procedure
Async2002 Tutorial: - 140
Bit Population Counter
Counting the number of bits that are set to ‘1’ is necessary for ARM’s LDM/STM instructions
Write description for such a unit• solution in popcount/solution
Async2002 Tutorial: - 141
Bit Population Counter
Async2002 Tutorial: - 142
Acknowledgements
Thanks to:• Jeff Pepper our resident CAD Tools expert• Dave Bowden, our AV technician for
bringing AV to these laboratories• The rest of the Amulet group, most of
whom are now using (debugging) Balsa• System support staff for setting up the
demo accounts etc.