ADSD Fall2011 04 Design Partitioning Micro Architecture 2011Oct21
-
Upload
rehan-hafiz -
Category
Documents
-
view
219 -
download
0
Transcript of ADSD Fall2011 04 Design Partitioning Micro Architecture 2011Oct21
-
8/3/2019 ADSD Fall2011 04 Design Partitioning Micro Architecture 2011Oct21
1/103
Dr. Rehan Hafiz Lecture # 04
-
8/3/2019 ADSD Fall2011 04 Design Partitioning Micro Architecture 2011Oct21
2/103
Course Website for ADSD Fall 2011
http://lms.nust.edu.pk/
2
Lectures: Tuesday @ 5:30-6:20 pm, Friday @ 6:30-7:20 pm
Contact: By appointment/EmailOffice: VISpro Lab above SEECS Library
Acknowledgement: Material from the following sources has been consulted/used in theseslides:1. [CIL] Advanced Digital Design with the Verilog HDL, M D. Ciletti2. [SHO] Digital Design of Signal Processing System by Dr Shoab A Khan3. [STV] Advanced FPGA Design, Steve Kilts
Material/Slides from these slides CAN be used with following citing reference:
Dr. Rehan Hafiz: Advanced Digital System Design 2010
Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License.
http://creativecommons.org/licenses/by-nc-sa/3.0/http://creativecommons.org/licenses/by-nc-sa/3.0/http://creativecommons.org/licenses/by-nc-sa/3.0/http://creativecommons.org/licenses/by-nc-sa/3.0/http://creativecommons.org/licenses/by-nc-sa/3.0/http://creativecommons.org/licenses/by-nc-sa/3.0/http://creativecommons.org/licenses/by-nc-sa/3.0/http://creativecommons.org/licenses/by-nc-sa/3.0/http://creativecommons.org/licenses/by-nc-sa/3.0/http://creativecommons.org/licenses/by-nc-sa/3.0/http://creativecommons.org/licenses/by-nc-sa/3.0/http://creativecommons.org/licenses/by-nc-sa/3.0/ -
8/3/2019 ADSD Fall2011 04 Design Partitioning Micro Architecture 2011Oct21
3/103
This Lecture .
3
ASM Algorithmic State Machine
Understanding Design Partition
Controllers
FSM Finite State Machines
Mealy & Moore
Micro Programmed
-
8/3/2019 ADSD Fall2011 04 Design Partitioning Micro Architecture 2011Oct21
4/103
Algorithm State Machine
4
ASMs: Usually the 1ststep towards algorithm to hardware mapping
FSMs : More Controller oriented
-
8/3/2019 ADSD Fall2011 04 Design Partitioning Micro Architecture 2011Oct21
5/103
ASM- Algorithm State Machine
Example5
Up/Down Counter
[CIL]
-
8/3/2019 ADSD Fall2011 04 Design Partitioning Micro Architecture 2011Oct21
6/103
6
Implicit Coding
Up/Down Counter
-
8/3/2019 ADSD Fall2011 04 Design Partitioning Micro Architecture 2011Oct21
7/103
Understanding Design Partitioning
Systematically Porting an Algorithm to H/W
7
-
8/3/2019 ADSD Fall2011 04 Design Partitioning Micro Architecture 2011Oct21
8/103
Greatest Common Divisor
8
Steps:
Swap
Check
Process
Slides from MIT Course 6.375 Complex Digital Systems http://csg.csail.mit.edu/6.375/
-
8/3/2019 ADSD Fall2011 04 Design Partitioning Micro Architecture 2011Oct21
9/103
GCD -Algorithm
9
Steps:
Swap if req
Check if B != 0
Process
A = 100, B= 60 (s)
B !=0 (c)
A = 40, B= 60 (p)
A = 60, B= 40 (s)
B !=0 (c)
A = 20, B= 40 (p)
A = 40, B= 20 (s)
B !=0 (c)
A = 20, B= 20 (p)
A = 20, B= 20 (s)
B !=0 (c)
A = 0, B= 20 (p)
A = 20, B= 0 (s)
B !=0 (c)
-
8/3/2019 ADSD Fall2011 04 Design Partitioning Micro Architecture 2011Oct21
10/103L02 Verilog 10.884 Spring 2005 02/04/05
GCD Behavioral Examplemodule gcd_behavioral #(parameter width = 16 )
( input [width-1:0] A_in, B_in,output[width-1:0] Y );
reg [width-1:0] A, B, Y, swap;integer done;
always@( A_in or B_in )begin
done = 0;A = A_in; B = B_in;
while ( !done )begin
if ( A < B )begin
swap = A;A = B;B = swap;
end
elseif ( B != 0 )A = A - B;elsedone = 1;
end
Y = A;end
endmodule
We start byidentifying DATAProcessing Elements
&
Controlling Signals !
-
8/3/2019 ADSD Fall2011 04 Design Partitioning Micro Architecture 2011Oct21
11/103L02 Verilog 11.884 Spring 2005 02/04/05
Reference SlidesSlides from MIT Course 6.375 Complex Digital Systems http://csg.csail.mit.edu/6.375/
Slides 11-46
-
8/3/2019 ADSD Fall2011 04 Design Partitioning Micro Architecture 2011Oct21
12/103
Summary
Define higher level block diagram
Define its interface
Decompose into smaller blocks if required
Decompose into Datapath & Controller
Use different modules to implement Data path &
Controller
Define their interface
Connect them in higher level block
-
8/3/2019 ADSD Fall2011 04 Design Partitioning Micro Architecture 2011Oct21
13/103
Controller Vs. Data-path
Partitioning13
-
8/3/2019 ADSD Fall2011 04 Design Partitioning Micro Architecture 2011Oct21
14/103
Design Partitioning
14
Data path: The pipe that carries the data from the input of the design to the
output and performs the necessary operations on the data.
ALUs, Storage Registers & logic for moving data
Controller Determines the sequence
Configure the data path for various operations
Data path and control blocks should be partitioned intodifferent modules.
Allows module re-use Controller updates without requiring to update the Datapath
DatapathCritical Timing Allows dedicated floor planning for Datapath Logic
-
8/3/2019 ADSD Fall2011 04 Design Partitioning Micro Architecture 2011Oct21
15/103
15
2002 Dr. James P. Davis
-
8/3/2019 ADSD Fall2011 04 Design Partitioning Micro Architecture 2011Oct21
16/103
16
Logic systems consist of two basic elements:
Control logic consists of state machines (FSM)
Datapath logic consists of functions like counters, arithmetic,
multiplexers, decoders and memory (Wired Connected Datapaths)
-
8/3/2019 ADSD Fall2011 04 Design Partitioning Micro Architecture 2011Oct21
17/103
Finite State MachinesMoore Vs. Mealy Machine
17
Moore Machine
Output function only ofpresent state
May have more states
Synchronous outputs
No glitching One cycle delay
Full cycle of stable output
Mealy Machine
Output function of both presentstates & input
May have fewer states
Asynchronous outputs
If input glitches, so does output
Output immediately available
Output may not be stable longenough to be useful
-
8/3/2019 ADSD Fall2011 04 Design Partitioning Micro Architecture 2011Oct21
18/103
ASMs
Moore Machine: No Oval, No Conditional Output List
18
-
8/3/2019 ADSD Fall2011 04 Design Partitioning Micro Architecture 2011Oct21
19/103
Example: Output a ONE after detecting FOUR
1s in a binary sequence19
State Transition Graph
How shall be its Moore
equivalent
[SHO]
ASM
-
8/3/2019 ADSD Fall2011 04 Design Partitioning Micro Architecture 2011Oct21
20/103
ASM
Mealy Vs. Moore20
A hi f M l & M
-
8/3/2019 ADSD Fall2011 04 Design Partitioning Micro Architecture 2011Oct21
21/103
Architectures of Mealy & Moore
Machines !!21
The choice between Mealy
and Moore machine
implementations is usually
the designers will.
When some of the inputs are
expected to glitch and
outputs are required to be
stable for one complete
cycle MOORE is the best
choice
[SHO]
-
8/3/2019 ADSD Fall2011 04 Design Partitioning Micro Architecture 2011Oct21
22/103
22
// This module implements FSM for the
detection of four ones in a serial input stream of
data
module fsm mealy(
input clk, //system clock input reset, //system reset
input data in, //1-bit input stream
output reg four_ones det //1-bit output to
indicate 4 ones are detected or not
);
// Internal Variables
reg [1:0] current _state, //4-bit current state
register
next _state; //4-bit next state register
// State tags assigned using binary encoding
parameter STATE _0 = 2'b00,
STATE _1 = 2'b01,
STATE _2 = 2'b10,
STATE _3 = 2'b11;
// Always block for State Assignment
always @(posedge clk)
begin
if(reset)
current _state < STATE 0;
else
current _state < next _state;
end
endmodule
//State Assignment Block STATE 1: begin
-
8/3/2019 ADSD Fall2011 04 Design Partitioning Micro Architecture 2011Oct21
23/103
23
//State Assignment Block
// This block implements thecombination cloud of nextstate assignment logic
always @(*)
begin
case(current state)
STATE 0 :
begin
if(data _in)
begin
//transition to next state
next _state = STATE 1;
four _ones _det = 1'b0;
end
else
begin
//retain same state next _state = STATE 0;
four _ones _det = 1'b0;
end
End
STATE 1:
begin
if(data_ in)
begin
//transition to next state
next _state = STATE 2; four _ones _det = 1'b0;
end
else
begin
//retain same state
next state = STATE 1;
four ones det = 1'b0;
end
end
STATE 2 :
begin
if(data in)
begin
//transition to next state
next state = STATE 3;
four ones det = 1'b0;
end else
begin
//retain same state
next state = STATE 2;
four ones det = 1'b0;
end
end STATE 3 :
begin
if(data in)
begin
//transition to next state
next state = STATE 0;
four ones det = 1'b1;
end
else
begin
//retain same state
next state = STATE 3;
four ones det = 1'b0;
end
end
endcase
end
-
8/3/2019 ADSD Fall2011 04 Design Partitioning Micro Architecture 2011Oct21
24/103
24
To make this machine MOORE; output should be
a function of current_state not next_state
-
8/3/2019 ADSD Fall2011 04 Design Partitioning Micro Architecture 2011Oct21
25/103
State Encoding Schemes
25
One Hot: Very light on resources. Infact, a sequence can be defined using a
simple shift register
Binary-coded counter sequences often change multiple bits on one count
transition. That can lead to decoding glitches. Gray codes ensure minimum
glitches since just one bit changes
N d t k f Ill l St t
-
8/3/2019 ADSD Fall2011 04 Design Partitioning Micro Architecture 2011Oct21
26/103
Need to keep care of Illegal States
with One Hot Encoding 26
It is important to handle illegal states by checking whether more than
one bit of the state register is 1.
-
8/3/2019 ADSD Fall2011 04 Design Partitioning Micro Architecture 2011Oct21
27/103
Guidelines - Summary
27
Design Partitioning in Datapath and Controller
Datapath and control parts have different design objects so keep in different blocks !
Datapath usually synthesized for better timing; controller synthesized to take
minimum area.
FSM Coding in Procedural Blocks
Two always blocks are preferred, where one implements the sequential part that
assigns the next state to the state register, and the second block implements the
combinational logic that computes the next state
The designer can include the output computations for Mealy or Moore machines,
respectively, in the same combinational block. Alternatively, if the output is easy to
compute, they can be computed separately in a continuous assignment outside the
combinational procedural block.
State Encoding
Use meaningful tags using define or parameter statements for all possible states.
Select the best encoding scheme
D t t i f 1' 0' i th i l bit i t Th t i i t ill b
-
8/3/2019 ADSD Fall2011 04 Design Partitioning Micro Architecture 2011Oct21
28/103
Detect a pair of 1's or 0's in the single bit input. That is, input will be a
series of one's and zero's. If two one's or two zero's comes one after another,
output should go high. Otherwise output should be low.
28
http://electrosofts.com/verilog/fsm.html
http://electrosofts.com/verilog/fsm.htmlhttp://electrosofts.com/verilog/fsm.html -
8/3/2019 ADSD Fall2011 04 Design Partitioning Micro Architecture 2011Oct21
29/103
29
-
8/3/2019 ADSD Fall2011 04 Design Partitioning Micro Architecture 2011Oct21
30/103
30
-
8/3/2019 ADSD Fall2011 04 Design Partitioning Micro Architecture 2011Oct21
31/103
Micro-programmed State
Machines
31
-
8/3/2019 ADSD Fall2011 04 Design Partitioning Micro Architecture 2011Oct21
32/103
Micro-Programmed State Machines
32
In hardwired state machine based designs, the controller
is implemented as a Mealy or Moore finite state machine
(FSM)
Makes the design rigid
What can we do if updates to algorithm or sequencing is
expected ?
Make the controller programmable
How
-
8/3/2019 ADSD Fall2011 04 Design Partitioning Micro Architecture 2011Oct21
33/103
Idea
33
We DO NOT implement the logic for next state --- WE Simply store the
outputs & next state for the current state in a memory --- Just like a
lookup table
The combinational logic is replaced by a sequence of control signals that
are stored in program memory (PM)
The PM may be a read only (ROM) or random access (RAM).
The address of the contents in the memory is determined by the current state and
input to the FSM.
-
8/3/2019 ADSD Fall2011 04 Design Partitioning Micro Architecture 2011Oct21
34/103
General Architecture
34
The designer evaluates all possible state
transitions based on inputs and the current
state and tabulates the outputs and next
states as micro coding for PM.
These values are placed in the PM such that
the inputs and the current state provide the
index or address to the PM.
-
8/3/2019 ADSD Fall2011 04 Design Partitioning Micro Architecture 2011Oct21
35/103
Example (MEALY)
35
-
8/3/2019 ADSD Fall2011 04 Design Partitioning Micro Architecture 2011Oct21
36/103
Verilog Code
36
-
8/3/2019 ADSD Fall2011 04 Design Partitioning Micro Architecture 2011Oct21
37/103
Micro Programmed MOORE
37
The micro program memory is split into two parts
Combinational logic I and logic II are replaced by PM I and PM II.
The input and the current state constitute the address for PM I.
The memory contents of PM I are filled to appropriately generate the next
state according to the ASM chart. The width of PM I is equal to the size ofthe current state register, whereas its depth is number of bits for {input &
current state}
Only the current state acts as the address for PM II. The contents of PM II
generate output signals for the datapath
-
8/3/2019 ADSD Fall2011 04 Design Partitioning Micro Architecture 2011Oct21
38/103
Example
38
Variations:
-
8/3/2019 ADSD Fall2011 04 Design Partitioning Micro Architecture 2011Oct21
39/103
Variations:
Counter based State Machines39
Many controller designs do not depend on the external inputs.
May require a sequence of control signals
To read a value, the design only needs to generate addresses to the PM
Simply Use Counters !!
Remember the difference b/w micro-processor and these micro-
programmed state machines for upcomg slides
Variations : Adding Jumps
-
8/3/2019 ADSD Fall2011 04 Design Partitioning Micro Architecture 2011Oct21
40/103
Variations : Adding Jumps
Loadable Counter based State Machines40
State machines also have jumps & may also have explicit jumps
decided at runtime !!!
Controller should be capable of jumping to start generating control
signals from a new address in the PM.
Make branching address part of micro-code ! Unconditional Branching
Load bit provides a
programmable way of
deciding if JUMP should be
associated with a particular
state
Branch_addr provides the
address
Variations
-
8/3/2019 ADSD Fall2011 04 Design Partitioning Micro Architecture 2011Oct21
41/103
Loadable Counter based State Machines with
Conditional Branch Support41
Algorithms may require conditional Jump support as a result of for example
some ALU operation
Some sort of Status and Control register (SCR) may be sued
Good Idea to have a centralized Status Register in your controller
Not all status signals are always useful
We increase the load bits
to have a programmable
way to test various
options from the availablestatus bits
-
8/3/2019 ADSD Fall2011 04 Design Partitioning Micro Architecture 2011Oct21
42/103
Example Design Scenario
42
Variation :
-
8/3/2019 ADSD Fall2011 04 Design Partitioning Micro Architecture 2011Oct21
43/103
Variation :
Register-based Controllers43
Similar to PC (Program Counter Approach)
-
8/3/2019 ADSD Fall2011 04 Design Partitioning Micro Architecture 2011Oct21
44/103
Adding Subroutine Support
44
Subroutine, needs to return to the next micro code instruction.
So we need to store return address in a register.
The state machine saves the contents of micro PC in a special register
called the subroutine return address (SRA) register.
Parity bits are some time
dd d h k f l
-
8/3/2019 ADSD Fall2011 04 Design Partitioning Micro Architecture 2011Oct21
45/103
45
added to check false
conditions . Again this helps in
keeping the datapath as much
independent as possible
Allows us to branch on bothtrue and false states & its
programmable
PC ADDr
RET ADDr
JMP ADDr
Load SRA on
CALL to
subroutine
PC Address (00)
JMP Address & Load SRA on CALL (01)
RET Address (10) Select SRA Address
Automatically
updates the
next PC
-
8/3/2019 ADSD Fall2011 04 Design Partitioning Micro Architecture 2011Oct21
46/103
Adding Nested Sub Routine Support
46
Add a STACK !!
Level of nesting ??
PC ADDr
ET ADDr
JMP ADDr
-
8/3/2019 ADSD Fall2011 04 Design Partitioning Micro Architecture 2011Oct21
47/103
Logic for Subroutine Address Stack
47
On CALL Write isenabled to save
the RET address
& the correct
LIFO address is
selected based
upon the MUXvalue (simple
increment is fine
for STACK
ADDRESSING)
Read_lifo_addr
points to top of
stack
Write_lifo_addr
points to top+1 of
stack
Assumed no error
handling
-
8/3/2019 ADSD Fall2011 04 Design Partitioning Micro Architecture 2011Oct21
48/103
Complete System !
48
LOOPs in State Machines
-
8/3/2019 ADSD Fall2011 04 Design Partitioning Micro Architecture 2011Oct21
49/103
LOOPs in State Machines
Example : Filtering !49
State 1 Reset
State 2
Wait for Data
State 3
Wait for Complete Data Packet
State 4
Start Processing : Repeat State 5 6, 256 times
State 5
Convolve filter with data at location x,y State 6
x++, y++
State 7
End
What if you want to
apply a cascaded filter
OO S h
-
8/3/2019 ADSD Fall2011 04 Design Partitioning Micro Architecture 2011Oct21
50/103
LOOPs in State Machines
50
State 1 Reset
State 2
Wait for Data
State 3
Wait for Complete Data Packet State 3.5
Start Filtering : Repeat State 4, 2 times (For two filters)
State 4
Start Processing : Repeat State 5 6, 256 times
State 5 Convolve filter with data at location x,y
State 6
x++, y++
State 7
End
Need Nested LOOP
Support !
Imagine doing this for
a Hard Wired State
Machine !
Addi LOOP S
-
8/3/2019 ADSD Fall2011 04 Design Partitioning Micro Architecture 2011Oct21
51/103
Adding LOOP Support
51
Consider a LOOP instruction
Need a counter now
Loop counter loads the value on loop command
Endaddress in
a loop
instance
reached
Why need
this ?
LOOP Ended
Why need this ?
Addi NESTED LOOP S
-
8/3/2019 ADSD Fall2011 04 Design Partitioning Micro Architecture 2011Oct21
52/103
Adding NESTED LOOP Support
52
Add STACKs to your architecture !
Good thing :
All stacks need the same global address logic controller !!!
Why ?
Adding NESTED LOOP
-
8/3/2019 ADSD Fall2011 04 Design Partitioning Micro Architecture 2011Oct21
53/103
Adding NESTED LOOP
Support
53
C l S !
-
8/3/2019 ADSD Fall2011 04 Design Partitioning Micro Architecture 2011Oct21
54/103
LOOP & Subroutine
Address Stack
Complete System !
-
8/3/2019 ADSD Fall2011 04 Design Partitioning Micro Architecture 2011Oct21
55/103
Design Example I Microcoded Machine
FIFO/LIFO55
Example Design-1
-
8/3/2019 ADSD Fall2011 04 Design Partitioning Micro Architecture 2011Oct21
56/103
Example Design 1
LIFO/FIFO Architecture56
A traditional four deep FIFO shall require 4 states
Working:
WRITE Gets the new value from IN_BUS on the next available
space
DEL updates the read address for the OUT_BUS
ERROR = Any Error condition, for example : DEL on Empty
Mi C d f FIFO
-
8/3/2019 ADSD Fall2011 04 Design Partitioning Micro Architecture 2011Oct21
57/103
Micro-Code for FIFO
Mi C d f FIFO
-
8/3/2019 ADSD Fall2011 04 Design Partitioning Micro Architecture 2011Oct21
58/103
Micro-Code for FIFO
Mi C d f LIFO
-
8/3/2019 ADSD Fall2011 04 Design Partitioning Micro Architecture 2011Oct21
59/103
Micro-Code for LIFO
59
-
8/3/2019 ADSD Fall2011 04 Design Partitioning Micro Architecture 2011Oct21
60/103
Design Example-IIDesign for Block Based Estimation !
60
Example-2Image Source:
http://www-sipl.technion.ac.il/Info/News&Events_1_e.php?id=373
-
8/3/2019 ADSD Fall2011 04 Design Partitioning Micro Architecture 2011Oct21
61/103
p
Design for Block based Motion Estimation61
Block based exhaustive Motion
Estimation
searches a block in the whole
image & computes
some similarity measure, e.g.
Sum of Absolute Difference
Example-2Image Source:
http://www-sipl.technion.ac.il/Info/News&Events_1_e.php?id=373
-
8/3/2019 ADSD Fall2011 04 Design Partitioning Micro Architecture 2011Oct21
62/103
p
Design for Block based Motion Estimation62
[SHO]Fig 10.22
Raster Scanning
Sample Design
-
8/3/2019 ADSD Fall2011 04 Design Partitioning Micro Architecture 2011Oct21
63/103
Sample Design
63
[SHO]
S stem Design for a Comple S stem !!
-
8/3/2019 ADSD Fall2011 04 Design Partitioning Micro Architecture 2011Oct21
64/103
System Design for a Complex System !!
64
From where shall I start
1. Follow a Top Down Hierarchical Model withiterative refinement
2. Define the interface with the external world{other components and memory e.t.c. !}
1. The way of your memory arrangement can betricky but again we identify incrementally
3. Define major functional blocks & reiterate Step1-3 for each of them until you constitute yourcomplete data path
Consider Block based Motion Estimation
-
8/3/2019 ADSD Fall2011 04 Design Partitioning Micro Architecture 2011Oct21
65/103
Consider Block based Motion Estimation
65
[SHO]Fig 10.22
Lets assume we wish to have a
micro-coded design.
We wish to have flexibility to
change the raster scan direction!!!
The FUN Part : Lets start the
design right now Divide &
Conquer
Developing a RASTER Machine !!!!!
-
8/3/2019 ADSD Fall2011 04 Design Partitioning Micro Architecture 2011Oct21
66/103
p g
Consider Block based Motion Estimation66
Considerations:
Describe what your block shall do
Shall read an image and a reference block, both from
memory; & shall raster scan the target image completely
and report the x,y for lowest computed SAD.
Define its I/O & Draw the block diagram !Any particular specs
Customer want it programmable and may change rater style
& starting position in future !
Start studying your Algorithm to go further deep in the design.
Requires Four nested Loops so you need a nice looking
controller with loop support !Shall require some ALU to the real data crunching !
Requires Register file to store data read from the memory
RASTER MACHINE Need a lot of Address Logic to generate
the right logic depending upon the current state !
Need to store tx,ty
-
8/3/2019 ADSD Fall2011 04 Design Partitioning Micro Architecture 2011Oct21
67/103
67
Tx & TyRegister
Reference
RAM
(SinglePorted)
Target
RAM(SinglePorted)
Target & REF Register FileAddressing controlled by Address Generator (above)
Address Generator for :Tx,Ty ,ALU, RAMs, Register File
Inputs :Current State, Tx, Ty,
Address
Generator
for ALU for
the
processingstate
To : ALU
From: Reg
File
Address Generatorfor Extra Column/Row
(EAG)
RASTER ControlControls the Address
Generation Logic
Block Address Generator(BAG)
For generating addresses for
memory access during initial
loading
Needs to keep care for Row
Major AddressingInput : From TAG
Output : To Reg File & Memory
tX,tY
AddressGenerator
(TAG)
ALUPerforming SAD on each cycle & on storing the corresponding tx & ty with minimum
SAD
Controller
(Micro-Coded,
SupportingNested Loops)
Row Major Addressing a b c de f g h
-
8/3/2019 ADSD Fall2011 04 Design Partitioning Micro Architecture 2011Oct21
68/103
j gfor Matrices
68
e f g h
i j k lm n o p
C = Number of Columns
Suppose your loop is over i,j ;
where i is the loop index for
current row and j is the loop
index for current column
ith row Jth col Row Major Linear Address(Row * C)+Col Add Data
0 0 0 [0000] a0 1 1 [0001] b0 2 2 [0010] c0 3 3 [0011] d1 0 4 [0100] e1 1 5 [0101] f1 2 6 [0110] g1 3 7 [0111] h2
0
8
[1000]
i
2 1 9 [1001] j2 2 10 [1010] k2 3 11 [1011] l3 0 12 [1100] m3 1 13 [1101] n3
2
14
[1110]
o
3 3 15 [1111] p
How can you implement for a square image
- A Row Major to Linear Address Mapper
- A Linear to Row-Major Mapper
Solution :
Concatenation & De-Concatenation !
i = tx
-
8/3/2019 ADSD Fall2011 04 Design Partitioning Micro Architecture 2011Oct21
69/103
69
For i = 0: (255- (N-1)
For j = 0: (255- (N-1)For k = 0: (N-1)
For l = 0 : (N-1) SAD(k,l) = S(k,l)-R(k,l)
SAD(i,j) = SAD(i,j) + SAD(k,l)
End
EndIf (SAD(I,j) < Min_SAD ); Min_SAD = SAD(i,j)
End
End
Need to get data from RAMassuming Row Major Order
Shall need a Row Major to
Linear Converter if required
Once the blocks are loaded it
requires a simple one-to-one
mapping (address generation)
for ALU (SAD Block) !
N = elements per row or col
assuming a square block !
i tx
J = ty
Raster Algo !
-
8/3/2019 ADSD Fall2011 04 Design Partitioning Micro Architecture 2011Oct21
70/103
70
For i = 0:2: (255-(N-1))/2 For j = 0: (255- (N-1)
For k = 0: (N-1)
For l = 0 : (N-1)
SAD(k,l) = |S(k,l)-R(k,l)|
SAD(i,j) = SAD(i,j) + SAD(k,l)
End
End If (SAD(I,j) < Min_SAD ); Min_SAD = SAD(i,j)
End
tx = tx +1
For j = (255- (N-1):0
For k = 0: (N-1)
For l = 0 : (N-1)
SAD(k,l) = S(k,l)-R(k,l)
SAD(i,j) = SAD(i,j) + SAD(k,l)
End
End
If (SAD(I,j) < Min_SAD ); Min_SAD = SAD(i,j)
End
End
Rastering efficiently !
-
8/3/2019 ADSD Fall2011 04 Design Partitioning Micro Architecture 2011Oct21
71/103
Rastering efficiently !
71
1 5 3 7 1
2 3 7 5 2
3 7 4 3 3
4 3 5 2 4
5 1 6 1 5
-
8/3/2019 ADSD Fall2011 04 Design Partitioning Micro Architecture 2011Oct21
72/103
72
Tx & TyRegister
Reference
RAM
(SinglePorted)
Target
RAM(SinglePorted)
Target & REF Register FileAddressing controlled by Address Generator (above)
Address Generator for :Tx,Ty ,ALU, RAMs, Register File
Inputs :Current State, Tx, Ty,
Address
Generator
for ALU for
the
processingstate
To : ALU
From: Reg
File
Address Generatorfor Extra Column/Row
(EAG)
RASTER ControlControls the Address
Generation Logic
Block Address Generator(BAG)
For generating addresses for
memory access during initial
loading
Needs to keep care for Row
Major AddressingInput : From TAG
Output : To Reg File & Memory
tX,tY
AddressGenerator
(TAG)
ALUPerforming SAD on each cycle & on storing the corresponding tx & ty with minimum
SAD
Controller
(Micro-Coded,
SupportingNested Loops)
-
8/3/2019 ADSD Fall2011 04 Design Partitioning Micro Architecture 2011Oct21
73/103
73
ALU-In Depth
-
8/3/2019 ADSD Fall2011 04 Design Partitioning Micro Architecture 2011Oct21
74/103
ALU-In Depth
74
-
8/3/2019 ADSD Fall2011 04 Design Partitioning Micro Architecture 2011Oct21
75/103
Instruction/StateState
Value Loop Start End Comments
Reset S0Reset Everything
Set tx 0 Initialize tx (Starting x co-ordinate)
-
8/3/2019 ADSD Fall2011 04 Design Partitioning Micro Architecture 2011Oct21
76/103
76
( g )
Set ty 0 Initialize ty (Starting y co-ordinate)
RASTER RIGHT Tell processor you are traversing right initially
Lp InitBlk S1 Block size Lp InitBlk Lp InitBlk Load initial Blocks for REF & TARGET. Will take clks equal to the number of elements
Lp R S2 (256)/2-8 Lp C LpR_Dne Run till State Lp R Dne equal to half of number of rows
Lp C S3 256-8 Lc+Pr LpC_Dne Run till State Lp C Dne equal to number of columns for each row traversed in RIGHT Direction
Lc+Pr S4 = c size Lc+Pr Lc+Pr Process & Load RIGHT/LEFT Coulmn Due to RASTER value
Pr S5 b c size Pr PrProcess only
Pr_dneStore Result
Update_ty Update ty based upon previous RASTER Direction
SHIFT LEFT Shift Left
LpC_Dne S7 Done with one row --- (over all the coulmns)
RASTER DOWN Block needs to move down !
Update_tx As defined by previous RASTER !
Load R/C = c size Lc+Pr Lc+Pr Load Row Due to RASTER value
SHIFT UP Update REG files !
RASTER LEFT This step can be avoided by adding a XORING to a predefined bit of Counter : Useful for RASTER !
Lp C S3 256-8 Lc+Pr LpC_Dne
Lc+Pr S4 = c size Lc+Pr Lc+Pr Process & Load LEFT Coulmn Due to RASTER value
Pr S5 b c size Process only
Pr_dne Store Result
Update_ty Update ty based upon previous RASTER Direction
SHIFT RIGHT Shift right - take the extra coulmn to the other end !
LpC_Dne S7
RASTER DOWN
Update_tx
Load R/C = c size Lc+Pr Lc+Pr Load Row Due to RASTER value
SHIFT UP
RASTER RIGHT
Lp R_Dne
-
8/3/2019 ADSD Fall2011 04 Design Partitioning Micro Architecture 2011Oct21
77/103
77
Designing your own microprocessor
Datapath Vs Control Logic Partitioning
-
8/3/2019 ADSD Fall2011 04 Design Partitioning Micro Architecture 2011Oct21
78/103
Micro Architecture DocumentingCase Study: RISC-SPM
(A mini RISC Stored Program Machine)
Datapath Vs. Control Logic Partitioning
78
Design Spec or
-
8/3/2019 ADSD Fall2011 04 Design Partitioning Micro Architecture 2011Oct21
79/103
Micro-Architecture
Partitioning of functions into blocks,
clock/reset requirements, pipelining of
registers, memory buffers, state machines and
interface details.
Micro Architecture Documents
-
8/3/2019 ADSD Fall2011 04 Design Partitioning Micro Architecture 2011Oct21
80/103
Template
Part 1 Block name, Owner, Version control
Part 2 Overview
Part 3 Functional/Requirement Specifications
Operation details, Interfacing signals, .
Part 4 Detailed Functional description of key
circuitry with drawings
Part 5 Verification list of assertions, formalverification rules, etc.
Part 6 Comments
Micro-Architecture Template
P t 1 Bl k O V i t l
-
8/3/2019 ADSD Fall2011 04 Design Partitioning Micro Architecture 2011Oct21
81/103
Part 1 Block name, Owner, Version control
Block Name
A mini RISC Stored Program Machine ,Dual PortRAM
Version Control
Version Modification Author/s Date Remarks
1.0 Initial Draft Ossama 10th Aug,09
2.0 Updated FSM for
Bulk Transfer,
Page No
Saad 13th Aug,09 It was found
that ..
Micro-Architecture Template
P t 2 O i
-
8/3/2019 ADSD Fall2011 04 Design Partitioning Micro Architecture 2011Oct21
82/103
Part 2 Overview
Describe what you block is supposed to do
A mini RISC Stored Program Machine that
performs basic arithmetic .
Give enough information for people to recognizethe functionality in a glance
Should List
Abbreviations References
Micro Architecture Template
P t 3 F ti l/R i t S ifi ti
-
8/3/2019 ADSD Fall2011 04 Design Partitioning Micro Architecture 2011Oct21
83/103
Part 3 Functional/Requirement Specification
What are the functional demands / requirements
/ constraints of your block
Examples:
The mini RISC SPM should operate at 2.5 GHz
Interface with the external world
Interfacing Signals, Any specific interface
Instruction
-
8/3/2019 ADSD Fall2011 04 Design Partitioning Micro Architecture 2011Oct21
84/103
Set84
Interface with the external world
-
8/3/2019 ADSD Fall2011 04 Design Partitioning Micro Architecture 2011Oct21
85/103
Interface with the external world
85
RISC SPM
Rst
Clk
Int
Micro Architecture Template
Part 3 Functional/Requirement Specification
-
8/3/2019 ADSD Fall2011 04 Design Partitioning Micro Architecture 2011Oct21
86/103
Part 3 Functional/Requirement Specification
Interface Signal List
Every interface signal should be listed Dont forget comments:
for example if a system clock is gated low
Remember to fill in information which is helpful to
the designers interfacing to you.
Micro Architecture Template
-
8/3/2019 ADSD Fall2011 04 Design Partitioning Micro Architecture 2011Oct21
87/103
p
Part 4 Detailed Functional description
(a)Block Level Diagram Hirarchical (b) Datapath
(c) Controller
Block Level DiagramId tif Y j F ti l Bl k
-
8/3/2019 ADSD Fall2011 04 Design Partitioning Micro Architecture 2011Oct21
88/103
Identify You major Functional Blocks
88
Controller
RstClk
Memory
Processor
Micro Architecture Template
Part 4 Detailed Functional description
-
8/3/2019 ADSD Fall2011 04 Design Partitioning Micro Architecture 2011Oct21
89/103
Part 4 Detailed Functional description
a) Block level diagram
If your block is top level you can go
gradually to lower levels
Block diagram/ Macro-Architecture
Highlighting the flow of data and control
signals
Draw Control path and Data path for
each ground level block For each & every block specify:
Overview, Interfacing Signals,
Block Level Diagram
-
8/3/2019 ADSD Fall2011 04 Design Partitioning Micro Architecture 2011Oct21
90/103
Identify You major Functional Blocks
90
Controller
RstClk
Memory
Processor
Moving further down into design
-
8/3/2019 ADSD Fall2011 04 Design Partitioning Micro Architecture 2011Oct21
91/103
Moving further down into design
91
Further add the functionality
Show how your block is structured
Dont necessarily draw every wire rather a
qualitative approach All interface signals should be present on your
drawing.
Show all storage elements/registers/pipelinestages
Block Level DiagramIdentify You major Functional Blocks
-
8/3/2019 ADSD Fall2011 04 Design Partitioning Micro Architecture 2011Oct21
92/103
Identify You major Functional Blocks
92
Controller
RstClk
Memory
Register File
ALU
Instruction Reg.Program Counter
Processor
-
8/3/2019 ADSD Fall2011 04 Design Partitioning Micro Architecture 2011Oct21
93/103
(b) Datapath93
-
8/3/2019 ADSD Fall2011 04 Design Partitioning Micro Architecture 2011Oct21
94/103
(c) Controller94
Control Signals Generation
Finite State Machines
ASM Charts
-
8/3/2019 ADSD Fall2011 04 Design Partitioning Micro Architecture 2011Oct21
95/103
95
-
8/3/2019 ADSD Fall2011 04 Design Partitioning Micro Architecture 2011Oct21
96/103
96
-
8/3/2019 ADSD Fall2011 04 Design Partitioning Micro Architecture 2011Oct21
97/103
97
-
8/3/2019 ADSD Fall2011 04 Design Partitioning Micro Architecture 2011Oct21
98/103
98
-
8/3/2019 ADSD Fall2011 04 Design Partitioning Micro Architecture 2011Oct21
99/103
99
-
8/3/2019 ADSD Fall2011 04 Design Partitioning Micro Architecture 2011Oct21
100/103
100
-
8/3/2019 ADSD Fall2011 04 Design Partitioning Micro Architecture 2011Oct21
101/103
101
(d) Timing waveforms of interfacing signals E.g. Interfacing with external RAM
(e) Memory Map
Status registers, Defined I/O ports etc
Micro Architecture Template
P 5 V ifi i
-
8/3/2019 ADSD Fall2011 04 Design Partitioning Micro Architecture 2011Oct21
102/103
Part 5 Verification
Describe the rules for the correct behaviour ofyour block
Take your time and describe rules
Example: 2 cycles after signal A goes down, signal Bshould also go down.
Micro Architecture DocumentSummary
-
8/3/2019 ADSD Fall2011 04 Design Partitioning Micro Architecture 2011Oct21
103/103
Summary
Part 1 Block name, Owner, Version control
Part 2 Overview
Part 3 Functional/Requirement Specifications
Operation details/requirements, Interfacing signals
Part 4 Detailed Functional description
State diagrams for Control & Data path & waveforms
Part 5 Verification list of assertions, formalverification rules, etc.