ECE 448 Lecture 1 3
description
Transcript of ECE 448 Lecture 1 3
George Mason UniversityECE 448 – FPGA and ASIC Design with VHDL
ECE 448Lecture 13
Multipliers
Timing Parameters
2ECE 448 – FPGA and ASIC Design with VHDL
Required reading
• S. Brown and Z. Vranesic, Fundamentals of Digital Logic with VHDL Design
Chapter 10.2.3, Shift-and-Add Multiplier
3ECE 448 – FPGA and ASIC Design with VHDL
Shift-and-Add Multiplier
4ECE 448 – FPGA and ASIC Design with VHDL
An algorithm for multiplication
(a) Manual method
Multiplicand, A11
Product
Multiplier, B10
01
11
1 1 0 11011
00001011
01 001111
Binary
1311
1313
143
Decimal P = 0 ; for i = 0 to n 1 do
if b i = 1 thenP = P + A ;
end if; Left-shift A ;
end for;
(b) Pseudo-code
–
5ECE 448 – FPGA and ASIC Design with VHDL
Expected behavior of the multiplier
6ECE 448 – FPGA and ASIC Design with VHDL
Datapath for the multiplier
E
L E
L E
0 DataA LA
EA
A Clock
P
DataP
RegisterEP
Sum 0
z
B
b 0
DataB LB
EB
+
2n
n n
Shift-leftregister
Shift-right register
n
n
2n 2n
Psel 1 0
2n
2n
7ECE 448 – FPGA and ASIC Design with VHDL
ASM chart for the multiplier
Shift left A , Shift right B Done
P P A + B 0 = ?
P 0
s
Load A
b 0
Reset
S3
0
1
0
1
0 1
s
S1
S2
1
0
Load B
8ECE 448 – FPGA and ASIC Design with VHDL
ASM chart for the multiplier control circuit
EP z
b 0
Reset
S3
0
1
0
1 s
0
1
Done
Psel 0 = EP
s 0
1
S1
S2
Psel 1 = EA EB
9ECE 448 – FPGA and ASIC Design with VHDL
VHDL code of multiplier circuit (1)
LIBRARY ieee ;USE ieee.std_logic_1164.all ;USE ieee.std_logic_unsigned.all ;USE work.components.all ;
ENTITY multiply ISGENERIC ( N : INTEGER := 8; NN : INTEGER := 16 ) ;PORT ( Clock : IN STD_LOGIC ;
Resetn : IN STD_LOGIC ; LA, LB, s : IN STD_LOGIC ; DataA : IN STD_LOGIC_VECTOR(N–1 DOWNTO 0) ; DataB : IN STD_LOGIC_VECTOR(N–1 DOWNTO 0) ; P : OUT STD_LOGIC_VECTOR(N–1 DOWNTO 0) ; Done : OUT STD_LOGIC ) ;
END multiply ;
10ECE 448 – FPGA and ASIC Design with VHDL
VHDL code of multiplier circuit (2)ARCHITECTURE Behavior OF multiply IS
TYPE State_type IS ( S1, S2, S3 ) ;SIGNAL y : State_type ;SIGNAL Psel, z, EA, EB, EP, Zero : STD_LOGIC ;SIGNAL PF, B, N_Zeros : STD_LOGIC_VECTOR(N–1 DOWNTO 0) ;SIGNAL A, Ain, DataP, Sum : STD_LOGIC_VECTOR(NN–1 DOWNTO 0) ;
BEGINFSM_transitions: PROCESS ( Resetn, Clock )BEGIN
IF Resetn = '0’ THENy <= S1 ;
ELSIF (Clock'EVENT AND Clock = '1') THENCASE y IS
WHEN S1 =>IF s = '0' THEN y <= S1 ; ELSE y <= S2 ; END IF ;
WHEN S2 =>IF z = '0' THEN y <= S2 ; ELSE y <= S3 ; END IF ;
WHEN S3 =>IF s = '1' THEN y <= S3 ; ELSE y <= S1 ; END IF ;
END CASE ;END IF ;
END PROCESS ;
11ECE 448 – FPGA and ASIC Design with VHDL
VHDL code of multiplier circuit (3)
FSM_outputs: PROCESS ( y, s, B(0) )BEGIN
EP <= '0' ; EA <= '0' ; EB <= '0' ; Done <= '0' ; Psel <= '0';CASE y IS
WHEN S1 =>EP <= '1‘ ;
WHEN S2 =>EA <= '1' ; EB <= '1' ; Psel <= '1‘ ;IF B(0) = '1' THEN
EP <= '1' ; ELSE
EP <= '0' ; END IF ;
WHEN S3 =>Done <= '1‘ ;
END CASE ;END PROCESS ;
12ECE 448 – FPGA and ASIC Design with VHDL
Datapath for the multiplier
E
L E
L E
0 DataA LA
EA
A Clock
P
DataP
RegisterEP
Sum 0
z
B
b 0
DataB LB
EB
+
2n
n n
Shift-leftregister
Shift-right register
n
n
2n 2n
Psel 1 0
2n
2n
13ECE 448 – FPGA and ASIC Design with VHDL
VHDL code of multiplier circuit (4)- - Define the datapath circuit
Zero <= '0' ;N_Zeros <= (OTHERS => '0' ) ;Ain <= N_Zeros & DataA ;
ShiftA: shiftlne GENERIC MAP ( N => NN )PORT MAP ( Ain, LA, EA, Zero, Clock, A ) ;
ShiftB: shiftrne GENERIC MAP ( N => N )PORT MAP ( DataB, LB, EB, Zero, Clock, B ) ;
z <= '1' WHEN B = N_Zeros ELSE '0' ;Sum <= A + PF ;
P <= PF;
- - Define the 2n 2-to-1 multiplexers for DataPGenMUX: FOR i IN 0 TO NN–1 GENERATE
Muxi: mux2to1 PORT MAP ( Zero, Sum(i), Psel, DataP(i) ) ;END GENERATE;
RegP: regne GENERIC MAP ( N => NN )PORT MAP ( DataP, Resetn, EP, Clock, PF ) ;
END Behavior ;
14ECE 448 – FPGA and ASIC Design with VHDL
Array Multiplier
15ECE 448 – FPGA and ASIC Design with VHDL
Notation
a Multiplicand ak-1ak-2 . . . a1 a0
x Multiplier xk-1xk-2 . . . x1 x0
p Product (a x) p2k-1p2k-2 . . . p2 p1 p0
16ECE 448 – FPGA and ASIC Design with VHDL
Unsigned Multiplication
a4 a3 a2 a1 a0
x4 x3 x2 x1 x0x
a4x0 a3x0 a2x0 a1x0 a0x0
a4x1 a3x1 a2x1 a1x1 a0x1
a4x2 a3x2 a2x2 a1x2 a0x2
a4x3 a3x3 a2x3 a1x3 a0x3
a4x4 a3x4 a2x4 a1x4 a0x4
p0p1p9 p2p3p4p5p6p7p8
+
ax0 20
ax1 21
ax2 22
ax3 23
ax4 24
17ECE 448 – FPGA and ASIC Design with VHDL
5 x 5 Array Multiplier
18ECE 448 – FPGA and ASIC Design with VHDL
Array Multiplier - Basic Cell
x
y
cin
cout s
FA
19ECE 448 – FPGA and ASIC Design with VHDL
Array Multiplier – Modified Basic Cell
si-1ci
ci+1 si
FA
xn
am
20ECE 448 – FPGA and ASIC Design with VHDL
5 x 5 Array Multiplier with modified cells
21ECE 448 – FPGA and ASIC Design with VHDL
Pipelined 5 x 5 Multiplier
22ECE 448 – FPGA and ASIC Design with VHDL
Array Multiplier – Modified Basic Cell
si-1ci
ci+1 si
FA
xn
am
Flip-flops
23ECE 448 – FPGA and ASIC Design with VHDL
Timing parameters
definition units
delay
clock period T
clock frequency
time from pointpoint
rising edge rising edgeof clock
1clock period
ns
ns
MHz
latency
throughput
time from inputoutput
#output bits/time unit
ns
Mbits/s
24ECE 448 – FPGA and ASIC Design with VHDL
Latency
D Q
clk
D Q
clk
CombinationalLogic
CombinationalLogic
CombinationalLogic
CombinationalLogic D Q
clk
top-level entity
• Latency is the time between input(n) and output(n)• i.e. time it takes from first input to first output, second input to second output, etc.• Latency is usually constant for a system (but not always)• Also called input-to-output latency
• Count the number of rising edges of the clock!• In this example, 2 rising edges from input to output latency is 2 cycles
• Latency is measured in clock cycles and then translated to units of time (nanoseconds)• In this example, say clock period is 10 ns, then latency is 20 ns
input output
clk
input input(0) input(1) input(2)
output (unknown) output(0) output(1)
8 bits 8 bits
100 MHz
25ECE 448 – FPGA and ASIC Design with VHDL
Throughput
D Q
clk
D Q
clk
CombinationalLogic
CombinationalLogic
CombinationalLogic
CombinationalLogic D Q
clk
top-level entity
• Throughput = (bits per output sample) / (time between consecutive output samples)• Bits per output sample:
• In this example, 8 bits per output sample• Time between consecutive output samples: clock cycles between output(n) to output(n+1)
• Can be measured in clock cycles, then translated to time• In this example, time between consecutive output samples = 1 clock cycle = 10 ns
• Throughput = (8 bits per output sample) / (10 ns) = 0.8 bits / ns = 800 Mbits/s
input output
clk
input input(0) input(1) input(2)
output (unknown) output(0) output(1)
8 bits 8 bits
1 cycle betweeenoutput samples
26ECE 448 – FPGA and ASIC Design with VHDL
Pipelining—Conceptual
• Purpose of pipelining is to reduce the critical path of the circuit by inserting an additional register (called a pipeline register)
• This splits the combinational logic in half• Now critical path delay is 5 ns, so maximum clock frequency is 200 MHz
• Double the clock frequency• Area is increased due to additional register• In general, pipelining increases throughput at the cost of increased
area/power and a minor increase in latency
D Q
clk
D Q
clk
CombinationalLogic A
CombinationalLogic A
tLOGICA = 5 ns
CombinationalLogic
CombinationalLogic
CombinationalLogic A
CombinationalLogic A
D Q
clk
register splits logic in half
tLOGICB = 5 ns