Arith Adder
-
Upload
yermakov-vadim-ivanovich -
Category
Documents
-
view
228 -
download
0
Transcript of Arith Adder
-
7/27/2019 Arith Adder
1/40
1
Design Space Exploration forPower-Efficient Mixed-Radix LingAdders
Chung-Kuan ChengComputer Science and Engineering Depart.
University of California, San Diego
-
7/27/2019 Arith Adder
2/40
2
Outline Prefix Adder Problem
Background & Previous Work
Extensions: High-radix, Ling Our Work
Area/Timing/Power Models
Mixed-Radix (2,3,4) Adders ILP Formulation
Experimental Results
Future Work
-
7/27/2019 Arith Adder
3/40
Prefix Adder Challenges
Logical
Levels
Wire Tracks
Fanouts
Area
Physical
placement
Detail routing
New
Design
Scope
Timing
Gate Cap
Wire Cap
Gate sizing
Buffer
insertion
Signal slope
Input arrival
time
Output
require time
Increasing impact of physical design
and concern of power.Power
Static
power
Dynamic
power
Power
gating
Activity
Probability
-
7/27/2019 Arith Adder
4/40
4
Binary Addition Input: two n-bit binary numbers
and , one bit carry-in
Output: n-bit sum and one bit carryout
Prefix Addition: Carry generation &propagation
011... aaan
011... bbbn 0c
011... sssn
nc
)(
:Propagate
:Generate
1
iiii
iiii
iii
iii
bacs
cpgc
bap
bag
=
+=
=
=
+
-
7/27/2019 Arith Adder
5/40
5
Prefix Addition Formulation
iiiiii bapbag == Pre-processing:
Post-
processing:
Prefix
Computation:
iii
iii
cps
cPGc
=
+=+ 0]0:[]0:[1
]:1[]:[]:[
]:1[]:[]:[]:[
kjjiki
kjjijiki
PPP
GPGG
=
+=
-
7/27/2019 Arith Adder
6/40
6
1234
12:13:14:1
Prefix Adder Prefix StructureGraph
gpi
pi
G[i:0]
si
biai
GP[i, j] GP[j-1, k]
GP[i, k]
gp generator
sum generator
GP cell
Pre-
processing
Post-
processing
Prefix
Computation
-
7/27/2019 Arith Adder
7/40
7
Previous Works Classical prefix
adders12345678
12:13:14:16:17:18:1 5:1
Brent-Kung:
Logical levels: 2log2n1
Max fanouts: 2
Wire tracks: 1
12345678
12:13:14:16:17:18:1 5:1
Kogge-Stone:
Logical levels: log2n
Max fanouts: 2
Wire tracks: n/2
12345678
12:13:14:16:17:18:1 5:1
Sklansky:
Logical levels: log2n
Max fanouts: n/2
Wire tracks: 1
-
7/27/2019 Arith Adder
8/40
8
High-Radix Adders Each cell has more than two fan-ins
Pros: less logic levels
6 levels (radix-2) vs. 3 levels (radix-4) for64-bit addition
Cons: larger delay and power in each
cell
-
7/27/2019 Arith Adder
9/40
9
Radix-3 Sklansky & Kogge-
Stone Adder
David Harris, Logical Effort of Higher Valency Adders
-
7/27/2019 Arith Adder
10/40
10
Ling Adders
iiiiii bapbag == Pre-processing:
Post-
processing:
Prefix
Computation:
iii
iii
cps
cPGc
=
+= 0]0:1[]0:1[
]:1[]:[]:[
]:1[]:[]:[]:[
kjjiki
kjjijiki
PPP
GPGG
=
+=
Prefix Ling
* *
[ 1:0] [ 1:0] 1( )
i i i i i is G t G t p
= +
,i i i i i i
i i i
g a b p a b
t a b
= = +
=
1
*
]1:[1
*
]1:[ , =+= iiiiiiii ppPggG
* * * *
[ : ] [ : ] [ 1: 1] [ 1: ]i k i j i j j k
G G P G
= +
*
]:1[
*
]:[
*
]:[ kjjiki PPP =
*
]0:1[1 = iii Gpc
-
7/27/2019 Arith Adder
11/40
11
An 8-bit Ling Adder
H1
H2
H3
H4
H5
H6
H7
H8
12345678
0 1
si
ai
bi
pi-1
Hi
gi
pi-1
pi-2
G*i
P*i-1
gi-1
-
7/27/2019 Arith Adder
12/40
12
Area Model Distinguish physical placement from logical
structure, but keep the bit-slice structure.
Logical view Physical view
Bit positionLogicalle
vel
Bit positionPhysicallevel
Compact placement
12345678 12345678
-
7/27/2019 Arith Adder
13/40
13
Timing Model Cell delay calculation:
pfd +=
Effort Delay Intrinsic Delay
hgf =
Logical EffortElectrical Effort = Cout/Cin= (fanouts+wirelength) / size
Intrinsic properties of the cell
-
7/27/2019 Arith Adder
14/40
14
Power Model Total power consumption:
Dynamic power + Static Power
Static power: leakage current of devicePsta = *#cells
Dynamic power: current switching capacitancePdyn = Cload
is the switching probability = j (j is the logical level*)
cellsCjPPP loadstadyntotal #+=+=
* Vanichayobon S, etc, Power-speed Trade-off in Parallel Prefix Circuits
-
7/27/2019 Arith Adder
15/40
15
ILP Formulation Overview
Structure variables:
GP cells
Connections (wires)
Physical positions
Capacitance variables:
Gate cap
Vertical wire cap
Horizontal wire cap
Timing variables:
Input arrival time
Output arrival time
Power
Objective
ILPILOG CPLEX
Optimal Solution
-
7/27/2019 Arith Adder
16/40
16
Integer Linear Programming(ILP)
ILP: Linear Programming with integervariables.
Difficulties and techniques: Constraints are not linear
Linearize using pseudo linear constraints
Search Space too large
Reduce search space
Search is slow
Add redundant constraints to speedup
-
7/27/2019 Arith Adder
17/40
17
ILPInteger Linear Programming
Linear Programming: linear constraints, linearobjective, fractional variables.
Integer Linear Programming: Linear Programmingwith integer variables.
LP Optimal
Constraints
ILP Optimal
-
7/27/2019 Arith Adder
18/40
ILP Pseudo-Linear Constraint
Minimize: x3Subject to: x1 300
x2 500x3 = min(x1, x2)
Minimize: x3Subject to: x1300
x2500x3x1x3x2x3x1 1000 b1 (1)x3x2 1000 (1 b1) (2)
b1 is binary
Problem: ILP formulation:
LP objective: 0
ILP objective: 300
A constraint is called pseudo-linear if its not effective
until some integer variables are fixed.
Pseudo-linear constraints mostly arise from IF/ELSE
scenarios
binary decision variables are introduced to indicate true or false.
-
7/27/2019 Arith Adder
19/40
-
7/27/2019 Arith Adder
20/40
Interval Adjacency Constraint
H1H2H3H4H5H6H7H8
12345678
(7,3): Interval [7,1]
(3,2): Interval [3,1]
(7,2): Interval [7,4]
Must be adjacent,
i.e. 4 = 3 + 1
(column id, logic level)
-
7/27/2019 Arith Adder
21/40
21
Linearization for IntervalAdjacency Constraint
(i, j)
(i, h) (k1, l1) (k2, l2)
wl wr1
wr2
],[ ),(),(R
hi
L
hi yy ],[ )1,1()1,1(R
lk
L
lk yy ],[ )2,2()2,2(R
lk
L
lk yy
],[),(),(
R
ji
L
ji
yy
11if1),(),( ==+= (i,j,k,l)wrwl(i,j,h)yyL
lk
R
hi
1if1),,,(1),(
),( =+= wl(i,j,h)lkjiwrkylk
R
hi
11),,,(1),(
),( + wl(i,j,h))(nlkjiwrkylk
R
hi
11),,,(1
),(
),( ++ wl(i,j,h))(nlkjiwrkylk
R
hi
iyL
ji =),(
Linearize
Pseudo Linear
Left interval boundequal to column index
-
7/27/2019 Arith Adder
22/40
22
Search Space Reduction
Lings adder:separate odd and
even bits Double the bit-width
we are able tosearch H1H2H3H4H5H6H7H8
12345678
-
7/27/2019 Arith Adder
23/40
23
Redundant Constraints
Cell (i,j)is known to have logic leveljbefore wireconnection
Assume load is MinLoad(fanout=1 with minimum
wire length):
+ MinLoadjP ji ),(
Cell (i,j) has a path of lengthj-1
Assume each cell along the path has MinLoad
)(),( MinLoadLEPDjT ji +
-
7/27/2019 Arith Adder
24/40
24
Experiments16-bit Uniform Timing
-
7/27/2019 Arith Adder
25/40
Experiments16-bit Uniform Timing
25
-
7/27/2019 Arith Adder
26/40
26
Min-Power Radix-2 Adder(delay= 22, power = 45.5FO4 )
1
1
2
2
3
3
4
4
5
5
6
6
7
7
8
8
9
9
10
10
11
11
12
12
13
13
14
14
15
15
16
16
-
7/27/2019 Arith Adder
27/40
27
Min-Power Radix-2&4 Adder(delay=18, power = 29.75FO4 )
1
1
2
2
3
3
4
4
5
5
6
6
7
7
8
8
9
9
10
10
11
11
12
12
13
13
14
14
15
15
16
16
Radix-2 Cell Radix-4 Cell
-
7/27/2019 Arith Adder
28/40
28
Min-Power Mixed-Radix Adder(delay=20, power = 28.0FO4)
1
1
2
2
3
3
4
4
5
5
6
6
7
7
8
8
9
9
10
10
11
11
12
12
13
13
14
14
15
15
16
16
Radix-2 Cell Radix-4 CellRadix-3 Cell
-
7/27/2019 Arith Adder
29/40
29
Experiments16-bit Non-
uniform Time (Mixed Radix)
ILP is able to handle non-uniform timings
Ling adders are most superior in increasing arrival timefaster carries
-
7/27/2019 Arith Adder
30/40
Increasing Arrival Time(delay=35.5, power = 27.0FO4 )
30
-
7/27/2019 Arith Adder
31/40
Decreasing Arrival Time(delay=34.5, power = 30.5FO4)
31
-
7/27/2019 Arith Adder
32/40
Convex Arrival Time(delay=35.9, power = 32.4FO4 )
32
-
7/27/2019 Arith Adder
33/40
Increasing Required Time(delay=34.5, power = 30.5FO4)
33
-
7/27/2019 Arith Adder
34/40
Decreasing Required Time(delay=36.5, power = 32.5FO4)
34
-
7/27/2019 Arith Adder
35/40
Convex Required Time(delay=36.5, power = 32.5FO4)
35
-
7/27/2019 Arith Adder
36/40
36
Experiments64-bit Hierarchical
Structure (Mixed-Radix)
Handle high bit-width applications
16x4 and 8x8
ILP Block ILP Block ILP Block ILP Block
ILP Block
a1b1a16b16a17b17a32b32a33b33a48b48a49b49a64b64
... ... ... ...
Level 1
Level 2
... ... ... ...
... ... ... ...
GP*[64:50]
GP*[48:34] GP*[32:18] GP*[16:2]
GP*[1:1]
GP*[17:17]
GP*[33:33]GP*[49:49]
... ... ...H
64H
49H
48H
33H
32H
17H
16H
1
-
7/27/2019 Arith Adder
37/40
37
Experiments64-bit
Hierarchical Structure
TSL: a 64-bit high-radix three-stage Ling adder
V. Oklobdzija and B. Zeydel, Energy-Delay Characteristics of CMOS Adders,
in High-Performance Energy-Efficient Microprocessor Design, pp. 147-170, 2006
-
7/27/2019 Arith Adder
38/40
38
ASIC Implementation - Results
64-bit hierarchical design (mixed-radix) byILP vs. fast carry look-ahead adder by
Synopsys Design Compiler TSMC 90nm standard cell library was used
-
7/27/2019 Arith Adder
39/40
39
Future Work
ILP formulation improvement
Expected to handle 32 or 64 bit
applications without hierarchical scheme
Optimizing other computer arithmeticmodules
Comparator, Multiplier
-
7/27/2019 Arith Adder
40/40
40
Q & A
Thank You!