Variability-Driven Formulation for Simultaneous Gate Sizing and Post-Silicon Tunability Allocation...

23
Variability-Driven Formulation for Simultaneous Gate Sizing and Post-Silicon Tunability Allocation Vishal Khandelwal and Ankur Srivastava Department of Electrical and Computer Engineering University of Maryland College Park http://www.ece.umd.edu/~vishalk

Transcript of Variability-Driven Formulation for Simultaneous Gate Sizing and Post-Silicon Tunability Allocation...

Page 1: Variability-Driven Formulation for Simultaneous Gate Sizing and Post-Silicon Tunability Allocation Vishal Khandelwal and Ankur Srivastava Department of.

Variability-Driven Formulation for Simultaneous Gate Sizing and Post-Silicon Tunability

Allocation

Variability-Driven Formulation for Simultaneous Gate Sizing and Post-Silicon Tunability

Allocation

Vishal Khandelwal and Ankur Srivastava

Department of Electrical and Computer Engineering

University of Maryland College Parkhttp://www.ece.umd.edu/~vishalk

Vishal Khandelwal and Ankur Srivastava

Department of Electrical and Computer Engineering

University of Maryland College Parkhttp://www.ece.umd.edu/~vishalk

Page 2: Variability-Driven Formulation for Simultaneous Gate Sizing and Post-Silicon Tunability Allocation Vishal Khandelwal and Ankur Srivastava Department of.

2

IntroductionIntroduction

Process variations cause significant spread in design performance in sub 90nm technologies

Impact yield and reliability

It is necessary to explicitly consider the impact of process variations on design parameters

Several statistical analysis and optimization techniques have been proposed to improve timing/power yields

Page 3: Variability-Driven Formulation for Simultaneous Gate Sizing and Post-Silicon Tunability Allocation Vishal Khandelwal and Ankur Srivastava Department of.

3

Handling Process VariationsHandling Process Variations

Statistical Gate Sizing

Statistical Buffer Insertion

Process Variations

Design-Time Optimization Post-Fabrication Tunability

Post-Silicon Tunable Clock-Tree Buffers

Adaptive Body-Biasing

[Davoodi, DAC’06] [Sapatnekar, DAC’05][Zhou, ICCAD’05]

[He, ISPD’06][Davoodi, ICCD’05][Wong, ICCAD’05][Khandelwal, ICCAD’03]

[Chen, ICCAD’05][Mahoney, ISSC’05][Takahashi, 2003][Tam, JSSC’00]

[Kim, ISLPED’03][Orshansky, ICCAD’06]

Page 4: Variability-Driven Formulation for Simultaneous Gate Sizing and Post-Silicon Tunability Allocation Vishal Khandelwal and Ankur Srivastava Department of.

4

Traditional Gate SizingTraditional Gate Sizing

iii

consn

iij

sss

Tt

tsdt

t

maxmin

0

)(

0

Minimize Area, Power, …

Gate size: si

Minimize area, or power Subject to:

meeting a delay constraint at the output size constraints

[Fishburn, Dunlop 1985]

[Sapatnekar,1993]

tiitj

di

n0

Tcons

Page 5: Variability-Driven Formulation for Simultaneous Gate Sizing and Post-Silicon Tunability Allocation Vishal Khandelwal and Ankur Srivastava Department of.

5

Traditional Gate SizingTraditional Gate Sizing

i

iFOjjij

ii s

saad

)(

0i

j

Posynomial Gate Delay Expression [Fishburn, Dunlop 1985]

[Sapatnekar,1993]

iii

consn

iij

sss

Tt

tsdt

t

maxmin

0

)(

0

Minimize Area, Power, …

ix

i

consn

ijj

xx

iji

ses

Tt

tteaa

t

i

ij

maxmin

0

0

0)(

0

Minimize Area, Power, …

)(xdi

ixi es

Convex Formulation

Page 6: Variability-Driven Formulation for Simultaneous Gate Sizing and Post-Silicon Tunability Allocation Vishal Khandelwal and Ankur Srivastava Department of.

6

Effects of Process VariationsEffects of Process Variations

{ , ,...}eff oxL T ��������������

Delay of each gate becomes a random variable

Statistical Gate Sizing

( )0

( )

( , ) ( )ij j

j Fanout ii i

i

a s

d s as

��������������

�������������������������� ��

Tox

n+ n+Leff

Set of random variables with arbitrary distributions

[Davoodi, DAC’06] [Sapatnekar, DAC’05][Zhou, ICCAD’05]

Page 7: Variability-Driven Formulation for Simultaneous Gate Sizing and Post-Silicon Tunability Allocation Vishal Khandelwal and Ankur Srivastava Department of.

7

Post-Silicon Tunable (PST) Clock Tree BuffersPost-Silicon Tunable (PST) Clock Tree Buffers

FF1

FF2

FF3

FF4

FF5

FF6

FF7

FF8

B1

B2

B4

B3

B5 B6 B7

Tunable clock buffers can introduce extra slack into critical paths after fabrication

Design Overhead Area, Clock-Tree Power [Chen, ICCAD’05]

[Mahoney, ISSC’05][Takahashi, 2003][Tam, JSSC’00]

Page 8: Variability-Driven Formulation for Simultaneous Gate Sizing and Post-Silicon Tunability Allocation Vishal Khandelwal and Ankur Srivastava Department of.

8

Post-Silicon Tunable Clock Tree BuffersPost-Silicon Tunable Clock Tree Buffers

Let Dij be the delay of the longest path between flip-flops i and j

Consider Flip-Flops 2 and 7: Tune buffers to change clock-skew

FF1

FF2

FF3

FF4

FF5

FF6

FF7

FF8

B1

B2

B4

B3

B5 B6 B7

i ij clk j setT D T T T

2 1 2 4 27 7 1 3 7( ) ( )

0

Buf Buf Buf Buf Buf Bufclk set

Buf Bufi i

T T T T D T T T T T T

T Max

Page 9: Variability-Driven Formulation for Simultaneous Gate Sizing and Post-Silicon Tunability Allocation Vishal Khandelwal and Ankur Srivastava Department of.

9

Optimization Objective: Tunability CostOptimization Objective: Tunability Cost

Metric to capture the overhead due to PST buffers in the design Silicon Area Clock-Tree Power

Bufi

i PST Buffers

TunabilityCost Max

Page 10: Variability-Driven Formulation for Simultaneous Gate Sizing and Post-Silicon Tunability Allocation Vishal Khandelwal and Ankur Srivastava Department of.

10

Optimization Objective: Binning Yield LossOptimization Objective: Binning Yield Loss

consT T dttftLossldLossBinningYie )()(

[V. Zolotov, DAC’04]

( ) ( )cons

cons TTBYL Q t T f t dt

Convex loss function Q(.)

LossLoss

TconsTcons Delay (t)Delay (t)

)(tfT )(tfT

(BYL)

[D. Blaauw, GLSVLSI’05]

Page 11: Variability-Driven Formulation for Simultaneous Gate Sizing and Post-Silicon Tunability Allocation Vishal Khandelwal and Ankur Srivastava Department of.

11

Problem StatementProblem Statement

Given a sequential design with a synthesized PST clock-tree (known buffer locations), perform simultaneous Statistical gate sizing PST buffer tuning range determination

Such that Binning Yield Loss and Tunability Cost is minimized

FF1

FF2

FF3

FF4

FF5

FF6

FF7

FF8

B1

B2

B4

B3

B5 B6 B7i

di

n0

Tcons

Page 12: Variability-Driven Formulation for Simultaneous Gate Sizing and Post-Silicon Tunability Allocation Vishal Khandelwal and Ankur Srivastava Department of.

12

Two-Stage FormulationTwo-Stage Formulation

Gate Size: , Tuning Buffer Range: x r

1. Deterministic constraints: meeting timing requirement assuming no variations

2. Capturing variability in objective

0 0 0{ , ,...}eff oxl t ��������������

( ( , ) ( ) )Minimize BYL x r TunabilityCost r GateSizes

0

min max

( , )

( , ) ( )( , )

( )

0

i ij clk j set

p q q

q ij

Buf

T D T T T FlipFlops i j

t d x t p fanin qFlipFlops i j

t D q fanin FlipFlop j

x x x

r Max

��������������

FirstStage

Page 13: Variability-Driven Formulation for Simultaneous Gate Sizing and Post-Silicon Tunability Allocation Vishal Khandelwal and Ankur Srivastava Department of.

13

Second Stage FormulationSecond Stage Formulation

( ( , , ) )( , , )

0

ij cons ij consQ D x r T D TV x r

Otherwise

TconsTcons

Loss Q

Loss Q

)(tfT )(tfTvv

( , ) ( ( , , ) ) ( ) ( ,) [ ( , )]cons

ij cons T VTBYL x r Q D x r T f t dt v f v dv E V x r

0 0 ( , )

0

0

0

( , , ) ( )

( ) ( , ) ( )

( , )

( , ) ( )( , )

( , ) ( ( ))

i

violijFF i j

Buf Bufi k ij clk j kk C k Cj

violset ij

p q q

q ij

vij

v x r Minimize Q T

T T D x T T T

T T FlipFlop i j

t d x t p fanin qFlipFlop i j

t D x q fanin FlipFlop j

T

0

0 ( , )

0

iol

Buf

FlipFlop i j

T r PST Buffer

SecondStage

Given a solution to the first stage problem and a variability sample: 0 0( , , )x r No Statistical Timing Analysis scheme exists to estimate

the timing distribution of a circuit given gate sizes and tuning buffer ranges Each sample of variability requires different amount of

tuning for maximum timing yield

Page 14: Variability-Driven Formulation for Simultaneous Gate Sizing and Post-Silicon Tunability Allocation Vishal Khandelwal and Ankur Srivastava Department of.

14

THEOREM: The proposed two-stage stochastic programmingformulation is convex

PROOF: Detailed proof omitted for brevity

( , ) [ ( , , )] ( ) ( , , ) ( )VBYL x r E V x r v f v dv V x r f d

��������������

��������������

Convex ProblemConvex Problem

First stage constraints are convex

First stage objective is convex if BYL(x,r) is convex

From second stage formulation one can show that

is convex

Need to show each sample is convex( , , )V x r

0

( , , )V x r

Page 15: Variability-Driven Formulation for Simultaneous Gate Sizing and Post-Silicon Tunability Allocation Vishal Khandelwal and Ankur Srivastava Department of.

15

Kelley’s Cutting Plane AlgorithmKelley’s Cutting Plane Algorithm

Iteratively solve first and second stage formulation

Given a solution to the first stage formulation, we use method of finite differences to generate a lower bound to BYL from the second stage formulation

( , ) , ( , )k kBYL x r x r

Add this constraint to the first stage formulation at each iteration

Page 16: Variability-Driven Formulation for Simultaneous Gate Sizing and Post-Silicon Tunability Allocation Vishal Khandelwal and Ankur Srivastava Department of.

16

Shortest-Path ConstraintsShortest-Path Constraints

Inherently non-convex in nature

Approximate gate delay using a linear approximation (lower bound)

The two-stage stochastic programming formulation can be modified to consider shortest path constraints

( , )short ji ij j holdT D T T FlipFlop i j

0 1

( )

pshort linij m

m

linm m m m n n

n fanout m

D d gates m on path p

d a a x b x

Page 17: Variability-Driven Formulation for Simultaneous Gate Sizing and Post-Silicon Tunability Allocation Vishal Khandelwal and Ankur Srivastava Department of.

17

Experimental ResultsExperimental Results

Implemented the framework in SIS using MOSEK to solve the convex formulation

Used CAPO to place netlist to get spatially correlated gate delays

Assumed 15% Vth variation in 90nm technology node [Predictive Technology Model]

Synthesized the PST clock-tree using the technique proposed in [Chen et. al, ICCAD’05]

xixi

yiyiii

xjxj

yjyj jj

Page 18: Variability-Driven Formulation for Simultaneous Gate Sizing and Post-Silicon Tunability Allocation Vishal Khandelwal and Ankur Srivastava Department of.

18

Experimental ResultsExperimental Results

Experimental Comparison – ISCAS benchmarks [Chen]:

Nominal gate sizing PST clock-tree generation using [Chen et. al, ICCAD’05]

Sensitivity: Retain PST clock-tree location and range Sensitivity-driven statistical gate sizing algorithm

– Size the gate with maximum yield gain greedily (iterative)

– Similar in spirit to [Zhou ICCAD’05, Zolotov DAC’05]

Stochastic: Retain PST clock-tree buffer locations Proposed simultaneous gate sizing and post-silicon tunability

allocation algorithm

Page 19: Variability-Driven Formulation for Simultaneous Gate Sizing and Post-Silicon Tunability Allocation Vishal Khandelwal and Ankur Srivastava Department of.

19

BYL, Area and Tuning Range ComparisonBYL, Area and Tuning Range Comparison

0

50000

100000

150000

200000

250000

300000

s344 s382 s400 s526 s635

Binning Yield Loss

[Chen]

Sensitivity

Stochastic

3000

4000

5000

6000

7000

8000

9000

s344 s382 s400 s526 s635

Area (Logic Gates) Comparison

[Chen]

Sensitivity

Stochastic

0

2

4

6

8

10

12

14

s344 s382 s400 s526 s635

Tuning Range Comparison

[Chen]

Sensitivity

Stochastic

Page 20: Variability-Driven Formulation for Simultaneous Gate Sizing and Post-Silicon Tunability Allocation Vishal Khandelwal and Ankur Srivastava Department of.

20

Timing Yield Loss ComparisonTiming Yield Loss Comparison

0

0.05

0.1

0.15

0.2

0.25

0.3

s344 s382 s400 s526 s635

Timing Yield Loss

[Chen]

Sensitivity

Stochastic

[Chen] Sensitivity Stochastic

Average Timing Yield Loss

0.22 0.19 0.03

Page 21: Variability-Driven Formulation for Simultaneous Gate Sizing and Post-Silicon Tunability Allocation Vishal Khandelwal and Ankur Srivastava Department of.

21

Runtime ComparisonRuntime Comparison

0

50

100

150

200

250

300

350

400

s344 s382 s400 s526 s635

Runtime

Sensitivity

Stochastic

Technique s344 s382 s400 s526 s635

Sensitivity 24 40 18 15 109

Stochastic 7 19 13 14 7

Number of Iterations

Page 22: Variability-Driven Formulation for Simultaneous Gate Sizing and Post-Silicon Tunability Allocation Vishal Khandelwal and Ankur Srivastava Department of.

22

Summary and Future WorkSummary and Future Work

Variability-driven framework for simultaneous gate sizing and post-silicon tunability allocation to minimize binning-yield loss and tunability cost

Efficient stochastic programming based scheme to solve the formulation

No assumptions about parameter distribution or their correlations

Need to develop a statistical timing analysis scheme that can consider the effect of post-silicon tunability

Page 23: Variability-Driven Formulation for Simultaneous Gate Sizing and Post-Silicon Tunability Allocation Vishal Khandelwal and Ankur Srivastava Department of.

23

Thank You!Thank You!