Analytical Minimization of Signal Delay in VLSI Placement

23
Analytical Minimization of Signal Delay in VLSI Placement Andrew B. Kahng and Igor L. Markov UCSD, Univ. of Michigan http://www.eecs.umich.edu/~imarkov IBM technical contact: Paul Villarrubia

description

Analytical Minimization of Signal Delay in VLSI Placement. Andrew B. Kahng and Igor L. Markov UCSD, Univ. of Michigan http://www.eecs.umich.edu/~imarkov IBM technical contact: Paul Villarrubia. Outline. Background: Global Placement for VLSI wirelength minimization delay minimization - PowerPoint PPT Presentation

Transcript of Analytical Minimization of Signal Delay in VLSI Placement

Page 1: Analytical Minimization of Signal Delay in VLSI Placement

Analytical Minimization of Signal Delayin VLSI Placement

Andrew B. Kahng and Igor L. Markov

UCSD, Univ. of Michiganhttp://www.eecs.umich.edu/~imarkov

IBM technical contact: Paul Villarrubia

Page 2: Analytical Minimization of Signal Delay in VLSI Placement

Outline

• Background: Global Placement for VLSI– wirelength minimization

– delay minimization

• Contribution– minimization objective

– “generic” minimization algorithm: outer loop and inner loop

– empirical results

• Futures

Page 3: Analytical Minimization of Signal Delay in VLSI Placement

VLSI Global Placement

• Find locations for standard cells

• Standard cells placed in rows, without overlap

• Minimize wirelength, “routing congestion”

• Minimize clock cycle

• Key abstractions:– standard cells rectangular outlines

– netlist weighted hypergraph (signal nets hyperedges)

– signal delay function of cell locations (interconnect dominates)

Page 4: Analytical Minimization of Signal Delay in VLSI Placement

A VLSI Global Placement Example

bad placement good placement

Page 5: Analytical Minimization of Signal Delay in VLSI Placement

Netlist Hypergraph and Timing Graph

• Two signal nets: 3 pins (l.blue), and 4 pins (l.green)

• Ovals: hyperedges

• Red edges: timing graph edges

Page 6: Analytical Minimization of Signal Delay in VLSI Placement

Top-Down Global Placement• Placement blocks represent cells and layout area

– single block at the start, driven by recursive (min-cut) bipartitioning– each pass: number of blocks doubles, size of blocks halves– end case: several cells in a tiny region

etc.

•Intuition: many cells can operate in parallel.Partitioning finds “independent” groups of cells

Page 7: Analytical Minimization of Signal Delay in VLSI Placement

Analytical Global Placement

• Find a continuous placement (locations == reals)• Efficient optimizations when nonconvex constraints are

relaxed (e.g., cells are allowed to overlap)• Represent multi-pin hyperedges by sets of edges

– minimize total weighted “wirelength” of all edges

Popular objectives:• Linear (Manhattan) WL = w12 ( |x1-x2| + |y1-y2| )• Quadratic “squared” WL = w12 ( (x1-x2)2 + (y1-y2)2 )Constraints: fixed vertices and/or “region constraints”

P1

P2

Page 8: Analytical Minimization of Signal Delay in VLSI Placement

Analytical Placement Alone is Not Enough

• Many cells overlap• Must “spread” the placement • IBM CPlace and XQ

– Remove overlap (comp. geometry)

– Cplace combines min-cut with analytical techniques

Page 9: Analytical Minimization of Signal Delay in VLSI Placement

Timing-Driven Placement

• Cycle time maximum path delay, not total path delay (!) – max(x,y,...) is not differentiable

– framework: pin-based timing graph

• Analytical approaches allow cell overlaps– Cell overlaps are resolved later

• Main difficulty: cannot enumerate signal paths• Signal paths implicitly defined by device types

– signal path sources, sinks == I/O pins and storage elements

• Timing constraints also implicitly defined– “actual arrival times” (AATs) at sources– “required arrival times” (RATs) at sinks– source-sink path constraint: path delay RAT@sink - AAT@source

Page 10: Analytical Minimization of Signal Delay in VLSI Placement

Implicit Analysis of Path Constraints

• Static Timing Analysis (STA) methodology– forward topological traversal in timing graph AAT@every_pin

– similar backward traversal RAT@every_pin

– slack@pin is given by RAT@pin - AAT@pin

– negative slacks violated timing constraints

• STA-based and STA-inspired placement methods– slacks net weights for HPWL minimization

• top-down placement to maximize negative slack (Marek-Sadowska/Lin 86)

– note: STA requires edge delays (e.g., from placement)– delay budgets

• zero-slack (Hauge, Nair and Yoffa 86)• iterative min-max (Shragowitz et al. 90/92)• limit-bumping (Frankle 92)

Page 11: Analytical Minimization of Signal Delay in VLSI Placement

Motivations For Novelty

• Many promising techniques available– net reweighting

– delay budgeting

– others

• Existing frameworks have weaknesses– speed/scalability

– loss or ignorance of input information• delay budgeting algorithms tend to ignore fixed locations, obstacles

– optimization of “wrong” global objectives (e.g., average wirelength)

Page 12: Analytical Minimization of Signal Delay in VLSI Placement

The Dimensionless Path-Timing Objective

• For path consider edge e

• Dimensionless Path-Timing Objective (DPO)

=max {t /c}= max {(e de)/c}

• Where

– c is path constraint

– t is path delay

– de= dij(xi,yi,xj,yj) is edge delay

Page 13: Analytical Minimization of Signal Delay in VLSI Placement

DPO: Properties

=max {t /c}= max {(e de)/c}

• 1 all timing constraints are satisfied

• Convex when edge delay models are convex

• Min DPO max slack when all c are equal

• Max slack can be reduced to min DPO– add two new vertices: the source and the sink

– connect the source to former sources

– connect the sink to former sinks

– use constant edge delay models

Page 14: Analytical Minimization of Signal Delay in VLSI Placement

Criticalities: “Multiplicative Slacks”

• By analogy with slack, define criticalities

i = max v {t /c} for vertex v=vi

ij = max e {t /c} for edge e=eij

• Criticalities are multiplicative versions of slack

• DPO and criticalities quickly computable– STA + postprocessing

• Vertex criticalities cells on critical paths– can be used by the proposed top-down timing-driven placement flow

Page 15: Analytical Minimization of Signal Delay in VLSI Placement

Generic Minimization of DPO

• Reduce DPO to a simpler objective: maxij wijdij

– maximal weighted edge delay

– use “reweighting iterations”

• One reweighting iteration– assume a placement

– compute edge criticalities

– compute new edge weights wij

– minimize maxij wijdij

• (New weights: wij’= ij / dij where = maxij wijdij )

Page 16: Analytical Minimization of Signal Delay in VLSI Placement

Properties of Reweighting

• Theorem 1. If = maxij wijdij does not increase at a

particular iteration, all timing constraints must be satisfied.

• Theorem 2. A re-weighting iteration either decreases DPO, or leaves it unchanged.

• Reweighting upper-bounds dij because wijdij can interpret reweighting as delay rebudgeting

• Youssef and Shragowitz used wij= ij in 1990/92– [interpretation of their iterative MiniMax]

– no iterations with placement: ignore fixed pad locations

Page 17: Analytical Minimization of Signal Delay in VLSI Placement

Optimization of Maximal Edge Delay

• Must consider particular edge delay models– popular choices: linear and quadratic

• Theorem 3. 2-dim max edge delay can be reduced to 1-dim case with double #vertices

• [“Inlined” implementation: no new graph]

max akm |tk-tm|

max bkm (tk-tm)2

• Theorem 4. Let bkm=akm2 minimizers coincide

Linear and quadratic WL are numerically equivalent!

Page 18: Analytical Minimization of Signal Delay in VLSI Placement

Top-Down Placement Framework

• Top-down placement done in passes• In one pass

– split every previously existing block

• Cell-to-block assignments– viewed as region constraints– gradually refine, converge to cell locs

• Assume we analytically minimized signal delay have cell locations can compute edge delays can perform Static Timing Analysis know which cells lie on critical paths• Use delay-minimizing cell locs when splitting

blocks

Page 19: Analytical Minimization of Signal Delay in VLSI Placement

Empirical Validation

• We combined min-max placement with recursive min-cut bisection (Capo CapoT)

• Implemented minimization of edge delay objectives:– Length as delay

– Squared length as delay

– Quadratic RC delay

– MST-based Elmore delay (using

• Evaluated– Internal evaluators (after placement): sanity check

– Industry timing analyzer

• Compared to an industry placer on 4 test-cases– Won on three test-cases (by slack computed with industry STA)

Page 20: Analytical Minimization of Signal Delay in VLSI Placement

Results of Quadratic, Linear and Min-Max Placement

Page 21: Analytical Minimization of Signal Delay in VLSI Placement

Results of Quadratic, Linear and Min-Max Placement

Page 22: Analytical Minimization of Signal Delay in VLSI Placement

Conclusions and Ongoing Work

• New timing-driven placement framework– can potentially be combined with budgeting or reweighting

– expected to be successful enough on its own

– leverages mincut placement

– relies on a novel analytical delay minimization

• Dimensionless Path-timing Objective (DPO)– novel global timing objective; generalizes slack optimization

• New minimization algorithms– reweighting iteration: reduction to simpler MAX-based objective

– MAX-based objective can be minimized very quickly

• Ongoing work in the context of timing-driven flows

Page 23: Analytical Minimization of Signal Delay in VLSI Placement

Future Work

• Observation (how the proposed method works)– a classic placement approach is split into stages

– a new timing optimization is performed between those stages

– most critical wires/gates are found first

(traditionally: placement is found first)

Try other types of optimizations during placement– routing of timing-critical nets

• better delay estimation

• early cross-talk detection?

– sizing of timing-critical drivers

– buffer insertion for timing-critical nets

– early detection of dangerous cross-talk

Faster and cheaper ICs