Analytical Minimization of Signal Delay in VLSI Placement

Analytical Minimization of Signal Delayin VLSI Placement

Andrew B. Kahng and Igor L. Markov

UCSD, Univ. of Michiganhttp://www.eecs.umich.edu/~imarkov

IBM technical contact: Paul Villarrubia

Outline

• Background: Global Placement for VLSI– wirelength minimization

– delay minimization

• Contribution– minimization objective

– “generic” minimization algorithm: outer loop and inner loop

– empirical results

• Futures

VLSI Global Placement

• Find locations for standard cells

• Standard cells placed in rows, without overlap

• Minimize wirelength, “routing congestion”

• Minimize clock cycle

• Key abstractions:– standard cells rectangular outlines

– netlist weighted hypergraph (signal nets hyperedges)

– signal delay function of cell locations (interconnect dominates)

A VLSI Global Placement Example

bad placement good placement

Netlist Hypergraph and Timing Graph

• Two signal nets: 3 pins (l.blue), and 4 pins (l.green)

• Ovals: hyperedges

• Red edges: timing graph edges

Top-Down Global Placement• Placement blocks represent cells and layout area

– single block at the start, driven by recursive (min-cut) bipartitioning– each pass: number of blocks doubles, size of blocks halves– end case: several cells in a tiny region

•Intuition: many cells can operate in parallel.Partitioning finds “independent” groups of cells

Analytical Global Placement

• Find a continuous placement (locations == reals)• Efficient optimizations when nonconvex constraints are

relaxed (e.g., cells are allowed to overlap)• Represent multi-pin hyperedges by sets of edges

– minimize total weighted “wirelength” of all edges

Popular objectives:• Linear (Manhattan) WL = w12 ( |x1-x2| + |y1-y2| )• Quadratic “squared” WL = w12 ( (x1-x2)2 + (y1-y2)2 )Constraints: fixed vertices and/or “region constraints”

Analytical Placement Alone is Not Enough

• Many cells overlap• Must “spread” the placement • IBM CPlace and XQ

– Remove overlap (comp. geometry)

– Cplace combines min-cut with analytical techniques

Timing-Driven Placement

• Cycle time maximum path delay, not total path delay (!) – max(x,y,...) is not differentiable

– framework: pin-based timing graph

• Analytical approaches allow cell overlaps– Cell overlaps are resolved later

• Main difficulty: cannot enumerate signal paths• Signal paths implicitly defined by device types

– signal path sources, sinks == I/O pins and storage elements

• Timing constraints also implicitly defined– “actual arrival times” (AATs) at sources– “required arrival times” (RATs) at sinks– source-sink path constraint: path delay RAT@sink - AAT@source

Implicit Analysis of Path Constraints

• Static Timing Analysis (STA) methodology– forward topological traversal in timing graph AAT@every_pin

– similar backward traversal RAT@every_pin

– slack@pin is given by RAT@pin - AAT@pin

– negative slacks violated timing constraints

• STA-based and STA-inspired placement methods– slacks net weights for HPWL minimization

• top-down placement to maximize negative slack (Marek-Sadowska/Lin 86)

– note: STA requires edge delays (e.g., from placement)– delay budgets

• zero-slack (Hauge, Nair and Yoffa 86)• iterative min-max (Shragowitz et al. 90/92)• limit-bumping (Frankle 92)

Motivations For Novelty

• Many promising techniques available– net reweighting

– delay budgeting

– others

• Existing frameworks have weaknesses– speed/scalability

– loss or ignorance of input information• delay budgeting algorithms tend to ignore fixed locations, obstacles

– optimization of “wrong” global objectives (e.g., average wirelength)

The Dimensionless Path-Timing Objective

• For path consider edge e

• Dimensionless Path-Timing Objective (DPO)

=max {t /c}= max {(e de)/c}

• Where

– c is path constraint

– t is path delay

– de= dij(xi,yi,xj,yj) is edge delay

DPO: Properties

=max {t /c}= max {(e de)/c}

• 1 all timing constraints are satisfied

• Convex when edge delay models are convex

• Min DPO max slack when all c are equal

• Max slack can be reduced to min DPO– add two new vertices: the source and the sink

– connect the source to former sources

– connect the sink to former sinks

– use constant edge delay models

Criticalities: “Multiplicative Slacks”

• By analogy with slack, define criticalities

i = max v {t /c} for vertex v=vi

ij = max e {t /c} for edge e=eij

• Criticalities are multiplicative versions of slack

• DPO and criticalities quickly computable– STA + postprocessing

• Vertex criticalities cells on critical paths– can be used by the proposed top-down timing-driven placement flow

Generic Minimization of DPO

• Reduce DPO to a simpler objective: maxij wijdij

– maximal weighted edge delay

– use “reweighting iterations”

• One reweighting iteration– assume a placement

– compute edge criticalities

– compute new edge weights wij

– minimize maxij wijdij

• (New weights: wij’= ij / dij where = maxij wijdij )

Properties of Reweighting

• Theorem 1. If = maxij wijdij does not increase at a

particular iteration, all timing constraints must be satisfied.

• Theorem 2. A re-weighting iteration either decreases DPO, or leaves it unchanged.

• Reweighting upper-bounds dij because wijdij can interpret reweighting as delay rebudgeting

• Youssef and Shragowitz used wij= ij in 1990/92– [interpretation of their iterative MiniMax]

– no iterations with placement: ignore fixed pad locations

Optimization of Maximal Edge Delay

• Must consider particular edge delay models– popular choices: linear and quadratic

• Theorem 3. 2-dim max edge delay can be reduced to 1-dim case with double #vertices

• [“Inlined” implementation: no new graph]

max akm |tk-tm|

max bkm (tk-tm)2

• Theorem 4. Let bkm=akm2 minimizers coincide

Linear and quadratic WL are numerically equivalent!

Top-Down Placement Framework

• Top-down placement done in passes• In one pass

– split every previously existing block

• Cell-to-block assignments– viewed as region constraints– gradually refine, converge to cell locs

• Assume we analytically minimized signal delay have cell locations can compute edge delays can perform Static Timing Analysis know which cells lie on critical paths• Use delay-minimizing cell locs when splitting

blocks

Empirical Validation

• We combined min-max placement with recursive min-cut bisection (Capo CapoT)

• Implemented minimization of edge delay objectives:– Length as delay

– Squared length as delay

– Quadratic RC delay

– MST-based Elmore delay (using

• Evaluated– Internal evaluators (after placement): sanity check

– Industry timing analyzer

• Compared to an industry placer on 4 test-cases– Won on three test-cases (by slack computed with industry STA)

Results of Quadratic, Linear and Min-Max Placement

Conclusions and Ongoing Work

• New timing-driven placement framework– can potentially be combined with budgeting or reweighting

– expected to be successful enough on its own

– leverages mincut placement

– relies on a novel analytical delay minimization

• Dimensionless Path-timing Objective (DPO)– novel global timing objective; generalizes slack optimization

• New minimization algorithms– reweighting iteration: reduction to simpler MAX-based objective

– MAX-based objective can be minimized very quickly

• Ongoing work in the context of timing-driven flows

Future Work

• Observation (how the proposed method works)– a classic placement approach is split into stages

– a new timing optimization is performed between those stages

– most critical wires/gates are found first

(traditionally: placement is found first)

Try other types of optimizations during placement– routing of timing-critical nets

• better delay estimation

• early cross-talk detection?

– sizing of timing-critical drivers

– buffer insertion for timing-critical nets

– early detection of dangerous cross-talk

Faster and cheaper ICs

Analytical Minimization of Signal Delay in VLSI Placement

Documents

Transcript of Analytical Minimization of Signal Delay in VLSI Placement

Delay Minimization for Data Transmission in Wireless Power … · 2019-01-25 · 298 IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, VOL. 37, NO. 2, FEBRUARY 2019 Delay Minimization

CMOS VLSI Design DC Transfer Characteristics and Switch –level RC delay Models

Reduction of Power and Delay in Vlsi Interconnects Ppt

Лекция 2. Двухуровневый логический синтез · 2018-02-26 · Minimization Algorithms for VLSI Synthesis, Kluwer Academic Press, 1984 –Richard L.

Parallel analog VLSI architectures for computation of ...papers.nips.cc/paper/1125-parallel-analog-vlsi... · time delay of the two signals, and therefore velocity, for the direction

EE466: VLSI Design Lecture 11: Wires. CMOS VLSI Design6: WiresSlide 2 Outline Introduction Wire Resistance Wire Capacitance Wire RC Delay Crosstalk.

CDA 4213/CIS 6930 CMOS VLSI Design Lecture 11CDA 4213/CIS 6930 CMOS VLSI Design Lecture 11 Delay Estimation Linear Delay Model Slide 2 Source: Prof. David Harris’ Slides and Dr.

CMOS VLSI Design DC Transfer Characteristics and Switch –level RC delay Models.

Clock Routing - Indian Institute of Technology Kharagpurisg/CAD/SLIDES/13-misc-routing.pdf · CAD for VLSI 23 Zero Skew Clock Routing • Based on the Elmore delay model. – Delay

Delay Minimization for Massive Internet of Things With Non ...cai/stsp19-iot-noma.pdf · ZHAI et al.: DELAY MINIMIZATION FOR MASSIVE INTERNET OF THINGS WITH NON-ORTHOGONAL MULTIPLE

VLSI Design – I · 1 1 VLSI Design – I Interconnect Parasitics Professor Yusuf Leblebici Microelectronic Systems Laboratory (LSM) yusuf.leblebici@epfl.ch! 2 Interconnect Delay

A custom VLSI architecture for implementing low-delay ...summit.sfu.ca/system/files/iritems1/4814/b14511678.pdf · The design of application-specific VLSI architectures for digital

VLSI Testing Lecture 9: Delay Test

Outline - University of Notre Damekogge/courses/cse40462-VLSI-fa18/... · 2018. 10. 29. · p nypes same size as ntypes. 8 Delay A CMOS VLSI Design RC Delay Model Use equivalent circuits

Very Large Scale Integration (VLSI) Very Large Scal… · Dr. Ahmed H. Madian-VLSI 9 Delay estimation (cont.) Equivalent circuit used for MOSFET Ideal Switch + Capacitance and ON

Introduction to structured VLSI design: Design for · PDF fileIntroduction to structured VLSI design: Design for Test ... Delay Faults, Transient Fault . Defects, ... • Produced

Compiling Communicating Processes into Delay-Insensitive VLSI Circuits

UNIT I MOS TRANSISTOR THEORY AND PROCESS …sacet.edu.in/ECE/6th sem/ec 1354-VLSI Design.pdf · 10.Explain the physical design for VLSI circuits. 11. Explain the interconnect delay

A New Characterization Method for Delay and Power ...downloads.hindawi.com/journals/vlsi/2002/457569.pdfA New Characterization Method for Delay and Power Dissipation of Standard Library

Lecture 3: CMOS Transistor Theoryideal.csie.ncku.edu.tw/vlsi/lect3.pdf · 3: CMOS Transistor Theory Slide 35CMOS VLSI Design RC Delay Model Use equivalent circuits for MOS transistors