Mixed-Size Placement with Fixed Macrocells using Grid-Warping Zhong Xiu*, Rob Rutenbar * Advanced...

27
Mixed-Size Placement with Fixed Macrocells using Grid-Warping Zhong Xiu*, Rob Rutenbar *Advanced Micro Devices Inc., Department of Electrical and Computer Engineering, Carnegie Mellon University

Transcript of Mixed-Size Placement with Fixed Macrocells using Grid-Warping Zhong Xiu*, Rob Rutenbar * Advanced...

Mixed-Size Placement with Fixed Macrocells using Grid-WarpingMixed-Size Placement with Fixed Macrocells using Grid-Warping

Zhong Xiu*, Rob Rutenbar

*Advanced Micro Devices Inc.,

Department of Electrical and Computer Engineering, Carnegie Mellon University

Slide 2

Placement by Grid-Warping

In [Zhong et al, DAC04], we showed first grid-warping placer In [Zhong et al, DAC05], we showed our timing-driven placer Fundamentally new idea for placement improvement

Imagine we place the gates on the surface of a flexible elastic sheet We stretch the sheet to improve the placement

Quadratic Initial

placement

WarpPlacement

surface

Improvedwarpedresult

Recurse& descendto continue

Slide 3

Grid Warping: Attractive Features

Novel paradigm for placement: optimize the grid, not the gates Think of “gravity” – we reshape curvature of space to move the mass

Flexibly nonlinear Free to warp anyway we like; not driven primarily by linear solves

Low-dimensional optimization problem We only need to control the sheet, we don’t move gates individually

Early prototypes – WARP1, WARP2 – perform well Competitive on wirelength with other published placers As fast – or faster – than many other analytical placers

Slide 4

Organization of this Talk

What’s missing? To handle Mixed-Size Placement Wirelength/Timing optimization is necessary but not sufficient We must be able to handle mixed-size placement with fixed macrocells

First, we review basic mechanics of grid-warping Second, we show how to extend grid-warping for mixed-size

placement with fixed macrocells Finally, we show our results and future directions

Slide 5

Review: Mechanics of Grid-Warping

It’s conceptually useful to think of warping as distorting a regular mesh placed on the elastic placement surface…

..but this is not actually how we implement warping

Quadratic Initial

placement

WarpPlacement

surface

Improvedwarpedresult

Recurse& descendto continue

Slide 6

Warp grids andacquire gates

We Formulate Warping in an “Inverse” Way

We warp to “acquire” a new set of gates in each unit grid area… … then “pull” gates back to the undistorted grid, to move them

Restore grids and pull gates back

Initial placement mass and grids

Slide 7

And We Do Not Use a Regular Warping Grid

2x2 Warping grid 4x4 Warping grid

Instead, we use a grid defined by a set of slicing cuts It turns out this allows a greater range of motion for the gates

Yes—a lot like quadrisection or partitioning, but more general The cuts need not be axis parallel Because gates are fully placed in each region, we get real wirelength

Slide 8

ImprovementImprovement

Pre-WarpingPre-Warping

Decompose/RecurseDecompose/Recurse

NonlinearNonlinearGrid-Warping LoopGrid-Warping Loop

Quadratic PlacementQuadratic Placement

Legalize (Domino)Legalize (Domino)

Complete Grid Warping Flow

Complete flow has several steps We review them briefly here

Slide 9

ImprovementImprovement

Pre-WarpingPre-Warping

Decompose/RecurseDecompose/Recurse

NonlinearNonlinearGrid-Warping LoopGrid-Warping Loop

Quadratic PlacementQuadratic Placement

Legalize (Domino)Legalize (Domino)

Complete Grid Warping Flow

Quadratic place onto elastic sheet Note: pure quadratic wirelength No reweighting steps

Slide 10

ImprovementImprovement

Pre-WarpingPre-Warping

Decompose/RecurseDecompose/Recurse

NonlinearNonlinearGrid-Warping LoopGrid-Warping Loop

Quadratic PlacementQuadratic Placement

Legalize (Domino)Legalize (Domino)

Complete Grid Warping Flow

Geometric pre-conditioning step Spreads gates out quickly,

uniformly, to improve final wirelen

Slide 11

ImprovementImprovement

Pre-WarpingPre-Warping

Decompose/RecurseDecompose/Recurse

NonlinearNonlinearGrid-Warping LoopGrid-Warping Loop

Quadratic PlacementQuadratic Placement

Legalize (Domino)Legalize (Domino)

Complete Grid Warping Flow

Nonlinear optimizer iteratively perturbs warping grid on sheet

Slide 12

Complete Grid Warping Flow

ImprovementImprovement

Pre-WarpingPre-Warping

Decompose/RecurseDecompose/Recurse

NonlinearNonlinearGrid-Warping LoopGrid-Warping Loop

Quadratic PlacementQuadratic Placement

Legalize (Domino)Legalize (Domino)

Nonlinear optimizer iteratively perturbs warping grid on sheet

..each new warping is quickly “stretched” back to a full placement

Use this to eval cost function, which tracks ‘rectilinear wirelen + capacity’

stretched

Slide 13

ImprovementImprovement

Pre-WarpingPre-Warping

Decompose/RecurseDecompose/Recurse

NonlinearNonlinearGrid-Warping LoopGrid-Warping Loop

Quadratic PlacementQuadratic Placement

Legalize (Domino)Legalize (Domino)

Complete Grid Warping Flow

Nonlinear optimizer delivers a final warped placement

Standard improvement step runs hMetis to optimize location of gates placed near partition cuts

Slide 14

ImprovementImprovement

Pre-WarpingPre-Warping

Decompose/RecurseDecompose/Recurse

NonlinearNonlinearGrid-Warping LoopGrid-Warping Loop

Quadratic PlacementQuadratic Placement

Legalize (Domino)Legalize (Domino)

Complete Grid Warping Flow

Recurse: in this case, 4 new placements inside 4 regions

Continue until ~few gates/region

Slide 15

ImprovementImprovement

Pre-WarpingPre-Warping

Decompose/RecurseDecompose/Recurse

NonlinearNonlinearGrid-Warping LoopGrid-Warping Loop

Quadratic PlacementQuadratic Placement

Legalize (Domino)Legalize (Domino)

Complete Grid Warping Flow

Warping flow delivers a final, but still slightly illegal, placement

Use Domino (T.U. Munich) to legalize to final detailed placement

Slide 16

Problem: Warped Placement with Macrocells Assumptions

We focus on the fixed-macro case

The core problem Warping is intrinsically weak at separating large macros and small gates All instances modeled as points; elastic “stretching” keeps nearby points close

IBM03 IBM04 IBM17 IBM18

Slide 17

Handling Fixed Macrocells

ImprovementImprovement

Pre-WarpingPre-Warping

Decompose/RecurseDecompose/Recurse

NonlinearNonlinearGrid-Warping LoopGrid-Warping Loop

Quadratic PlacementQuadratic Placement

Legalize (FastPlace)Legalize (FastPlace)

Re-WarpingRe-Warping

4 new geometric solutions

Inside warping, a geometric “hash” function that greedily re-locates gates that warp on top of macrocells

…and a new net model (QP)

During partition improvement, closer attention to size imbalances

New backend (FastPlace)

Slide 18

(1) Geometric Hashing During Warping Problem: nonlinear warping drops gates on top of fixed macros Solution: “hash” them off, inside warping loop

Inside warping, inside each eval of global cost func, check if each cell overlaps macro If so, we push it to the nearest boundary that has enough space for the cell We chop chip up into small grids, store “nearest boundary” info in a hash table

Note No attempt to manage density or wirelength in this solution, just legality

M M M M

Slide 19

(2) New Net Models 1st QP & QPs in re-warping

For 2-pin nets, use the clique model, and set weight to 1. For nets with 3 or more pins, use star model and introduce a new variable,

each net has a weight: #Pins/(#Pins - 1). (FastPlace, ISPD’04) 2nd QP and beyond

At 2nd layer of QP and beyond, use Jens Vygen’s net-split technique Hybrid models, make QP faster and conserve quality

If a column (row) contains only one cell, star model is used and no additional variables is introduced.

If a column (row) contains two or more cells, a new cell is introduced and is connected to all the internal cells and all the propagated pins.

Slide 20

(3) Better Consideration of Capacity

ImprovementImprovement

Pre-WarpingPre-Warping

Decompose/RecurseDecompose/Recurse

NonlinearNonlinearGrid-Warping LoopGrid-Warping Loop

Quadratic PlacementQuadratic Placement

Legalize (FastPlace)Legalize (FastPlace)

Re-WarpingRe-Warping

Get rid of “Pre-Warping”

Cost Function: use a single-sided cost function, under-filled regions receive no capacity penalty

hMetis: the total area of all cells in each region does not exceed the capacity of that region

Re-Warping: all of the above mentioned modifications are applied here

Slide 21

(4) New Backend

Use ideas from FastPlace [ICCAD’05] Swap based detail placement

algorithm – greedy algorithm Find the “optimal region” for a cell,

if the cell is not in this region, try to swap it with a cell or space in that optimal region

Detailed backend flow First pass legalize

FengShui 5.1 from SUNY Binghm Local repair for small macros/overlaps

Local greedy swaps Wirelength minimization

Chu’s FastPlace legalizer ideas

Our new overall flow Run WARP as global placement Sort all the cells and ~legalize them Run the new detailed backend

i

The optimal region of Cell i

Cell i

Try to swap cell i with a cell or a space in its optimal region, so that the wire length can be improved.

i

The optimal region of Cell i

Cell i

Try to swap cell i with a cell or a space in its optimal region, so that the wire length can be improved.

Slide 22

Macrocell Results – ISPD’02

Wirelength 3-4% better than Feng Shui Competitive with BonnPlace

Run time 2X Fengshui ~ Competitive with BonnPlace

Exact comparison difficult – BonnPlace is running on an 1.45GHz IBM 4-processor server and is explicitly parallelized software

We’re on a 2.0GHz Xeon.

Feng Shui 5.1 BonnPlace Warp 3

Wire CPU Wire CPU Wire CPU

1.037 0.45 0.993 -- 1.000 1.00

Slide 23

ISPD’02 Layouts

ibm01 ibm04

Slide 24

More Macrocell Results – ISPD’05

DesignWARP 3 APlace

Global Legalization Backend Wirelen

adaptec1 1.0273 1.0447 0.9720 0.8731

adaptec4 2.1258 2.1892 2.0455 1.8731

bigblue1 1.0639 1.0659 1.0181 0.9464

bigblue2 1.7764 1.8526 1.6950 1.4382

bigblue3 3.6878 3.6959 3.5783 3.5789

bigblue4 8.9466 9.1896 8.5981 8.3321

Ratio 1.043 1.063 1.000 0.927

Versus APlace (Kahng@UCSD) ~7% more wirelen Total 41.75 hours for all 6 designs

for Warp on 2.8GHz Xeon

Aside: ISPD’05 contest results: APlace 1.00 mFAR 1.06 Dragon 1.08 mPL 1.09 FastPlace 1.16 Capo 1.17 NTUP 1.21 Fengshui 1.50 Kraftwerk+Domino 1.84

Slide 25

ISPD’05 Layouts

adaptec2 adaptec4

Slide 26

ISPD’05 Layouts

bigblue2 bigblue3

Slide 27

Conclusion and Future Work

Placement with macrocells – WARP3 – competitive New techniques such as geometric hashing, improved hybrid net model Can produce very good quality mixed-size placements reasonably quickly

Future Work Improve both quality and runtime Better handle macrocells and routing congestion “Hybrid” layout strategies (warping, but a flatter, analytical style)