VLSI-Physical Design- Tool Terminalogy

105
Physical Design Flow Physical Design Flow Mohammad reza Kakoee micrellab m.kakoee@unibo.it

Transcript of VLSI-Physical Design- Tool Terminalogy

Page 1: VLSI-Physical Design- Tool Terminalogy

Physical Design FlowPhysical Design Flow

Mohammad reza [email protected]@

Page 2: VLSI-Physical Design- Tool Terminalogy

Agenda

Introduction to design flow and BackendIntroduction to design planningIntroduction to design planningFloorplanning / Hierarchical designP l iPower planningSummary

Page 3: VLSI-Physical Design- Tool Terminalogy
Page 4: VLSI-Physical Design- Tool Terminalogy
Page 5: VLSI-Physical Design- Tool Terminalogy
Page 6: VLSI-Physical Design- Tool Terminalogy
Page 7: VLSI-Physical Design- Tool Terminalogy
Page 8: VLSI-Physical Design- Tool Terminalogy
Page 9: VLSI-Physical Design- Tool Terminalogy
Page 10: VLSI-Physical Design- Tool Terminalogy
Page 11: VLSI-Physical Design- Tool Terminalogy

Agenda

Introduction to design flow and BackendIntroduction to design planningIntroduction to design planningFloorplanning / Hierarchical designP l iPower planningSummary

Page 12: VLSI-Physical Design- Tool Terminalogy

The Physical Design Task

Physical Design FlowVerilog netlist FlowVerilog netlist

GDSII

SDC constraints

Front End Back End

Page 13: VLSI-Physical Design- Tool Terminalogy

Example Physical Design FlowDesign/Constraints Import

Floorplanningp g

Placement

Cl k T S th iClock Tree Synthesis

Routing

Post Route Optimization

Layout Verification / Finishing

Page 14: VLSI-Physical Design- Tool Terminalogy

Fullchip Design OverviewFullchip Design Overview

Core placement area

The location of the core, I/O areas P/G pads and

the P/G gridthe P/G grid

RingsP/G IP

ROM

RAM

StrapsGrid

Periphery (I/O) area

Page 15: VLSI-Physical Design- Tool Terminalogy

Where Do We Start? - DesignWhere Do We Start? Design Planning

Physical DesignVerilog netlist Physical Design Flow

Verilog netlist

SDC constraintsHow do we handle?

Die sizeIO / Hard-IP placementGlobal clock distribution P l iPower planningFlat versus hierarchical design

Page 16: VLSI-Physical Design- Tool Terminalogy

Design PlanningFloorplanning

Determine die sizeShape and arrange hierarchical blocksShape and arrange hierarchical blocks Integrate hard-IP efficientlyPredict and prevent congestion hotspots and critical timing pathspaths

Power planningCreate power distribution grid

Consider IR drop and ElectromigrationImplement power saving techniques

Power gatingg gMulti-Voltage design / Voltage islands

Page 17: VLSI-Physical Design- Tool Terminalogy

Agenda

Introduction to design planningFloorplanningp g

Setup/configurationDie size utilization metallization schemeDie size, utilization, metallization schemeIO-ring and macro placementFlat versus hierarchical designFlat versus hierarchical designHierarchical design planning issues

Power planningPower planningSummary

Page 18: VLSI-Physical Design- Tool Terminalogy

S t / fi tiSetup/configuration

Read netlistRead SDC

check netlistHigh fanoutU iRead SDC

Read .lib filesRead footprint for P&R

UniqueUnconnected inputsStandard cell areaCh k ti i ith t i l d

pLEF : SOC encounterFram : Synopsys tools

Check timing without wire load

Read technology fileMetal width … (DRC rules)rules)

Page 19: VLSI-Physical Design- Tool Terminalogy

Floorplanning – Die SizeFloorplanning Die Size, Utilization & Metal Stack-up

Choosing the die size, initial standard cell utilization and metallization scheme involves several design tradeoffs ( Schedule, Cost, Performance)Schedule, Cost, Performance)

Larger die Easier to route, less congestion, lower cap (decrease signal/power integrity related problems) faster designsignal/power integrity related problems), faster design cycleHigher cost, higher power

M d idMore dense power grid Reduce risk of power related failuresIncrease number of metal layer masks, reduce signal route tracks

Page 20: VLSI-Physical Design- Tool Terminalogy

Floorplanning – UtilizationFloorplanning Utilization

Low standard-cell High standard-cell tili tiutilization utilization

Page 21: VLSI-Physical Design- Tool Terminalogy

Floorplanning – Utilization

Utilization refers to the percentage of core area that is taken up by standard cells.

A typical starting utilization might be 70%A typical starting utilization might be 70%This can very a lot depending on the design

High utilization can make it difficult to close a designRouting congestion,Negative impact during optimization legalization stages.

Utilization changes should be examined after each stage of g gthe flow

Avoid having large increases after placement optimizationFeedback should be given to front-end designersFeedback should be given to front end designersTopographical synthesis is now possible

Page 22: VLSI-Physical Design- Tool Terminalogy

Initialize FloorplanInitialize FloorplanDefine globals (VDD1,VDD2,GND1,….)D fi ( ll ili i f )Define core area : (cells + utilization factor)

IO [Analog] macro

core core

[ g]

IO

Shape can be implied by a macro

Place IO (fixed, equidistant,..)Take macro’s and power domains intoTake macro’s and power domains into account already

Page 23: VLSI-Physical Design- Tool Terminalogy

IO Ring and Large MacroIO Ring and Large Macro Placement

IO Ring is often decided by front-end designers, with input from physical design and packaging engineers.When placing large macros we must consider impacts on routing,When placing large macros we must consider impacts on routing, timing and power.

For wire-bond place power hungry macros away from the chip center. Possible routing center.

congestion hotspots

Page 24: VLSI-Physical Design- Tool Terminalogy

Flat Versus HierarchicalFlat Versus Hierarchical DesignWhat happens if the design is too big to be handled by the EDA tools?y

Hierarchical DesignI/O PadFullchip Design

IP Macro

/Blk 1 Blk 2 Blk 3

Block / TileP&R Flow

P&R Flow

P&R Flow

Fullchip Timing & Verification

Page 25: VLSI-Physical Design- Tool Terminalogy

Flat Versus HierarchicalFlat Versus Hierarchical DesignHierarchical Design

AdvantagesFaster runtime, less memory needed for EDA toolsFaster eco turn-around timeAbility to do design re useAbility to do design re-use

DisadvantagesMuch more difficult for fullchip timing closureMuch more difficult for fullchip timing closure (ILMs)More intensive design planning needed, feedthrough generation repeater insertion timingfeedthrough generation, repeater insertion, timing constraint budgeting.

Page 26: VLSI-Physical Design- Tool Terminalogy

Hierarchical Design : SpecifyHierarchical Design : Specify Partitions / Plan GroupsNetlist must have partitions as top level modules.Partitions generally sized according to a target initial utilization ~70% utilization, ~300k-700k instances

Ch l b t tChannels or abutmentRectilinear block shapes are possible Abutment

Channels

RectilinearBlocksBlocks

Page 27: VLSI-Physical Design- Tool Terminalogy

Hierarchical Design : PinHierarchical Design : Pin AssignmentPin constraints include parameters such as,

Layers, spacing, size, overlapNet groups, pin guides

Pins can be assigned placement based

Pin guide 1

Pin guide 2Pins can be assigned placement-based (flightlines) or route-based (trial route, boundary crossings).Pin guides can be used to influence automatic

Partition

Pin guide 2

pin placement of particular net groups

Pins at partition corners can make

ti diffi ltrouting difficult

Page 28: VLSI-Physical Design- Tool Terminalogy

Hierarchical Design : TimingHierarchical Design : Timing Budgeting

Chip level constraints must be mapped correctly to block level constraintsTh d i t b l d t i l t d d h iThe design must be placed, trial routed and have pins assigned before running budgetingBlock level constraints will be assigned input or outputBlock level constraints will be assigned input or output delays on I/O ports based off of the estimated timing slack.

set input delay 1 5 [ get port IN1 ]IN1 set_input_delay 1.5 [ get_port IN1 ]1.5ns

Block Boundary

Page 29: VLSI-Physical Design- Tool Terminalogy

Hierarchical Design : TimingHierarchical Design : Timing Budgeting & Fullchip Timing Closure

Fullchip timing closure is typically a bottleneck for design cycles.Block-level P&R flow does not emphasize io-to-flop, flop-to-io, io-to-io timing paths because budgeted constraints are only estimatestiming paths, because budgeted constraints are only estimates

Interface logic models (ILMs) can be usedTo speed-up timing analysis runs when fullchip design is too large.Required clock and datapaths are preserved, net/cell names are q p p ,identical

A

B

Clk

X

Y

A

B

X

Y

ClkClk

Original Netlist Interface Logic Model (ILM)

Page 30: VLSI-Physical Design- Tool Terminalogy

Agenda

Introduction to design planningFloorplanningFloorplanningPower planning

Intro to power issues in IC designIntro to power issues in IC designBasic power grid creationMulti voltage design & power gatingMulti-voltage design & power gatingAutomated power grid design flows

SummarySummary

Page 31: VLSI-Physical Design- Tool Terminalogy

Power Consumption and ReliabilityPower Consumption and Reliability

Dynamic Power IR-Drop /Dynamic Power IR Drop /Voltage Drop

Average Power problem

Static Power(Leakage Power)

Fail

p ob e

Electromigration(EM)

FloorplanPower densityproblem in theFloorplan

+Design of the grid

problem in theLong run

1 out of 5 chips fail due to excessive power consumption

Page 32: VLSI-Physical Design- Tool Terminalogy

Power Consumption and Reliability :Power Consumption and Reliability : IR-Drop

The drop in supply voltage over the length of the supply lineA resistance matrix of the power grid is constructedThe average current of each gate is considered g gThe matrix is solved for the current at each node, to determine the IR-drop.

VDD Pad VDD

Page 33: VLSI-Physical Design- Tool Terminalogy

Where does the all power goWhere does the all power go to?

Total Power

Core I/O+Core I/O

•Separate supply ring•Often higher voltage

Standard Cells Macros +

•Clock network

•Fixed, no optimization

Page 34: VLSI-Physical Design- Tool Terminalogy

Agenda Introduction to design planningFloorplanningPower planning

Intro to power issues in IC designp gBasic power grid creationMulti-voltage design & power gatingAutomated power grid design flows

Summary

Page 35: VLSI-Physical Design- Tool Terminalogy

Power Grid Creation : MacroPower Grid Creation : Macro Placement

Blocks with the highest performance andperformance and highest power consumption

Close to border power pads (IR drop)

Away from each other (EM)

Page 36: VLSI-Physical Design- Tool Terminalogy

Agenda Introduction to design planningFloorplanningPower planning

Intro to power issues in IC designp gBasic power grid creationMulti-voltage design & power gatingAutomated power grid design flows

Summary

Page 37: VLSI-Physical Design- Tool Terminalogy

Agenda

Introduction to design planningFloorplanningp gPower planning

Intro to power issues in IC designIntro to power issues in IC designBasic power grid creationMulti-voltage design & power gatingg g p g gAutomated power grid design flows

Summaryy

Page 38: VLSI-Physical Design- Tool Terminalogy

Automated Power Grid Design:Automated Power Grid Design: PNS & PNA

Power grid creation has usually done by hand using rules of thumb for widths and number of straps

Analysis often done late in the design flowAnalysis often done late in the design flowGrid is typically over-designed to prevent time-intensive power grid changes.

When incorporating advanced low-power strategies, there are too many variables to achieve an optimal result manually.For more complex designs an automated strategy is preferred.

e g Power Network Synthesis (PNS) and Powere.g Power Network Synthesis (PNS) and Power Network Analysis (PNA) from SynopsysAllows designers to anticipate affects of floorplanning

Page 39: VLSI-Physical Design- Tool Terminalogy

P N t k A l i (PNA)Power Network Analysis (PNA)

There are EDA tools that allow early power network analysis for designs in the early floorplaning stage.

N t i ff lit b t d h f i iti l d iNot signoff quality, but good enough for initial design.e.g. Synopsys Power Network Analysis (PNA)

VDDVDD Pad VDD

Page 40: VLSI-Physical Design- Tool Terminalogy

Power Network Synthesis:Power Network Synthesis: PNS – What?

Goal is to QUICKLY find minimum routing resource required to meet specified IR drop target

More power routing => easier to reach IR-dropMore power routing => easier to reach IR-drop target, but harder to route clock and signals with remaining tracks

Power pads

Power straps(in Red)

Power pads

Power trunks

Power rings

Page 41: VLSI-Physical Design- Tool Terminalogy

PNS : Running PNS TrialsPNS : Running PNS Trials

Run PNS

Page 42: VLSI-Physical Design- Tool Terminalogy

PNS C t P R tiPNS : Create Power RoutingAfter running trials, an optimal power grid can be chosen and the g , p p gactual rails can be laid out.Virtual rails => actual rails

Outside main PNS : memory footprint + cpu timeMany options : eg. % Via penetration , order of routing …

Check legal cell/pin placement (grid aligned ?)Depending on the design phasep g g p

What cells, nets and layerseg. First macros and pads, then high voltage areas, …

Secondary PG ports on level shftrs, isol. cells, ret. RegsSeco da y G po ts o e e s t s, so ce s, et egsLater after placement during routing : same as the follow pins for the normal vdd and gnd of the std cells.

Page 43: VLSI-Physical Design- Tool Terminalogy

PNS C t P R tiPNS : Create Power Routing

Page 44: VLSI-Physical Design- Tool Terminalogy

SSummaryThe goal of design planning is to arrange the chip so that the “Place and g g p g g pRoute” flow can converge quickly and easily.

Design experience is neededFloorplan is driven by :

PPower TimingCongestionMinimum areaMinimum area

There is no 1 way to create a floorplanFlat – hierarchicalRegions, position of the macro’sg , pOrder of placement IO versus macros versus core

This phase can take a significant portion of the complete backend design time.E l l i f id i ti l f idi j blEarly analysis of power grid is essential for avoiding major problems near the end of the design cycle.Automated power grid tools may help reduce necessary safety margins.

Page 45: VLSI-Physical Design- Tool Terminalogy

PlacementPlacement

Page 46: VLSI-Physical Design- Tool Terminalogy

Placement in the FlowPlacement in the Flow

Design Specification dDesign Specification

Logic Design and Verification

Fron

t-End

Logic Synthesis

Physical Libraries

F

PhysicalDesignStage

Netlist

Libraries

Placement

Floorplanning

ack-

End

gPhysical Design

Constraints

Routing Ba

Page 47: VLSI-Physical Design- Tool Terminalogy

D fi iti f Pl tDefinition of PlacementPlacement : Exact placement of thePlacement : Exact placement of the

modules (modules can be gates, standard cells macros ) The general goal is tocells, macros…). The general goal is to minimize the total area and interconnect costcost.

The quality of the attainable routing is highly d t i d b th l t

Circuit placement becomes very critical in 90nm

determined by the placement.

and below technologies.

Page 48: VLSI-Physical Design- Tool Terminalogy

C t F ti f Pl tCost Function for PlacementMethods of considerationCost components

Area

Wire length Traditional methods of Placement

Overlap

Timing Timing driven PlacementTiming

Congestion

Timing-driven Placement

Congestion-driven Placement

Clock

Power

Clock Gating

Multivoltage and Multisupply Placement

Page 49: VLSI-Physical Design- Tool Terminalogy

Placement StepspInput information:

NetlistMapped and floorplanned design

Logical and physical librariesDesign constraints

Reading Gate level netlists from synthesisReading Gate-level netlists from synthesis

Global placement

D il d lDetailed placement

Placement optimization

Output information:Physical layout informationCell placement locations

Physical layout timing and technology information of reference librariesPhysical layout, timing, and technology information of reference libraries

Page 50: VLSI-Physical Design- Tool Terminalogy

Inputs for the Placement ToolInputs for the Placement ToolGate-level netlist

LogicalDesign

constraints

TargetPlacement

Physicaltool

Macro cell

Design libraries

Macro cell

Standard cellReference

Floorplanned design

Technology file

Page 51: VLSI-Physical Design- Tool Terminalogy

Inside A Physical LibraryInside A Physical LibraryMACRO AN2D0

CLASS CORE

Example l f DimensionCLASS CORE ;

FOREIGN AN2D0 0.000 0.000 ;ORIGIN 0.000 0.000 ;SIZE 1.400 BY 2.520 ;SYMMETRY x y ;SITE core ;PIN Z

.lef Dimension“bounding box”

Pins

VDD

A B

Blockage

ANTENNADIFFAREA 0.1680 ;DIRECTION OUTPUT ;PORTLAYER M1 ;RECT 1.300 0.640 1.330 1.675 ;RECT 1.190 0.640 1.300 1.780 ;RECT 1.140 0.640 1.190 0.900 ; reference point

(direction, layer and shape)

GND

Y

NAND_1

Symmetry(X, Y, or 90º) F

Abstract ViewRECT 1.140 1.520 1.190 1.780 ;END

END ZPIN A2

ANTENNAGATEAREA 0.0704 ;DIRECTION INPUT ;PORT

reference point(typically 0,0)

Abstract View

PORTLAYER M1 ;RECT 0.610 0.975 0.770 1.545 ;END

Page 52: VLSI-Physical Design- Tool Terminalogy

T h l I f tiTechnology InformationFor each tool, a specific set of files are required to provide details about the metal layers for the chosenprovide details about the metal layers for the chosen process technology…

Number and name designations for each layer/viaPh i l d l t i l h t i ti f h lPhysical and electrical characteristics for each layerDielectric constantDesign rules for each layer (min spacing, min width,

)etc…)Units and precision for numerical values

Example filetypesp yp.lefhdr, .tf -> contain layer and design rule information

Also, there are files that enable improved RC estimationAlso, there are files that enable improved RC estimation that can be read by the placement engines.

.captable, .tluplus -> store RC coefficients.

Page 53: VLSI-Physical Design- Tool Terminalogy

Ph i l T h l D tPhysical Technology DataLAYER M1

TYPE ROUTING ;DIRECTION HORIZONTAL ;

The technology files contain ExampleOFFSET 0 ;PITCH 0.280 ;WIDTH 0.120 ;MAXWIDTH 12.000 ;AREA 0.058 ;MINENCLOSEDAREA 0.200 ;THICKNESS 0.240 ;HEIGHT 0.765 ;

design rule information that can be read by the tools

For example the

Example .lefhdr

SPACINGTABLEPARALLELRUNLENGTH 0.00 0.52 1.50 4.50WIDTH 0.00 0.12 0.12 0.12 0.12WIDTH 0.30 0.12 0.17 0.17 0.17WIDTH 1.50 0.12 0.17 0.50 0.50WIDTH 4.50 0.12 0.17 0.50 1.50

;

For example, the spacing table constrains the parallel runlength of dj t i th MINIMUMCUT 2 WIDTH 0.42 ;

MINIMUMCUT 4 WIDTH 0.98 FROMABOVE ;MINIMUMCUT 2 WIDTH 0.70 LENGTH 0.70 WITHIN 1.001 ;MINIMUMCUT 2 WIDTH 2.00 LENGTH 2.00 WITHIN 2.001 ;MINIMUMCUT 2 WIDTH 3.00 LENGTH 10.0 WITHIN 5.001 ;

MINIMUMDENSITY 15 ;MAXIMUMDENSITY 70 ;

adjacent wires on the same layer.Wire width and pitch are MAXIMUMDENSITY 70 ;

DENSITYCHECKWINDOW 50 50 ;DENSITYCHECKSTEP 50 ;FILLACTIVESPACING 0.60 ;

Wire width and pitch are also described, as well as any more complex design rules for routingdesign rules for routing.

Page 54: VLSI-Physical Design- Tool Terminalogy

Gl b l d D t il Pl tGlobal and Detail PlacementReading Gate LevelReading Gate-Level

Netlist from synthesis

Global Placement

Pl t ti i ti

Detailed Placement

Placement optimization

Page 55: VLSI-Physical Design- Tool Terminalogy

Gl b l Pl tGlobal Placement

Standard cells are placed into groups such that the number of connections between groups is minimized.This is solved through circuit partitioningThis is solved through circuit partitioning.

Bad Placement Good Placement

Page 56: VLSI-Physical Design- Tool Terminalogy

Detail Placement : CoarseDetail Placement : Coarse Placement

C Pl t

All the cells are placed in the i t l ti b t th

Coarse Placement

approximate locations but they are not legally placed

No logic optimization is done

Page 57: VLSI-Physical Design- Tool Terminalogy

D t il Pl t L li tiDetail Placement : Legalization

Legalization: Ensures that the final placement is legal before saving the design.

Legal placement of cells is not required for analyzing routing ti t l tcongestion at an early stage

Page 58: VLSI-Physical Design- Tool Terminalogy

H d M Pl tHard Macro PlacementHard macros are placed during the fl l i t d th k dfloorplanning stage and then marked as FIXED for placement.Typically, hard macros are placed near the sides of the core area.

Page 59: VLSI-Physical Design- Tool Terminalogy

S G id li f Pl t (2)Some Guidelines for Placement (2)RAM 1 RAM 2 RAM 3

RAM 4 RAM 5 RAM 6

Avoid constrictive channels

RAM 8RAM 7

Avoid many pins in the narrow

channel. Rotate for pin accessibility Use blockage

t i ito improve pin accessibility

Page 60: VLSI-Physical Design- Tool Terminalogy

Review of Placement CostReview of Placement Cost Function

Methods of considerationCost components

Area

Wire length Traditional methods of Placement

Methods of considerationCost components

Wire length

Overlap

Traditional methods of Placement

Timing

Congestion

Timing-driven Placement

Congestion-driven Placement

Clock

Power

Clock Gating

Multivoltage and Multisupply PlacementPower Multivoltage and Multisupply Placement

Page 61: VLSI-Physical Design- Tool Terminalogy

Ti i D i Pl tTiming Driven PlacementCritical paths are determined using static timing p g ganalysis (STA). Tool attempts to minimize wire length of critical paths to meet setup timing.

Net RCs are based on VirtualNet RCs are based on Virtual Routing (VR) estimates

Page 62: VLSI-Physical Design- Tool Terminalogy

Vi t l R t / T i l R tVirtual Route / Trial RouteManhatten geometry

Virtual Route

Horizontal Vertical

Manhatten geometry

Horizontal – Vertical

NO diagonal routing

Page 63: VLSI-Physical Design- Tool Terminalogy

Congestion Driven Placement:Congestion Driven Placement: Detouring Routes

Congestion Map

Congestion hot spot

Congestion Map

If congestion is not too

Issues with Congestion

severe, the actual route can be detoured around the

congested area

DetourThe detoured nets will have worse RC delay compared to

the VR estimates

In highly congested areas delay estimates during placement will

≥2 ≥3 ≥4 ≥5 ≥6 ≥7

the VR estimates

In highly congested areas, delay estimates during placement will be optimistic.

Page 64: VLSI-Physical Design- Tool Terminalogy

C ti MCongestion MapCauses high local No need to use -congestion

utilizationBy default, physical synthesis tools

perform some congestion optimization

unnecessarily

G f

perform some congestion optimization which has a reasonable chance of providing acceptable congestion

Gives uniform densityCongestion driven placement increases the effort of algorithm to fix congestion

On average –congestion option

For better correlation to post-route, congestion-driven placement is enabled

increases runtime by 20%

co gest o d e p ace e t s e ab edbased on GR congestion map

Page 65: VLSI-Physical Design- Tool Terminalogy

Congestion Driven Placement:Congestion Driven Placement: Options

Some Congestion: using medium effort congestion-driven

M ti ti 90%Max routing congestion > 90%Large hot spots

Bad Congestion: using high effort congestion-drivenBad Congestion: using high effort congestion-drivenMax routing congestion >> 90%Very large hot spotsy g p

Congestion-driven might affect timing negatively butPost-route numbers will not create surprisesLower congestion will speed up the detailed router

Page 66: VLSI-Physical Design- Tool Terminalogy

M dif i Ph i l C t i tModifying Physical ConstraintsModifying Physical Constraints:

Cell density can be up to

Modifying Physical Constraints: Cell Density

x2 y2y p

95% by default

Density level can also be applied to a specific region

x1 y1

applied to a specific region

Lower cell density in congested areas using –congested areas using

coordinate option

Page 67: VLSI-Physical Design- Tool Terminalogy

M dif i th Fl lModifying the FloorplanTop-level portsTop level ports

Changing to a different metal layerSpreading them out, re-ordering or moving to other sides

Macro location or orientationAlignment of bus signal pinsAlignment of bus signal pinsIncrease of spacing between macros

Core aspect ratio and sizepMaking block taller to add more horizontal routing resourceI f th bl k i t d ll tiIncrease of the block size to reduce overall congestion

Power grid: Fixing any routed or non-preferred layers

Page 68: VLSI-Physical Design- Tool Terminalogy

Congestion Driven vs. TimingCongestion Driven vs. Timing Driven Placement

In general there is a direct trade-off between congestion and timingg g

Timing-driven placement tries to shorten nets whereas congestion driven placement tries to g pspread cells, thus lengthing nets.

Iterative placement trials should be pperformed to find a balance between the different tool options/settings.p g

Page 69: VLSI-Physical Design- Tool Terminalogy

Timing and Congestion Optimization

Some things that can be done for timing optimization…Addi / d l ti b ffAdding / deleting buffers Resizing gatesRestructuring the netlistRestructuring the netlistSwapping pinsMoving instancesgArea recovery

Congestion optimization tries to reduce local congestion hotspots.

Generally if congestion exists after placement, little more can be done if area recovery is not significantmore can be done, if area recovery is not significant.

It is essential that sufficient area is available for any optimizations that are required

Page 70: VLSI-Physical Design- Tool Terminalogy

Cl k T S th iClock Tree Synthesis

CTS

Page 71: VLSI-Physical Design- Tool Terminalogy

General Concept of Clock treeGeneral Concept of Clock tree synthesisy

CLK CLK

Unbuffered clock tree Buffered/balanced clock tree

Skew

Power

Area (#buffers)

Slew rates

71+ Minimize total insertion delay (latency)

Page 72: VLSI-Physical Design- Tool Terminalogy

S f kSources of skewNot perfectly balanced clock treep y

Different levels of bufferingDifferent cellsDifferent load due to routingDifferent RC delays

SSetting a skew constraint = 0 psMakes no senseInsertion delay (latency) will increaseInsertion delay (latency) will increasePower consumption will increaseArea will increaseArea will increaseRule of thumb : skew values : 100 – 150 ps for 90 nm

Page 73: VLSI-Physical Design- Tool Terminalogy

Extra sources of clock skew : variabilityy

Unwanted Skew Variations

TS

WProcess variations in clock buffers

Power supply noiseH

Ground plane

HTemperature variations

. part of the OCV (lecture 15)

tGate length

Gate width

.

.

. L effectivepart of the OCV (lecture 15)

toxGate width

73

Page 74: VLSI-Physical Design- Tool Terminalogy

CTS in a design flowVLSI Design Steps Simplified CTS Design Flow

RTLLogical

Clock TreeSequentials

(x,y)

Clock gating

LogicSynthesis

( ,y)

ClockBufferingPhysical

Synthesis (Placement)Routing

Clock Nets

Buffering

CTS

SizingClock Buffers

Clock Nets

RoutingClock Buffers

Page 75: VLSI-Physical Design- Tool Terminalogy

Prepare the netlist for CTSPrepare the netlist for CTS

Analyze the clock treesCheck the clocksCheck the clocksRemove unwanted buffering

Page 76: VLSI-Physical Design- Tool Terminalogy

R t d b ff iRemove unwanted buffering

Unnecessary pre-existing clock buffers/inverters

remove_clock_tree

Page 77: VLSI-Physical Design- Tool Terminalogy

CTS : GoalsCTS : GoalsMeeting the clock tree design rule

Constraints are upper bound goals. If constraints

t t i l ti ill

constraints

Maximum transition delay

Maximum load capacitance are not met, violations will be reported.

Maximum load capacitance

Maximum fanout

[Maximum buffer levels][ ]

defaults

Meeting the clock tree targets

Maximum skew Highest priority

77

Maximum skew

Min/Max insertion delay (latency)

Highest priority

Page 78: VLSI-Physical Design- Tool Terminalogy

Effect of Clock Tree Synthesis on placementon placement

Clock buffers added

Congestion may increase

Non clock cells may have been moved to less ideal locations

Inserting clock trees can introduce new timing and max

moved to less ideal locations

introduce new timing and max tran/cap violations

“real” skew taken into account

Page 79: VLSI-Physical Design- Tool Terminalogy

SummaryClock tree synthesis is one of the mostClock tree synthesis is one of the most important steps of IC design and can have a significant impact on timing power areaa significant impact on timing, power, area, etc.Th l ki t t h t b di dThe clocking strategy has to be discussed with the frontend people before CTS is t t dstartedClocks identificationClock dependenciesClock balancing

Page 80: VLSI-Physical Design- Tool Terminalogy

RoutingRouting

Page 81: VLSI-Physical Design- Tool Terminalogy

Overview

Routing fundamentals / Advanced issues introThe routing flowSpecial topics for 90nm and belowSpecial topics for 90nm and belowAdditional routing considerationsSummary

Page 82: VLSI-Physical Design- Tool Terminalogy

Physical Design FlowPhysical Design Flow

Design/Constraints Import

Physical Design Flow

Floorplanning

Placement

Clock Tree Synthesis

Routingg

Post Route Optimization

Fi i hi

82

Finishing

Page 83: VLSI-Physical Design- Tool Terminalogy

Routing Fundamentals

Goal is to realize the metal/copper connections between the pins of standard cells and macros

Input : placed designplaced design fixed number of metal/copper layers

Goal: routed design that is DRC clean and meets setup/hold timing

Consists of two phases1. Global route2. Detail route

Standard cell pin

Horizontal routingrouting tracks

Vertical routing tracks

Page 84: VLSI-Physical Design- Tool Terminalogy

Routing Fundamentals :Routing Fundamentals : Advanced Issues

Timing driven routingTiming budget for each netTiming budget for each netMinimize critical paths

Signal integrity aware : 90nm and below !!!!Signal integrity aware : 90nm and below !!!!Minimize crosstalk

DFM / DFYDFM / DFYDRC cleanRule based versus Model based

Page 85: VLSI-Physical Design- Tool Terminalogy

General Flow for RoutingGeneral Flow for RoutingPlacement and CTS

Route Clock Nets

Global Route Signal Nets

Detail Route Signal Nets

Design for Manufacturing (DFM)

Geert Vanwijnsberghe - Affiliation 85

Page 86: VLSI-Physical Design- Tool Terminalogy

Global RouteGlobal RouteVertical routing capacity = 9 tracks

Horizontal routing

Y

Horizontal routing capacity = 9 tracks

X

XXY

8686

Page 87: VLSI-Physical Design- Tool Terminalogy

Global RouteGlobal RouteInput:

Cell and macro placementCell and macro placementRouting channel capacity per layer / per direction

Goal:Goal: Perform fast, coarse grid routing through global routing cells (GCells) while considering the following:

Wire lengthCongestionTimingTimingNoise / SI

Often used by placement engines to predict congestion in the form of a “trial ro te” or

8787

congestion in the form of a “trial route” or “virtual route”

Page 88: VLSI-Physical Design- Tool Terminalogy

Global RouteGlobal RouteGlobal Route

global route

Tries to avoid congested Gcells while

Assigns nets to specific metal layers and global routing cells (Gcells)

Tries to avoid congested Gcells while minimizing detours

Congestion exists when more tracks are needed than availableare needed than available

Also avoids P/G (rings/straps/rails) and

Detours increase wire length (delay)

X

Y

virtual routecongested area

routing blockages

88

Page 89: VLSI-Physical Design- Tool Terminalogy

Global RouteGlobal Route

8989

Preroute Global route

Page 90: VLSI-Physical Design- Tool Terminalogy

Detail RouteDetail RouteUsing global route plan, within each global route cellglobal route cell

Assign nets to tracksL d iLay down wiresConnect pins to corresponding nets

Solve DRC violationsReduce cross couple capp pApply special routing rules

9090

Page 91: VLSI-Physical Design- Tool Terminalogy

Detail Route: Track AssignmentDetail Route: Track Assignment

For nets that traverse multiple GCellsGCellsAssigns each net to a specific track anda specific track and lays down the actual metal tracesMakes long, straight traces and Reduces the number

91

Reduces the number of vias

Preroute TA metal traces Jog reduces via count

Page 92: VLSI-Physical Design- Tool Terminalogy

Detail route : Solve DRC ViolationsDetail route : Solve DRC Violations

Detail Route BoxesSolveshorts

NotchSpacing

Detail Route Boxesshorts

NotchSpacing

Mi

Thin&FatSpacing

MinSpacing

92

Page 93: VLSI-Physical Design- Tool Terminalogy

Detail Route: Analysis of RoutingDetail Route: Analysis of Routing DRC Errors

93

Page 94: VLSI-Physical Design- Tool Terminalogy

Timing Driven RoutingTiming Driven Routing

At 90 Quality of route can effect timingnm net delay becomes significant

Optimize critical paths Route some nets first

Most routing freedom at startUse shortest paths possible

Net weightsNet weights Order of routing (priorities : eg. Default : Clocks 50, others 2)

Wi id iWire wideningReduce resistance

Page 95: VLSI-Physical Design- Tool Terminalogy

What is Signal Integrity or SI? (1)What is Signal Integrity or SI? (1)Signal delay caused by crosstalk noise

Possible in 2 directions : push-out pull-downp p

Aggressornet 1

Victimnet 2

DelaySpeed Up

95

Page 96: VLSI-Physical Design- Tool Terminalogy

What is SI? (2)What is SI? (2)Glitch caused by crosstalk noise

Aggressor

Extra clock cycle!

Functional FailureFunctional Failure

^D Q

Vdd

Clk

Victim

96

Page 97: VLSI-Physical Design- Tool Terminalogy

Crosstalk Prevention : DesignCrosstalk Prevention : Design Optimization

Noise depends onCoupling capacitanceCoupling capacitanceTotal net capacitanceStrength of the driver (Rd of the victim net)Strength of the driver (Rd of the victim net)

Design optimizationIncrease drive strength often easier (onlyIncrease drive strength, often easier (only local effect)Buffer long netsBuffer long nets

Page 98: VLSI-Physical Design- Tool Terminalogy

Crosstalk Prevention : RoutingCrosstalk Prevention : RoutingRouting solution

Limit length of parallel nets (H&V)

Wire spreading (skip track - clocks)

Shield special netsShield special nets

Coupling free routing

98

Page 99: VLSI-Physical Design- Tool Terminalogy

Crosstalk Prevention : ReduceCrosstalk Prevention : Reduce Cross Coupling Cap

Critical NetsExtra space Grounded shields

Critical Nets

Spacing ShieldingSpacing ShieldingSame layer (H)

Adjacent layers (V) Net Ordering

99

Page 100: VLSI-Physical Design- Tool Terminalogy

Effect of Floorplanning on RoutingEffect of Floorplanning on Routing CongestionFor hierarchical designs, good pin placement is essential to preventing p p grouting congestion.

Can use pin guides during partitioningCan use pin guides during partitioning

Page 101: VLSI-Physical Design- Tool Terminalogy

Routing around blockages and overRouting around blockages and over macros

By default routing tool will:

M1- M4 Routing BlockageRoute over macros

By default routing tool will:

Not route where there is a routing

M1- M3 Routing Blockage

Not route where there is a routing blockage

Not route through a narrow channel in the non-preferred

M1- M4 Routing Blockage

channel in the non preferred routing direction

Macro

M4 has a horizontal routing channel but its preferred

routing direction is vertical

The preferred routing direction needs to be changed

Macro

Page 102: VLSI-Physical Design- Tool Terminalogy

Clock Tree RoutingClock Tree RoutingFor SI prevention we generally want to route our clocks with extra spacingour clocks with extra spacing.Global H-trees are often routed manually before placementbefore placement

Htree nets may be routed with wide-metal and shielding. Wide-metal H-Tree netWide metal H Tree net

102Grounded shields

Page 103: VLSI-Physical Design- Tool Terminalogy

Post Route Clock TreePost Route Clock Tree Optimization (CTO)

improve the skew on clock nets

Detail Routed Before CTO

Skew OK?Yes

Detail Routed Design

Short

Postroute CTOECO Route

Skew OK?

Nopath

ECO Route

After CTO

Increased delay

Page 104: VLSI-Physical Design- Tool Terminalogy

O ti f CPU ff tOptions for CPU effort

# processorsRouting in parallel on # processorsRouting in parallel on # processorsSuperthreading, multithreadingSome routers are better a threading thanSome routers are better a threading than others

# iterations for detail route# of iteration steps done to get a DRC free# of iteration steps done to get a DRC free design

Page 105: VLSI-Physical Design- Tool Terminalogy

SSummaryStarting from 90 nm technologiesStarting from 90 nm technologies

Timing Driven Routenet delay is becoming more of a factornet delay is becoming more of a factor

SI Aware RouteSmall geometries make SI timing closure muchSmall geometries make SI timing closure much more difficult

DFM / DFYNow a crucial part of the routing flow

DRCNumber and complexity of DRC rules has increased dramatically