EE-382M-8 VLSI–II Early Design Planning: Back...

65
The University of Texas at Austin EE 382M-8 VLSI-2 Page 1 EE-382M-8 VLSI–II Early Design Planning: Back End Mark McDermott

Transcript of EE-382M-8 VLSI–II Early Design Planning: Back...

Page 1: EE-382M-8 VLSI–II Early Design Planning: Back Endusers.ece.utexas.edu/~mcdermot/vlsi-2/Lecture_3.pdf · 2008-10-21 · Approximate clock routing at the top-level – Approximate

The University of Texas at AustinFoil # 1 The University of Texas at AustinEE 382M-8 VLSI-2 Page 1

EE-382M-8

VLSI–II

Early Design Planning:Back End

Mark McDermott

Page 2: EE-382M-8 VLSI–II Early Design Planning: Back Endusers.ece.utexas.edu/~mcdermot/vlsi-2/Lecture_3.pdf · 2008-10-21 · Approximate clock routing at the top-level – Approximate

The University of Texas at AustinFoil # 2 The University of Texas at AustinEE 382M-8 VLSI-2 Page 2

Backend EDP Flow

• The project activities will include:– Determining the standard cell and custom library elements needed

to completely do the design with APR tools.– Detailed floor-plan of the block level components.– A reasonably detailed top-level floorplan using the cluster abstracts.– Approximate clock routing at the top-level– Approximate Power-GND routing at the top level

Page 3: EE-382M-8 VLSI–II Early Design Planning: Back Endusers.ece.utexas.edu/~mcdermot/vlsi-2/Lecture_3.pdf · 2008-10-21 · Approximate clock routing at the top-level – Approximate

The University of Texas at AustinFoil # 3 The University of Texas at AustinEE 382M-8 VLSI-2 Page 3

EDP and Layout in the Design Flow

Concept

Architecture

Logic

Circuits

Si Debug

uArchitecure

Production

EDP

Layout

Front End Development

BackendDesign

Execution

Silicon Ramp

EDP encompasses planning from

architecture to the layout.

Technology Readiness

Page 4: EE-382M-8 VLSI–II Early Design Planning: Back Endusers.ece.utexas.edu/~mcdermot/vlsi-2/Lecture_3.pdf · 2008-10-21 · Approximate clock routing at the top-level – Approximate

The University of Texas at AustinFoil # 4 The University of Texas at AustinEE 382M-8 VLSI-2 Page 4

Standard Cell Library Effort

• Will be using a very minimal standard cell library for the project: ~80+ cells– Basic logic gates and buffers– 1 set-reset flip-flop

• “CMOS65_SubVt.lib” file was derived using a scaled 65nm .lib file– Need to validate the scaled numbers with HSPICE simulations.– Need to validate power spreadsheet numbers using HSPICE:

• S-D leakage currents• Intrinsic power

– Need to validate area spreadsheet numbers

Page 5: EE-382M-8 VLSI–II Early Design Planning: Back Endusers.ece.utexas.edu/~mcdermot/vlsi-2/Lecture_3.pdf · 2008-10-21 · Approximate clock routing at the top-level – Approximate

The University of Texas at AustinFoil # 5 The University of Texas at AustinEE 382M-8 VLSI-2 Page 5

Block Floorplanning Effort

• Objectives:– Minimize area– Determine best shape of the block– Minimize total wire length

• Each team will do a detailed floorplan of their respective blocks. The output will be a spreadsheet analysis showing the contribution from each of the following:– Power grid– Clocking– Signal Routing– Datapath area– Random logic area– White space

Page 6: EE-382M-8 VLSI–II Early Design Planning: Back Endusers.ece.utexas.edu/~mcdermot/vlsi-2/Lecture_3.pdf · 2008-10-21 · Approximate clock routing at the top-level – Approximate

The University of Texas at AustinFoil # 6 The University of Texas at AustinEE 382M-8 VLSI-2 Page 6

Integration Effort

• The integration team will be responsible for:– Doing a floor plan of the top level of the chip– Characterizing the top-level routing delays and determining the

assertions and constraints for each cluster. They will be working with each cluster to optimize the constraints.

– Designing the clock routing structure: – Determining the clock generation implementation (block diagrams)– Determining the clock regeneration circuitry (block diagrams)– Determining the reset logic. – Designing the power grid.– Determining the power estimation for the global clock and signal

routing.– Generating the power budget for each cluster.– Generating the area budget for each cluster.

Page 7: EE-382M-8 VLSI–II Early Design Planning: Back Endusers.ece.utexas.edu/~mcdermot/vlsi-2/Lecture_3.pdf · 2008-10-21 · Approximate clock routing at the top-level – Approximate

The University of Texas at AustinFoil # 7 The University of Texas at AustinEE 382M-8 VLSI-2 Page 7

Layout Implementation Options

SPARC-T1

Page 8: EE-382M-8 VLSI–II Early Design Planning: Back Endusers.ece.utexas.edu/~mcdermot/vlsi-2/Lecture_3.pdf · 2008-10-21 · Approximate clock routing at the top-level – Approximate

The University of Texas at AustinFoil # 8 The University of Texas at AustinEE 382M-8 VLSI-2 Page 8

Layout Density & Die Size = Performance

• Higher density layout leads to smaller block sizes

• Smaller block sizes lead to shorter wires

• Shorter wires can lead to higher frequency

• Shorter wires can also lead to higher IPC by requiring fewer transmission pipe stages

Layout #1

Layout #2

A B’

A

C

CB

A C

Schematic

Floorplan

B’

The layout of Block B affects the

timing of the path from A to C

Page 9: EE-382M-8 VLSI–II Early Design Planning: Back Endusers.ece.utexas.edu/~mcdermot/vlsi-2/Lecture_3.pdf · 2008-10-21 · Approximate clock routing at the top-level – Approximate

The University of Texas at AustinFoil # 9 The University of Texas at AustinEE 382M-8 VLSI-2 Page 9

Layout Implementation Options

• Synthesis – Random Logic Macro (RLM)– Cell layout comes from a shared cell library– Automated cell selection and placement– Automated routing between cells

• Structured Custom (SC/SDP)– Cell layout comes from a shared cell library– Manual cell selection and placement– Automated routing between cells

• Custom Design (CD)– Cell layout is unique for each application– Manual cell selection and placement– Manual routing between cells

Increasing

Design Effort

(And Density)

Page 10: EE-382M-8 VLSI–II Early Design Planning: Back Endusers.ece.utexas.edu/~mcdermot/vlsi-2/Lecture_3.pdf · 2008-10-21 · Approximate clock routing at the top-level – Approximate

The University of Texas at AustinFoil # 10 The University of Texas at AustinEE 382M-8 VLSI-2 Page 10

Layout Implementation Options

CD SC RLMARTL Coding M M M

Logic Minimization M M ACell Placement M M A

Device Sizing M A ALayout M A A

A = Automatic

M = Manual

CD SC RLMTiming Best Better WorstDensity Best Better Worst

Design Time Worst Better Best

• RLM saves time in circuit design and layout

• SC saves time in layout.

• RLM and SDP make revisions easier.

Page 11: EE-382M-8 VLSI–II Early Design Planning: Back Endusers.ece.utexas.edu/~mcdermot/vlsi-2/Lecture_3.pdf · 2008-10-21 · Approximate clock routing at the top-level – Approximate

The University of Texas at AustinFoil # 11 The University of Texas at AustinEE 382M-8 VLSI-2 Page 11

Datapath and Block Floorplanning Procedures

MIPS R10K

Page 12: EE-382M-8 VLSI–II Early Design Planning: Back Endusers.ece.utexas.edu/~mcdermot/vlsi-2/Lecture_3.pdf · 2008-10-21 · Approximate clock routing at the top-level – Approximate

The University of Texas at AustinFoil # 12 The University of Texas at AustinEE 382M-8 VLSI-2 Page 12

Datapath and Block Floorplanning Procedure

• Step 1 - Identify feedthrus for RLM or SC/DP block• Step 2 - Look for opportunities for track sharing• Step 3 - Define the bitpitch of the block• Step 4 - Review the metal plan within the cell • Step 5 - Review and plan the clock routing and placement• Step 6 - Plan the critical cell placement locations• Step 7 - Estimate the area of the cells and the block• Step 8 - Review the power grid

Page 13: EE-382M-8 VLSI–II Early Design Planning: Back Endusers.ece.utexas.edu/~mcdermot/vlsi-2/Lecture_3.pdf · 2008-10-21 · Approximate clock routing at the top-level – Approximate

The University of Texas at AustinFoil # 13 The University of Texas at AustinEE 382M-8 VLSI-2 Page 13

Feed-through or Over-the-cell (OTC) Routes

• Metal tracks routed over RLM, Datapath or custom block • The block is neither the driver or a receiver of the signals• Feedthrus use up metal tracks which impacts the internal

signals of the block• Carefully review datapath connectivity to account for them

Bypass

ALU 0

ALU 1

ALU 2

Sources Results

ReceiverDriver

Feedthrus

for ALU0

Page 14: EE-382M-8 VLSI–II Early Design Planning: Back Endusers.ece.utexas.edu/~mcdermot/vlsi-2/Lecture_3.pdf · 2008-10-21 · Approximate clock routing at the top-level – Approximate

The University of Texas at AustinFoil # 14 The University of Texas at AustinEE 382M-8 VLSI-2 Page 14

Datapath and Block Floorplanning Procedure

• Step 1 - Identify feedthrus• Step 2 - Look for opportunities for track sharing• Step 3 - Define the bitpitch of the block• Step 4 - Review the metal plan within the cell • Step 5 - Review and plan the clock routing and placement• Step 6 - Plan the critical cell placement locations• Step 7 - Estimate the area of the cells and the block• Step 8 - Review the power grid

Page 15: EE-382M-8 VLSI–II Early Design Planning: Back Endusers.ece.utexas.edu/~mcdermot/vlsi-2/Lecture_3.pdf · 2008-10-21 · Approximate clock routing at the top-level – Approximate

The University of Texas at AustinFoil # 15 The University of Texas at AustinEE 382M-8 VLSI-2 Page 15

Step 2: Track Sharing

• Minimizes the number of unique tracks in layout by opportunistically sharing tracks where possible

• Often allows for the smallest possible bitpitch

• Allows for metal layers to be more efficiently utilized

• Can help improve performance by shortening distances

• Should always be explored to improve layout efficiency and performance

Page 16: EE-382M-8 VLSI–II Early Design Planning: Back Endusers.ece.utexas.edu/~mcdermot/vlsi-2/Lecture_3.pdf · 2008-10-21 · Approximate clock routing at the top-level – Approximate

The University of Texas at AustinFoil # 16 The University of Texas at AustinEE 382M-8 VLSI-2 Page 16

Step 2: Track Sharing

Bypass$

ALU 0

ALU 1

ALU 2

Sources Results

First, check outside your

block to see if there

are any candidates

for track sharing

ReceiverDriver

Page 17: EE-382M-8 VLSI–II Early Design Planning: Back Endusers.ece.utexas.edu/~mcdermot/vlsi-2/Lecture_3.pdf · 2008-10-21 · Approximate clock routing at the top-level – Approximate

The University of Texas at AustinFoil # 17 The University of Texas at AustinEE 382M-8 VLSI-2 Page 17

Step 2: Track Sharing

Next, check inside your

block to see if there

are any candidates

for track sharing

LRBL<11:0> RRBL<11:0>

IE_BYC_DATA<11:0> IE_RF_DATA<11:0>

Metal 2Metal 4

Page 18: EE-382M-8 VLSI–II Early Design Planning: Back Endusers.ece.utexas.edu/~mcdermot/vlsi-2/Lecture_3.pdf · 2008-10-21 · Approximate clock routing at the top-level – Approximate

The University of Texas at AustinFoil # 18 The University of Texas at AustinEE 382M-8 VLSI-2 Page 18

Step 2: Track Sharing Example

Page 19: EE-382M-8 VLSI–II Early Design Planning: Back Endusers.ece.utexas.edu/~mcdermot/vlsi-2/Lecture_3.pdf · 2008-10-21 · Approximate clock routing at the top-level – Approximate

The University of Texas at AustinFoil # 19 The University of Texas at AustinEE 382M-8 VLSI-2 Page 19

Step 2: Track Sharing Example

Page 20: EE-382M-8 VLSI–II Early Design Planning: Back Endusers.ece.utexas.edu/~mcdermot/vlsi-2/Lecture_3.pdf · 2008-10-21 · Approximate clock routing at the top-level – Approximate

The University of Texas at AustinFoil # 20 The University of Texas at AustinEE 382M-8 VLSI-2 Page 20

Datapath and Block Floorplanning Procedure

• Step 1 - Identify feedthrus• Step 2 - Look for opportunities for track sharing• Step 3 - Define the bitpitch of the block• Step 4 - Review the metal plan within the cell • Step 5 - Review and plan the clock routing and placement• Step 6 - Plan the critical cell placement locations• Step 7 - Estimate the area of the cells and the block• Step 8 - Review the power grid

Page 21: EE-382M-8 VLSI–II Early Design Planning: Back Endusers.ece.utexas.edu/~mcdermot/vlsi-2/Lecture_3.pdf · 2008-10-21 · Approximate clock routing at the top-level – Approximate

The University of Texas at AustinFoil # 21 The University of Texas at AustinEE 382M-8 VLSI-2 Page 21

Bit Pitch Defining Width of Chip

AMD K5

Page 22: EE-382M-8 VLSI–II Early Design Planning: Back Endusers.ece.utexas.edu/~mcdermot/vlsi-2/Lecture_3.pdf · 2008-10-21 · Approximate clock routing at the top-level – Approximate

The University of Texas at AustinFoil # 22 The University of Texas at AustinEE 382M-8 VLSI-2 Page 22

Step 3: Define the Bitpitch

• Fixed cell width chosen to allow easy assembly

• Most often determined by metal usage within the datapath

• Integration efficiency would prefer one bitpitch per project

• Architectures lend themselves to more unique bit pitches

Bitpitch A<4>

A<3>

A<2>

A<1>

A<0>

VddSig0 <4>Sig1 <4>Sig2 <4>Sig3 <4>Sig4 <4>Sig5 <4>

Vss

VddSig0 <1>Sig1 <1>Sig2 <1>Sig3 <1>Sig4 <1>Sig5 <1>

Vss

Page 23: EE-382M-8 VLSI–II Early Design Planning: Back Endusers.ece.utexas.edu/~mcdermot/vlsi-2/Lecture_3.pdf · 2008-10-21 · Approximate clock routing at the top-level – Approximate

The University of Texas at AustinFoil # 23 The University of Texas at AustinEE 382M-8 VLSI-2 Page 23

Step 3: Define the Bitpitch

Insure all blocks in a datapath stack follow the same bitpitchB

itpitc

h #2

Byp

ass C

ache

Inte

ger

Reg

iste

r

File

AL

U 0

AL

U 1

Ari

th F

lags

AG

EN

-L

D /

STA

Shift

er

WB

Mux

Bit

Ops

Syst

em U

ops

Bitp

itch

#1 X

µ

Page 24: EE-382M-8 VLSI–II Early Design Planning: Back Endusers.ece.utexas.edu/~mcdermot/vlsi-2/Lecture_3.pdf · 2008-10-21 · Approximate clock routing at the top-level – Approximate

The University of Texas at AustinFoil # 24 The University of Texas at AustinEE 382M-8 VLSI-2 Page 24

Bit Pitch Example: 3:2 Adder Bit Cell

Bitpitch

7.56u

M1

M4

M3 & M1

Page 25: EE-382M-8 VLSI–II Early Design Planning: Back Endusers.ece.utexas.edu/~mcdermot/vlsi-2/Lecture_3.pdf · 2008-10-21 · Approximate clock routing at the top-level – Approximate

The University of Texas at AustinFoil # 25 The University of Texas at AustinEE 382M-8 VLSI-2 Page 25

Bit Pitch Example: 4 Bit Cells stacked

Bitpitch

7.56u BIT - 0

BIT - 1

BIT - 2

BIT - 4

Page 26: EE-382M-8 VLSI–II Early Design Planning: Back Endusers.ece.utexas.edu/~mcdermot/vlsi-2/Lecture_3.pdf · 2008-10-21 · Approximate clock routing at the top-level – Approximate

The University of Texas at AustinFoil # 26 The University of Texas at AustinEE 382M-8 VLSI-2 Page 26

Bit Pitch Example: Tiled Datapath

Page 27: EE-382M-8 VLSI–II Early Design Planning: Back Endusers.ece.utexas.edu/~mcdermot/vlsi-2/Lecture_3.pdf · 2008-10-21 · Approximate clock routing at the top-level – Approximate

The University of Texas at AustinFoil # 27 The University of Texas at AustinEE 382M-8 VLSI-2 Page 27

Bit Pitch Example: Swizzle

Don’t mix and match bit pitches to avoid swizzle channels

As buses get wider and the number of tracks per

bit gets higher the cost of swizzle channels grows

Swizzle

Channel

Page 28: EE-382M-8 VLSI–II Early Design Planning: Back Endusers.ece.utexas.edu/~mcdermot/vlsi-2/Lecture_3.pdf · 2008-10-21 · Approximate clock routing at the top-level – Approximate

The University of Texas at AustinFoil # 28 The University of Texas at AustinEE 382M-8 VLSI-2 Page 28

Step 3: Define the Bitpitch

• Wider bit pitches allow more upper level metal usage

• Narrower bit pitches allow shorter routes for orthogonal signals

• Balancing these conflicting objectives can be difficult

• Understand your local constraints and be aware of the tradeoffs

Page 29: EE-382M-8 VLSI–II Early Design Planning: Back Endusers.ece.utexas.edu/~mcdermot/vlsi-2/Lecture_3.pdf · 2008-10-21 · Approximate clock routing at the top-level – Approximate

The University of Texas at AustinFoil # 29 The University of Texas at AustinEE 382M-8 VLSI-2 Page 29

Datapath and Block Floorplanning Procedure

• Step 1 - Identify feedthrus• Step 2 - Look for opportunities for track sharing• Step 3 - Define the bitpitch of the block• Step 4 - Review the metal plan within the cell • Step 5 - Review and plan the clock routing and placement• Step 6 - Plan the critical cell placement locations• Step 7 - Estimate the area of the cells and the block• Step 8 - Review the power grid

Page 30: EE-382M-8 VLSI–II Early Design Planning: Back Endusers.ece.utexas.edu/~mcdermot/vlsi-2/Lecture_3.pdf · 2008-10-21 · Approximate clock routing at the top-level – Approximate

The University of Texas at AustinFoil # 30 The University of Texas at AustinEE 382M-8 VLSI-2 Page 30

Metal Planning

• Metal layer, width, spacing and shielding are negotiable– “Negotiable” means you have to plead your case to the integration

leaders

• All of these impose a physical constraint for layout

• For your first attempt at convergence– M1,M2 : Local routing– M3,M4, M5, M6 : Data and control– M7,M8 : Power, Ground, Clock, Reset, etc– Assume all nets are routed in M1&M2 within your block– Assume your only shielding is on clocks and reset– Assume the routes are minimum

Page 31: EE-382M-8 VLSI–II Early Design Planning: Back Endusers.ece.utexas.edu/~mcdermot/vlsi-2/Lecture_3.pdf · 2008-10-21 · Approximate clock routing at the top-level – Approximate

The University of Texas at AustinFoil # 31 The University of Texas at AustinEE 382M-8 VLSI-2 Page 31

Metal Flow Planning

Avoid bi-directional dataflow

BAD GOOD

Data

Cntl

Data

Cntl

Data

Page 32: EE-382M-8 VLSI–II Early Design Planning: Back Endusers.ece.utexas.edu/~mcdermot/vlsi-2/Lecture_3.pdf · 2008-10-21 · Approximate clock routing at the top-level – Approximate

The University of Texas at AustinFoil # 32 The University of Texas at AustinEE 382M-8 VLSI-2 Page 32

Shielding

• Intentionally routing signals to control the effective line-to-line capacitance seen during switching.

• Requires designers to constrain the physical assembly done by routing tools or physical design specialists (PDSs).

• Falls into one of three categories:– Physical shielding - signals are routed next to a power rail– Logical shielding - signals are routed by logically related signals– Temporal shielding - signals are routed by temporally distinct

signals

Page 33: EE-382M-8 VLSI–II Early Design Planning: Back Endusers.ece.utexas.edu/~mcdermot/vlsi-2/Lecture_3.pdf · 2008-10-21 · Approximate clock routing at the top-level – Approximate

The University of Texas at AustinFoil # 33 The University of Texas at AustinEE 382M-8 VLSI-2 Page 33

Miller Coupling Factor

A

B

C

A

B

C

A

B

C

A

B

C

A

B

C

MCF = 1.5 One against, one quietMCF = 2.0 Both against

MCF = 0.5 One with, one quietMCF = 1.0 Both quiet MCF = 0.0 Both with

Page 34: EE-382M-8 VLSI–II Early Design Planning: Back Endusers.ece.utexas.edu/~mcdermot/vlsi-2/Lecture_3.pdf · 2008-10-21 · Approximate clock routing at the top-level – Approximate

The University of Texas at AustinFoil # 34 The University of Texas at AustinEE 382M-8 VLSI-2 Page 34

No Shielding

• Signals are routed next to any neighboring signals• Neighbors can slow down (max delay) or speed up (min delay)

signal transitions through line-to-line coupling• Variation can create design problems• Most signals will not be shielded

Sig A Sig B Sig C

No Shield Max MCF 2.0 Min MCF 0.0

A

B

C

A

B

C

Page 35: EE-382M-8 VLSI–II Early Design Planning: Back Endusers.ece.utexas.edu/~mcdermot/vlsi-2/Lecture_3.pdf · 2008-10-21 · Approximate clock routing at the top-level – Approximate

The University of Texas at AustinFoil # 35 The University of Texas at AustinEE 382M-8 VLSI-2 Page 35

Physical Shielding

• Signals are routed next to at least one power rail• Helps both min delay and max delay• Can be expensive in terms of metal usage• Typically limited to most critical nets and clocks

Vss Sig A Sig B Vss Sig A Vss

Half Shield Full Shield

Max MCF 1.5

Min MCF 0.5

Max MCF 1.0

Min MCF 1.0

Page 36: EE-382M-8 VLSI–II Early Design Planning: Back Endusers.ece.utexas.edu/~mcdermot/vlsi-2/Lecture_3.pdf · 2008-10-21 · Approximate clock routing at the top-level – Approximate

The University of Texas at AustinFoil # 36 The University of Texas at AustinEE 382M-8 VLSI-2 Page 36

Logical Shielding

• Signals are routed next to mutually exclusive neighbors• Also helps min delay and max delay• Comparable results as physical shielding but lesser cost• Encouraged in mux structures and arrays

Sel A Sel B Sel C

A

B

C

Sel A

Sel B

Sel C

Sel A

Sel B

Sel C

Max MCF 1.5

Min MCF 1.0

Page 37: EE-382M-8 VLSI–II Early Design Planning: Back Endusers.ece.utexas.edu/~mcdermot/vlsi-2/Lecture_3.pdf · 2008-10-21 · Approximate clock routing at the top-level – Approximate

The University of Texas at AustinFoil # 37 The University of Texas at AustinEE 382M-8 VLSI-2 Page 37

Temporal Shielding

• Signals are routed next to signals that limit aggressors• Can help max delay or min delay or both• Lesser cost than physical shielding, but more design effort• Encouraged wherever possible but tricky

A

B

C

A

B

C

Max MCF 1.0

Min MCF 0.0

Ck

Ck

Ck

Sig A Sig B Sig C

Page 38: EE-382M-8 VLSI–II Early Design Planning: Back Endusers.ece.utexas.edu/~mcdermot/vlsi-2/Lecture_3.pdf · 2008-10-21 · Approximate clock routing at the top-level – Approximate

The University of Texas at AustinFoil # 38 The University of Texas at AustinEE 382M-8 VLSI-2 Page 38

Shielding Gotcha

• Tools may rely on the designer to override the default coupling assumptions

L

L

Ck

Ck

A

B

Max MCF 2.0

Min MCF 0.0

If you need temporal shielding to make your

circuit meet timing, your circuit doesn’t

meet timing. Do not rely on it.

Page 39: EE-382M-8 VLSI–II Early Design Planning: Back Endusers.ece.utexas.edu/~mcdermot/vlsi-2/Lecture_3.pdf · 2008-10-21 · Approximate clock routing at the top-level – Approximate

The University of Texas at AustinFoil # 39 The University of Texas at AustinEE 382M-8 VLSI-2 Page 39

Datapath and Block Floorplanning Procedure

• Step 1 - Identify feedthrus• Step 2 - Look for opportunities for track sharing• Step 3 - Define the bitpitch of the block• Step 4 - Review the metal plan within the cell • Step 5 - Review and plan the clock routing and placement• Step 6 - Plan the critical cell placement locations• Step 7 - Estimate the area of the cells and the block• Step 8 - Review the power grid

Page 40: EE-382M-8 VLSI–II Early Design Planning: Back Endusers.ece.utexas.edu/~mcdermot/vlsi-2/Lecture_3.pdf · 2008-10-21 · Approximate clock routing at the top-level – Approximate

The University of Texas at AustinFoil # 40 The University of Texas at AustinEE 382M-8 VLSI-2 Page 40

Variations of Clock Tree distribution networks

Tapered H-Tree

Target: Metallization and Gate topology uniformity

Page 41: EE-382M-8 VLSI–II Early Design Planning: Back Endusers.ece.utexas.edu/~mcdermot/vlsi-2/Lecture_3.pdf · 2008-10-21 · Approximate clock routing at the top-level – Approximate

The University of Texas at AustinFoil # 41 The University of Texas at AustinEE 382M-8 VLSI-2 Page 41

Clock Routing

• Watch out for the clock, it’s your most critical net• Make sure the physical design treats it accordingly• Help reduce clock power by eliminating unnecessary load• Make sure the clock has enough via coverage• Leave room for decoupling capacitors and upsizing• Don’t forget to account for clock routing overhead (full shield) in

your metal planning

Page 42: EE-382M-8 VLSI–II Early Design Planning: Back Endusers.ece.utexas.edu/~mcdermot/vlsi-2/Lecture_3.pdf · 2008-10-21 · Approximate clock routing at the top-level – Approximate

The University of Texas at AustinFoil # 42 The University of Texas at AustinEE 382M-8 VLSI-2 Page 42

Clock Routing

BAD GOOD

UNNECESSARY

LOAD

Avoid unnecessary clock load to save active power

Page 43: EE-382M-8 VLSI–II Early Design Planning: Back Endusers.ece.utexas.edu/~mcdermot/vlsi-2/Lecture_3.pdf · 2008-10-21 · Approximate clock routing at the top-level – Approximate

The University of Texas at AustinFoil # 43 The University of Texas at AustinEE 382M-8 VLSI-2 Page 43

Power/Clock Grid• Clock grid is interleaved between VDD and VSS on metal6

Port1 Input Data LatchLCB

LCB

Port0 Input Data Latch LCB

LCB

Port0 Read/Write CktLCB

Port0 Output LatchLCB

LCB

Port1 Output LatchLCB

Port1 Read/Write Ckt

LCB

LCB

LCB

LCB

BitcellArray

Port1 Input Data LatchLCB

LCB

Port0 Input Data LatchLCB

LCB

Port0 Read/Write Ckt LCB

BitcellArray

Port0 D

ecoderLCB

LCB

Port0 Output Latch LCB

LCBPort1 Output LatchPort1 Read/Write Ckt

LCB

LCB

LCB

LCB

LCB

LCB

LCB

Port0 Read/Write CktP

ort1 Decoder

Page 44: EE-382M-8 VLSI–II Early Design Planning: Back Endusers.ece.utexas.edu/~mcdermot/vlsi-2/Lecture_3.pdf · 2008-10-21 · Approximate clock routing at the top-level – Approximate

The University of Texas at AustinFoil # 44 The University of Texas at AustinEE 382M-8 VLSI-2 Page 44

Clock Routing

Make sure there are enough vias to get power through

the clock network

INSUFFICIENT

VIA COVERAGE

SUFFICIENT

VIA COVERAGE

Page 45: EE-382M-8 VLSI–II Early Design Planning: Back Endusers.ece.utexas.edu/~mcdermot/vlsi-2/Lecture_3.pdf · 2008-10-21 · Approximate clock routing at the top-level – Approximate

The University of Texas at AustinFoil # 45 The University of Texas at AustinEE 382M-8 VLSI-2 Page 45

Clock Routing

Remember to count clocks as ~5-7 tracks in your

wire planning!

Vdd Clock Vss

Be careful with gated clocks. Fine grain

clock gating tends to drastically increase

the number of unique clocks, significantly

increasing the metal usage.

No tools catch this before layout

1x 2x 1 x

1.5x 1.5x

Page 46: EE-382M-8 VLSI–II Early Design Planning: Back Endusers.ece.utexas.edu/~mcdermot/vlsi-2/Lecture_3.pdf · 2008-10-21 · Approximate clock routing at the top-level – Approximate

The University of Texas at AustinFoil # 46 The University of Texas at AustinEE 382M-8 VLSI-2 Page 46

Datapath and Block Floorplanning Procedure

• Step 1 - Identify feedthrus• Step 2 - Look for opportunities for track sharing• Step 3 - Define the bitpitch of the block• Step 4 - Review the metal plan within the cell • Step 5 - Review and plan the clock routing and placement• Step 6 - Plan the critical cell placement locations• Step 7 - Estimate the area of the cells and the block• Step 8 - Review the power grid

Page 47: EE-382M-8 VLSI–II Early Design Planning: Back Endusers.ece.utexas.edu/~mcdermot/vlsi-2/Lecture_3.pdf · 2008-10-21 · Approximate clock routing at the top-level – Approximate

The University of Texas at AustinFoil # 47 The University of Texas at AustinEE 382M-8 VLSI-2 Page 47

Cell Placement

• Start with the critical path!– Place cells to limit the wire load on the critical path– Move less critical blocks out of the way

• Place clock generators to limit clock wire load– Again, place most critical clock LCBs first if area is tight– Ideally there should be minimal side loads

• Consider track sharing opportunities when placing cells– Cell placement can enable or disable track sharing– Optimum placement generally follows data flow

Page 48: EE-382M-8 VLSI–II Early Design Planning: Back Endusers.ece.utexas.edu/~mcdermot/vlsi-2/Lecture_3.pdf · 2008-10-21 · Approximate clock routing at the top-level – Approximate

The University of Texas at AustinFoil # 48 The University of Texas at AustinEE 382M-8 VLSI-2 Page 48

Cell Placement

LCB

Short

critical

path

No side

load

Page 49: EE-382M-8 VLSI–II Early Design Planning: Back Endusers.ece.utexas.edu/~mcdermot/vlsi-2/Lecture_3.pdf · 2008-10-21 · Approximate clock routing at the top-level – Approximate

The University of Texas at AustinFoil # 49 The University of Texas at AustinEE 382M-8 VLSI-2 Page 49

Cell Placement and Routing

Page 50: EE-382M-8 VLSI–II Early Design Planning: Back Endusers.ece.utexas.edu/~mcdermot/vlsi-2/Lecture_3.pdf · 2008-10-21 · Approximate clock routing at the top-level – Approximate

The University of Texas at AustinFoil # 50 The University of Texas at AustinEE 382M-8 VLSI-2 Page 50

Datapath and Block Floorplanning Procedure

• Step 1 - Identify feedthrus• Step 2 - Look for opportunities for track sharing• Step 3 - Define the bitpitch of the block• Step 4 - Review the metal plan within the cell • Step 5 - Review and plan the clock routing and placement• Step 6 - Plan the critical cell placement locations• Step 7 - Estimate the area of the cells and the block• Step 8 - Review the power grid

Page 51: EE-382M-8 VLSI–II Early Design Planning: Back Endusers.ece.utexas.edu/~mcdermot/vlsi-2/Lecture_3.pdf · 2008-10-21 · Approximate clock routing at the top-level – Approximate

The University of Texas at AustinFoil # 51 The University of Texas at AustinEE 382M-8 VLSI-2 Page 51

Area Estimation

• All modules have an area budget in the floorplan

• That budget is only an educated guess

• Some guesses are high, and some are low

• You will need to enhance the quality of these estimates by more accurately estimating the area of your modules

• While doing this you will reduce the amount of late surprises in the design and also reduce post-layout effort by converging with accurate parasitics

Page 52: EE-382M-8 VLSI–II Early Design Planning: Back Endusers.ece.utexas.edu/~mcdermot/vlsi-2/Lecture_3.pdf · 2008-10-21 · Approximate clock routing at the top-level – Approximate

The University of Texas at AustinFoil # 52 The University of Texas at AustinEE 382M-8 VLSI-2 Page 52

Area Estimation

• Custom cell area can be set in one of three ways– Device limited layout means the device sizes set the cell area– Metal limited layout means the wires set the cell area– Pitch-matching means the cell area is set to match another cell

• Your first job is to figure out which your cell is – Datapaths are metal limited in one direction (bitpitch)– Arrays often are metal limited in both directions– Control blocks often match a datapath or array

Page 53: EE-382M-8 VLSI–II Early Design Planning: Back Endusers.ece.utexas.edu/~mcdermot/vlsi-2/Lecture_3.pdf · 2008-10-21 · Approximate clock routing at the top-level – Approximate

The University of Texas at AustinFoil # 53 The University of Texas at AustinEE 382M-8 VLSI-2 Page 53

Die Size Estimation

Page 54: EE-382M-8 VLSI–II Early Design Planning: Back Endusers.ece.utexas.edu/~mcdermot/vlsi-2/Lecture_3.pdf · 2008-10-21 · Approximate clock routing at the top-level – Approximate

The University of Texas at AustinFoil # 54 The University of Texas at AustinEE 382M-8 VLSI-2 Page 54

Datapath and Block Floorplanning Procedure

• Step 1 - Identify feedthrus• Step 2 - Look for opportunities for track sharing• Step 3 - Define the bitpitch of the block• Step 4 - Review the metal plan within the cell • Step 5 - Review and plan the clock routing and placement• Step 6 - Plan the critical cell placement locations• Step 7 - Estimate the area of the cells and the block• Step 8 - Review the power grid

Page 55: EE-382M-8 VLSI–II Early Design Planning: Back Endusers.ece.utexas.edu/~mcdermot/vlsi-2/Lecture_3.pdf · 2008-10-21 · Approximate clock routing at the top-level – Approximate

The University of Texas at AustinFoil # 55 The University of Texas at AustinEE 382M-8 VLSI-2 Page 55

Power Grid

• Delivers current from the C4 bumps to the transistors• Designed to deliver typical current density to the devices• Increasing current density by arraying large devices can cause

you to exceed the power grid’s nominal design• Doing this can cause performance and noise problems

Page 56: EE-382M-8 VLSI–II Early Design Planning: Back Endusers.ece.utexas.edu/~mcdermot/vlsi-2/Lecture_3.pdf · 2008-10-21 · Approximate clock routing at the top-level – Approximate

The University of Texas at AustinFoil # 56 The University of Texas at AustinEE 382M-8 VLSI-2 Page 56

Power Grid

Think of the grid as a straw

between the C4 and the devices.

Too many devices sucking through

the same straw or too narrow a

straw can cause devices to starve

and the supply to dip or crater!

Page 57: EE-382M-8 VLSI–II Early Design Planning: Back Endusers.ece.utexas.edu/~mcdermot/vlsi-2/Lecture_3.pdf · 2008-10-21 · Approximate clock routing at the top-level – Approximate

The University of Texas at AustinFoil # 57 The University of Texas at AustinEE 382M-8 VLSI-2 Page 57

SAMPLE Power/Ground GRID

Shielding takes up significant routing resources.Global M6 routes over the array should have minimal coupling noise to array bitlines.

* Where λ is minimum critical dimension for width/space

Sig

Sig

Si g

Sig

VSS VDD VSSS

ig

48λ

Sig

Vss

Vss

Vss

Vss

(Full Shielding, MCF = 1.0)

λ

2λ2λ

λ

Page 58: EE-382M-8 VLSI–II Early Design Planning: Back Endusers.ece.utexas.edu/~mcdermot/vlsi-2/Lecture_3.pdf · 2008-10-21 · Approximate clock routing at the top-level – Approximate

The University of Texas at AustinFoil # 58 The University of Texas at AustinEE 382M-8 VLSI-2 Page 58

Power Grid

SCHEMATIC

VIEW

CELL LAYOUT

VIEW

RELATIVE CELL

PLACEMENTA

Bit 31

Bit 0

A<31:0>

A <31:0>

OUT

<31:0>

Page 59: EE-382M-8 VLSI–II Early Design Planning: Back Endusers.ece.utexas.edu/~mcdermot/vlsi-2/Lecture_3.pdf · 2008-10-21 · Approximate clock routing at the top-level – Approximate

The University of Texas at AustinFoil # 59 The University of Texas at AustinEE 382M-8 VLSI-2 Page 59

Power Grid

A <31:0>

SCHEMATIC

VIEW

CELL LAYOUT

VIEW

RELATIVE CELL

PLACEMENTA

Bit 31

Bit 0

A<31:0>

When large, arrayed drivers pull

on the same rail, supply bounce

can occur degrading performance

and causing supply offset noise

OUT

<31:0>

Out

Current

Vdd

Vss

Page 60: EE-382M-8 VLSI–II Early Design Planning: Back Endusers.ece.utexas.edu/~mcdermot/vlsi-2/Lecture_3.pdf · 2008-10-21 · Approximate clock routing at the top-level – Approximate

The University of Texas at AustinFoil # 60 The University of Texas at AustinEE 382M-8 VLSI-2 Page 60

Power Grid

• Be very careful arraying large drivers • Follow the % power guidelines for the power grid• Try to keep temporal relationships between arrayed drivers• Consider the physical impact on the grid by your design• Be prepared to make the grid more robust to compensate for

marginal grids

Page 61: EE-382M-8 VLSI–II Early Design Planning: Back Endusers.ece.utexas.edu/~mcdermot/vlsi-2/Lecture_3.pdf · 2008-10-21 · Approximate clock routing at the top-level – Approximate

The University of Texas at AustinFoil # 61 The University of Texas at AustinEE 382M-8 VLSI-2 Page 61

Summary

• Early design planning and layout can have a significant impact on processor design– Die size, profit & power are impacted by layout density– Schedule is impacted by implementation choices

• Floorplanning also significantly impacts circuit performance – Shielding can help timing and noise sensitive circuits– Carefully floorplanning critical paths can help reduce wire loads– Reducing clock routing can reduce clock skew and clock power

Page 62: EE-382M-8 VLSI–II Early Design Planning: Back Endusers.ece.utexas.edu/~mcdermot/vlsi-2/Lecture_3.pdf · 2008-10-21 · Approximate clock routing at the top-level – Approximate

The University of Texas at AustinFoil # 62 The University of Texas at AustinEE 382M-8 VLSI-2 Page 62

Backup

Page 63: EE-382M-8 VLSI–II Early Design Planning: Back Endusers.ece.utexas.edu/~mcdermot/vlsi-2/Lecture_3.pdf · 2008-10-21 · Approximate clock routing at the top-level – Approximate

The University of Texas at AustinFoil # 63 The University of Texas at AustinEE 382M-8 VLSI-2 Page 63

Wire and Resistance Calculator

Page 64: EE-382M-8 VLSI–II Early Design Planning: Back Endusers.ece.utexas.edu/~mcdermot/vlsi-2/Lecture_3.pdf · 2008-10-21 · Approximate clock routing at the top-level – Approximate

The University of Texas at AustinFoil # 64 The University of Texas at AustinEE 382M-8 VLSI-2 Page 64

ALPHA 21364

Page 65: EE-382M-8 VLSI–II Early Design Planning: Back Endusers.ece.utexas.edu/~mcdermot/vlsi-2/Lecture_3.pdf · 2008-10-21 · Approximate clock routing at the top-level – Approximate

The University of Texas at AustinFoil # 65 The University of Texas at AustinEE 382M-8 VLSI-2 Page 65

PPC 603