EDA Challenges for Low Power Design - IEEEsites.ieee.org/scv-cas/files/2013/02/2006Iyer.pdfEDA...

35
EDA Challenges for Low Power Design Anand Iyer, Cadence Design Systems

Transcript of EDA Challenges for Low Power Design - IEEEsites.ieee.org/scv-cas/files/2013/02/2006Iyer.pdfEDA...

EDA Challenges for Low Power gDesign

Anand Iyer, Cadence Design Systems

Agendag

I t d ti• Introduction

• LP techniques in detail

• Challenges to low power techniques

• Guidelines for choosing various techniquesg q

Why is Power an Issue?

Leakage Leakage ggPowerPower

ActiveActivePowerPowerPerformance =

180 130 90 65Process Technology

(nm)P h

Source: Intel, 2004mW/MHzComplex System

Power hungry process

Sluggish Battery Life Improvement

Source: EETimes, 2004

Sluggish Battery Life Improvement

2000

2001

2002

2003

2004

Approaches To Power Management• System Architecture (multi-core)

• Software/Hardware power management system

Leakage Activesystem

– ARM IEM

• Voltage scaling / frequency scalingDesign and System Level Optimization

• Multiple voltage islandsOpt at o

• Clock gating, logic structuringWe will discuss this• Multi-Vth cell selection to reduce leakage

• Support for multi voltage islands (aka “multi-vdd” aka “MSV”) implementation

ImplementationWe will discuss this

in detailmulti vdd aka MSV ) implementation

• Signoff accurate analysis

• SOI

• High-K, Gate Stack, power gating, etc.

• LLDProcess Level Optimization

Controlling Power in gImplementation

Dynamic power (≈ k • C • V 2 • f )

Leakage power (≈ V • I )

• Clock gating (including de-clone) • Multi-Vt cell optimization

(≈ k • C L • V DD2 • f CLK) (≈ VDD • Ileakage)

• Area optimization

• Static voltage scaling (MSV)*

• Dynamic voltage frequency scaling

• Substrate biasing (VT CMOS)• Power shut-off (PSO) – aka

• Dynamic voltage frequency scaling (DVFS)*

• Adaptive voltage scaling (AVS)*

MTCMOS - including State retention– Fine grain control

Coarse grain control– Coarse grain control

* Techniques that affect both dynamic and leakage power

Techniques and Trade-offsPower reduction

techniqueLeakagepower

Dynamicpower

Timingpenalty

Area penalty

Methodologyimpact

Dynamic power optimization 10% 10% 0% -10% Noney p pMulti-Vt optimization 6X 0% 0% 0% Low

Clock gating 0% 20% 0% <2% Low

Voltage Islands 2X 40-50% 0% <10% Medium

Power shut-off (PSO) 10-50X 0% 4-8% 5-15% Medium-high

Dynamic and Adaptive Voltage Frequency Scaling

(DVFS and AVS)2-3X 40-70% 0% <10% High

(DVFS and AVS)

Substrate Biasing 10X - 10% <10% High

Source – Customer interviews, Conference papers (ISSCC), magazine articles

LP Techniques in Detail

Dynamic Power Optimization (No V)

Pin swapping: low C with high FPin swapping: low C with high F

Gate sizing: CMOS power usage related to size

Buffer removal: remove unnecessary buffers

i t 2inst_2

MVT Optimization

Low Vt

Norm VtImplementation

High Vt

Clock Gating

• Relies on clock gate control signal in RTL or netlist

RTL

Relies on clock gate control signal in RTL or netlist

RTLalways @(posedge clk)

if (en)out <= in;

Control signal

Block A

clk

Maps to either:1. User defined gating

module

Block B

2.Clock-gating-integrated cell from library

clk3.Gating function built

from standard logic

Designing with Voltage Islands

1 2 V Power Domain1.0 V Power Domain clamps

1.2 V Power Domain

MemoryLow Vt(Hi h S d)

Normal Vt High Vt(l l k

y

clamps

(High Speed) (low leakage, lower Speed)

Voltage Level Shifter

1.2VDomain

Power Domain 3 (0.8V)clamps

Voltage Level ShifterVoltage Level Shifter

Power Switch-Off (PSO) Methodologies

Fine Grain Power Switches Coarse Grain Power Switches

VDD

A ZSLEEP

Real VSS

VDD

Switch

Real VSS

SLEEP

Virtual VSS(No Pin)

Virtual Vss

VDD

A ZStandardCell

SLEEP Real R l

Virtual Vss

Standard Cells

Vdd SLEEPVdd

SLEEP

Real VSS

Real VSS

(power switchBuilt-in)

Standard Cells SwitchModule

Logical Representation(No change except for SLEEP)

Logical Representation(Logic needs to be power aware!)

Coarse Grain PSO Methodologies

Always On Always On(Default Domain) (Default Domain)

On/Offi

Always On

Always OnDomain DomainOn

Domain

Global Vdd Global Vdd

Power Switching

ll

GNDSwitched VDD

Power Switching cellCommon GNDSeparated Area VDDOn/Off

Domain

Cl t S it h S t d S it hcellCluster Switches Segmented Switches

Dynamic Voltage Frequency Scaling • Hardware that scales supply voltage and

clock frequency in response to software demands

– 16 levels of VDD (use 5 to 7 in practice) from 1.1V to 0.6V

– Clock frequency from 200MHz to 700MHz in i t f 33MH

Energy Characteristics of a ProcessorPower Energy

increments of 33MHz

• Triggered when load change (detected by CPU software, or HW) – (load means number of functions to be executed) ne

rgy/

Pow

er

number of functions to be executed)– Heavier load → ramp up supply voltage,

when stable, then scale up clock frequency

– Lighter load → scale down clock frequency,

En

g q ywhen PLL locks onto new rate, ramp down supply voltage

• Must keeps clock frequency within limits required by supply voltage to avoid clock Source – Magazine article

300 Mhz,0.80 V

433 Mhz,0.875V

533 Mhz,0.95 V

667 Mhz,1.05V

800 Mhz,1.15V

900 Mhz,1.25V

1000 Mhz,1.3V

Operating Points

required by supply voltage to avoid clock skew problems, timing violation.

– Worst-case scenario of a full swing from 0.6 V to 1.1V and from 200MHz to 700MHz

ld t k b t 280 i dcould take about 280 microseconds.

Dynamic Voltage Frequency Scaling

Mode Core Sleep SlowMode Core Sleep Slow

Baseline 1.08V

125MHz

1.08V

125 MHz

1.08V

125 MHz

SLEEP SLOW

125MHz 125 MHz 125 MHz

Slow 1.08V

125MHz

1.08V

125MHz

0.9V

66MHzCORE

Standby 0.0V 1.08V

125MHz

0.0V

• Multiple modes need to be analyzed/optimized for multiple• Multiple constraints (.sdc) • Librariesanalyzed/optimized for multiple corners

– Setup analysis for (WC, 1,125C) corner

p ( )– Example: baseline.sdc,

ios.sdc, slow.sdc, sleep.sdc– stdcell_1.08sl.lib,

stdcell_0.9sl.lib, stdcell_1.08fs.lib, stdcell 0 9fs libstdcell_0.9fs.lib

Adaptive Voltage Scaling

Operating Voltage

PMPM

CPU/SOCPM

Power Management Unit

PMPerformance parameters

Closed loop control

Substrate Bias Control

Vdd

-2.5, 0.84-2, 0.7750.8

0.9

VddVbp

Vbn

-1.5, 0.7-1, 0.625

-0.5, 0.540.5

0.6

0.7

Vth

(V)

Vss 0, 0.45

0.2

0.3

0.4

2 5 2 1 5 1 0 5 0

• For an n-channel device, the substrate is normally tied to ground (Vsb = 0)

-2.5 -2 -1.5 -1 -0.5 0

Vsb (V)

• A negative bias on Vsb causes Vth to increase• Substrate biasing can be done during packaging (VTCMOS) or

during operation (ABB)during operation (ABB)

Challenges to implementing LP g gTechniques

Dynamic Power Optimization (No V)y p ( )

• Toggle reduction• Toggle reduction

– Efficient synthesis

C it d ti• Capacitance reduction

– Placement

– Physical synthesis

• Toggle based Capacitance reduction

– Pin swapping

– Area compactionp

– Wire length minimization (high-toggle, fanout)

• Useful skewUseful skew

MVT Optimizationp

Lib h t i ti• Library characterization

– Identical footprint

– Footprint independent

• Implementation

– Efficiently replacing lower Vt cells with higher Vt cells

• Analysisy

– How/When to measure leakage power?

– Signal Integrity AnalysisSignal Integrity Analysis

– Lowest leakage state

Clock Gatingg

Id tif i ti ditilatch_posedge

test

latch_posedge_precontrollatch_posedge_precontrol_obs

• Identifying gating conditions

• Testability requirements ck_outck_inenable

test

• Physical effects of clock gating

obs

• Timing effects of clock gating

Observability Logic

g g

……

.

SISO

SE

Specify max #flops observable

..

per observability flop (default=36)

Low Power Clock Tree Synthesis –De CloningDe-Cloning

CLK CLK

Congestion!

De-cloning

Congestion!Skew!

Dynamic power

Clock GatesCGEnable

Clock GatesCGEnable

Flip flops Flip flops

Voltage Islandsg• Which logic modules are suitable for voltage scaling?• What should be the scaled voltage value for these blocks?• What should be the scaled voltage value for these blocks?• Library characterization

– Multiple voltages/ multiple conditions– Additional components – Voltage level shifters

• Implementation– Physical shape of the voltage islandsy p g– Level shifter insertion in the netlist– Placement of level shifters

Routing to a level shifter– Routing to a level shifter– Power connection of a level shifter

• Analysis– Timing analysis of islands– Optimization including level shifters– Signal integrity analysisg g y y– IR drop and how it affects timing

Power Switch-Off

• Library Characterization– Additional parameters – leakage power, max. current through the cells (Id), max.

voltage drop– Additional cells – Switches, isolation cells, state retention cells

• Implementation– Logic level Switch insertion/simulation/verification– Switch placement schemes – Ring/Column/Distributedp g– Switch enable distribution – high fan out net– Power planning/routing – Fine grain, coarse grain

SRPG control signals– SRPG control signals• Analysis

– Transient analysis– On/Off analysis– Functional verification– Sneak path analysisp y

DVFS/AVS

Lib h t i ti• Library characterization

– Advanced modeling (ECSM, CCS)

• Implementation

– Clock synchronization

– Use of level shifters in the clock design

• Analysisy

– Multi-mode/multi-corner analysis/optimization

– Functional verification (huge for AVS)Functional verification (huge for AVS)

Substrate Bias

• Timing Analysis– Characterization for VTCMOS

– Custom analysis for ABB

• Optimization– Must be aware of body bias

W ll ti• Well separation– Between the regions that are subjected to control and that are not

• Planning/routing additional power signals• Planning/routing additional power signals– Congestion

– EMEM

– Cell design

– Functional Verification/validation

Variability and Low Power

Test Chip Timing Path Slack Distribution, -100ps -> +200ps

14%

16% notimetimed

10%

12%

ths

MVTMSV

6%

8%

% o

f pa

2%

4%

0%

-100 -8

0

-60

-40

-20 0 20 40 60 80 100

120

140

160

180

200

ps

Functional Checks Need to be Done @ Transistor Level

VDD VDD

Power PwrEn1

PwrEn1

A BV1 V2

PwrEn2

Vs Vc

ControlFSM

PwrEn2

ISO

A B

ISO

A YVs VcA

Iso

Y

ISO

Level Shifting Isolation Cellin Source Domain, which will be shut off

State Retention Register Checks

ASWPwrEn1

PowerController

PwrEn1

RTCLK

RET

A

D

Q

RETSRPG

VDD VRETON/OFF

V RET

RTCLK

V1V2

RTCLK

D

Q X

Don’t care

Don’t care

Str ct ral Check

RET

Q XSleep Wake

• Structural Check– Checks that RET signal comes from an Always ON power domain; VRET tied to continuous Power– Checks that VDD and D pins connect to the same power domain

• Functional Checksassert (RET || RET ) (RTCLK off)

Sneak Path Detection

Fl ti d h X

ENB VDD

Floating node when X is switched-off can

cause additional leakage

A YBlock X

EN VSS

Common in mux logic

Guidelines for LP Technique Selection

How To Choose Between Various LP Techniques

• Understand the application/technology need for power reductionreduction

• Choose the techniques based on the power reduction i t d t irequirement and not vice versa.

• Understand the trade-offs – esp. methodology implications

High-level Guidelines for Power Reduction in Design

P i f t i il t d ti i• Power is a performance parameter similar to area and timing

– Optimize and analyze timing, power and area concurrently

• Choose the LP techniques early in the implementation

– Helps to get max. power reduction

– Architecture/process selection must be driven by power need

• Use of voltage scaling techniques leads to quadratic reduction g g q qin power e.g. MSV, DVFS

• When not in use, shut it off!

• Verify, verify, verify!

Steps for Successful LP Design Tapeout!

• LP implementation is complex and requires more time (2X) than normal Plan ahead!normal. Plan ahead!

• Library characterization can time consuming as new cells need to be designed and the existing cells characterized under new g gconditions.

• Choose a comprehensive implementation tool to address not l f t h i b t l t d ff b tonly a range of techniques, but also trade-offs between power,

area and timing.

• LP techniques force you to change the existing methodology• LP techniques force you to change the existing methodology adding new tools and steps. In order to be successful, consider partnering with a EDA vendor (Cadence!)

• Verification is key to successful implementation. Make sure the verification tool can understand low power techniques.

Backup