EE241 - Spring 2011bwrcs.eecs.berkeley.edu/Classes/icdesign/ee241_s11/... · 0.01 1 100 10000...
Transcript of EE241 - Spring 2011bwrcs.eecs.berkeley.edu/Classes/icdesign/ee241_s11/... · 0.01 1 100 10000...
1
EE241 Spring 2011EE241 - Spring 2011Advanced Digital Integrated Circuits
Lecture 19: Managing Leakage
Announcements
Quiz #3 next Wednesday
This and next lecture until 4pm
Reading: Chapter 8, 10, from Rabaey LPDE
Plan until the end of semester:
2
Plan until the end of semester:One more homework and two quizzes
Final: April 28, in class
Project presentations: Wednesday, May 4, 2pm
2
Outline
Last lectureDVS
Clock gating
This lectureLeakage management
Power gating
Back bias
3
Managing LeakageManaging Leakage
3
Power /Energy Optimization SpaceConstant Throughput/Latency Variable Throughput/Latency
Energy Design Time Sleep Mode Run TimeEnergy Design Time Sleep Mode Run Time
Active
Logic design
Scaled VDD
Trans. sizing
Multi-VDD
Clock gatingDFS, DVS
St k ff t
5
Leakage
Stack effects
Trans sizing
Scaling VDD
+ Multi-VTh
Sleep T’s
Multi-VDD
Variable VTh
+ Input control
+ Variable VTh
Stack Effect
Reduction (in 0.13μ):
6Narendra, ISLPED’01
4
Stack Forcing
7
Tradeoffs:• W/2 – ¼ of drive current, same loading• 2W – 4x loading, same drive current
Narendra, ISLPED’01
Power /Energy Optimization SpaceConstant Throughput/Latency Variable Throughput/Latency
Energy Design Time Sleep Mode Run TimeEnergy Design Time Sleep Mode Run Time
Active
Logic design
Scaled VDD
Trans. sizing
Multi-VDD
Clock gatingDFS, DVS
St k ff t
8
Leakage
Stack effects
Trans sizing
Scaling VDD
+ Multi-VTh
Sleep T’s
Multi-VDDVariable VTh
+ Input control
+ Variable VTh
5
Input Control
9
May take many cycles to force the desired state in a block
Power /Energy Optimization SpaceConstant Throughput/Latency Variable Throughput/Latency
Energy Design Time Sleep Mode Run TimeEnergy Design Time Sleep Mode Run Time
Active
Logic design
Scaled VDD
Trans. sizing
Multi-VDD
Clock gatingDFS, DVS
St k ff t
10
Leakage
Stack effects
Trans sizing
Scaling VDD
+ Multi-VTh
Sleep T’s
Multi-VDDVariable VTh
+ Input control
+ Variable VTh
6
Dynamic Sleep Transistor
VCCON: gate
Active modePMOS forward body bias
...
CC
Virtual VCC
ON: gateoverdrive
Noise on virtual supply
Dual-VT
core
11
ON: gateoverdrive VSS
Virtual VSS
Courtesy of J. Tschanz, Intel (ISSCC’03)
Dynamic Sleep Transistor
VCCOFF: gate
PMOS reverse body biasIdle mode
...
Virtual VCCunderdrive
Virtual supply collapse
12
VSS
Virtual VSSOFF: gateunderdrive
Courtesy of J. Tschanz, Intel (ISSCC’03)
7
How to Size the Sleep Transistor?
Circuits in active mode see the sleep transistor as extra power line resistance
The wider the sleep transistor the betterThe wider the sleep transistor, the better
Wide sleep transistors cost areaMinimize the size of the sleep transistor for given ripple (e.g. 5%)
Need to find the worst case vectorSleep transistor is not for free – it will degrade the performance in active mode
13
performance in active modeCharging and discharging the virtual rails costs energy
Sleep Transistor
High-VTH transistor has to be very large for low resistancein linear region. gLow-VTH transistor needs much less areafor the same resistance.
14Courtesy: R. Krishnamurthy, Intel
8
Sleep Transistor Layout
ALUSleep
transistor cells
Area overhead
PMOS 6%
15
PMOS 6%
NMOS 3%
Tschanz, ISSCC’03
Sleep in Standard Cells
16Uvieghara, ISSCC’04
9
Sleep Transistor Grid
No sleep transistor PMOS & NMOSsleep transistors
Virtual VCC Virtual VSS
VCC M4VCC M4
sleep transistors
17
VSSM4VSS
M4
M3 M3 M3 M3Tschanz, ISSCC’03
Preserving State
Virtual supply collapse in sleep mode will cause the loss of state in registers
Putting the registers at nominal VDD would preserve the state
These registers leak
The second supply needs to be routed as well
Can lower VDD in sleep
18
Some impact on robustness, noise and SEU immunity
State preservation and recovery
10
Register Design
SLEEP High VT
SLEEP High VT
SLEEP High VT
SLEEP High VT
19
High VT
CLK
High VT
[Mutoh95]
Power /Energy Optimization SpaceConstant Throughput/Latency Variable Throughput/Latency
Energy Design Time Sleep Mode Run TimeEnergy Design Time Sleep Mode Run Time
Active
Logic design
Scaled VDD
Trans. sizing
Multi-VDD
Clock gatingDFS, DVS
St k ff t
20
Leakage
Stack effects
Trans sizing
Scaling VDD
+ Multi-VTh
Sleep T’s
Multi-VDDVariable VTh
+ Input control
+ Variable VTh
11
Shared-Well Dual Supply
wer
[mW
]
40
50
60
-42%
[pJ]
Room temp.
600
700
800
Single-supply
Dual supply
1.16GHz
VDDL=1.4VEnergy:-25.3%
In 180nm
VDDL [V]
Leak
age
Pow
0
10
20
30
40
1.0 1.2 1.4 1.6 1.8 2.0
VDDHDomain
VDDLDomain
VDDH
Ener
gy [
TCYCLE [ns]
200
300
400
500
0.6 0.8 1.0 1.2 1.4 1.6
pp y(VDDH=1.8V)Delay :+2.8%
VDDL=1.2VEnergy:-33.3% Delay :+8.3%
21VDDH circuit
VDDL
VSS
VDDL circuit
Shimazaki, ISSCC’03
Power /Energy Optimization SpaceConstant Throughput/Latency Variable Throughput/Latency
Energy Design Time Sleep Mode Run TimeEnergy Design Time Sleep Mode Run Time
Active
Logic design
Scaled VDD
Trans. sizing
Multi-VDD
Clock gatingDFS, DVS
St k ff t
22
Leakage
Stack effects
Trans sizing
Scaling VDD
+ Multi-VTh
Sleep T’s
Multi-VDDVariable VTh
+ Input control
+ Variable VTh
12
Dynamic Body Bias
Similar concept to dynamic voltage scalingControl loop adjusts the substrate bias to meet the timingtiming
Can be used just as runtime/sleep
Limited range of threshold adjustments (<100mV)Limited leakage reduction (<10x)No delay penalty
C i d b f d bi
23
Can increase speed by forward bias
Energy cost of charging/discharging the substrate capacitance(but doesn’t need a regulator)
Dynamic Body Bias
24
13
Dynamic Body Bias
450mVFBB
VCC
PMOSbody
PMOSForward body bias
Active mode
... ...
450mVFBB
VSS
body
NMOSbody
bias
NMOSbias
PMOSbodyVHIGH
Forward body bias (FBB)
Local VCC tracking
Idle mode
Dual-VT
core
25
PMOSbias
... ...NMOS
bias
500mVRBB
500mVRBB
VCC
VSS
NMOSbody
VLOW
Reverse body bias (RBB)
Triple well needed
Tschanz, ISSCC’03
Body Bias Layout
Sleep transistor LBGsALU core LBGs
ALUNumber of ALU core LBGs
30
Number of sleep transistor LBGs
10
PMOS device width 13mm
U
26
Area overhead 8%
Sleep transistor LBGs
ALU core LBGs
14
1
Leakage Power Savings vs. Decap
Virtual VCC1.32V, 75°Cwer
0 2
0.4
0.6
0.8
1.32V75°C
Overhead: charging &
Dual-VTcore
lized
leak
age
po
in id
le m
od
e
40%
Low-leakage 133nF decap on
virtual VCC
No decap on virtual VCC
27
0
0.2
0.01 1 100 10000
Minimize capacitance on virtual VCC
g gdischarging of virtual VCC
capacitance
Idle time
10ns 1s 100s 10ms10sNo
rmal 90%
Decoupling Capacitor Placement
Longertime
OxideleakageDual-VT
core
Reducedleakage
timeconstant
Dual-VTcore
28
Decap on full supply Decap on virtual supply
Performance
Convergence time
Oxide leakage savings
15
20%
Total Active Power Savings(Fixed activity: = 0.05)
0.5 5 50 500 5000 50000
Number of consecutive active cycles (TON)
5%
10%
15%
20%
ota
l po
we
r sa
vin
gs
Body bias (1.28V): active: FBB, idle: ZBB
PMOS sleep transistor (1.32V)
otal
pow
er s
avin
gs Max 18%
Max 8%
29
0%
5%
10 100 1000 10000 100000 1000000Number of idle cycles
To
Reference: 450mV FBB to core with clock gating, 1.28V, 4.05GHz, 75°C
Number of consecutive idle cycles (TOFF)
Power savings for TOFF > ~100 idle cycles
To
Techniques Summary
80
100
Standby supply reduction3 4 l k d ti
Reduced VDD
20
40
60
Il ea
k(n
orm
aliz
ed
)
Sleep transistor - up to~25x leakage reduction
~3-4x leakage reduction
Reverse bias~3x leakage
reduction
Standby supply + reverse bias~10x leakage reduction
Off-transistorload line
30
0
20
0 0.2 0.4 0.6 0.8 1
VDD [V]
16
Body Biasing and Variations
Body biasing with a local control loop can be used to lower the impact of process variations
Used to limit die-to-die and within-die variations
31
Normalized Delay vs VDD & VTH
1 8VDD =1.0 V
ΔVTH =
1.5 V
3.0 V
1 0
1.4
1.8
mal
ized
Del
ay ±0.15V
±0.05V
ΔVTH =
32
Sakurai, Kuroda
VTH (V)
0 0.2 0.4 0.7 1
5.0 V
0.6
1.0
No
rm
0.5
17
Self-Adjusting Threshold-Voltage Scheme (SATS)
33
Substrate Biasing
34Tschanz, JSSC 11/02
18
Effectiveness of Substrate Bias
Die-to-die variations
35
Effectiveness of Substrate Bias
Within-die variations
36
19
Dynamic Voltage Scaled MicroprocessorExternal VDD 3.3V±10% Internal VDDL 0.8V~2.9V ±5%
U L i
VS
mW
) 300
TX3900
User Logic PLL
ow
er D
issi
pat
ion
(m
100
200
TheoryMeasurement
37
VTP
Operating Frequency (MHz)
00 10 20 30 40
Courtesy: Prof. Kuroda
Adapting VDD and VTH
38
Miyazaki, ISSCC’02
20
Adapting VDD and VTH
39
Miyazaki, ISSCC’02
Optimal VDD, VTh
Adjusting VDD, VTh trades of energy and delay
We studied energy-limited designThere are alternate ways for optimizing energy and delay together
E.g. energy-delay product (EDP)
Or EnDm, n,m > 1
40
21
Optimal EDP Contours
41Gonzalez, JSSC 8/97
Topology Inverter Adder Decoder
(ELk/ESw)ref 0.1% 1% 10%Reference Design:Dref (Vdd
max,Vthref)
Sizing, Supply, Threshold Optimization
Large variation in optimal circuit parameters Vddopt, Vth
opt, wopt
Vddmax Vth
max
42Technology parameters (Vdd
max, Vthref) rarely optimal
Vddmin Vth
min
22
Energy efficient curvef (W,Vdd,Vth)
gy
(Ere
f)Sensitivity W Vdd Vth
(Dref,Eref) 1.5 0.2
(D E ) 1
Result: E-D Tradeoff in an Adder
ReferenceDesign(Dref,Eref)
(Dmin,Eref)
En
erg (Dref,Emin) 1
(Dmin,Eref) 22 16 22
-80%
-40%
80% of energy savedwithout delay penalty
43Delay (Dref)
(Dref,Emin) 40% delay improvement without energy penalty
Energy-constrained delay
Active power2
DDact fCVP
f = 1/LDtp
Leakage power
Eli i t i bl (V ) d fi d P (V )
DDS
VV
leak VeIPDDTh
0
44
Eliminate one variable(VTh) and find Pmin(VDD)
Nose, ASP-DAC’00
23
Large (ELk/ESw)opt
Flat EOp minimum
T l d d t
Minimum energy: ESw = 2ELk
0.8
1O
pef
Vthref-180mV
0 81Vmax
2
ln
Lk Sw optd
avg
E EL
K
Topology dependent
0.2
0.4
0.6
EO
p /
nom
inal
EOre
nominalparallelpipeline
0.81Vdd
Vthref-95mV
0.57Vddmax
Vthref-140mV
0.52Vddmax
45Optimal designs have high leakage (ELk/ESw ≈ 0.5)
10-2
10-1
100
101
0
ELeakage
/ESwitching
pipeline
Subthreshold Optimum
f = 30kHz Minimum is independent of VT
46
Minimum is independent of VT
Calhoun, JSSC 9/05
24
Next Lecture
Back to design for performance
47