EE241 - Spring 2013bwrcs.eecs.berkeley.edu/Classes/icdesign/ee241_s13/...12 23 Inverter RingOsc...
Transcript of EE241 - Spring 2013bwrcs.eecs.berkeley.edu/Classes/icdesign/ee241_s13/...12 23 Inverter RingOsc...
1
EE241 - Spring 2013Advanced Digital Integrated Circuits
Lecture 18: Dynamic Voltage Scaling
2
Announcements
Homework 3 due today
Quiz #3 today
2
3
Reading
Burd et al, A dynamic voltage scaled microprocessor system, IEEE Journal of Solid-State Circuits, vol. 35, no.11, pp. 1571-1580, Nov. 2000.
4
Outline
Last lecturePower-performance tradeoffs at circuit level
Tradeoffs through supply voltage
This lectureMultiple supplies
Dynamic voltage scaling
3
5. Low Power DesignF. Multiple Supplies
6
Power /Energy Optimization Space
Constant Throughput/Latency Variable Throughput/Latency
Energy Design Time Sleep Mode Run Time
Active
Logic design
Scaled VDD
Trans. sizing
Multi-VDD
Clock gatingDFS, DVS
Leakage
Stack effects
Trans sizing
Scaling VDD
+ Multi-VTh
Sleep T’s
Multi-VDD Variable VTh
+ Input control
+ Variable VTh
4
7
Reducing Active Power
Downsizing, lowering the supply on the critical path will lower the operating frequencyDownsize (lowering supply) non-critical paths
Narrows down the path delay distributionIncreases impact of variations
8
Multiple Supply Voltages
Block-level supply assignmentHigher throughput/lower latency functions are implemented in higher VDD
Slower functions are implemented with lower VDD
Often called “Voltage islands”Separate supply grids, level conversion performed at block boundaries
Multiple supplies inside a blockNon-critical paths moved to lower supply voltageLevel conversion within the blockPhysical design challenging
5
9
Leakage Issue
Driving from VDDL to VDDH Level converter
10
Multiple Supplies in a Block
Lower VDD portion is shaded
CVS StructureConventional Design
Critical Path
Level-Shifting F/F
Critical Path
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
M.Takahashi, ISSCC’98. “Clustered voltage scaling”
6
11
Multiple Supplies in a Block
CVS Layout:
Usami’98
12
Level-Converting Flip-Flop
CK
CK
CLK CK
CK
D
VL
CK
Q
CK
VH
M1 M2
CK
7
13V1 = 1.5V, VTH = 0.3V, p(t):lambda
Po
wer
Red
uct
ion
Rat
io
0.4 0.5 0.6 0.7 0.8 0.9 1 1.1 1.2 1.3 1.4
0.4
0.5
0.6
0.7
0.8
0.9
1
1.1
1.2
1.3
1.4
V1 (V)
V2
(V
)
+
V2 (V)V
3(V
)
From Two to Three VDD’s
From Kuroda
14
1.0
0.5
Su
pp
ly V
olt
ag
e R
ati
o
1.0
0.4
0.5 1.0 1.5V1 (V)
Po
we
r D
iss
ipa
tio
n R
ati
o
V2/V1
P2/P1
{ V1, V2 }
V2/V1
V3/V1
{ V1, V2, V3 }
0.5 1.0 1.5V1 (V)
P3/P1
V2/V1
V3/V1
V4/V1
0.5 1.0 1.5V1 (V)
P4/P1
{ V1, V2, V3, V4 }
The more VDD’s, the less power, but the effect will be saturated. Power reduction effect will be decreased as VDD’s are scaled. Optimum V2/V1 is around 0.7.
Optimum Numbers of Supplies
Hamada, CICC’01
8
5. Low Power DesignG. Dynamic Voltage Scaling
16
Power /Energy Optimization Space
Constant Throughput/Latency Variable Throughput/Latency
Energy Design Time Sleep Mode Run Time
Active
Logic design
Scaled VDD
Trans. sizing
Multi-VDD
Clock gatingDFS, DVS
Leakage
Stack effects
Trans sizing
Scaling VDD
+ Multi-VTh
Sleep T’s
Multi-VDD Variable VTh
+ Input control
+ Variable VTh
9
17
Adaptive Supply Voltages
18
Processors for Portable Devices
1000
100
10
1
Perf
orm
ance
(MIP
S)
Processor Energy (Watt*sec)1 100.1
• Eliminate performance energy trade-off
PDAs
Pocket-PCs
NotebookComputers
DynamicVoltageScaling
BurdISSCC’00
10
19
Typical MPEG IDCT Histogram
20
timeSystem Idle
DesiredThroughput
Maximum Processor Speed
Background andhigh-latency processes
Compute-intensive andlow-latency processes
• Maximize Peak Throughput• Minimize Average Energy/operation
System Optimizations:BurdISSCC’00
Processor Usage Model
11
21
Compute ASAP:D
eliv
ered
Thr
ough
put
Clock Frequency Reduction:
Excessthroughput
Always high throughput
Energy/operation remains unchanged…while throughput scaled down with fCLK
fCLK
Reduced
time
time
Common Design Approaches (Fixed VDD)
22
0
0.5
1
0 0.5 1
Ener
gy/o
pera
tion
Throughput ( fCLK)
Constant supply voltage
Reduce VDD, slowcircuits down.
~10x EnergyReduction
3.3V
1.1V
BurdISSCC’00
Scale VDD with Clock Frequency
12
23
Inverter
RingOsc
RegFile
SRAM
1.0
0.5
0VT 2VT 3VT 4VT
Nor
mal
ized
max
. fC
LK
VDD
Delay tracks within +/- 10%BurdISSCC’00
CMOS Circuits Track Over VDD
24
time
• Dynamically scale energy/operation with throughput.• Always minimize speed minimize average energy/operation.• Extend battery life up to 10x with the exact same hardware!
Vary fCLK,VDD
Del
iver
edTh
roug
hput
1 2 Dynamically adapt
BurdISSCC’00
Dynamic Voltage Scaling (DVS)
13
25
• DVS requires a voltage scheduler (VS).• VS predicts workload to estimate CPU cycles.• Applications supply completion deadlines.
0
20
40
60
80
0 0.2 0.4 0.6 1.0 1.2 1.40.8
Processor Speed (MPEG)
F DES
IRED
(M
Hz)
Time (sec)
Operating System Sets Processor Speed
DESIREDCPU cycles
Ftime
26
RST Counter
Latch
Digital Loop Filter
LCDD
VDD
PENAB
NENABFERR
FMEAS
f1MHz
0110
100 FDES
+Register
fCLK
Ring Oscillator Processor
IDD
• Feedback loop sets VDD so that FERR 0.• Ring oscillator delay-matched to CPU critical paths.• Custom loop implementation Can optimize CDD.
7
Buck converter
Set byO.S.
BurdISSCC’00
Converter Loop Sets VDD, fCLK
14
27
Next Lecture
Continue DVS