Dynamic Voltage Scheduling Using Adaptive Filtering of Workload … · 2004. 1. 20. · Adaptive...
Transcript of Dynamic Voltage Scheduling Using Adaptive Filtering of Workload … · 2004. 1. 20. · Adaptive...
Dynamic Voltage SchedulingUsing
Adaptive Filtering of Workload Traces
Amit Sinha and Anantha Chandrakasan
Massachusetts Institute of Technology
Sinha, VLSI ’01 2
Overview
n Introductionn Typical Workload Profilen DVS Basics
n Energy Workload Modelsn Workload Prediction
n Markov Processesn Various Algorithms
n Energy Performance Tradeoffsn Results and Conclusions
Sinha, VLSI ’01 3
Typical Processor Workload Profiles
Pro
cess
or
Uti
liza
tio
n (
%)
Time (s)
Dialup Server
WorkstationFileserver
Sinha, VLSI ’01 4
Dynamic Voltage Scaling
ACTIVE IDLE
EFIXED = ½ C VDD2
Fixed Power Supply
ACTIVE
EVARIABLE = ½ C (VDD/2)2 = EFIXED / 4
Variable Power Supply
0.2 0.4 0.8 1.0
0.2
0.4
0.6
0.8
1.0
Normalized Workload
Nor
mal
ized
Ene
rgy
Fixed Supply
VariableSupply
00 0.6
Sinha, VLSI ’01 5
Enabling Technology
n Variable frequency processors availablen Transmeta’s Crusoe
n LongRun Technology
n AMD K6-2+n PowerNOW!
n Mobile Pentium IIIn SpeedStep
StrongARM
n StrongARM SA-1100n 59MHz – 206MHz (0.8V – 1.5V) DVS Circuit
Sinha, VLSI ’01 6
Energy Workload Model
Workload (r)
Rel
ativ
e C
urr
ent
(I/
I ma
x)
Relative Current Load (I/Imax)
Rel
ativ
e E
ffic
ien
cy (
%)
( )2
2
00
20 22
+++=
rVV
rr
VV
rfTCVrE ttrefs
[Gutnik97]
( )
+++=
2
00
0
22r
VV
rrVV
VV
rIrI tt
refref
Workload (r)
No
rmal
ized
En
erg
y
No Voltage Scaling
DVS with Converter Efficiency
Ideal DVS
Energy vs. WorkloadDC/DC Efficiency
Current vs.Workload
Sinha, VLSI ’01 7
Workload Prediction
n How to predict workload, w?n How frequently processing rate, f(r), be updated
Variable VoltageProcessor
DC
/DC
C
on
vert
er
Wo
rklo
ad
Mo
nit
or
Vfixed
V(r) w f(r)
r
?1
?2
?n
Task Queue
?
Can be modelled asa Markov Process
Sinha, VLSI ’01 8
Prediction Algorithms
Least Mean Square (LMS)
Expected Workload State (EWS)
Exp. Weighted Average (EWA)
Moving Average Workload (MAW)
knN
khn ,1
][ ∀= kn akh −=][
{ } ∑=
=+Ε=+L
jijj pwnwnw
0
]1[]1[ ][][][][1 knwnwkhkh enn −+=+ µ
• Simplest• Peformance degradation with fast loads
• Lower significance of older data• Event predictition context [Hwang97]
• Adaptive filter, self-adjusting• Convergence issues
• Probabilistic fomulation• Transition matrix updated every slot
∑−
=
−=+1
0
][][]1[N
knp knwkhnwPredicted
WorkloadPrevious
Workloads
Sinha, VLSI ’01 9
Prediction Performance
n Best prediction with LMS and about 3 taps
RM
S E
rro
r
Filter Taps (N)
MAW
EWS
LMS
EWA
n Averaged over different processors and times
n 1 sec update raten 1 hour processor
utilization snapshots
Less TapsNoisy Prediction
More TapsExcessive LPF
Sinha, VLSI ’01 10
LMS Tracking of Workload
Time (s)
Wo
rklo
ad
Continuous
Prefect
Predicted
N = 3T = 10Levels = 10µ = 0.1
Sinha, VLSI ’01 11
Energy Performance Tradeoff
n Averaging is energy efficient
T 2T
Time
Wor
kloa
d 1.0
0.5
W1W2
0.675
Ener
gy
1.0
0.5
W1 W2
0.5625
)()(22
221
22
21 rErE
rrrr≥→
+
≥+
DecreasedAveraging
Higher EnergyFaster Response
Increased Averaging
Lower EnergySluggish
Performance
n Update time T depends onn Maximum allowed performance hitn DC/DC converter and frequency change overheads
Sinha, VLSI ’01 12
Update Time (s)
Per
form
an
ce H
it
F max
F avg
N = 2
N = 6
N = 10
Maximum allowed performance hit
Tmax
Performance Hit Metric
n Performance Hit Function
t
tt
rrw
t∆
∆∆ −=∆ )(φ
Maximum can be used set update time
n Maximum and Average
)(),(max tt Tavg
T ∆∆ φφ
Sinha, VLSI ’01 13
No
rma
lize
d E
ner
gy
Update Time, T (s) Filter Taps (N
)
Optimum Update Time and Taps
n N, T selections are not completely independent!
N = 3T = 5 s
n Good choice
Sinha, VLSI ’01 14
Discrete Processing Levels
n Discrete frequency levels are not too bad.n StrongARM has 11 levels [ degradation < 5% ]
Eac
tual
/ E p
erfe
ct
Processing Levels (L)
N = 3T = 5LMS Filter
Sinha, VLSI ’01 15
Results
36.310.81.112.1EWS
35.410.61.092.2EWA
43.114.71.032.3LMS
42.812.61.41
3.3
16.7
23.576.7
MAW
FileServer
33.87.41.5015.7EWS
37.49.21.4116.7EWA
47.714.11.2019.6LMS
35.33.65.22
1.6
52.7
275.2445.9
MAWUserWork-Station
35.13.84.6359.5EWS
35.63.75.2852.1EWA
36.03.95.1953.0LMS
2.2
Actual
1.2
Max / Perfect
ESR Comparison
1.10
Perfect / Actual
10.6
F avg
(%)
34.8
2.42.9
MAW
DialupServer
PerfectMaxF max
(%)
Energy Savings Ratio (ESR)FilterTrace
Sinha, VLSI ’01 16
Conclusions
n DVS is very effective for energy reductionn Upto 2 orders of magnitude savings possiblen About 30% ‘instantaneous’ performance loss
n Averaged workloads are bestn Makes system sluggish to workload changesn Unknown a priori
n Energy Performance Tradeoffn Faster updates lower visible performance lossn Faster updates also mean increased energy
n Workload prediction is crucialn Adaptive LMS filtering is quite effective