An Adaptive Disk Spindown Algorithm - University of ...manfred/opubs/corrie-pres.pdfAn Adaptive Disk...

An Adaptive Disk Spindown Algorithm

Corrie Scalisi

University of California - Santa Cruz

April 16, 2007

Corrie Scalisi (UCSC) An Adaptive Disk Spindown Algorithm April 16, 2007 1 / 30

Outline

1 Introduction

2 Datasets

3 Experts

4 Loss Function

5 BestShift(K)

6 Sharing Algorithm


Introduction

Algorithm for deciding when to spin down a disk to save powerBased on Helmbold et al. MONET 2000Uses a Mixture of Experts and Weighted Majority Approach

See Littlestone and Warmuth 1994


Outline

1 Introduction

2 Datasets

3 Experts

4 Loss Function

5 BestShift(K)

6 Sharing Algorithm


On-line Data

Temporally correlatedPredictable given historyRandomly permute an on-line dataset to break these correlations

Significant visual difference between original data and permutation

Our algorithm must perform considerably better on the originaldata than it does on a random permutation.


Cello-2 Dataset

On-line


Intel Dataset

Already looks like a randomized permutation


Outline

1 Introduction

2 Datasets

3 Experts

4 Loss Function

5 BestShift(K)

6 Sharing Algorithm


Experts

Our algorithms use a set of experts, which each specify a timeoutvalue.

The optimal set of experts for a dataset always includes 0 and actualidle durations within the dataset:


Experts


Outline

1 Introduction

2 Datasets

3 Experts

4 Loss Function

5 BestShift(K)

6 Sharing Algorithm


Loss Function

Energy used by time-out:

Loss(timeout) =

{idle time if idle time ≤ timeouttimeout + spindown cost if idle time > timeout

Energy used by optimal:

Loss(optimal) =

{idle time if idle time ≤ spindown costspindown cost if idle time > spindown cost


Energy Used By Timeout - Fixed Timeout


Two Timeouts

If t2 > t1, then t2 has smaller loss in interval (t1, t2)

If there is high certainty that the idle period is about to end,then the disk should be kept running


Energy Used By Timeout - Fixed Idle Time

If timeout < idletime, Loss = Spindown Cost + TimeoutIf timeout > idletime (don’t spin down), Loss = idletime


Outline

1 Introduction

2 Datasets

3 Experts

4 Loss Function

5 BestShift(K)

6 Sharing Algorithm


BestShift(K)

Dynamic programming

Illustrates how adaptively changing the timeout value lowers theamount of energy used.

Maintains a table B holding minimum Loss values.

Run on a dataset and a random permutation in order to capturewhether the data is ”on-line”


BestShift(K) - Simple Algorithm

t = trial numberk = max number of times the timeout value can change

Dynamic Programming algorithmMaintain loss for each (t , k) pair, B(t , k)

Maintain last expert used E(t , k)

LossWithoutShift = B(t − 1, k) + Loss(t , E(t , k))

LossWithShift = minm<t

{B(t −m, k − 1) + Loss(t , E(t −m, k))}

B(t , k) =

{0 t = 0, k ≥ 1min(LossWithoutShift, LossWithShift) t > 0, k > 1

Running Time O(T 2K )


BestShift(K) - Improved Algorithm

t = trial numberk = max number of times the timeout value can changei = the index into a list of timeout values

B(t , k , i) =

0 if t = 0, k ≥ 1B(t − 1, 1, i) + Loss(t , i) if k = 1, t > 0min{B(t − 1, k , i) + Loss(t , i),minj 6=i(B(t − 1, k − 1, j) + Loss(t , i))} if t > 0, k > 1

We need to consider TKN table entriesCreate Mt , a KN sized table for each t from 0 to T


BestShift(K)

To calculate entries in Mt we will refer to entries in Mt−1

On each (k , n) combination we must consider(a) The entry (k , n) of Mt−1, the minimum cost of using a partitionthat includes expert n on the t − 1 trial(b) The minimal entry in the k − 1 row of Mt−1, the cost of using apartition that includes a different expert on the t − 1 trialThe (k , n) entry of Mt is the minimum of (a) and (b)


BestShift - Running Time

Note: minj 6=i(B(t − 1, k − 1, j) + Loss(t , i)) =minj 6=i(B(t − 1, k − 1, j) + Loss(t , i)If we compute minj 6=i(B(t − 1, k − 1, j) for each (k , j), the runningtime is O(TKN2)

However, the minimum holds for all j since it is the minimum overall j in the k − 1 row of Mt−1

Therefore if we maintain the minimum we can use the quantity foreach entry in the k -th row of Mt

Then, the runtime is O(TKN)


BestShift for Cello-2

Takes advantage of temporal similarities in the original data.BestShift curve for randomized permutation of data is very flatbecause of lower temporal correlation.


BestShift for Intel Dataset

Similar for original and permuted data because of low temporalcorrelationin both.Only 1751 idletimes greater than 10. Approaches optimal quicklyby shifting at most costly idletimes.


Outline

1 Introduction

2 Datasets

3 Experts

4 Loss Function

5 BestShift(K)

6 Sharing Algorithm


Sharing Algorithm

Weight vector w = (w1, w2, . . . wn) = (1n , 1

n , . . . 1n )

Experts t = (t1, t2, ...tn)Make prediction based on weights of experts:

time-out =n∑

i=1

wi ti

Calculate loss for each expert ti :

Loss(i) =energy used by ti

spindown cost

Loss update - reduce weights and then normalize

wi =wie−ηLoss(i)∑nj=1 wje−ηLoss(i)

Share update - redistribute weight


Variable-Share Update

Calculate pool of weight to share:

weightToShare =n∑

i=1

oldWeight(i) ∗ (1− (1− α)Loss(i))

Distribute weight amongst experts:

weight(i) = oldWeight(i) ∗ (1− α)Loss(i) +1n

weightToShare

Variable-Share update from Herbster and Warmuth 95Other Share Updates:

Fixed Share to Start Vector Update, Fixed Share to Past Average,etc.


Performance of Adaptive Algorithm on Cello-2


Performance of Adaptive Algorithm on Intel Data


Performance of Adaptive Algorithm on Cello-2 -Compared to Loss Curve


Performance of Adaptive Algorithm on Intel data -Compared to Loss Curve


Conclusion

Share algorithm very effective on on-line data.May not give improvement over static policies on data withoutsignificant temporal correlation (not on-line)


An Adaptive Disk Spindown Algorithm - University of ...manfred/opubs/corrie-pres.pdfAn Adaptive Disk...

Documents

Transcript of An Adaptive Disk Spindown Algorithm - University of ...manfred/opubs/corrie-pres.pdfAn Adaptive Disk...