Revisiting Size-Based Scheduling with Estimated Job Sizes
-
Upload
matteo-dellamico -
Category
Science
-
view
123 -
download
0
description
Transcript of Revisiting Size-Based Scheduling with Estimated Job Sizes
Revisiting Size-Based Scheduling
with Estimated Job Sizes
Matteo Dell’Amico (EURECOM, France),
Damiano Carra (Univ. Verona, Italy)
Mario Pastorelli, Pietro Michiardi (EURECOM, France)
MASCOTS 2014
These slides at http://bit.ly/schedsim
1
On Size-Based Scheduling
On Size-Based Scheduling An Example
Processor-Sharing vs. Size-Based
100usage (%)
cluster
50
10 15 37.5 42.5 50
time(s)
100usage (%)
cluster
10 5020 30
50
time(s)
job 1
job 2
job 3
job 1 job 3job 2 job 1
3
On Size-Based Scheduling An Example
Processor-Sharing vs. Size-Based
100usage (%)
cluster
50
10 15 37.5 42.5 50
time(s)
100usage (%)
cluster
10 5020 30
50
time(s)
job 1
job 2
job 3
job 1 job 3job 2 job 1
3
On Size-Based Scheduling Properties
Size-Based Scheduling: Some Properties
Shortest Remaining Processing Time (SRPT)
Minimizesmean sojourn time (MST) [Schrage, OPER RES ’68]
Sojourn time: interval between job submission and completion
Fair Sojourn Protocol (FSP)
Jobs are scheduled in the order they would complete if doing
Processor Sharing (PS)
Efficiency: very close to SRPT
Fairness: each job completes not later than Processor Sharing[Friedman & Henderson, SIGMETRICS ’03]
4
On Size-Based Scheduling Properties
Size-Based Scheduling: Some Properties
Shortest Remaining Processing Time (SRPT)
Minimizesmean sojourn time (MST) [Schrage, OPER RES ’68]
Sojourn time: interval between job submission and completion
Fair Sojourn Protocol (FSP)
Jobs are scheduled in the order they would complete if doing
Processor Sharing (PS)
Efficiency: very close to SRPT
Fairness: each job completes not later than Processor Sharing[Friedman & Henderson, SIGMETRICS ’03]
4
On Size-Based Scheduling Related Work
Where Are All the Size-Based Schedulers?
Job size is almost never known a priori
Related Work: Inexact Job Size Information
Simulation-based study: estimations need to be precise[Lu et al., MASCOTS 2004]
Analytic study: bounded errors, over-estimation only[Wierman & Nuyens, SIGMETRICS PER, 2008]
What Motivated Us
We developed HFSP, a size-based scheduler for Hadoop
We found it works very well even with very rough estimations[Pastorelli et al., BIGDATA 2013]
5
On Size-Based Scheduling Related Work
Where Are All the Size-Based Schedulers?
Job size is almost never known a priori
Related Work: Inexact Job Size Information
Simulation-based study: estimations need to be precise[Lu et al., MASCOTS 2004]
Analytic study: bounded errors, over-estimation only[Wierman & Nuyens, SIGMETRICS PER, 2008]
What Motivated Us
We developed HFSP, a size-based scheduler for Hadoop
We found it works very well even with very rough estimations[Pastorelli et al., BIGDATA 2013]
5
On Size-Based Scheduling Related Work
Where Are All the Size-Based Schedulers?
Job size is almost never known a priori
Related Work: Inexact Job Size Information
Simulation-based study: estimations need to be precise[Lu et al., MASCOTS 2004]
Analytic study: bounded errors, over-estimation only[Wierman & Nuyens, SIGMETRICS PER, 2008]
What Motivated Us
We developed HFSP, a size-based scheduler for Hadoop
We found it works very well even with very rough estimations[Pastorelli et al., BIGDATA 2013]
5
Understanding Size-Based Scheduling With Errors
Understanding Size-Based Scheduling With Errors Simulation
Scheduling Simulation
Main Features
Simulates single-server, preemptive scheduling
Can create synthetic traces or replay real ones
Injects artificial size estimation errors
in this case, SRPT and FSP become SRPTE and FSPE
Efficient and easy to prototype new schedulers
10,000 jobs are simulated in half a second on an old laptopFSP is ~50 lines of Python code
Free Software: Apache License 2.0
https://bitbucket.org/bigfootproject/schedsim
7
Understanding Size-Based Scheduling With Errors Simulation
Log-Normal Error Distribution
0.0 0.5 1.0 1.5 2.0 2.5 3.0x
0.0
0.5
1.0
1.5
2.0
2.5
3.0
3.5
4.0
sigma= 0.125sigma= 0.25sigma= 1sigma= 4
Error: real sizeestimated size
8
Understanding Size-Based Scheduling With Errors Simulation
Weibull Job Size Distribution
0.0 0.5 1.0 1.5 2.0 2.5 3.0x
0.0
0.5
1.0
1.5
2.0
shape= 0.125shape= 1shape= 2shape= 4
Interpolates between
heavy-tailed job size distributions (shape<1)exponential distributions (shape=1)bell-shaped distributions (shape>1)
9
Understanding Size-Based Scheduling With Errors Simulation
Other Parameters
Number of jobs (default: 10,000 per workload)
at least 30 repetitions per data point
System load (default: 0.9)
Ratio between requested and available resources
Job arrival time (default: exponential)
Bursts vs. regular
These parameters are not fundamental
more details in the paper
10
Understanding Size-Based Scheduling With Errors Simulation Results
Size-Based Scheduling With Errors
SRPTE FSPE
Problems for heavy-tailed job size distributions
Otherwise, size-based schedulingworks very well
11
Understanding Size-Based Scheduling With Errors Simulation Results
Over-Estimations and Under-Estimations
Over-‐es'ma'on Under-‐es'ma'on
t
t
t
t
Remaining size
Remaining size
Remaining size
Remaining size
J1 J2 J3
J2 J3
J1 ^
J4
J5 J6
J4 J5 J6
^
Over-estimating hurts a single job: limited damage
Under-estimating very large jobs canwreak havoc
12
Size-Based Scheduling For Approximate Sizes
Size-Based Scheduling For Approximate Sizes FSPE+PS
FSPE + PS
Idea
Without errors, real jobs always complete before virtual ones
When they don’t (they are late), there has been a mistake
The scheduler can realize this, and take corrective action
Realization
A scheduler such that late jobs don’t block the system
Just do processor sharing between them instead of scheduling
the “most late” one
14
Size-Based Scheduling For Approximate Sizes FSPE+PS
FSPE + PS: Results
FSPE FSPE + PS
Performance becomes very close to optimal
Outperformed by PS only for extreme skew and errors
Replaying real-world traces gives analogous results
15
Size-Based Scheduling For Approximate Sizes Comparison With SRPT
Schedulers vs. SRPT
0.125 0.25 0.5 1 2 4shape
2
4
6
8
10
MST
/M
ST(S
RPT
) SRPTEFSPEFSPE+PS
PSLASFIFO
Sigma: 0.516
Size-Based Scheduling For Approximate Sizes Real Workloads
Hadoop@ Facebook
0.125 0.25 0.5 1 2 4sigma
2
4
6
8
10
MST
/M
ST(S
RPT
) SRPTEFSPEFSPE+PS
PSLAS
0.125 0.25 0.5 1 2 4sigma
2
4
6
8
10
MST
/M
ST(S
RPT
) SRPTEFSPEFSPE+PS
PSLAS
Synthetic workload (shape=0.25) Facebook Hadoop Cluster
17
Size-Based Scheduling For Approximate Sizes Real Workloads
Web Cache
0.125 0.25 0.5 1 2 4sigma
1
10
100
MST
/M
ST(S
RPT
) SRPTEFSPEFSPE+PS
PSLAS
0.125 0.25 0.5 1 2 4sigma
1
10
100
1000
10000
MST
/M
ST(S
RPT
) SRPTEFSPEFSPE+PS
PSLASFIFO
Synthetic workload (shape=0.177) IRCache Web Cache
18
Take-Home Messages
Take-Home Messages
Take-Home Messages
For System Designers
Do not be afraid of size-based scheduling
it can work great even with very rough estimations
Further Research
Schedulers like FSPE+PS, designed for estimated sizes, work
very well
Can we design a scheduler that always outperforms PS?
Can we get better analytical understanding of the phenomena
we observed?
These slides (plus bonus content) available at
http://bit.ly/schedsim
20
Bonus Content
Bonus Content Fairness
Fairness: Slowdown
100 101 102
slowdown
0.0
0.2
0.4
0.6
0.8
1.0
ECD
F SRPTEFSPEFSPE+PS
PSLASFIFO
100 101 102
slowdown
0.90
0.92
0.94
0.96
0.98
1.00
ECD
F
Shape: 0.25, sigma: 0.5
22
Bonus Content Fairness
Fairness: Conditional Slowdown
10−4 10−3 10−2 10−1 100 101 102
job size
100
101
102
103
104
105
106
107
slow
dow
n
SRPTEFSPEFSPE+PS
PSLASFIFO
Shape: 0.25, sigma: 0.523