Revisiting Size-Based Scheduling with Estimated Job Sizes

27
Revisiting Size-Based Scheduling with Estimated Job Sizes Matteo Dell’Amico (EURECOM, France), Damiano Carra (Univ. Verona, Italy) Mario Pastorelli, Pietro Michiardi (EURECOM, France) MASCOTS 2014 These slides at http://bit.ly/schedsim 1

description

We study size-based schedulers, and focus on the impact of inaccurate job size information on response time and fairness. Our intent is to revisit previous results, which allude to performance degradation for even small errors on job size estimates, thus limiting the applicability of size-based schedulers. We show that scheduling performance is tightly connected to workload characteristics: in the absence of large skew in the job size distribution, even extremely imprecise estimates suffice to outperform size-oblivious disciplines. Instead, when job sizes are heavily skewed, known size-based disciplines suffer. In this context, we show -- for the first time -- the dichotomy of over-estimation versus under-estimation. The former is, in general, less problematic than the latter, as its effects are localized to individual jobs. Instead, under-estimation leads to severe problems that may affect a large number of jobs. We present an approach to mitigate these problems: our technique requires no complex modifications to original scheduling policies and performs very well. To support our claim, we proceed with a simulation-based evaluation that covers an unprecedented large parameter space, which takes into account a variety of synthetic and real workloads. As a consequence, we show that size-based scheduling is practical and outperforms alternatives in a wide array of use-cases, even in presence of inaccurate size information.

Transcript of Revisiting Size-Based Scheduling with Estimated Job Sizes

Page 1: Revisiting Size-Based Scheduling with Estimated Job Sizes

Revisiting Size-Based Scheduling

with Estimated Job Sizes

Matteo Dell’Amico (EURECOM, France),

Damiano Carra (Univ. Verona, Italy)

Mario Pastorelli, Pietro Michiardi (EURECOM, France)

MASCOTS 2014

These slides at http://bit.ly/schedsim

1

Page 2: Revisiting Size-Based Scheduling with Estimated Job Sizes

On Size-Based Scheduling

Page 3: Revisiting Size-Based Scheduling with Estimated Job Sizes

On Size-Based Scheduling An Example

Processor-Sharing vs. Size-Based

100usage (%)

cluster

50

10 15 37.5 42.5 50

time(s)

100usage (%)

cluster

10 5020 30

50

time(s)

job 1

job 2

job 3

job 1 job 3job 2 job 1

3

Page 4: Revisiting Size-Based Scheduling with Estimated Job Sizes

On Size-Based Scheduling An Example

Processor-Sharing vs. Size-Based

100usage (%)

cluster

50

10 15 37.5 42.5 50

time(s)

100usage (%)

cluster

10 5020 30

50

time(s)

job 1

job 2

job 3

job 1 job 3job 2 job 1

3

Page 5: Revisiting Size-Based Scheduling with Estimated Job Sizes

On Size-Based Scheduling Properties

Size-Based Scheduling: Some Properties

Shortest Remaining Processing Time (SRPT)

Minimizesmean sojourn time (MST) [Schrage, OPER RES ’68]

Sojourn time: interval between job submission and completion

Fair Sojourn Protocol (FSP)

Jobs are scheduled in the order they would complete if doing

Processor Sharing (PS)

Efficiency: very close to SRPT

Fairness: each job completes not later than Processor Sharing[Friedman & Henderson, SIGMETRICS ’03]

4

Page 6: Revisiting Size-Based Scheduling with Estimated Job Sizes

On Size-Based Scheduling Properties

Size-Based Scheduling: Some Properties

Shortest Remaining Processing Time (SRPT)

Minimizesmean sojourn time (MST) [Schrage, OPER RES ’68]

Sojourn time: interval between job submission and completion

Fair Sojourn Protocol (FSP)

Jobs are scheduled in the order they would complete if doing

Processor Sharing (PS)

Efficiency: very close to SRPT

Fairness: each job completes not later than Processor Sharing[Friedman & Henderson, SIGMETRICS ’03]

4

Page 7: Revisiting Size-Based Scheduling with Estimated Job Sizes

On Size-Based Scheduling Related Work

Where Are All the Size-Based Schedulers?

Job size is almost never known a priori

Related Work: Inexact Job Size Information

Simulation-based study: estimations need to be precise[Lu et al., MASCOTS 2004]

Analytic study: bounded errors, over-estimation only[Wierman & Nuyens, SIGMETRICS PER, 2008]

What Motivated Us

We developed HFSP, a size-based scheduler for Hadoop

We found it works very well even with very rough estimations[Pastorelli et al., BIGDATA 2013]

5

Page 8: Revisiting Size-Based Scheduling with Estimated Job Sizes

On Size-Based Scheduling Related Work

Where Are All the Size-Based Schedulers?

Job size is almost never known a priori

Related Work: Inexact Job Size Information

Simulation-based study: estimations need to be precise[Lu et al., MASCOTS 2004]

Analytic study: bounded errors, over-estimation only[Wierman & Nuyens, SIGMETRICS PER, 2008]

What Motivated Us

We developed HFSP, a size-based scheduler for Hadoop

We found it works very well even with very rough estimations[Pastorelli et al., BIGDATA 2013]

5

Page 9: Revisiting Size-Based Scheduling with Estimated Job Sizes

On Size-Based Scheduling Related Work

Where Are All the Size-Based Schedulers?

Job size is almost never known a priori

Related Work: Inexact Job Size Information

Simulation-based study: estimations need to be precise[Lu et al., MASCOTS 2004]

Analytic study: bounded errors, over-estimation only[Wierman & Nuyens, SIGMETRICS PER, 2008]

What Motivated Us

We developed HFSP, a size-based scheduler for Hadoop

We found it works very well even with very rough estimations[Pastorelli et al., BIGDATA 2013]

5

Page 10: Revisiting Size-Based Scheduling with Estimated Job Sizes

Understanding Size-Based Scheduling With Errors

Page 11: Revisiting Size-Based Scheduling with Estimated Job Sizes

Understanding Size-Based Scheduling With Errors Simulation

Scheduling Simulation

Main Features

Simulates single-server, preemptive scheduling

Can create synthetic traces or replay real ones

Injects artificial size estimation errors

in this case, SRPT and FSP become SRPTE and FSPE

Efficient and easy to prototype new schedulers

10,000 jobs are simulated in half a second on an old laptopFSP is ~50 lines of Python code

Free Software: Apache License 2.0

https://bitbucket.org/bigfootproject/schedsim

7

Page 12: Revisiting Size-Based Scheduling with Estimated Job Sizes

Understanding Size-Based Scheduling With Errors Simulation

Log-Normal Error Distribution

0.0 0.5 1.0 1.5 2.0 2.5 3.0x

0.0

0.5

1.0

1.5

2.0

2.5

3.0

3.5

4.0

PDF

sigma= 0.125sigma= 0.25sigma= 1sigma= 4

Error: real sizeestimated size

8

Page 13: Revisiting Size-Based Scheduling with Estimated Job Sizes

Understanding Size-Based Scheduling With Errors Simulation

Weibull Job Size Distribution

0.0 0.5 1.0 1.5 2.0 2.5 3.0x

0.0

0.5

1.0

1.5

2.0

PDF

shape= 0.125shape= 1shape= 2shape= 4

Interpolates between

heavy-tailed job size distributions (shape<1)exponential distributions (shape=1)bell-shaped distributions (shape>1)

9

Page 14: Revisiting Size-Based Scheduling with Estimated Job Sizes

Understanding Size-Based Scheduling With Errors Simulation

Other Parameters

Number of jobs (default: 10,000 per workload)

at least 30 repetitions per data point

System load (default: 0.9)

Ratio between requested and available resources

Job arrival time (default: exponential)

Bursts vs. regular

These parameters are not fundamental

more details in the paper

10

Page 15: Revisiting Size-Based Scheduling with Estimated Job Sizes

Understanding Size-Based Scheduling With Errors Simulation Results

Size-Based Scheduling With Errors

SRPTE FSPE

Problems for heavy-tailed job size distributions

Otherwise, size-based schedulingworks very well

11

Page 16: Revisiting Size-Based Scheduling with Estimated Job Sizes

Understanding Size-Based Scheduling With Errors Simulation Results

Over-Estimations and Under-Estimations

Over-­‐es'ma'on   Under-­‐es'ma'on  

t  

t  

t  

t  

Remaining  size  

Remaining  size  

Remaining  size  

Remaining  size  

J1   J2  J3  

J2  J3  

J1  ^  

J4  

J5  J6  

J4   J5  J6  

^  

Over-estimating hurts a single job: limited damage

Under-estimating very large jobs canwreak havoc

12

Page 17: Revisiting Size-Based Scheduling with Estimated Job Sizes

Size-Based Scheduling For Approximate Sizes

Page 18: Revisiting Size-Based Scheduling with Estimated Job Sizes

Size-Based Scheduling For Approximate Sizes FSPE+PS

FSPE + PS

Idea

Without errors, real jobs always complete before virtual ones

When they don’t (they are late), there has been a mistake

The scheduler can realize this, and take corrective action

Realization

A scheduler such that late jobs don’t block the system

Just do processor sharing between them instead of scheduling

the “most late” one

14

Page 19: Revisiting Size-Based Scheduling with Estimated Job Sizes

Size-Based Scheduling For Approximate Sizes FSPE+PS

FSPE + PS: Results

FSPE FSPE + PS

Performance becomes very close to optimal

Outperformed by PS only for extreme skew and errors

Replaying real-world traces gives analogous results

15

Page 20: Revisiting Size-Based Scheduling with Estimated Job Sizes

Size-Based Scheduling For Approximate Sizes Comparison With SRPT

Schedulers vs. SRPT

0.125 0.25 0.5 1 2 4shape

2

4

6

8

10

MST

/M

ST(S

RPT

) SRPTEFSPEFSPE+PS

PSLASFIFO

Sigma: 0.516

Page 21: Revisiting Size-Based Scheduling with Estimated Job Sizes

Size-Based Scheduling For Approximate Sizes Real Workloads

Hadoop@ Facebook

0.125 0.25 0.5 1 2 4sigma

2

4

6

8

10

MST

/M

ST(S

RPT

) SRPTEFSPEFSPE+PS

PSLAS

0.125 0.25 0.5 1 2 4sigma

2

4

6

8

10

MST

/M

ST(S

RPT

) SRPTEFSPEFSPE+PS

PSLAS

Synthetic workload (shape=0.25) Facebook Hadoop Cluster

17

Page 22: Revisiting Size-Based Scheduling with Estimated Job Sizes

Size-Based Scheduling For Approximate Sizes Real Workloads

Web Cache

0.125 0.25 0.5 1 2 4sigma

1

10

100

MST

/M

ST(S

RPT

) SRPTEFSPEFSPE+PS

PSLAS

0.125 0.25 0.5 1 2 4sigma

1

10

100

1000

10000

MST

/M

ST(S

RPT

) SRPTEFSPEFSPE+PS

PSLASFIFO

Synthetic workload (shape=0.177) IRCache Web Cache

18

Page 23: Revisiting Size-Based Scheduling with Estimated Job Sizes

Take-Home Messages

Page 24: Revisiting Size-Based Scheduling with Estimated Job Sizes

Take-Home Messages

Take-Home Messages

For System Designers

Do not be afraid of size-based scheduling

it can work great even with very rough estimations

Further Research

Schedulers like FSPE+PS, designed for estimated sizes, work

very well

Can we design a scheduler that always outperforms PS?

Can we get better analytical understanding of the phenomena

we observed?

These slides (plus bonus content) available at

http://bit.ly/schedsim

20

Page 25: Revisiting Size-Based Scheduling with Estimated Job Sizes

Bonus Content

Page 26: Revisiting Size-Based Scheduling with Estimated Job Sizes

Bonus Content Fairness

Fairness: Slowdown

100 101 102

slowdown

0.0

0.2

0.4

0.6

0.8

1.0

ECD

F SRPTEFSPEFSPE+PS

PSLASFIFO

100 101 102

slowdown

0.90

0.92

0.94

0.96

0.98

1.00

ECD

F

Shape: 0.25, sigma: 0.5

22

Page 27: Revisiting Size-Based Scheduling with Estimated Job Sizes

Bonus Content Fairness

Fairness: Conditional Slowdown

10−4 10−3 10−2 10−1 100 101 102

job size

100

101

102

103

104

105

106

107

slow

dow

n

SRPTEFSPEFSPE+PS

PSLASFIFO

Shape: 0.25, sigma: 0.523