RUN: Optimal Multiprocessor Real-Time Scheduling via Reduction to Uniprocessor
description
Transcript of RUN: Optimal Multiprocessor Real-Time Scheduling via Reduction to Uniprocessor
![Page 1: RUN: Optimal Multiprocessor Real-Time Scheduling via Reduction to Uniprocessor](https://reader036.fdocuments.net/reader036/viewer/2022062315/56815a08550346895dc75684/html5/thumbnails/1.jpg)
RUN: Optimal Multiprocessor
Real-Time Scheduling via
Reduction to Uniprocessor
Paul Regnier† George Lima† Ernesto Massa†
Greg Levin‡ Scott Brandt‡
†Federal University of Bahia ‡ University of California Brazil Santa Cruz
![Page 2: RUN: Optimal Multiprocessor Real-Time Scheduling via Reduction to Uniprocessor](https://reader036.fdocuments.net/reader036/viewer/2022062315/56815a08550346895dc75684/html5/thumbnails/2.jpg)
Multiprocessors
Most high-end computers today have multiple processors
In a busy computational environment, multiple processes compete for processor time More processors means more scheduling
complexity
2
![Page 3: RUN: Optimal Multiprocessor Real-Time Scheduling via Reduction to Uniprocessor](https://reader036.fdocuments.net/reader036/viewer/2022062315/56815a08550346895dc75684/html5/thumbnails/3.jpg)
3
Real-Time Multiprocessor Scheduling Real-time tasks have workload deadlines
Hard real-time = “Meet all deadlines!”
Problem: Schedule a set of periodic, hard real-time tasks on a multiprocessor systems so that all deadlines are met.
![Page 4: RUN: Optimal Multiprocessor Real-Time Scheduling via Reduction to Uniprocessor](https://reader036.fdocuments.net/reader036/viewer/2022062315/56815a08550346895dc75684/html5/thumbnails/4.jpg)
4
Example: EDF on One Processor On a single processor, Earliest Deadline First (EDF) is
optimal (it can schedule any feasible task set)
Task 1:
time = 0 10 20 30
CPU
2 units of work forevery 10 units of time
Task 3: 10 units of work forevery 25 units of time
Task 2: 6 units of work forevery 15 units of time
![Page 5: RUN: Optimal Multiprocessor Real-Time Scheduling via Reduction to Uniprocessor](https://reader036.fdocuments.net/reader036/viewer/2022062315/56815a08550346895dc75684/html5/thumbnails/5.jpg)
5
Scheduling Three Tasks Example: 2 processors; 3 tasks, each with 2 units
of work required every 3 time unitsjob release
time = 0
deadline
Task 1
Task 2
Task 3
1 2 3
![Page 6: RUN: Optimal Multiprocessor Real-Time Scheduling via Reduction to Uniprocessor](https://reader036.fdocuments.net/reader036/viewer/2022062315/56815a08550346895dc75684/html5/thumbnails/6.jpg)
6
Global Schedule Example: 2 processors; 3 tasks, each with 2 units
of work required every 3 time units
time = 0
CPU 1
CPU 2
1 2 3
Task 1 migrates between processors
![Page 7: RUN: Optimal Multiprocessor Real-Time Scheduling via Reduction to Uniprocessor](https://reader036.fdocuments.net/reader036/viewer/2022062315/56815a08550346895dc75684/html5/thumbnails/7.jpg)
Taxonomy of Multiprocessor Scheduling Algorithms
UniprocessorAlgorithms
LLFEDF
PartitionedAlgorithms
GlobalizedUniprocessor
Algorithms
GlobalEDF
DedicatedGlobal
Algorithms
PartitionedEDF
OptimalAlgorithms
EKG
DP-Wrap
pfair
LLREF
7
Uniprocessor
Multiprocessor
![Page 8: RUN: Optimal Multiprocessor Real-Time Scheduling via Reduction to Uniprocessor](https://reader036.fdocuments.net/reader036/viewer/2022062315/56815a08550346895dc75684/html5/thumbnails/8.jpg)
8
Problem Model
n tasks running on m processors
A periodic task T = (p, e) requires a workload e be completed within each period of length p
T's utilization u = e / p is the fraction of each period that the task must execute
job release
time = 0 p 2p 3p
job releasejob releasejob release
period p
workload e
![Page 9: RUN: Optimal Multiprocessor Real-Time Scheduling via Reduction to Uniprocessor](https://reader036.fdocuments.net/reader036/viewer/2022062315/56815a08550346895dc75684/html5/thumbnails/9.jpg)
9
Assumptions
Processor Identity: All processors are equivalent
Task Independence: Tasks are independent
Task Unity: Tasks run on one processor at a time
Task Migration: Tasks may run on different processors at different times
No overhead: free context switches and migrations
In practice: built into WCET estimates
![Page 10: RUN: Optimal Multiprocessor Real-Time Scheduling via Reduction to Uniprocessor](https://reader036.fdocuments.net/reader036/viewer/2022062315/56815a08550346895dc75684/html5/thumbnails/10.jpg)
10
The Big Goal (Version 1)
Design an optimal scheduling algorithm for periodic task sets on multiprocessors
A task set is feasible if there exists a schedule that meets all deadlines
A scheduling algorithm is optimal if it can
always schedule any feasible task set
![Page 11: RUN: Optimal Multiprocessor Real-Time Scheduling via Reduction to Uniprocessor](https://reader036.fdocuments.net/reader036/viewer/2022062315/56815a08550346895dc75684/html5/thumbnails/11.jpg)
11
Necessary and Sufficient Conditions
Any set of tasks needing at most
1) 1 processor for each task ( for all i, ui ≤ 1 ) , and
2) m processors for all tasks ( ui ≤ m)
is feasible
1) Status: Solved
1) pfair (1996) was the first optimal algorithm
![Page 12: RUN: Optimal Multiprocessor Real-Time Scheduling via Reduction to Uniprocessor](https://reader036.fdocuments.net/reader036/viewer/2022062315/56815a08550346895dc75684/html5/thumbnails/12.jpg)
12
The Big Goal (Version 2)
Design an optimal scheduling algorithm with fewer context switches and migrations
( Finding feasible schedule with fewest migrations is NP-Complete )
![Page 13: RUN: Optimal Multiprocessor Real-Time Scheduling via Reduction to Uniprocessor](https://reader036.fdocuments.net/reader036/viewer/2022062315/56815a08550346895dc75684/html5/thumbnails/13.jpg)
13
The Big Goal (Version 2)
Design an optimal scheduling algorithm with fewer context switches and migrations
Status: Ongoing…
All existing improvements over pfair use some form of deadline partitioning…
![Page 14: RUN: Optimal Multiprocessor Real-Time Scheduling via Reduction to Uniprocessor](https://reader036.fdocuments.net/reader036/viewer/2022062315/56815a08550346895dc75684/html5/thumbnails/14.jpg)
14
Deadline Partitioning
Task 1
Task 2
Task 4
Task 3
![Page 15: RUN: Optimal Multiprocessor Real-Time Scheduling via Reduction to Uniprocessor](https://reader036.fdocuments.net/reader036/viewer/2022062315/56815a08550346895dc75684/html5/thumbnails/15.jpg)
15
Deadline Partitioning
CPU 2
CPU 1
![Page 16: RUN: Optimal Multiprocessor Real-Time Scheduling via Reduction to Uniprocessor](https://reader036.fdocuments.net/reader036/viewer/2022062315/56815a08550346895dc75684/html5/thumbnails/16.jpg)
16
The Big Goal (Version 2)
Design an optimal scheduling algorithm with fewer context switches and migrations
Status: Ongoing…
All existing improvements over pfair use some form of deadline partitioning…
… but all the activity in each time window still leads to a large amount of preemptions and migrations
Our Contribution: The first optimal algorithm which does not depend on deadline partitioning, and which therefore has much lower overhead
![Page 17: RUN: Optimal Multiprocessor Real-Time Scheduling via Reduction to Uniprocessor](https://reader036.fdocuments.net/reader036/viewer/2022062315/56815a08550346895dc75684/html5/thumbnails/17.jpg)
17
The RUN Scheduling Algorithm
RUN uses a sequence of reduction operations to transform a multiprocessor problem into a collection of uniprocessor problems Uniprocessor scheduling is solved optimally by
Earliest Deadline First (EDF) Uniprocessor schedules are transformed back into a
single multiprocessor schedule
A reduction operation is composed of two steps: packing and dual
![Page 18: RUN: Optimal Multiprocessor Real-Time Scheduling via Reduction to Uniprocessor](https://reader036.fdocuments.net/reader036/viewer/2022062315/56815a08550346895dc75684/html5/thumbnails/18.jpg)
18
A collection of tasks with total utilization at most one can be packed into a single fixed-utilization task:
The Packing Operation
Task 1: u = 0.1
Task 2: u = 0.3
Task 3: u = 0.4
Packed Task:u = 0.8
![Page 19: RUN: Optimal Multiprocessor Real-Time Scheduling via Reduction to Uniprocessor](https://reader036.fdocuments.net/reader036/viewer/2022062315/56815a08550346895dc75684/html5/thumbnails/19.jpg)
19
We divide time with the packed tasks’ deadlines, and schedule them with EDF. In each segment, consume according to total utilization.
Scheduling a Packed Task’s Clients
Task 1: p=5 , u=.1
Task 2: p=3 , u=.3
Task 3: p=2 , u=.4
EDF schedule of tasks
![Page 20: RUN: Optimal Multiprocessor Real-Time Scheduling via Reduction to Uniprocessor](https://reader036.fdocuments.net/reader036/viewer/2022062315/56815a08550346895dc75684/html5/thumbnails/20.jpg)
20
The packed task temporarily replaces (acts as a proxy for) its clients. It may not be periodic, but has a fixed utilization in each segment.
Defining a Packed Task
Packed Task
EDF schedule of tasks
Task 1: p=5 , u=.1
Task 2: p=3 , u=.3
Task 3: p=2 , u=.4
![Page 21: RUN: Optimal Multiprocessor Real-Time Scheduling via Reduction to Uniprocessor](https://reader036.fdocuments.net/reader036/viewer/2022062315/56815a08550346895dc75684/html5/thumbnails/21.jpg)
21
The dual of a task T is a task T* with the same deadlines, but complementary utilization / workloads:
The Dual of a Task
T: u = 0.4, p = 3
T*: u = 0.6, p = 3
![Page 22: RUN: Optimal Multiprocessor Real-Time Scheduling via Reduction to Uniprocessor](https://reader036.fdocuments.net/reader036/viewer/2022062315/56815a08550346895dc75684/html5/thumbnails/22.jpg)
22
The Dual of a System
Given a system of n tasks and m processors, assume full utilization ( ui = m )
The dual system consists of n dual tasks running on n-m processors Note that the dual utilizations sum to ui = n - ui = n-m
If n < 2m, the dual system has fewer processors
The dual task T* represents the idle time of T. T* is executed in the dual system precisely when T is idle, and vice versa.
![Page 23: RUN: Optimal Multiprocessor Real-Time Scheduling via Reduction to Uniprocessor](https://reader036.fdocuments.net/reader036/viewer/2022062315/56815a08550346895dc75684/html5/thumbnails/23.jpg)
23
The Dual of a SystemTask 1
Task 2
Task 3
![Page 24: RUN: Optimal Multiprocessor Real-Time Scheduling via Reduction to Uniprocessor](https://reader036.fdocuments.net/reader036/viewer/2022062315/56815a08550346895dc75684/html5/thumbnails/24.jpg)
24
The Dual of a SystemTask 1
Task 2
Task 3
time = 0 1 2 3
Dual System
![Page 25: RUN: Optimal Multiprocessor Real-Time Scheduling via Reduction to Uniprocessor](https://reader036.fdocuments.net/reader036/viewer/2022062315/56815a08550346895dc75684/html5/thumbnails/25.jpg)
25
The Dual of a System
Because each dual task represents the idle time of its primal task, scheduling the dual system is equivalent to scheduling the original system.
Original System
time = 0 1 2 3
Dual System
![Page 26: RUN: Optimal Multiprocessor Real-Time Scheduling via Reduction to Uniprocessor](https://reader036.fdocuments.net/reader036/viewer/2022062315/56815a08550346895dc75684/html5/thumbnails/26.jpg)
26
The RUN Algorithm
Reduction = Packing + Dual Keep packing until remaining tasks satisfy:
ui + uj > 1 for all tasks Ti , Tj
Schedule of Dual System
Schedule for “Primal” System
Replace packed tasks in schedule with their clients, ordered by EDF
![Page 27: RUN: Optimal Multiprocessor Real-Time Scheduling via Reduction to Uniprocessor](https://reader036.fdocuments.net/reader036/viewer/2022062315/56815a08550346895dc75684/html5/thumbnails/27.jpg)
27
RUN: A Simple Example
Task 1: u=.2
Task 2: u=.6
Task 3: u=.3
Step 1: Pack tasks until all pairs have utilization > 1
Task 5: u=.5
Task 4: u=.4
Task 12: u=.8
![Page 28: RUN: Optimal Multiprocessor Real-Time Scheduling via Reduction to Uniprocessor](https://reader036.fdocuments.net/reader036/viewer/2022062315/56815a08550346895dc75684/html5/thumbnails/28.jpg)
28
RUN: A Simple Example
Task 3: u=.3
Step 1: Pack tasks until all pairs have utilization > 1
Task 5: u=.5
Task 4: u=.4
Task 12: u=.8
Task 34: u=.7
![Page 29: RUN: Optimal Multiprocessor Real-Time Scheduling via Reduction to Uniprocessor](https://reader036.fdocuments.net/reader036/viewer/2022062315/56815a08550346895dc75684/html5/thumbnails/29.jpg)
29
RUN: A Simple Example Step 2: Find the dual system on n-m = 3-2 = 1 processor
Task 5: u=.5
Task 12: u=.8
Task 34: u=.7
![Page 30: RUN: Optimal Multiprocessor Real-Time Scheduling via Reduction to Uniprocessor](https://reader036.fdocuments.net/reader036/viewer/2022062315/56815a08550346895dc75684/html5/thumbnails/30.jpg)
30
RUN: A Simple Example Step 2: Find the dual system on n-m = 3-2 = 1 processor
Task 5: u=.5
Task 12: u=.8
Task 34: u=.7
Task 5*: u=.5
Task 12*: u=.2
Task 34*: u=.3
![Page 31: RUN: Optimal Multiprocessor Real-Time Scheduling via Reduction to Uniprocessor](https://reader036.fdocuments.net/reader036/viewer/2022062315/56815a08550346895dc75684/html5/thumbnails/31.jpg)
31
RUN: A Simple Example Step 3: Schedule uniprocessor dual with EDF
Task 5*: u=.5
Task 12*: u=.2
Task 34*: u=.3
time = 0 2 4 6
Dual CPU
![Page 32: RUN: Optimal Multiprocessor Real-Time Scheduling via Reduction to Uniprocessor](https://reader036.fdocuments.net/reader036/viewer/2022062315/56815a08550346895dc75684/html5/thumbnails/32.jpg)
32
RUN: A Simple Example Step 4: Schedule packed tasks from dual schedule
time = 0 2 4 6
Dual CPU
![Page 33: RUN: Optimal Multiprocessor Real-Time Scheduling via Reduction to Uniprocessor](https://reader036.fdocuments.net/reader036/viewer/2022062315/56815a08550346895dc75684/html5/thumbnails/33.jpg)
33
RUN: A Simple Example Step 4: Schedule packed tasks from dual schedule
time = 0 2 4 6
Dual CPU
CPU 1
CPU 2
![Page 34: RUN: Optimal Multiprocessor Real-Time Scheduling via Reduction to Uniprocessor](https://reader036.fdocuments.net/reader036/viewer/2022062315/56815a08550346895dc75684/html5/thumbnails/34.jpg)
34
RUN: A Simple Example Step 5: Packed tasks schedule their clients with EDF
time = 0 2 4 6
Dual CPU
CPU 1
CPU 2
![Page 35: RUN: Optimal Multiprocessor Real-Time Scheduling via Reduction to Uniprocessor](https://reader036.fdocuments.net/reader036/viewer/2022062315/56815a08550346895dc75684/html5/thumbnails/35.jpg)
35
RUN: A Simple Example The original task set has been successfully scheduled!
time = 0 2 4 6
CPU 1
CPU 2
Task 1: u=.2
Task 2: u=.6
Task 3: u=.3
Task 5: u=.5
Task 4: u=.4
![Page 36: RUN: Optimal Multiprocessor Real-Time Scheduling via Reduction to Uniprocessor](https://reader036.fdocuments.net/reader036/viewer/2022062315/56815a08550346895dc75684/html5/thumbnails/36.jpg)
36
RUN: A Few Details
Several reductions may be necessary to produce a uniprocessor system Reduction reduces the number of processors by at
least half On random task sets with 100 processors and
hundreds of tasks, less than 1 in 600 task sets required more than 2 reductions.
RUN requires ui = m . When this is not the case, dummy tasks may be
added during the first packing step to create a partially or entirely partitioned system.
![Page 37: RUN: Optimal Multiprocessor Real-Time Scheduling via Reduction to Uniprocessor](https://reader036.fdocuments.net/reader036/viewer/2022062315/56815a08550346895dc75684/html5/thumbnails/37.jpg)
37
Proven Theoretical Performance
RUN is optimal
Reduction is done off-line, prior to execution, and takes O(n log n)
Each scheduler invocation is O(n), with total scheduling overhead O(jn log m) when j jobs are scheduled
RUN suffers at most (3r+1)/2 preemptions per job on task sets requiring r reductions.
Since r ≤ 2 for most task sets, this gives a theoretical upper bound of 4 preemptions per job.
In practice, we never observed more than 3 preeptions per job on our randomly generated task sets.
![Page 38: RUN: Optimal Multiprocessor Real-Time Scheduling via Reduction to Uniprocessor](https://reader036.fdocuments.net/reader036/viewer/2022062315/56815a08550346895dc75684/html5/thumbnails/38.jpg)
38
Simulation and Comparison
We evaluated RUN in side-by-side simulations with existing optimal algorithms LLREF, DP-Wrap, and EKG. Each data point is the average of 1000 task sets,
generated uniformly at random with utilizations in the range [0.1, 0.99] and integer periods in the range [5,100].
Simulations were run for 1000 time units each.
Values shown are average migrations and preemptions per job.
![Page 39: RUN: Optimal Multiprocessor Real-Time Scheduling via Reduction to Uniprocessor](https://reader036.fdocuments.net/reader036/viewer/2022062315/56815a08550346895dc75684/html5/thumbnails/39.jpg)
39
Comparison Varying Processors Number of processors varies from m = 2 to 32,
with 2m tasks and 100% utilization
![Page 40: RUN: Optimal Multiprocessor Real-Time Scheduling via Reduction to Uniprocessor](https://reader036.fdocuments.net/reader036/viewer/2022062315/56815a08550346895dc75684/html5/thumbnails/40.jpg)
40
Comparison Varying Utilization Total utilization varies from 55 to 100%, with 24
tasks running on 16 processors
![Page 41: RUN: Optimal Multiprocessor Real-Time Scheduling via Reduction to Uniprocessor](https://reader036.fdocuments.net/reader036/viewer/2022062315/56815a08550346895dc75684/html5/thumbnails/41.jpg)
Ongoing Work
Heuristic improvements to RUN
Extend RUN to broader problems: sporadic arrivals, arbitrarily deadlines, non-identical multiprocessors, etc
Develop general theory and new examples of non-deadline-partitioning algorithms
41
![Page 42: RUN: Optimal Multiprocessor Real-Time Scheduling via Reduction to Uniprocessor](https://reader036.fdocuments.net/reader036/viewer/2022062315/56815a08550346895dc75684/html5/thumbnails/42.jpg)
Summary- The RUN Algorithm:
is the first optimal algorithm without deadline partitioning
outperforms existing optimal algorithms by a factor of 5 on large systems
scales well to larger systems
reduces gracefully to the very efficient Partitioned EDF on any task set that Partitioned EDF can schedule
42
![Page 43: RUN: Optimal Multiprocessor Real-Time Scheduling via Reduction to Uniprocessor](https://reader036.fdocuments.net/reader036/viewer/2022062315/56815a08550346895dc75684/html5/thumbnails/43.jpg)
43
Thanks for Listening
Questions?