Download - The Multiprocessor Real-Time Scheduling of General Task ...baruah/DISSERTATIONS/02fisher.pdf · Optimal online multiprocessor real-time scheduling algorithms for sporadic and more

The Multiprocessor Real-Time Scheduling ofGeneral Task Systems

byNathan Wayne Fisher

A dissertation submitted to the faculty of the University of North Carolina at ChapelHill in partial fulfillment of the requirements for the degree of Doctor of Philosophyin the Department of Computer Science.

Chapel Hill2007

Approved by:

Sanjoy K. Baruah

James H. Anderson

Kevin Jeffay

Giuseppe Lipari

Ketan Mayer-Patel

c© 2007

Nathan Wayne Fisher

ALL RIGHTS RESERVED

ii

Abstract

NATHAN WAYNE FISHER: The Multiprocessor Real-Time

Scheduling of General Task Systems.

(Under the direction of Sanjoy K. Baruah.)The recent emergence of multicore and related technologies in many commercial

systems has increased the prevalence of multiprocessor architectures. Contempora-

neously, real-time applications have become more complex and sophisticated in their

behavior and interaction. Inevitably, these complex real-time applications will be de-

ployed upon these multiprocessor platforms and require temporal analysis techniques

to verify their correctness. However, most prior research in multiprocessor real-time

scheduling has addressed the temporal analysis only of Liu and Layland task systems.

The goal of this dissertation is to extend real-time scheduling theory for multiproces-

sor systems by developing temporal analysis techniques for more general task models

such as the sporadic task model, the generalized multiframe task model, and the

recurring real-time task model. The thesis of this dissertation is:

Optimal online multiprocessor real-time scheduling algorithms for sporadic

and more general task systems are impossible; however, efficient, online

scheduling algorithms and associated feasibility and schedulability tests,

with provably bounded deviation from any optimal test, exist.

To support our thesis, this dissertation develops feasibility and schedulability tests

for various multiprocessor scheduling paradigms. We consider three classes of mul-

tiprocessor scheduling based on whether a real-time job may migrate between pro-

cessors: full-migration, restricted-migration, and partitioned. For all general task

iii

systems, we obtain feasibility tests for arbitrary real-time instances under the full-

and restricted-migration paradigms. Despite the existence of tests for feasibility, we

show that optimal online scheduling of sporadic and more general systems is im-

possible. Therefore, we focus on scheduling algorithms that have constant-factor

approximation ratios in terms of an analysis technique known as resource augmenta-

tion. We develop schedulability tests for scheduling algorithms, earliest-deadline-first

(edf) and deadline-monotonic (dm), under full-migration and partitioned scheduling

paradigms. Feasibility and schedulability tests presented in this dissertation use the

workload metrics of demand-based load and maximum job density and have provably

bounded deviation from optimal in terms of resource augmentation. We show the

demand-based load and maximum job density metrics may be exactly computed in

pseudo-polynomial time for general task systems and approximated in polynomial

time for sporadic task systems.

iv

Acknowledgments

My daily survival in the Ph.D. program required countless selfless acts of support,

generosity, and time by people in my personal and academic life. This section is my

humble attempt at acknowledging and thanking the people that have given so freely

throughout my Ph.D. career and made this dissertation possible.

I am deeply grateful to Sanjoy K. Baruah, my advisor, for the guidance, support,

respect, and kindness that he has shown me over the last four years. Sanjoy’s men-

toring, friendship, and collegiality enriched my academic life and have left a profound

impression on how academic research and collaboration should ideally be conducted.

While many students are fortunate to find a single mentor, I have been blessed with

two; Jim Anderson has been a constant source of invaluable encouragement, aid, and

expertise during my years at UNC. I am extremely thankful to the members of my

dissertation committee: Kevin Jeffay has provided me with wise advice and support

in both my research and career search; Giuseppe Lipari has graciously offered to serve

on my committee from across the Atlantic and provided unique feedback, comments,

and questions on multiprocessor scheduling; Ketan Mayer-Patel has offered insightful

comments and ideas to my research and its effective presentation. Other faculty mem-

ber who I owe gratitude for their support of my research or major Ph.D. milestones

include: Prasun Dewan, Don Smith, Jack Snoeyink, and Mary Whitton. I am also

grateful to the always helpful UNC Computer Science department staff; in particular,

I owe special thanks to Sandra Neely, Janet Jones, and Tammy Pike.

v

I would like to thank all my research collaborators (in addition to Sanjoy and

Jim) who have enhanced my enthusiasm and understanding of real-time systems: Ted

Baker, Joel Goossens, Pascal Richard, and Thi Huyen Chau Nguyen. Special thanks

goes to Marko Bertogna who visited UNC during the Fall semester of 2006 and whom

I spent many enjoyable lunches and dinners discussing multiprocessor scheduling. I,

in part, owe my first academic position outside graduation to the letters of recommen-

dation written by Jim, Joel, Kevin, Pascal, Sanjoy, and Ted, in addition to Gerhard

Fohler and Giorgio Buttazzo. Since the path through the Ph.D. program would be

much more difficult without examples of success, I am indebted to Shelby Funk and

Uma Devi who have given friendship, guidance, and advice as recent real-time sys-

tem Ph.D. graduates. In addition, Mithun Arora, Aaron Block, Bjorn Brandenburg,

Vasile Bud, John Calandrino, Philip Holman, Hennadiy Leontyev, Abhishek Singh,

and Mengsheng Zhang, past and present students of the real-time systems group,

have all given me hours of helpful discussion and constructive criticism. Other UNC

graduate students outside the real-time group that have been especially supportive

include (but are not limited to): Nico Galoppo, Russ Gayle, Sasa Junuzovic, Stephen

Olivier, and Jeff Terrell.

My family and friends have been an unending source of love and inspiration

throughout my Ph.D. career. My parents, Wayne and Joy Fisher, have offered uncon-

ditional understanding and encouragement. My sisters, Krista, Jessica, and Natasha,

have kept me sane with their humor and understanding. My close friends in Chapel

Hill, Evan Davis, Andrea Ford, and James and Katharine McClure have provided

hours of enjoyable distraction from my work. Most importantly, my love, best friend,

and daily companion, Marcy Renski has given me constant compassion and inspira-

tion to give my best to achieve my goals while selflessly asking for little in return

for herself. Marcy has endured my long trips away from home, my untidy habits

vi

and occasional forgetfulness when busy on research, and many hours of practice talks

only to motivate me to try even harder. Marcy’s family have also been caring and

supportive.

vii

Contents

List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiv

List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xvi

List of Abbreviations . . . . . . . . . . . . . . . . . . . . . . . . . . . . xvii

1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

1.1 Real-Time Workload Models and Assumptions . . . . . . . . . . . . . 4

1.1.1 Completely-Specified Recurrent Task Systems . . . . . . . . . 6

1.1.1.1 Periodic Task Systems . . . . . . . . . . . . . . . . . 6

1.1.2 Partially-Specified Recurrent Task Systems . . . . . . . . . . . 7

1.1.2.1 Sporadic Task Systems with Implicit Deadlines (Liuand Layland (LL) Task Model) . . . . . . . . . . . . 8

1.1.2.2 Sporadic Task Systems with Explicit Deadlines . . . 10

1.1.2.3 Generalized Multiframe (GMF) Task Systems . . . . 11

1.1.2.4 Recurring Real-Time Task Systems . . . . . . . . . . 13

1.1.2.5 Relationship Between Task Models . . . . . . . . . . 17

1.2 Processing Platform . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

1.3 Real-Time Scheduling Algorithms . . . . . . . . . . . . . . . . . . . . 21

1.3.1 Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

1.3.2 Priority-Driven Scheduling Algorithms . . . . . . . . . . . . . 24

viii

1.3.2.1 Fixed Task-Priority (FTP) Scheduling Algorithms . . 25

1.3.2.2 Fixed Job-Priority (FJP) Scheduling Algorithms . . 26

1.3.2.3 Dynamic-Priority (DP) Scheduling Algorithms . . . . 27

1.3.3 Degree of Migration . . . . . . . . . . . . . . . . . . . . . . . 27

1.3.3.1 Partitioned Scheduling . . . . . . . . . . . . . . . . . 28

1.3.3.2 Full-Migration Scheduling . . . . . . . . . . . . . . . 28

1.3.3.3 Restricted-Migration Scheduling . . . . . . . . . . . 31

1.4 Formal Verification of Real-Time Systems . . . . . . . . . . . . . . . 32

1.4.1 Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

1.4.2 Feasibility Analysis . . . . . . . . . . . . . . . . . . . . . . . . 34

1.4.3 Schedulability Analysis . . . . . . . . . . . . . . . . . . . . . . 36

1.4.4 Evaluating the Effectiveness of a Verification Technique: Re-source Augmentation Analysis . . . . . . . . . . . . . . . . . . 37

1.5 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

1.6 Organization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

2 Related Work: Multiprocessor Real-time Scheduling . . . . . . . 44

2.1 Arbitrary Real-Time Instances . . . . . . . . . . . . . . . . . . . . . . 45

2.1.1 Impossibility of Optimal Online Scheduling of Arbitrary Real-Time Instances . . . . . . . . . . . . . . . . . . . . . . . . . . 46

2.1.2 Resource Augmentation Results for Online Scheduling Algorithms 47

2.1.3 The Predictability of Multiprocessor Scheduling Algorithms . 49

2.1.4 Synthetic Utilization . . . . . . . . . . . . . . . . . . . . . . . 51

2.2 LL Tasks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52

2.2.1 Task Utilization . . . . . . . . . . . . . . . . . . . . . . . . . . 53

ix

2.2.2 Dhall’s Effect . . . . . . . . . . . . . . . . . . . . . . . . . . . 54

2.2.3 Overview of Schedulability Tests . . . . . . . . . . . . . . . . 55

2.3 Sporadic Tasks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56

2.3.1 Limitation of Traditional Workload Metrics . . . . . . . . . . 57

2.3.2 Partitioned Scheduling . . . . . . . . . . . . . . . . . . . . . . 59

2.3.3 Full-Migration Schedulability Tests . . . . . . . . . . . . . . . 60

2.4 More General Task Models . . . . . . . . . . . . . . . . . . . . . . . . 64

2.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65

3 A Metric of Real-time Workload: Demand-Based Load and Maxi-mum Job Density . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66

3.1 Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67

3.2 Infeasibility Test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68

3.3 Demand-Based Load of Partially-Specified Recurrent Task Systems . 70

3.3.1 The dbf Abstraction . . . . . . . . . . . . . . . . . . . . . . . 71

3.4 Efficiently Calculating load for Sporadic Task Systems . . . . . . . . . 78

3.4.1 Properties of Demand-Based Load for a Sporadic Task System 78

3.4.2 An Exact Algorithm for Calculating Load . . . . . . . . . . . 82

3.4.3 Approximation Algorithms . . . . . . . . . . . . . . . . . . . . 86

3.4.3.1 Pseudo-polynomial-time Approximation Scheme . . . 86

3.4.3.2 Polynomial-time Approximation Scheme . . . . . . . 88

3.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93

4 The Restricted- and Full-Migration Feasibility Analysis of GeneralTask Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94

4.1 Restricted-Migration Feasibility . . . . . . . . . . . . . . . . . . . . . 95

x

4.1.1 Algorithm Analysis . . . . . . . . . . . . . . . . . . . . . . . . 96

4.1.2 Tightness of the Bound . . . . . . . . . . . . . . . . . . . . . . 102

4.2 Full-Migration Feasibility . . . . . . . . . . . . . . . . . . . . . . . . . 103

4.2.1 Proof of Theorem 4.3 . . . . . . . . . . . . . . . . . . . . . . . 106

4.2.1.1 Notation . . . . . . . . . . . . . . . . . . . . . . . . . 106

4.2.1.2 Outline . . . . . . . . . . . . . . . . . . . . . . . . . 108

4.2.1.3 Proof . . . . . . . . . . . . . . . . . . . . . . . . . . 110

4.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120

5 The Impossibility of Optimal Online Multiprocessor Scheduling Al-gorithms for General Task Systems . . . . . . . . . . . . . . . . . . 122

5.1 Impossibility of Optimal Online Scheduling . . . . . . . . . . . . . . . 124

5.2 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128

6 The Full-Migration Schedulability Analysis for General Task Sys-tems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130

6.1 Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131

6.2 dm-schedulability Conditions . . . . . . . . . . . . . . . . . . . . . . 132

6.3 edf-schedulability conditions . . . . . . . . . . . . . . . . . . . . . . 135

6.3.1 A Different edf-schedulability Test . . . . . . . . . . . . . . . 138

6.4 Full-Migration Schedulability of Sporadic Task Systems . . . . . . . . 139

6.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143

7 The Partitioned Scheduling and Schedulability Analysis of SporadicTask Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144

7.1 edf-based Partitioning . . . . . . . . . . . . . . . . . . . . . . . . . . 147

7.1.1 Approximation of dbf(τi, t) . . . . . . . . . . . . . . . . . . . 148

7.1.2 Density-Based Partitioning . . . . . . . . . . . . . . . . . . . . 150

xi

7.1.2.1 Resource Augmentation Analysis . . . . . . . . . . . 156

7.1.3 Algorithm edf-partition . . . . . . . . . . . . . . . . . . . . 157

7.1.4 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162

7.1.5 A Pragmatic Improvement . . . . . . . . . . . . . . . . . . . . 173

7.2 dm-based Partitioning . . . . . . . . . . . . . . . . . . . . . . . . . . 177

7.2.1 The Request-Bound Function . . . . . . . . . . . . . . . . . . 177

7.2.2 A Polynomial-Time Partitioning Algorithm . . . . . . . . . . . 179

7.2.2.1 Algorithm dm-partition . . . . . . . . . . . . . . . 179

7.2.2.2 Proof of Correctness . . . . . . . . . . . . . . . . . . 181

7.2.2.3 Computational Complexity . . . . . . . . . . . . . . 184

7.2.3 Theoretical Evaluation . . . . . . . . . . . . . . . . . . . . . . 184

7.2.3.1 Sufficient Schedulability Conditions . . . . . . . . . . 185

7.2.3.2 Resource Augmentation . . . . . . . . . . . . . . . . 190

7.2.3.3 Comparison with Prior Implicit-Deadline PartitioningResults . . . . . . . . . . . . . . . . . . . . . . . . . 191

7.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192

8 Conclusions and Future Work . . . . . . . . . . . . . . . . . . . . . 193

8.1 Summary of Results . . . . . . . . . . . . . . . . . . . . . . . . . . . 194

8.1.1 Efficient Workload Characterization for General Task Systems 194

8.1.2 Multiprocessor Feasibility Tests . . . . . . . . . . . . . . . . . 194

8.1.3 Impossibility of Optimal Online Multiprocessor Scheduling . . 196

8.1.4 Multiprocessor Schedulability Tests . . . . . . . . . . . . . . . 196

8.2 Related Research Contributions . . . . . . . . . . . . . . . . . . . . . 197

8.3 Future Research Agenda . . . . . . . . . . . . . . . . . . . . . . . . . 199

xii

8.3.1 Open Questions from Dissertation . . . . . . . . . . . . . . . . 199

8.3.2 Real-time Processor Virtualization with Resource Sharing . . 201

8.3.3 Approximate Response-time Analysis for Uniprocessors . . . . 201

8.4 Concluding Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . 202

A Proof of Theorem 5.1 . . . . . . . . . . . . . . . . . . . . . . . . . . 203

A.1 Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203

A.2 Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 204

A.3 Proof . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 211

A.3.1 Construction of Schedule S . . . . . . . . . . . . . . . . . . . 212

A.3.2 Construction of Schedule S ′I . . . . . . . . . . . . . . . . . . . 213

A.3.3 Construction of Schedule S ′′I and Proof of Theorem 5.1 . . . . 224

Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 231

xiii

List of Figures

1.1 Periodic Task . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

1.2 LL Task . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

1.3 Generalized Multiframe (GMF) Task . . . . . . . . . . . . . . . . . . 13

1.4 Recurring Real-Time Task . . . . . . . . . . . . . . . . . . . . . . . . 15

1.5 Converting a GMF Task to a Recurring Task . . . . . . . . . . . . . . 17

1.6 Relationship Between Task Models . . . . . . . . . . . . . . . . . . . 19

1.7 Symmetric Shared-Memory Multiprocessor (SMP) . . . . . . . . . . . 20

1.8 Example Schedules for Priority-Driven Algorithms . . . . . . . . . . . 26

1.9 Examples of Multiprocessor Schedules . . . . . . . . . . . . . . . . . . 29

1.10 High-Level Overview of Partitioned Scheduling . . . . . . . . . . . . . 30

1.11 High-Level Overview of Full-Migration Scheduling . . . . . . . . . . . 30

1.12 High-Level Overview of Restricted-Migration Scheduling . . . . . . . 31

2.1 Dhall’s Effect . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55

2.2 Regions of Feasibility for Sporadic Task Systems . . . . . . . . . . . . 60

3.1 Behavior of f(τi, t) for Example Tasks . . . . . . . . . . . . . . . . . 79

3.2 Load-Based Tests Reduce the Region of Uncertainty . . . . . . . . . . 81

3.3 Pseudo-code for Approximating load(τ) . . . . . . . . . . . . . . . . . 85

3.4 Pseudo-code for PTAS calculating load(τ) . . . . . . . . . . . . . . . 92

4.1 Pseudo-code for Restricted-Migration Job-Assignment Algorithm . . . 96

xiv

4.2 Overlapping Jobs from Proof of Theorem 4.1 . . . . . . . . . . . . . . 98

4.3 Feasibility Region of Equation 4.7 . . . . . . . . . . . . . . . . . . . . 100

4.4 Feasibility Region of Equation 4.10 . . . . . . . . . . . . . . . . . . . 104

4.5 Visual Depiction of a Spanning Chain . . . . . . . . . . . . . . . . . . 108

4.6 Linear Program Representing Minimum Work over a Spanning Chain 115

4.7 Dual Linear Program of Figure 4.6 . . . . . . . . . . . . . . . . . . . 119

5.1 Task System τexample and Its Execution . . . . . . . . . . . . . . . . . 127

5.2 Two Execution Scenarios for τexample . . . . . . . . . . . . . . . . . . 128

6.1 Interval Containing Jobs that May Interfere with Ji . . . . . . . . . . 136

7.1 The dbf and Its Linear Approximation . . . . . . . . . . . . . . . . . 149

7.2 Pictorial Representation of τi’s Computational Reservation in Processor-Sharing Schedule . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150

7.3 Pseudo-code for Density-Based Partitioning . . . . . . . . . . . . . . 152

7.4 Pseudo-code for dbf∗-Based Partitioning Algorithm . . . . . . . . . . 158

7.5 Plot of rbf(τi, t) and Its Approximation . . . . . . . . . . . . . . . . 177

A.1 Example Illustrating the Maximum Execution of τi in an Interval . . 209

A.2 Construction of Schedule S ′I . . . . . . . . . . . . . . . . . . . . . . . 215

xv

List of Tables

1.1 Summary of Task Models for Partially-Specified Task Systems . . . . 18

1.2 Overview of Dissertation Contributions . . . . . . . . . . . . . . . . . 42

2.1 Overview of Multiprocessor Schedulability Tests Based on SyntheticUtilization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52

2.2 Overview of Multiprocessor Schedulability Tests For LL Task Systems 56

8.1 Resource-Augmentation Guarantee for Feasibility Results of Dissertation195

8.2 Resource-Augmentation Guarantee for Schedulability Results of Dis-sertation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197

xvi

List of Abbreviations

DAG Directed Acyclic Graph

DBF Demand-Bound Function

DGMF Distributed Generalized Multiframe

DM Deadline Monotonic

DP Dynamic Priority

EDF Earliest-Deadline First

FFD First-Fit Decreasing

FJP Fixed Job-Priority

FTP Fixed Task-Priority

GMF Generalized Multiframe

ICD Implantable Cardioverter Defibrillator

LCM Least Common Multiple

LL Liu and Layland

LLF Least-Laxity First

PTAS Polynomial-Time Approximation Scheme

PPTAS Pseudo-Polynomial-Time Approximation Scheme

RBF Request-Bound Function

RM Rate Monotonic

SMP Symmetric Shared-Memory Multiprocessor

UMA Uniform Memory Access

xvii

Chapter 1

Introduction

Real-time systems are designed to satisfy notions of temporal correctness and pre-

dictability. In a real-time system, computations must occur by specified times. In our

daily lives, we rely on systems that have underlying temporal constraints including

avionic control systems, medical devices, network processors, digital video recording

devices, and many other systems and devices. In each of these systems there is a po-

tential penalty or consequence associated with the violation of a temporal constraint.

For example, in a safety-critical system, a temporal violation can be life-threatening:

a patient wearing an Implantable Cardioverter Defibrillator (ICD) is at risk of car-

diac arrest if the device does not administer shocks to the heart in a timely fashion.

In other (less critical) applications, violations of temporal constraints may result in

a degradation in the quality-of-service experienced by the application user: a user

listening to an MP3 file may experience audio jitter if the frames of the file are not

decoded at a consistent rate. Regardless of the application, a well-designed real-time

system should eliminate or minimize temporal constraint violations.

In a hard real-time system, the penalty for even a single temporal constraint vio-

lation is unacceptable. Typically, a hard real-time system associates a hard deadline

with each system computation. For a hard real-time system to be temporally correct,

each computation must successfully complete prior to its deadline. The designer of

a hard real-time system must verify that the system is correct prior to system run-

time; that is, for any possible execution of the system, the designer must verify that

each execution results in all deadlines being met. For all but the simplest systems,

the number of possible execution scenarios is either infinite or prohibitively large.

Therefore, exhaustive simulation or testing cannot be used to verify the temporal

correctness of a hard real-time system. Instead, formal analysis techniques are nec-

essary to ensure that the designed real-time systems are, by construction, provably

temporally correct and predictable.

For a system to be proven temporally correct, three aspects of a real-time system

must be specified:

1. Real-Time Workload : the computation produced by the real-time system that

must complete prior to its deadline. In many real-time systems, the workload

is modeled using the concept of a recurring tasks. A recurring task initiates,

over time, the execution of sequential chunks of code called jobs. Once a job is

initiated it must successfully complete its execution by an associated deadline.

For a hard real-time system to be temporally correct, each job must complete

by its deadline.

2. Processing Platform: the set of hardware resources upon which the jobs of the

real-time workload are executed. The set of hardware resources includes the pro-

cessor(s), memory, cache, processor/memory interconnect, etc. A uniprocessor

platform consists of a single processor; a multiprocessor platform is comprised

of a set of two or more processors.

3. Scheduling Algorithm: the algorithm that determines, at any time, which set of

jobs execute on the processing platform.

2

Over the past three decades, the majority of research on real-time formal verifi-

cation techniques has focused predominately on uniprocessor systems. Prior research

that has addressed multiprocessor real-time systems has assumed a relatively simple

task model for real-time workloads; specifically, most prior research has assumed that

the set of jobs generated by any task is homogenous (i.e., the execution characteristics

and deadline constraints of each job are identical) and that the deadline of any job

coincides with the arrival of the next job of the same task. Unfortunately, such sim-

ple task models preclude the consideration of real-time applications that exhibit more

complex behavior (e.g., tasks that generate heterogenous workloads) or dynamically

change their computational requirements at run-time.

Furthermore, the need to support real-time systems on multiprocessor platforms

has been brought to the forefront by the development of multicore architectures. With

the current emergence of commercial systems such as Intel’s Core 2 Duo and Quad

processors or IBM’s Cell multiprocessor and chip manufacturers’ forecast of over 32

cores on a chip in the near future (Calandrino et al., 2007), the next generation of

embedded and real-time hardware platforms will undoubtedly have the capability for

parallel execution, increasing the need for multiprocessor real-time analysis. Unfor-

tunately, as the previous paragraph points out, most techniques for temporal analysis

of uniprocessor systems cannot be trivially extended to multiprocessor systems

The goal of this dissertation is to increase the number of types of real-time systems

that can be proven temporally correct upon a multiprocessor platform. The achieve-

ment of this goal implies that more complex applications can now be proven tem-

porally correct on multiprocessor systems; ultimately, the realization of this goal

facilitates the leveraging of more powerful multiprocessor systems by complex real-

time applications that previously could only be temporally verified on uniprocessor

systems. We broaden the scope of analyzable real-time systems by considering very

3

general task models that allow tasks to generate heterogeneous workloads; addition-

ally, we remove some restrictive assumptions of the simpler model. For real-time

systems that may be modeled by the more general models, we develop analytical

techniques for formally verifying the temporal correctness of these systems upon mul-

tiprocessor platforms. Furthermore, we show that our proposed analytical techniques

have bounded deviation from any “hypothetically” optimal verification technique.

The remainder of this chapter formally introduces the concepts and terms used

throughout this dissertation. Section 1.1 formally describes models of real-time work-

load. Section 1.2 introduces the processing platforms considered. Section 1.3 formal-

izes, categorizes, and discusses various online multiprocessor scheduling algorithms.

Section 1.4 more concretely introduces concepts used in formal verification of real-

time systems. Section 1.5 explicitly details the contributions of this dissertation.

Section 1.6 outlines the overall structure of this document.

1.1 Real-Time Workload Models and Assumptions

Throughout this dissertation, we will characterize a real-time job Ji by a three-tuple

(Ai, Ei, Di): an arrival time Ai, an execution requirement Ei, and a relative deadline

Di. The interpretation of these parameters is that Ji arrives Ai time units after

system start-time (assumed to be zero) and must execute for Ei time units over the

time interval [Ai, Ai + Di). Ai is assumed to be a non-negative real number while

both Ei and Di are positive real numbers. The interval [Ai, Ai + Di) is referred to as

Ji’s scheduling window. A job Ji is said to be active at time t if t ∈ [Ai, Ai + Di) and

Ji has unfinished execution.

We denote a real-time instance I as a finite or infinite collection of jobs I =

{J1, J2, . . .}. Unless otherwise specified, we will assume that jobs are indexed in order

4

of their arrival-time (i.e., for Ji, Jj ∈ I: i < j if Ai < Aj). F(I) denotes a real-time

instance family with representative real-time instance I. For each job J ′i in real-time

instance I ′ ∈ F(I), there is a job Ji in instance I with the same release time and

deadline; however, the execution of J ′i cannot exceed the execution time of Ji. More

formally, I ′ ∈ F(I) if and only if

∀J ′i ∈ I ′,∃Ji ∈ I :: (A′

i = Ai) ∧ (D′i = Di) ∧ (E ′

i ≤ Ei).

Informally, F(I) represents a set of related real-time instances with I being the most

“temporally constrained” of the set.

Example 1.1 Consider a real-time instance I = {(0, 2, 3), (5, 4, 5), (6, 2, 4)}. F(I)

includes any instance I ′ = {(0, x, 3), (5, y, 5), (6, z, 4)} such that 0 ≤ x ≤ 2, 0 ≤ y ≤ 4,

and 0 ≤ z ≤ 2.

In some simpler real-time systems, it may be possible to completely specify the

real-time instance I prior to system run-time (i.e., the system designer has complete

knowledge of each Ji ∈ I). However, in systems with a large (or infinite) number of

real-time jobs or systems that exhibit dynamic behavior, explicitly specifying each

job, prior to system run-time, may be impossible or unreasonable. Fortunately, for

systems where jobs may repeat there is a more succinct representation of the repeating

jobs via specification in some recurrent task model. A task model is the format and

rules for specifying a task system. We may represent a set of repeating or related

jobs by a recurrent task τi specified according to the model M . For every execution

of the system, τi will generate a (possibly infinite) collection of real-time jobs.

Several recurrent tasks can be composed together into a recurrent task system

τ = {τ1, τ2, . . . , τn}. The letter n will denote the number of tasks in a task system

throughout this dissertation. Every system execution of task system τ will result in

5

the generation a real-time instance I. We will denote the set of real-time instances that

τ can legally generate as I M(τ). Based on the real-time instances that τ generates,

we can classify τ as either completely specified or partially-specified. We now discuss

the difference between these two types of systems.

1.1.1 Completely-Specified Recurrent Task Systems

If the arrival-time and deadline parameters of each job Ji ∈ I can be determined prior

to system run-time, τ is a completely-specified task system. Typically, completely-

specified task systems are appropriate for applications that have completely pre-

dictable executions and do not exhibit dynamic behavior. For example in an avionic

control system, the control system will sample and process the pilot’s input command

at strict periodic intervals (e.g., see (Kirsch et al., 2002)). A strict rate is required to

ensure that flight control response does not degrade. A completely-specified system

is sometimes called a concrete system (Jeffay et al., 1991).

1.1.1.1 Periodic Task Systems

The periodic task model (Liu and Layland, 1973) allows the specification of homoge-

nous sets of jobs that recur at strict periodic intervals. A periodic task τi is specified

by a three tuple (oi, ei, pi): oi is the offset of the first job generated by τi from

system start time; ei is the worst-case execution time of any job generated by τi;

and pi is the period or inter-arrival time between successive jobs of τi. The set

of jobs generated by a periodic task τi with worst-case possible execution times is

J PWCET(τi)

def= {(oi, ei, oi + pi), (oi + pi, ei, oi + 2pi), (oi + 2pi, ei, oi + 3pi), . . .}. Fig-

ure 1.1 illustrates the jobs generated by τi. Let IWCET =⋃

τi∈τ J PWCET(τi); then

I P(τ) ≡ F(IWCET) is the set of real-time instances that can be generated by thepe-

riodic task system τ .

6

time0

...

oi oi+pi

τioi+2pi oi+3pi oi+4pi

J1:ei J2:ei J3:ei J4:ei J5:ei

Figure 1.1: Jobs generated by periodic task τi. The first job arrives at time oi.Thereafter, successive jobs arrive every pi time units. The activation period of thek’th job of τi is the interval [oi + (k − 1)pi, oi + kpi). The “Ji : ei” above each jobindicates that job Ji must execute for ei time units during its scheduling window.

Example 1.2 Consider a periodic task τ = {τ1 = (0, 2, 4), τ2 = (5, 3, 10)}. The set of

jobs generated by τ1 with worst-case execution times is J PWCET(τ1) =

{(0, 2, 4), (4, 2, 4), (8, 2, 4), . . .}; for τ2, J PWCET(τ2) = {(5, 3, 10), (15, 3, 10), (25, 3, 10),

. . .}.

1.1.2 Partially-Specified Recurrent Task Systems

For many real-time systems, it is not possible to know beforehand what real-time

instance will be generated by the system during run-time. Furthermore, completely-

specified systems such as periodic task systems are incapable of handling changes in

real-time workloads because of the restrictive constraint that jobs must arrive at strict

periodic intervals; for systems where the arrival times between jobs change dynam-

ically (e.g., packets in a network), the periodic task model may not be appropriate.

To overcome the fragile and inflexible nature of completely-specified task systems, a

designer may instead consider partially-specified tasks systems. A partially-specified

task system is sometimes referred to as non-concrete (Jeffay et al., 1991).

Partially-specified task systems permit that different executions of the same sys-

tem may result in different real-time instances (with different job arrival times) being

generated. The specification for a partially-specified task system includes a set of

constraints that any generated real-time instance must satisfy; in general, such a sys-

7

tem may legally generate infinitely many different real-time instances, each of which

satisfies the constraints placed on their generation. Each such real-time instance may

also have infinitely many jobs.

Let M and M ′ be task models. We say that task model M ′ generalizes task model

M , if for every task system τ specified in model M there exists a task system τ ′

specified in model M ′ such that

I ∈ I M(τ) ⇔ I ∈ I M′

(τ ′).

That is, for all task systems τ that can be specified in task model M , there is a task

system τ ′ specified in task model M ′ that can generate the same real-time instances

as τ . The concept of generalizing a model will be made clearer in the remainder of

this subsection.

In this subsection, we will introduce several increasingly general models for partially-

specified task systems: the sporadic task model with implicit deadlines (Liu and

Layland task model), general sporadic task model with explicit deadlines, and the re-

curring real-time task model. These increasingly general task models can be used to

represent more complex applications than the restrictive periodic task model. After

introducing the increasingly general models, we discuss the relationship between the

various task models.

1.1.2.1 Sporadic Task Systems with Implicit Deadlines (Liu and Layland

(LL) Task Model)

The sporadic task model with implicit deadlines (hereafter, referred to as the Liu and

Layland (LL) task model (Liu and Layland, 1973)) removes the restrictive assumption

of the periodic task model that jobs of a task are generated at strict periodic intervals

(using the generalization discussed in (Mok, 1983)). In addition, an offset parameter is

8

not specified for LL tasks. The behavior of a LL task τi can be characterized by a two-

tuple (ei, pi). As with the periodic task model, ei indicates the worst-case execution

time of any job generated by task τi. The pi parameter indicates the minimum inter-

arrival time between successive jobs of τi (note pi denoted the exact inter-arrival time

for periodic tasks). Let J LLWCET(τi) be a collection of real-time instances that are jobs

generated by LL task τi satisfying the minimum inter-arrival constraint and requiring

the worst-case possible execution time; i.e., Iτiis a member of J LL

WCET(τi) if and only if

for all Jk ∈ Iτiwhere k > 0 (recall that the jobs are indexed in order of non-decreasing

arrival time) the following constraints are satisfied:

(Ek = ei) ∧ (Dk = pi) ∧ (Ak+1 − Ak ≥ pi). (1.1)

The set of real-time instances that a LL task system τ = {τ1, τ2, . . . , τn} can generate

(with worst-case possible execution time) is equal to

I LLWCET(τ)

def=

{

n⋃

i=1

Iτi

∣

∣

∣

∣

(Iτ1 , Iτ2 , . . . , Iτn) ∈

n∏

i=1

J LLWCET(τi)

}

. (1.2)

Thus, the set of real-time instances generated by LL task system τ is

I LL(τ) =⋃

Ij∈I LLWCET(τ)

F(Ij). (1.3)

Figure 1.2 shows an example release for LL task τi. The following example illus-

trates the increase in flexibility in considering the LL task model over the periodic

task model.

Example 1.3 Consider a LL task system with parameters similar to the task system

of Example 1.2: τ = {τ1 = (2, 4), τ2 = (3, 10)}. Examples of sets of jobs in J LLWCET(τ1)

are {(0, 2, 4), (4, 2, 4), (8, 2, 4), . . .}, {(0, 2, 4), (5, 2, 4), (9, 2, 4)}, and {(0, 2, 4), (6, 2, 4),

9

time0

...

A1 A1+pi

τi

J1:ei J2:ei J3:ei J4:ei J5:ei

A2 A2+pi = A3 A3+pi A4 A5A4+pi

Figure 1.2: Jobs generated by LL task τi. The first job of τi can arrive at any time; inthis figure, the first job arrives at time A1. Thereafter, successive jobs arrivals mustbe separated by at least pi time units. The Ji : ei above each job indicates that jobJi must execute for ei time units during its scheduling window.

(10, 2, 4), . . .}; examples of sets of jobs in J LLWCET(τ2) are {(0, 3, 10), (10, 3, 10),

(20, 3, 10), . . .}, {(1, 3, 10), (15, 3, 10), (25, 3, 10), . . .}, and {(5, 3, 10), (15, 3, 10),

(25, 3, 10), . . .}. Note that J PWCET((oi, ei, pi) = (0, 2, 4)) is an member of J LL

WCET(τ1)

and J PWCET((oj, ej, pj) = (5, 3, 10)) is a member of J LL

WCET(τ2) where (0, 2, 4) and

(5, 3, 10) are the two periodic tasks from Example 1.2.

1.1.2.2 Sporadic Task Systems with Explicit Deadlines

The LL task model allows for flexibility in the job arrival times for a task τi; however,

the model is still somewhat restrictive in forcing the deadline of each job generated

by τi to be equal to the minimum inter-arrival parameter pi. It is easy to imagine

scenarios where the deadline of a job is not correlated with the minimum inter-arrival:

for example, in a car’s brake system the minimum time between braking events may be

considerably larger than the required braking-reaction time (i.e., deadline for halting

the car). The sporadic task model with explicit deadlines (Mok, 1983) (hereafter,

simply referred to as the sporadic task model) which generalized the LL task model

by adding a relative deadline parameter di to the specification for a task. The relative

deadline parameter di indicates the offset of a job’s deadline from the arrival time

for any job generated by task τi. A sporadic task τi is specified by the three-tuple

(ei, di, pi). Let J SWCET(τi) be a collection of real-time instances that are jobs generated

10

by sporadic task τi satisfying the minimum inter-arrival constraint and requiring the

worst-case possible execution time; i.e., Iτiis a member of J S

WCET(τi) if and only if for

all Jk ∈ Iτiwhere k > 0 (recall that the jobs are indexed in order of non-decreasing

arrival time) the following constraints are satisfied:

(Ek = ei) ∧ (Dk = di) ∧ (Ak+1 − Ak ≥ pi). (1.4)

(Note that the only difference from Equation 1.1 for LL jobs is that the Dk parameter

for each job Jk is set to di). The set of real-time instances that a sporadic task system

τ = {τ1, τ2, . . . , τn} can generate (with worst-case possible execution times) is

I SWCET(τ)

def=

{

n⋃

i=1

Iτi

∣

∣

∣

∣

(Iτ1 , Iτ2 , . . . , Iτn) ∈

n∏

i=1

J SWCET(τi)

}

. (1.5)

Thus, the set of real-time instances generated by sporadic task system τ is

I S(τ) =⋃

Ij∈I SWCET(τ)

F(Ij). (1.6)

Observe that for any LL task system τ = {τ1 = (e1, p1), . . . , τn = (en, pn)} we can

represent the same task system in the sporadic model by the sporadic task system

τ ′ = {τ ′1 = (e1, p1, p1), . . . , τn = (en, pn, pn)}. It is easy to see that I LL(τ) = I S(τ ′);

therefore, the sporadic task model generalizes the LL task model.

1.1.2.3 Generalized Multiframe (GMF) Task Systems

Both the LL and sporadic task models are useful when the worst-case execution time,

relative deadline, and minimum inter-arrival time of each job generated by a task is

identical. However, for some real-time applications, the sequence of jobs produced

11

may not be homogenous. The generalized multiframe task (GMF) model 1 (Baruah

et al., 1999) permits a task to be characterized as a repeating sequence of heterogenous

real-time jobs.

A GMF task τi is comprised of a finite sequence of jobs (originally referred to

as frames) that can be repeated (possibly infinitely). Let Ni be the number of

jobs that comprises a sequence for τi. τi can be characterized by a three-tuple

τi = (~ei, ~di, ~pi) where ~ei, ~di, and ~pi are Ni-ary vectors. Vectors ~ei = [e0, e1, . . . , eNi−1],

~di = [d0, d1, . . . , dNi−1], and ~pi = [p0, p1, . . . , pNi−1] represent (respectively) the worst-

case execution requirement, relative deadline, and minimum separation parameter of

each job in the sequence.

Let J GMFWCET(τi) be a collection of real-time instances that are jobs generated by

GMF task τi satisfying the minimum inter-arrival constraint and requiring the worst-

case possible execution time; i.e., Iτiis a member of J GMF

WCET(τi) if and only if for all

Jk ∈ Iτiwhere k > 0, the following constraints are satisfied:

(Ek = e(k−1) mod Ni) ∧ (Dk = d(k−1) mod Ni

) ∧ (Ak+1 − Ak ≥ p(k−1) mod Ni). (1.7)

The set of real-time instances that a GMF task system τ = {τ1, τ2, . . . , τn} can gen-

erate (with worst-case possible execution time) is

I GMFWCET(τ)

def=

{

n⋃

i=1

Iτi

∣

∣

∣

∣

(Iτ1 , Iτ2 , . . . , Iτn) ∈

n∏

i=1

J GMFWCET(τi)

}

. (1.8)

Thus, the set of real-time instances generated by GMF task system τ is

1The generalized multiframe task model is a generalization of both the sporadic task model andanother model known as the multiframe task model (Mok and Chen, 1996).

12

time0

J1:1

...

τi5 10 15 20

J2:2J3:1 J4:1

J5:2J6:1

Figure 1.3: Jobs generated by GMF task τi from Example 1.4.

I GMF(τ) =⋃

Ij∈I GMFWCET(τ)

F(Ij). (1.9)

Example 1.4 Consider the following GMF task τidef= ([1, 2, 1], [5, 4, 3], [4, 5, 4]). A

possible real-time instance Iτi∈ J GMF

WCET(τi) is {(0, 1, 5), (4, 2, 4), (9, 1, 3), (13, 1, 5),

(17, 2, 4), . . .}. This sequence corresponds to τi generating its first job at time zero,

and successive jobs are generated as soon as legally allowable. Figure 1.3 illustrates

this arrival sequence. Note that it is permissible for a job to arrive prior to the pre-

ceding job’s deadline (i.e., two or more jobs may be in their scheduling window at a

given time).

For any sporadic task system τ = {τ1 = (e1, d1, p1), . . . , τn = (en, dn, pn)} we can

represent the same task system in the GMF task model by the GMF task system τ ′ =

{τ ′1 = ([e1], [p1], [p1]), . . . , τn = ([en], [pn], [pn])} (i.e., each vector is one-dimensional).

It is easy to see that I S(τ) = I GMF(τ ′); therefore, the GMF task model generalizes

the sporadic task model.

1.1.2.4 Recurring Real-Time Task Systems

Each of the previous task models allow for the generation of sequences of jobs by a

task: the LL and sporadic task models allow for a sequence of homogenous jobs from a

task, and the GMF task model allows repeating sequences of heterogenous jobs. In a

13

sense, these models “fix” the relative sequential order of jobs during task specification.

However, a real-time application may need to generate different sequences of jobs

contingent upon the state of the system at run-time. Consider the following simple

temperature control system for maintaining a system temperature between a low-

threshold and high-threshold:

1 repeat¤ Sample current temperature.

2 Generate sampling job JS with execution requirement ES and relativedeadline DS.

3 if (temperature < low-threshold) then¤ Initiate heating mechanism.

4 Generate heating system control job JH with execution EH

and relative deadline DH .5 elseif (temperature > high-threshold) then

¤ Initiate cooling mechanism.6 Generate cooling system control job JH with execution EH

and relative deadline DH .7 end repeat

The above temperature-control system generates a sequence of sample and heat-

ing/cooling jobs depending on the state of the system at each sample point. Obviously,

the previously discussed recurrent task models cannot easily model the sequences pro-

duced by such a system.

(Baruah, 2003) introduced the recurring real-time task model to address such

conditional behavior by real-time tasks. In the recurring real-time task model, a

task τi is represented via a directed acyclic graph (DAG) with a unique source and

unique sink vertex. A source vertex is a vertex with no incoming directed edges.

A sink vertex has no outgoing edges. Let the DAG associated with τi be denoted

by G(τi) = (Vertices(τi), Edges(τi)) where Vertices(τi) is a set of labels for the ver-

tices of G(τi) and Edges(τi) ⊆ Vertices(τi)× Vertices(τi). Associated with each vertex

v ∈ Vertices(τi) is an execution requirement e(v) and a relative deadline d(v); the

interpretation is that when τi generates a job associated with vertex v, it will have to

complete at most e(v) units of execution within d(v) time units. Associated with each

14

0

1

2

3

P(G(τi))= 15

e(0)=3d(0)=10

e(1)=1d(1)=4

e(2)=2d(2)=5

e(3)=2d(3)=7

p(0,1)=4

p(0,2)=6

p(1,3)=7

p(2,3)=4

Figure 1.4: A recurring real-time task τi with four vertices.

edge (u, v) ∈ Edges(τi) is a minimum separation p(u, v), which represents the mini-

mum time between the successive generation of jobs associated with vertices u and v.

Finally, associated with the entire graph is a parameter P (G(τi)), which represents

the minimum time between generation of jobs corresponding to the source vertex (i.e.,

the jobs corresponding to the source vertex may not have their arrival times less than

P (G(τi)) time units apart). For any job J generated by τi, let vertex(J) be the label

of the corresponding vertex in G(τi). Figure 1.4 illustrates an example specification

of a recurring real-time task.

Let J RWCET(τi) be a collection of real-time instances that are jobs generated by

recurring real-time task τi satisfying the minimum inter-arrival constraints and re-

quiring the worst-case possible execution time. Iτiis a member of J R

WCET(τi) if and

only if for all Jk ∈ Iτiwhere k > 0, the following constraints are satisfied:

1. Ek = e(vertex(Jk)).

15

2. Dk = d(vertex(Jk)).

3. If vertex(Jk) is the sink, then Jk+1 must also satisfy the following three con-straints:

a) vertex(Jk+1) is the source vertex.

(a) Ak+1 ≥ Ak (i.e., a source vertex job cannot arrive before the previous sinkvertex job).

(b) For all source vertices Jℓ (ℓ < k +1), Ak+1 −Aℓ must be at least P (G(τi)).

4. If vertex(Jk) is not the sink, then the following two constraints must be satisfied:

(a) (vertex(Jk), vertex(Jk+1)) ∈ Edges(τi) (i.e., successive jobs must correspondto an edge in G(τi)).

(b) Ak+1 − Ak must be at least p(vertex(Jk), vertex(Jk+1)).

The set of real-time instances that a recurring real-time task system τ = {τ1, τ2, . . . , τn}

can generate (with worst-case possible execution times) is

I RWCET(τ)

def=

{

n⋃

i=1

Iτi

∣

∣

∣

∣

(Iτ1 , Iτ2 , . . . , Iτn) ∈

n∏

i=1

J RWCET(τi)

}

. (1.10)

Thus, the set of real-time instances generated by recurring real-time task system τ is

I R(τ) =⋃

Ij∈I RWCET(τ)

F(Ij). (1.11)

Example 1.5 Using the task specification of the task from Figure 1.4, the follow-

ing is a possible real-time instance Iτi∈ J R

WCET(τi): {(0, 3, 10), (4, 1, 4), (11, 2, 7),

(15, 3, 10), (21, 2, 5), (25, 2, 7), (30, 3, 10), . . .}. The example sequence corresponds to

the jobs of the “top path” (vertices 0, 1, and 3) being generated first, followed by

jobs of the “bottom” path (vertices 1, 2, and 3). The jobs arrive as quickly as legally

permitted.

For any GMF task τi = ([e0, e1, . . . , eNi−1], [d0, d1, . . . , dNi−1], [p0, p1, . . . , pNi−1]),

we can represent the same task in the recurring real-time task model by the task τ ′i

16

where Vertices(τ ′i) = {0, 1, 2, . . . , Ni} and Edges(τ ′

i) = {(ℓ, ℓ + 1)|0 ≤ ℓ < Ni} (i.e.,

G(τ ′i) is a unary tree with Ni + 1 vertices). Each vertex v ∈ {0, 1, . . . , Ni − 1} has

the parameters e(v) = ev and d(v) = dv. For vertex Ni, e(Ni) = 0 and d(Ni) = 0

(this is equivalent to a job with no work being produced and could be removed from

the real-time instance with no side-effect). Each edge (ℓ, ℓ + 1) ∈ edges(τ ′i) has the

parameter p(ℓ, ℓ + 1) = pℓ. The sequence period P (G(τ ′i)) is set to zero; i.e., the jobs

corresponding to the source vertex can be generated immediately after the generation

of the preceding sink vertex job. It is straightforward to see that if jobs with zero

execution are removed from I R(τ ′) then I GMF(τ) = I R(τ ′). Therefore, the recurring

real-time task model generalizes the GMF task model. Figure 1.5 shows how the

GMF task of Example 1.4 can be represented as a recurring real-time task.

0

P(G(τ'i))= 0

e(0)=1d(0)=5

e(1)=2d(1)=4

e(2)=1d(2)=3

e(3)=0d(3)=0

p(0,1)=42 31

p(1,2)=5 p(2,3)=4

Figure 1.5: A recurring real-time task τ ′i that is equivalent to the GMF task τi

def=

([1, 2, 1], [5, 4, 3], [4, 5, 4]) of Example 1.4.

1.1.2.5 Relationship Between Task Models

In this section, we have introduced increasingly general task models. Every gener-

alization provides descriptive power to model increasingly complex behavior by real-

time applications. Figure 1.6 illustrates the space of real-time instances that may be

generated by partially-specified task systems in the various models described in this

section. Table 1.1 summarizes the task models introduced in this section and briefly

states the contribution of each model.

17

Task Model Task Specification Contribution

LL Two-tuple Minimum separation

parameter

Sporadic Three-tuple Non-implicit relative

deadline parameter

GMF Three-tuple of Ni-ary vectors Heterogeneous

sequences of jobs

Recurring DAG Conditional

job generation

Table 1.1: The above table summarizes the task models for partially-specified tasksystems introduced in this section. For each model, the task specification is informallydescribed and a brief summary of the contribution of the task model is given.

We will see in Chapter 2 that most prior work in multiprocessor real-time systems

has focused upon the simplest model discussed in this section: the LL task model.

This dissertation instead focuses on expanding the types of systems that may be

formally verified by considering the sporadic and more general task models. The most

general multiprocessor scheduling results in this dissertation are valid for all real-time

models described above. Section 1.5 more explicitly describes the contribution that

developing formal temporal analysis for general task systems makes to multiprocessor

real-time systems research.

1.2 Processing Platform

This dissertation focuses on the real-time scheduling upon multiprocessor platforms.

More specifically, we will be concentrating on scheduling upon a class of multipro-

cessor platforms known as the identical multiprocessors. The identical multiprocessor

model assumes that each processor in the platform has identical processing capabili-

ties and speed; more specifically, each processor is identical in terms of architecture,

18

I

I R

I LL

I S

I GMF

Figure 1.6: The space of real-time instances that can be generated by the modelsdiscussed in Section 1.1.2. I is the set of all real-time instances. I M is the set of allreal-time instances than can be generated by a task system (with a finite number oftasks) specified in task model M. Note that I ⊃ I R ⊃ I GMF ⊃ I S ⊃ I LL.

cache size and speed, I/O and resource access, and access time to shared memory

(called Uniform Memory Access (UMA)). Figure 1.7 gives a high-level illustration of

a possible layout of an identical multiprocessor platform. This type of multiprocessor

layout is sometimes also called symmetric shared-memory multiprocessor (SMP). We

denote the multiprocessor platform by Π and assume Π is comprised of m identical

processors π1, π2, . . ., πm ∈ Π.

Recall from the beginning of this chapter that each job corresponds to the exe-

cution of a sequential segment of code by the processing platform. For each model

introduced in the previous section (Section 1.1), a real-time task has associated worst-

case execution requirement parameter(s). These execution requirements represent the

worst-case cumulative amount of execution time that a job generated by the task

requires to execute to completion on the processing platform. The process of deter-

mining the worst-case execution parameters is called timing analysis. Timing analysis

must account for worst-case cache behavior, pipeline stalls, memory contention, mem-

19

Processorπ1

Processorπ2

Processorπm

CacheCache CacheCache CacheCache

Shared Memory

Bus

Figure 1.7: The layout of a symmetric shared-memory multiprocessor (SMP) plat-form.

ory access time, program structure, and worst-case execution paths. The analysis for

determining the contribution to the worst-case execution time of each of these fac-

tors is dependent on the specific system and the program. Other factors that may

increase the worst-case execution time are job preemptions (i.e., a job suspends while

a different job executes and resumes execution at later time). The context switch,

state saving, and scheduling-decision processing time by the platforms’s operating

system adds additional time to the job’s execution requirement. Furthermore, if a job

is allowed to migrate between processors during its scheduling window, there may be

an added penalty of refreshing the cache of the processor to which the job is migrat-

ing. The preemption and migration execution costs are typically dependent on the

processor architecture and the scheduling algorithm used. (Calandrino et al., 2006)

determine the cost of preemption and migration for various multiprocessor scheduling

algorithms on a Linux-based testbed. There are known techniques for accounting for

these factors in the worst-case execution time parameter (see techniques described

in (Devi, 2006; Baker and Baruah, 2007) for multiprocessor systems). In this disser-

tation, we will assume that the worst-case execution time of each task has already

20

been determined.

We will assume that each processor has unit-speed. We will assume that jobs are

preemptable. The next section will discuss under what scenarios job migration be-

tween processors is allowed. Though a job may execute on different processors over its

scheduling window, job-level parallelism is not permitted (i.e., a job may not execute

concurrently with itself on two or more processors simultaneously); this assumption

is not limiting, since we have defined a job to correspond to a sequential segment of

code. Throughout this dissertation we will also assume that tasks are independent of

each other; that is, the execution of a job of one task is not contingent upon the status

of a job of another task (e.g., blocking on shared resources is not permitted). Devel-

oping formal analysis techniques for general task systems that are not independent is

the subject of current research and beyond the scope of this dissertation.

1.3 Real-Time Scheduling Algorithms

When executing a real-time application, the real-time scheduling algorithm must de-

termine which active jobs are executing on the processing platform at every time

instant. At an abstract level, the real-time scheduling algorithm determines the in-

terleaving of execution for jobs of any real-time instance I on the processing platform

Π. The interleaving of execution of I on Π is known as a schedule. The goal of a

real-time scheduling algorithm is to produce a schedule that ensures that every job

of I is allocated the processor (i.e., executes) for its execution requirement during its

scheduling window.

In this section, we discuss the classification of real-time multiprocessor scheduling

algorithms. Section 1.3.1 gives some formal definitions for real-time scheduling algo-

rithm concepts. Section 1.3.2 introduces a family of scheduling algorithms known as

21

priority-driven scheduling algorithms. Section 1.3.3 classifies multiprocessor schedul-

ing algorithms based on the degree of migration permitted.

1.3.1 Notation

In this section and Section 1.4.1, we take a very formal approach in defining the

concepts of real-time scheduling algorithms and formal verification techniques. In

particular, we give mathematical notation and definitions for concepts that for most

of the real-time literature a verbal definition sufficed. Our reasons for using a much

more formal approach are twofold:

1. The more formal definitions allow us to reason about scheduling algorithms in

very abstract terms. These abstractions will be used heavily in Chapter 4 and

Appendix A.

2. The formal definitions make the connection between the concepts of schedul-

ing algorithms, formal verification techniques, and real-time instance models

explicit and unambiguous.

However, to reduce the burden on the reader in remembering notation, we will also

provide a verbal description of each of the concepts introduced in these sections. We

will use the verbal definitions when the more formal definitions are not required and

the meaning is clear; the reader should refer back to the formal definitions of this

chapter, if any confusion or ambiguity arises. The formal definitions will be used

primarily in Chapter 4 and Appendix A.

As mentioned earlier, a schedule specifies the interleaving of execution of jobs of

a real-time instance. That is, a schedule will indicate at any given time which job is

executing on which processor. We can formally define the schedule S for real-time

instance I as a function of the processor and time.

22

Definition 1.1 (Schedule Function) Let SI(πk, t) be the job of I scheduled at time

t on processor πk ∈ Π; SI(πk, t) is ⊥ if there is no task scheduled at time t (i.e.,

SI : Π × R+ 7→ I ∪ {⊥}). Let SI,Π be the set of all possible schedule functions over

real-time instance I and platform Π.

It is sometimes useful to view the behavior of a single job of a real-time instance

I in schedule SI . The following definition allows us to characterize the schedule SI

with respect to task Ji.

Definition 1.2 (Job-Schedule Function) SI(πk, t, Ji) is an indicator function de-

noting whether Ji is scheduled at time t on processor πk for schedule SI . In other

words,

SI(πk, t, Ji)def

=

1 , if SI(πk, t) = Ji

0 , otherwise.(1.12)

A scheduling algorithm makes decisions about the order in which jobs of a real-

time instance should execute. If the real-time instance is specified prior to run-time or

generated by a completely-specified task system, a scheduling algorithm can generate

and store the schedule prior to run-time. This approach is called static scheduling or

table-driven scheduling (see (Baker and Shaw, 1989) for an example static scheduler).

For systems that are partially-specified or have schedules too large to store in a

system’s memory, anonline algorithm is more appropriate. For any time t, an online

real-time scheduling algorithm decides the set of jobs that will be executed on Π at

time t based on prior decisions and the status of jobs released at or prior to t. An

online scheduling algorithm does not have specific information on the release of jobs

after time t (i.e., future jobs arrival times are unknown). This dissertation focuses on

deterministic online real-time scheduling algorithms.

23

At an abstract level, a real-time scheduling algorithm2 A (either static or online)

on platform Π is a higher-order function3 from real-time instances to schedules over

Π — i.e., A : I → ⋃

I∈I SI,Π. Let I≤tdef= {Ji ∈ I|Ai ≤ t}; that is, I≤t is the set

of jobs of I that arrive prior to or at time t. For an online scheduling algorithm A,

I≤t represents the set of jobs that A has knowledge of at time t (i.e., A knows the

arrival time, execution requirement, and deadline parameters of the jobs of I≤t, but

not other jobs of I). Up until time t, algorithm A has made scheduling decisions

without specific knowledge of jobs arriving after time t; furthermore, jobs arriving

after t cannot have an effect on the schedule generated by A from time zero to t.

In other words, for an online scheduling algorithm future jobs cannot change past

scheduling decisions.

Definition 1.3 (Deterministic Online Scheduling Algorithm) For any I ∈ I ,

let SAI be the schedule produced by algorithm A for real-time instance I and platform

Π. An online real-time scheduling algorithm must satisfy the following constraint: for

all I, I ′ ∈ I and for all t > 0,

(I≤t = I ′≤t) ⇒

(

∀t′(0 ≤ t′ ≤ t),∀πk ∈ Π :: SAI (πk, t

′) = SAI′ (πk, t

′))

. (1.13)

1.3.2 Priority-Driven Scheduling Algorithms

A possible approach to the online scheduling of a real-time instance on a process-

ing platform is to assign, at any given time t, each job Ji a priority ρ(Ji, t) (which

is assumed to be a real number). A priority-driven scheduling algorithm at each

time t sorts the active jobs according to ρ(Ji, t) (in non-decreasing order) and sched-

2We will slightly abuse notation and use A to refer to both the scheduling algorithm and thefunction.

3A higher-order function has a function space as either the domain or range.

24

ules the highest-priority job(s) on the processing platform. In this section, we will

describe priority-driven scheduling algorithms assuming a uniprocessor system; the

next section (Section 1.3.3) will explain how these priority-driven algorithms will be

used on multiprocessor platforms under different degrees of migration. Priority-driven

scheduling algorithms differ in the manner that they assign priority to each job. In the

following, we give three classifications of priority-driven scheduling algorithms. We

follow the classification names used in (Baker and Baruah, 2007) ((Carpenter et al.,

2003) provides a thorough overview of the types of priority-driven algorithms under

slightly different terminology). The three major classes of priority-driven schedul-

ing algorithms are fixed task-priority (FTP), fixed job-priority (FJP), and dynamic

priority (DP).

1.3.2.1 Fixed Task-Priority (FTP) Scheduling Algorithms

In FTP scheduling, each task is assigned a fixed priority c, and each job generated

by that task is assigned the same priority value. Thus, for all Jk ∈ I M(τ) for any

recurrent task model M, ρ(Jk) = c for all t ≥ 0.4 For a recurrent task system

with n tasks, there are n distinct priorities (one for each task). For FTP-scheduled

systems, we will assume that tasks are indexed in decreasing order of priority; for

τdef= {τ1, . . . , τn}, τ1 has the highest priority and τn has the lowest priority. In general,

the task-priority assignment can be determined by the system designer. However,

there are two well-studied FTP-assignment algorithms for sporadic task systems: rate

monotonic (rm) and deadline monotonic (dm).

§ Rate Monotonic (rm). For rm scheduling (Liu and Layland, 1973), each spo-

radic task τi is assigned priority equal to the inverse of its period: for all Jk ∈ Iτi(∈

J SWCET(τi)), the priority ρ(Jk) is equal to 1/pi.

4We omit the argument t from ρ(·) for FTP and FJP scheduling, since priority will not changeover time for any job.

25

xxxxxxxxxxxxxxxxxxx

xxxxxxxxxxxxxxxxxxx

xxxxxxxxxxxxxxxxxxx

xxxxxxxxxxxxxxxxxxx

xxxxxxxxxxxxxxxxxx

xxxxxxxxxxxxxxxxxxx

xxxxxxxxxxxxxxxxxxx

xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

0 1 2 3 4 5 6xxxxxxxxxxxxxxxxxxx

xxxxxxxxxxxxxxxxxx

7 8

xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

τB

τA

RM

xxxxxxxxxxxxxxxxxx

xxxxxxxxxxxxxxxxxx

9 10

xxxxx

xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

xxxx

xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

xxxxx

xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

xxxxx

xxxxxxxxxxxxxxxxxxx

xxxxxxxxxxxxxxxxxx

xxxxxxxxxxxxxxxxxx

xxxxxxxxxxxxxxxxxxx

xxxxxxxxxxxxxxxxxxx

xxxxxxxxxxxxxxxxxxx

xxxxxxxxxxxxxxxxxxx

0 1 2 3 4 5 6xxxxxxxxxxxxxxxxxxx

xxxxxxxxxxxxxxxxxxx

7 8

τB

τA

DM

xxxxxxxxxxxxxxxxxx

xxxxxxxxxxxxxxxxxx

9 10


xxxxx

xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

xxxxxxxxxxxxxxxxxxx

xxxxxxxxxxxxxxxxxx

xxxxxxxxxxxxxxxxxx

xxxxxxxxxxxxxxxxxxx

xxxxxxxxxxxxxxxxxxx

xxxxxxxxxxxxxxxxxxx

xxxxxxxxxxxxxxxxxxx

0 1 2 3 4 5 6xxxxxxxxxxxxxxxxxx

xxxxxxxxxxxxxxxxxxx

7 8

τB

τA

EDF

xxxxxxxxxxxxxxxxxx

xxxxxxxxxxxxxxxxxx

9 10


xxxxx

xxxx


xxxx


xxxxx


xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx


xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

xx


xxxxx

xxxxx

xx

xxxxx


xxxxx

xxxxx


xxxxx

xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

xx

xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

Figure 1.8: Possible schedules of rm, dm, and edf for the task system of Example 1.6.

Example 1.6 Consider the sporadic task system τdef= {τA, τB} where τA

def= (1, 4, 7)

and τBdef= (3, 5, 5). The top schedule in Figure 1.8 gives a possible schedule for rm

with respect to a legal job arrival sequence for τ .

§ Deadline Monotonic (DM). The dm scheduling algorithm (Leung and White-

head, 1982) assigns to each sporadic task τi a priority equal to the inverse of its

relative deadline parameter: for all Jk ∈ Iτi(∈ J S

WCET(τi)), the priority ρ(Jk) is equal

to 1/di. The middle schedule in Figure 1.8 gives a possible schedule for rm with

respect to a legal job arrival sequence for τ from Example 1.6.

1.3.2.2 Fixed Job-Priority (FJP) Scheduling Algorithms

For FJP scheduling, the restriction that a task’s jobs have identical priority is re-

moved. Instead, each job Jk is assigned a single priority ρ(Jk) that does not change.

26

The specific FJP scheduling algorithm determines the priority assignment for jobs.

Earliest-deadline first (edf) is a well known FJP scheduling algorithm.

§ Earliest-Deadline First (edf). The edf scheduling algorithm (Liu and Layland,

1973) assigns a priority to each job equal to the inverse of its absolute deadline. In

other words, edf schedules among the set of jobs with remaining execution the m

jobs with the nearest deadline. For a recurrent task τi, the priority ρ(Jk) of any job

Jk ∈ Iτi(∈ JWCET(τi)) equals 1/(Ak + Dk). The bottom schedule in Figure 1.8 gives

a possible schedule for edf with respect to a legal job arrival sequence for τ from

Example 1.6.

1.3.2.3 Dynamic-Priority (DP) Scheduling Algorithms

The DP scheduling-algorithm classification is the most general. DP scheduling re-

moves the restriction that a job priority does not change. A job Jk priority ρ(Jk, t)

can now vary over time. Examples of well-known DP scheduling algorithm include

least-laxity first (LLF) and Pfair-based algorithms (Baruah et al., 1996; Baruah et al.,

1995; Anderson and Srinivasan, 2004).

In this dissertation, we focus on FTP and FJP scheduling on multiprocessor plat-

forms, primarily on the dm and edf scheduling algorithms.

1.3.3 Degree of Migration

The allocation of real-time jobs to processors is another dimension of scheduling

that may be used to classify real-time scheduling algorithms. Using the classification

of (Carpenter et al., 2003), we consider three classes of multiprocessor scheduling

algorithms: partitioned scheduling, restricted-migration scheduling, and full-migration

scheduling.

27

1.3.3.1 Partitioned Scheduling

In partitioned scheduling, each recurrent task τi ∈ τ is assigned a single processor

πℓ ∈ Π. The assignment of tasks to processors is typically done at system-design

time. There are several algorithms and heuristics for assigning tasks to processors;

Chapter 7 considers various partitioning algorithms for sporadic tasks. Once a task-

processor assignment for τi to πℓ is determined, every job Jk generated by τi executes

solely on processor πℓ. Let τ(πk) denote the set of tasks assigned to processor πk. The

tasks of τ(πk) are scheduled on processor π according to some uniprocessor scheduling

algorithm. Figure 1.10 shows a high-level view of the partitioning approach.

Example 1.7 Consider the following three-task, sporadic task system τdef= {τ1 =

(2, 3, 5), τ2 = (4, 8, 8), τ3 = (2, 4, 4)}. Let τ be scheduled upon two processors; the

partition is τ(π1) = {τ1, τ2} and τ(π2) = {τ3}. Figure 1.9(a) gives the partitioned

schedule (with edf used to schedule each individual processor) of a possible real-time

instance generated by τ .

1.3.3.2 Full-Migration Scheduling

The least restrictive of the migration-based scheduling classifications is full-migration

scheduling. In this classification, a job can halt its execution on one processor and re-

sume execution on a different processor. The only major restriction for full-migration

scheduling algorithms is that job-level parallelism is forbidden (i.e., a job may not

execute concurrently with itself on two or more different processors). It is dependent

on the scheduling algorithm whether task-level parallelism is permitted. Figure 1.11

gives a high-level overview of full-migration scheduling. Figure 1.9(b) illustrates the

full-migration schedule for the task system of Example 1.7.

28

xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx




xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx



xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx




xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx


xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx









xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx


xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx




xxxxxxxxxxxxxxx

0 1 2 3 4 5 6 7 8

τ1= (2, 3, 5)

9 10

xxxxxx

xxxxxxxxxxxxxxxxxx

xxxx

xx

xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

11 12 13 14 15xxxxxxxxxxxxxxx

xxxxxxxxxxxxxxx

xxxxxxxxxxxxxxx

xxxxxxxxxxxxxxx

xxxxxxxxxxxxxxx

xxxxxxxxxxxxxxx

xxxxxxxxxxxxxxx

xxxxxxxxxxxxxxx

xxxxxxxxxxxxxxx

xxxxxxxxxxxxxxx

xxxxxxxxxxxxxxx

xxxxxxxxxxxxxxx

xxxxxxxxxxxxxxx

xxxxxxxxxxxxxxx

xxxxxxxxxxxxxxx

τ2= (4, 8, 8)

τ3= (2, 4, 4)

xxxx

xxxxxxxxxxxxxxxxxx

xxxx

xxx

xxxxxxxxxxxxxxxxxx

xxxx

xxx

xxxxxxxxxxxxx

xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

xx

xxxxxxxxxxxxxxxxxxxxxxxx

xx

xxxxxxxxxxxxxxxxxxxxxxx

xx


xx

xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

xxxxxxxxxxxxxxx

0 1 2 3 4 5 6 7 8

τ1= (2, 3, 5)

9 10

xxxx

xxxxxxxxxxxxxxxxxxx

xxxx

xx



xxxxxxxxxxxxxxx

xxxxxxxxxxxxxxx

xxxxxxxxxxxxxxx

xxxxxxxxxxxxxxx

xxxxxxxxxxxxxxx

xxxxxxxxxxxxxxx

xxxxxxxxxxxxxxx

xxxxxxxxxxxxxxx

xxxxxxxxxxxxxxx

xxxxxxxxxxxxxxx

xxxxxxxxxxxxxxx

xxxxxxxxxxxxxxx

xxxxxxxxxxxxxxx

xxxxxxxxxxxxxxx

τ2= (4, 8, 8)

τ3= (2, 4, 4)

xxx

xxxxxxxxxxxxxxxxxx

xxx

xxx

xxxxxxxxxxxxxxxxxx

xxxx

xxx

xxxxxxxxxxxxx


xx


xx


xx


xxx

xxxxxxxxxxxxxxx

0 1 2 3 4 5 6 7 8

τ1= (2, 3, 5)

9 10

xxxx

xxxxxxxxxxxxxxxxxxx

xxxx

xx



xxxxxxxxxxxxxxx

xxxxxxxxxxxxxxx

xxxxxxxxxxxxxxx

xxxxxxxxxxxxxxx

xxxxxxxxxxxxxxx

xxxxxxxxxxxxxxx

xxxxxxxxxxxxxxx

xxxxxxxxxxxxxxx

xxxxxxxxxxxxxxx

xxxxxxxxxxxxxxx

xxxxxxxxxxxxxxx

xxxxxxxxxxxxxxx

xxxxxxxxxxxxxxx

xxxxxxxxxxxxxxx

τ2= (4, 8, 8)

τ3= (2, 4, 4)

xxx

xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

xxx

xxx

xxxxxxxxxxxxxxxxxx

xxxx

xxx

xxxxxxxxxxxxx


xx


xx


xx


xxx

(a)

(b)

(c)

Processor π1

Processor π2


Figure 1.9: Example multiprocessor schedules for task system τ of Example 1.7 underedf-scheduling and the different paradigms considered in this paper. (a) shows a par-titioned schedule; (b) shows a full-migration schedule; and (c) shows the restricted-migration schedule.

29

τ1

τ2

...

τ(π1)

π1

π2

πm

......

UniprocessorScheduler



LocalPriority-Queues (Jobs)

τ1

τ2

...

τ(π2)

τ1

τ2

...

τ(πm)...

Figure 1.10: A high-level perspective of partitioned scheduling. Tasks are staticallyassigned to processors by a partitioning algorithm. Each task places generated jobson the processor’s local priority queue and a uniprocessor scheduling algorithm is usedto schedule each processor.

τ1

τ2

τn

...Priority-Queue (Jobs)

GlobalScheduler

π1

π2

πm

...

Figure 1.11: A high-level perspective of full-migration scheduling. Each job generatedby a task is placed upon a global priority queue. A global scheduler decides whatjobs execute on each processor at any given time.

30

τ1

τ2

τn

...Priority-Queue (Jobs)

GlobalScheduler

π1

π2

πm

......




LocalPriority-Queues (Jobs)

Figure 1.12: A high-level perspective of restricted-migration scheduling. There areessentially two-levels of schedulers. At run-time jobs generated by a task are placedon a global priority queue. A global scheduler assigns each job to a processor byplacing it on the processor’s priority queue. A uniprocessor scheduling algorithm isused to schedule each processor’s assigned task.

1.3.3.3 Restricted-Migration Scheduling

Tasks are allowed to migrate between processors in a restricted-migration scheduling

algorithm. Each job, however, must execute on only one processor. For partially-

specified systems, the assignment of jobs to processors in restricted-migration schedul-

ing is typically done online, since the specific job arrivals are not known a priori. For

any real-time instance I, let I(πk) ⊆ I be the set of jobs assigned to processor πk ∈ Π

by the restricted-migration scheduling algorithm. Like partitioned scheduling, the

jobs of I(πk) (once assigned upon arrival) are scheduled using a uniprocessor schedul-

ing algorithm. Like full-migration scheduling, it is dependent upon the algorithm

whether task-level parallelism is allowed (i.e., two jobs of the same task executing on

different processor concurrently). Figure 1.12 gives a high-level overview of restricted-

migration scheduling. Figure 1.9(c) illustrates the full-migration schedule for the task

system of Example 1.7.

31

1.4 Formal Verification of Real-Time Systems

As mentioned in the introduction, to ensure a real-time system is temporally correct

it must be validated prior to system run-time using formal verification techniques.

These formal verification techniques must ensure that for all legal executions of the

system every real-time job generated by the system will meet its deadline on the

processing platform. We consider two fundamental problems in formal verification for

real-time systems: feasibility analysis and schedulability analysis. Feasibility analysis

determines whether there always exists a “way” to schedule a system (irrespective of

scheduling algorithm) meeting all deadlines. Schedulability analysis determines (with

respect to a given scheduling algorithm) whether the scheduling algorithm will always

meet all the system’s deadlines.

The remainder of this section further introduces real-time formal verification tech-

niques. Section 1.4.1 gives some notation that will be used to formally define feasibil-

ity and schedulability analysis. Section 1.4.2 discusses feasibility analysis for general

task systems. Section 1.4.3 discusses schedulability analysis and related concepts for

various real-time scheduling algorithms. Section 1.4.4 describes an approach, called

resource-augmentation analysis, which is used to theoretically evaluate the effective-

ness of real-time verification techniques.

1.4.1 Notation

This section gives formalism that will be used throughout this dissertation. When

evaluating a real-time system, it is sometimes useful to describe the amount of “work”

(execution) that a job does over a specified interval in a given schedule. The next

definition defines the amount of “processor time” that a job receives over a given

interval.

32

Definition 1.4 (Work Function) W (SI , Ji, t1, t2) denotes the amount of processor

time (over all processors of Π) that Ji receives from schedule SI over the interval

[t1, t2). In other words,5

W (SI , Ji, t1, t2)def

=∑

πk∈Π

[∫ t2

t1

SI(πk, t, Ji)dt

]

. (1.14)

We can use a system-work function to describe the cumulative work done by all

jobs of a real-time instance over a specified time interval in a given schedule.

Definition 1.5 (System-Work Function) WI(SI , t1, t2) denotes the amount of pro-

cessor time received by all jobs of I in schedule SI over the interval [t1, t2).

WI(SI , t1, t2)def

=∑

Ji∈I

W (SI , Ji, t1, t2). (1.15)

Not all functions from Π×R+ to I for a given real-time instance I represent valid

executions of a real-time system that could generate the instance I. In particular,

we must ensure the following: a job can only execute during its scheduling window,

a job cannot execute concurrently with itself on two or more processors, and a job

must execute for Ei time units in its scheduling window to meet its deadline. Using

Definitions 1.1 through 1.5, we can now formally define a valid schedule SI with

respect to a real-time instance I:

Definition 1.6 (Valid Schedule) SI ∈ SI,Π is valid (with respect to jobs of some

real-time instance I and platform Π) if and only if the following three conditions are

satisfied:

1. For any Ji ∈ I, if t < Ai or t > Ai + Di then SI(πk, t) 6= Ji for all πk ∈ Π

(i.e., a job cannot execute while it is outside its scheduling window). For this

5Since SI(πk, t, Ji) is potentially discontinuous at an infinite number of points,∫ t2

t1SI(πk, t, Ji)dt

denotes a Lebesgue integral (Kolmogorov and Fomin, 1970) and not a Riemann integral.

33

dissertation, we will assume that two different jobs of the same task may exe-

cute concurrently on different processors (i.e., intra-task parallelism is allowed,

but intra-job parallelism is forbidden). This assumption excludes certain task

systems such as sporadic tasks with relative deadlines greater than the task’s

period. We are currently working on extending our results to such systems that

forbid intra-task parallelism and require that jobs of a task execute in-order.

2. If SI(πi, t) 6= ⊥ and SI(πj, t) 6= ⊥ then SI(πi, t) 6= SI(πj, t) for all t ∈ R+ and

πi 6= πj ∈ Π (i.e., a job may not execute concurrently with itself).

3. For all Ji ∈ I, WI(SI , Ji, Ai, Ai+Di) = Ei (i.e., each job receives processing time

on Π equal to its execution requirement between its release time and deadline).

1.4.2 Feasibility Analysis

Recall that a recurrent task system can potentially generate infinitely different distinct

real-time instances over different executions of the system. Informally, a recurrent task

system τ is feasible on processing platform Π if and only if for every possible real-time

instance there exists a way to meet all deadlines. If there is a way for a real-time

instance I to meet all deadlines, we say that I is a feasible instance on processing

platform Π.

Definition 1.7 (Feasible Instance) A real-time instance I is feasible on platform

Π if and only if there exists SI ∈ SI,Π such that SI is valid.

We may extend this definition to recurrent task systems that allow for full migra-

tion between processors.

Definition 1.8 (Full-Migration Feasible Task System) Recurrent task system τ

in task model M is full-migration feasible on platform Π if and only if for all I ∈

I M(τ), I is a feasible instance on Π.

34

For restricted-migration systems (where a job can only execute upon a single

processor), we must restrict the definition of feasibility, slightly.

Definition 1.9 (Restricted-Migration Feasible Task System) Recurrent task

system τ in task model M is restricted-migration feasible on platform Π if and only

if for all I ∈ I M(τ), there exists a partition of I into m sets, I(π1), I(π2), . . . , I(πm),

such that for all πk ∈ Π, I(πk) is feasible on πk.

Finally, for partitioned systems comprised of recurrent tasks, we have the follow-

ing.

Definition 1.10 (Partition Feasible Task System) Recurrent task system τ in

task model M is partition feasible on platform Π if and only if there exists a partition

of τ into m sets, τ(π1), τ(π2), . . . , τ(πm), such that for all πk ∈ Π, τ(πk) is feasible

on πk.

A feasibility test labels a task system τ as either “feasible” or ”infeasible” on plat-

form Π (infeasible is the negation of feasible — i.e., there exists a real-time instance

I generated by τ such that there does not exist a schedule for I that is valid for

platform Π). An exact feasibility analysis test will classify a task system as “feasible”

if and only if τ is feasible on Π. A sufficient feasibility test may incorrectly classify a

feasible task system as “infeasible”, but has the property that if τ is classified by the

test as “feasible”, then τ is in fact feasible on Π.

In a sense, a feasibility test checks whether their exists some algorithm (either on-

line or hypothetical clairvoyant6) that will meet all the deadlines of the task system

on the processing platform. We will see in Chapter 2 that for completely-specified

instances and LL task systems, there are simple, exact feasibility tests for multiproces-

sor systems. However, for sporadic and more general partially-specified task systems,

6A clairvoyant scheduling algorithm has knowledge of future job arrivals and therefore does notneed to satisfy the restriction of online scheduling algorithms (Definition 1.3).

35

exact feasibility tests are currently unknown for full- and restricted-migration sys-

tems; developing “good” sufficient feasibility test for these general task systems is the

focus of Chapter 4.

1.4.3 Schedulability Analysis

For a given scheduling algorithm, task system, and processing platform, schedulability

analysis determines whether the given algorithm will always meet all deadlines for

the task system upon the processing platform.

Definition 1.11 (A-Schedulable) Recurrent task system τ in task model M is A-

schedulable on platform Π if and only if for all I ∈ I M(τ), SAI is a valid schedule on

Π.

Scheduling algorithm A is an optimal scheduling algorithm if, for all τ and Π, τ

being feasible on Π implies that τ is also A-schedulable on platform Π. We will see in

Chapter 2 that optimal scheduling algorithms exist for full-migration and partitioned

scheduling of LL task systems. However, we will show in Chapter 5 that optimal

scheduling algorithms cannot exist for sporadic or more general task systems.

Similar to feasibility tests, a schedulability test for algorithm A labels a task

system τ as either “A-schedulable” or “not A-schedulable” on platform Π (a task

system τ not A-schedulable implies the existence of a real-time instance generated by

τ that will miss a deadline when using algorithm A). An exact schedulability analysis

test will classify a task system as “A-schedulable” if and only if τ is A-schedulable

on Π. A sufficient schedulability test may incorrectly classify an A-schedulable task

system as “not A-schedulable”, but has the property that if τ is classified by the test

as “A-schedulable”, then τ is in fact A-schedulable on Π. Chapters 6 and 7 develop

“good” schedulability tests for full-migration and partitioned scheduling algorithms

for sporadic and more general task systems.

36

1.4.4 Evaluating the Effectiveness of a Verification Technique:

Resource Augmentation Analysis

In the preceding two subsections (Sections 1.4.2 and 1.4.3), we alluded to “good” ver-

ification techniques. For uniprocessor systems, there are both exact feasibility and

schedulability tests for all the recurrent task models discussed in Section 1.1 (e.g.,

see (Baruah, 2003) for discussion of feasibility and schedulability tests for general

task systems). Therefore, for uniprocessor systems, it is very clear what constitutes a

“good” verification technique; furthermore, there are known optimal scheduling algo-

rithms (such as edf (Liu and Layland, 1973)). Chapter 5 proves that optimal algo-

rithms for multiprocessor scheduling sporadic and more general task systems cannot

exist; Chapter 2 shows that many traditional tests for feasibility and schedulability

are not exact. Therefore, it is not entirely clear what constitutes a good verifica-

tion technique for the multiprocessor scheduling of sporadic and more general task

systems. This subsection discusses a possible approach for evaluating real-time ver-

ification techniques, and introduce resource augmentation analysis, a technique for

quantifying “goodness” based on worst-case behavior.

One approach for estimating the effectiveness of a verification technique is via

empirical analysis. A typical approach in the real-time systems literature is to ran-

domly generate synthetic task systems (i.e., randomly generate the parameters for

tasks) and use the proposed verification technique on the generated task systems.

The ratio of randomly generated task systems that are validated by the proposed

verification technique (i.e., labeled as either “feasible” or “A-schedulable”) to the total

number of generated task systems is called the acceptance ratio. Such an approach

is used by (Park et al., 1995) to determine the effectiveness of a rm schedulability

test on LL task systems upon a uniprocessor platform. The acceptance ratio is useful

for determining the average-case performance of a scheduling algorithm or verifica-

37

tion technique over a random distribution of task systems. However, the empirical

approach may susceptible to unintended bias (Bini and Buttazzo, 2005). Further-

more, empirical analysis does not give any insight into the worst-case behavior of the

verification technique or scheduling algorithm.

A standard way of theoretically evaluating many online algorithms is via a tech-

nique known as competitive analysis. Let c(A, I) be the“cost”of the solution produced

by algorithm A over the input I. Let cOPT(I) be the cost of the solution produced

by an optimal algorithm. An algorithm A has competitive ratio of at least α ≥ 1 if:

maxI

c(A, I)

cOPT(I)≤ α. (1.16)

We say that algorithm A is α-competitive. If the goal is to minimize the worst-case

cost of an algorithm, then a “good” algorithm will have a small competitive ratio

α, while “poor” algorithms will have a large (or unbounded) ratio. The competitive

analysis approach is very effective for many optimization problems by identifying good

online algorithms.

Unfortunately, the standard competitive analysis approach has serious drawbacks

for real-time systems. For a hard real-time system even a single deadline miss may be

unacceptable. Therefore, it is difficult to interpret for hard real-time algorithms what

the cost of a scheduling algorithm is and what the competitive ratio implies about the

algorithm. Even in systems that can tolerate some deadline misses and have define

the “cost” of a deadline miss, (Kalyanasundaram and Pruhs, 2000) show that many

algorithms that perform well in practice have a large competitive ratio.

To address the shortcomings of standard competitive analysis in real-time schedul-

ing, resource augmentation (Phillips et al., 2002) was proposed as a measure of the

relative effectiveness and tightness of a given feasibility or schedulability analysis

technique. In addition, resource augmentation may be used to indirectly gauge the

38

effectiveness of a non-optimal scheduling algorithm. Resource augmentation works by

comparing a given verification technique against the performance of a hypothetically

optimal analysis technique. The resource-augmentation metric of effectiveness used

in this dissertation for a given analysis technique is as follows: given any real-time

instance that is formally verifiable according to the hypothetical optimal algorithm

on the original platform (i.e., feasible on the original platform), our goal is to ob-

tain a constant multiplicative factor by which we must increase the speed of each

processor in order for our given analysis test to label the same instance as “feasi-

ble” (or “A-schedulable”, if using schedulability analysis). That is, we are interested

in the minimum speed-up factor necessary to guarantee that our analysis technique

verifies that a real-time instance that is optimally schedulable on the original plat-

form is also formally verifiable (according to our sufficient analysis test) on the more

powerful, modified platform. We believe that resource-augmentation analysis is par-

ticularly useful, as it quantifies the minimum amount of resources that would have

to be added to the original platform for a given verification test to validate the same

task systems that can be validated by an optimal algorithm; in other words, resource-

augmentation quantifies the processing capacity that might be “wasted” in using a

non-optimal scheduling algorithm.

Let V be a verification test (either feasibility or schedulability test). The func-

tion V maps real-time instances and processing platforms to labels (i.e., “feasible” or

“infeasible”, or “A-schedulable” or “not A-schedulable”). We will now more formally

define resource augmentation for V . We will abuse notation slightly and let s · Π

indicate a platform with m processors where each processor is s times faster (s ≥ 1)

than the processors of Π.

Definition 1.12 (Resource-Augmentation Approximation Ratio for V) The

resource-augmentation approximation ratio for verification test V is the minimum

39

s satisfying

s ≥ maxI∈I: I is feasible on Π

{

mins′≥1

{s′ | V(I, s′ · Π) labels I “feasible” (“A-schedulable”).}}

.

(1.17)

Note, that if V is a schedulability test for A, and if V has a resource-augmentation

approximation ratio of s, we may indirectly say that scheduling algorithm A has a

resource-augmentation approximation ratio of at most s.

1.5 Contributions

The thesis of this dissertation is:

Optimal online multiprocessor real-time scheduling algorithms for sporadic

and more general task systems are impossible; however, efficient, online

scheduling algorithms and associated feasibility and schedulability tests,

with provably bounded deviation from any optimal test, exist.

The above thesis is supported by the following contributions made in this disser-

tation:

• We describe the relationship between general task models and the well-known

workload metrics of demand-based load and maximum job density. We show

that tight upper bounds on demand-based load and maximum job density may

be obtained for task systems in each of the models discussed in Section 1.1.

• The best known algorithms for computing demand-based load for sporadic and

more general task systems require exponential time in the worst case. We show

that for sporadic task systems the demand-based load can be approximated

efficiently via a polynomial-time approximation scheme (PTAS).

40

• We demonstrate the difficulty of the online scheduling of sporadic and more

general task systems by proving that optimal online scheduling of these task

systems upon multiprocessor platforms is impossible. This results implies that

non-optimal online scheduling algorithms for multiprocessor systems are re-

quired.

• We derive a restricted-migration feasibility test using demand-based load and

maximum job density for real-time instances that has a resource-augmentation

approximation ratio of at most 4 − 1m

where m is the number of processors in

platform Π. This feasibility test may be applied to all the partially-specified

recurrent task models discussed in Section 1.1.

• We derive a full-migration feasibility test using demand-based load and maxi-

mum job density for real-time instances that has a resource-augmentation ap-

proximation ratio of at most√

2+1. Again, this feasibility test may be applied

to all the partially-specified recurrent task models discussed in Section 1.1.

• We derive edf and dm (full-migration) schedulability tests using demand-based

load and maximum job density for partially-specified recurrent task systems.

The application of this test to dm-scheduled sporadic task systems is discussed

and it is shown that the dm-schedulability test has resource-augmentation ap-

proximation ratio at most 4− 1m

; the dm resource-augmentation results is com-

pared with previously known dm-scheduling tests.

• We develop a polynomial-time partitioning algorithm for sporadic task systems

when either edf or dm is used to schedule each individual processor. For this

algorithm, we derive schedulability tests for our algorithm using demand-based

load and maximum job density, and we show that these tests have a resource-

augmentation approximation ratio of at most 4 − 2m

. For a special subset of

41

Scheduling: Task

Paradigm: Model

Liu & Layland Sporadic GMF/Recurring/etc..

Uniprocessor Previous Research

Scheduling (Well Understood)

Full-Migration

Scheduling This

Restricted-Migration Research

Scheduling

Partitioned Future

Scheduling Work

Table 1.2: The above table shows the contribution of this dissertation where the re-search space has been categorized by scheduling paradigm versus partially-specifiedrecurrent task model. Observe that most prior work has assumed either uniprocessorscheduling, or multiprocessor scheduling of LL task systems. This dissertation de-scribes contributions to multiprocessor scheduling of sporadic and more general taskssystems. Further research is required for the partitioned scheduling of task systemsin models more general than the sporadic task model.

sporadic task systems, known as constrained-deadline sporadic task systems, we

show that the resource-augmentation approximation ratio is tighter with a ratio

of 3 − 1m

.

Table 1.2 places the contributions of this dissertation in the context of previous

work. As mentioned in the beginning of the introduction, most prior work on real-

time scheduling of recurrent tasks has focused on either uniprocessor scheduling or

multiprocessor scheduling of LL task systems. The work contained in this disserta-

tion addresses the multiprocessor feasibility and schedulability of sporadic and more

general task systems.

42

1.6 Organization

This dissertation is organized as follows. In Chapter 2, we survey previous results

in multiprocessor real-time scheduling, including the scheduling of arbitrary real-

time instances, LL task systems, sporadic task systems, and more general systems.

For each of these different models, we present prior results concerning feasibility,

schedulability, and optimality. In Chapter 3, we formally introduce the demand-based

load and maximum job density characterization of real-time work used throughout

this dissertation. We show that both demand-based load and maximum job density

metrics may be efficiently computed for any task model discussed in this dissertation.

Furthermore, we show that demand-based load may be approximated in polynomial

time for sporadic task systems. In Chapter 4, we present our feasibility tests for the

full- and restricted-migration scheduling paradigms. In Chapter 5 and Appendix A,

we prove that optimal online scheduling of sporadic and more general task systems is

impossible. In Chapter 6, we derive schedulability tests for full-migration scheduling

of general task systems. In Chapter 7, we develop polynomial-time algorithms for

partitioning sporadic task systems and derive schedulability tests for these algorithms.

We finally summarize and conclude this dissertation in Chapter 8.

43

Chapter 2

Related Work: Multiprocessor

Real-time Scheduling

As mentioned in the introduction, most prior research in real-time scheduling theory

has predominantly focused on uniprocessor systems or the multiprocessor scheduling

of the simpler recurrent task systems (namely periodic and LL tasks). In this chapter,

we will review some of the prior fundamental results in multiprocessor scheduling of

real-time systems. We will begin with the most general characterization of real-time

workloads, arbitrary real-time instances. For arbitrary real-time instances (sometimes

referred to as independent jobs in the real-time literature) there have been some fun-

damental results concerning multiprocessor scheduling. In Section 2.1, we will briefly

review these results and discuss how they pertain to the results of this dissertation.

We will also identify some shortcomings of these multiprocessor results for arbitrary

real-time instances that this dissertation has attempted to address.

After discussing the most general characterization of real-time work, we focus on

the strictest partially-specified real-time task model discussed in the introduction:

the LL task model. In Section 2.2, we briefly summarize the results that have been

obtained for the feasibility and schedulability of task systems in the LL model under

the three paradigms of multiprocessor scheduling (partitioned, full-migration, and

restricted-migration) and three paradigms of priority-driven scheduling (fixed-task-

priority, fixed-job-priority, and dynamic-priority). We also discuss some fundamental

challenges that are present in the multiprocessor scheduling of LL task systems that

cause scheduling algorithms that are optimal for uniprocessor scheduling (e.g., edf)

to perform arbitrarily poorly in the multiprocessor setting.

In Section 2.3, we review known results concerning multiprocessor real-time schedul-

ing of sporadic task systems. We first discuss the ineffectiveness of the metrics used

in schedulability and feasibility tests for LL task systems on multiprocessors when

applied to sporadic task systems. We then give a brief overview of the current state-

of-the-art schedulability tests for edf and dm in the three multiprocessor scheduling

paradigms. In Section 2.4, we discuss the small amount of prior work concerning

real-time multiprocessor scheduling of task models that generalize the sporadic task

model.

2.1 Arbitrary Real-Time Instances

With Section 2.1.1, we begin our overview of related work with a negative result

that shows that optimal online scheduling of arbitrary real-time instances is impos-

sible. In Section 2.1.2, we review results that show that despite the impossibility of

optimal online scheduling there are real-time multiprocessor scheduling algorithms

that have constant factor resource-augmentation approximation ratios in terms of

resource-augmentation analysis. Section 2.1.3 describes an essential property for real-

time scheduling algorithms known as predictability; we briefly discuss its importance.

Section 2.1.4 presents a proposed online metric for multiprocessor schedulability of

arbitrary real-time instances known as synthetic utilization; even though, this result

gives a sufficient schedulability test for arbitrary instances, we will show that synthetic

45

utilization performs arbitrarily poorly in terms of resource augmentation.

2.1.1 Impossibility of Optimal Online Scheduling of Arbi-

trary Real-Time Instances

In the context of arbitrary real-time instances, a scheduling algorithm A is optimal

if for every I ∈ I that is feasible on platform Π, the schedule SAI produced by A on

platform Π is valid. edf is known to be an optimal uniprocessor scheduling algo-

rithm (Liu and Layland, 1973), even for arbitrary real-time instances. Unfortunately,

as soon as an additional processor is added to the platform, edf is no longer optimal;

in fact, for arbitrary real-time instance an even stronger negative statement is true:

Theorem 2.1 For arbitrary real-time instances, no online multiprocessor scheduling

algorithm is optimal.

Variants on this theorem were independently stated and proven by both (Hong

and Leung, 1988) and (Dertouzos and Mok, 1989). (Hong and Leung, 1988) gave a

slightly stricter result below which immediately implies Theorem 2.1.

Theorem 2.2 (from (Hong and Leung, 1988)) No optimal on-line multiproces-

sor scheduling algorithm exists for real-time instances with two or more distinct dead-

lines (i.e., there exists Jk, Jℓ ∈ I such that Ak 6= Aℓ).

(Dertouzos and Mok, 1989) show a similar theorem; in addition, their work ex-

plored what “knowledge” an optimal online multiprocessor scheduling algorithm re-

quired about the real-time instance being scheduled. They proved that even partial

information about all the jobs of I (e.g., having knowledge about the execution re-

quirements of instance of all jobs of I but not the arrival times) was not sufficient for

the existence of optimal scheduling algorithms for arbitrary real-time instances.

46

§ Implications to Online Scheduling of Recurrent Task Systems. Though

the above results may appear to have negative implications on the existence of optimal

scheduling algorithms for recurrent task systems, they, in fact, do not apply. We will

see in the next section (Section 2.2) that optimal online multiprocessor scheduling

algorithms do exist for LL task systems. The impossibility results for optimal online

scheduling of arbitrary real-time instances do not imply the non-existence of optimal

online scheduling algorithms for recurrent task systems due to the difference in the

set of feasible instances that must be optimally scheduled in the different contexts.

The definition for optimal algorithm A in the context of arbitrary real-time systems

requires that every I ∈ I that is feasible on Π be correctly schedule by A; however,

the definition of an optimal algorithm A′ for a recurrent task system in model M only

requires that if task system τ is feasible on Π, then A′ must correctly schedule every

I ′ ∈ I M(τ). We have thus restricted the number of feasible instances an optimal

algorithm must correctly schedule because I M(τ) ⊂ I (see Figure 1.6). For LL task

systems, this restriction is sufficient to allow for optimal online scheduling algorithms.

However, we will prove in Chapter 5 and Appendix A that the restriction is not

sufficient in sporadic and more general task systems, and optimal online scheduling

for systems in these models is impossible.

2.1.2 Resource Augmentation Results for Online Scheduling

Algorithms

Despite the non-existence of optimal online multiprocessor scheduling algorithms for

arbitrary real-time instances, there do exist online scheduling algorithms with con-

stant factor approximation ratios in terms of the resource-augmentation approxima-

tion ratio (discussed in Section 1.4.4). (Phillips et al., 2002) studied the behavior of

many online multiprocessor scheduling algorithms (both real-time and non-real-time)

47

when given faster processors than the optimal scheduling algorithm (not necessarily

online). In particular, they obtained resource augmentation guarantees for edf when

scheduling arbitrary real-time instances on a multiprocessor platform. The following

theorem states the guarantee obtained for edf:

Theorem 2.3 (from (Phillips et al., 2002)) If real-time instance I ∈ I is feasi-

ble on a processing platform comprised of m unit-speed processors, then edf (under

the full-migration paradigm) will always meet all deadlines when scheduling I on an

m-processor platform where each processor is of speed 2 − 1m

.

The above result provides a rather strong theoretical guarantee and justification

for considering edf for multiprocessor scheduling. The justification is even more

compelling when considering the fact that no online algorithm can have a resource-

augmentation approximation ratio better then 1 + 15

(also shown in (Phillips et al.,

2002)).

§ Implications to Online Scheduling of Recurrent Task Systems. The

results of Theorem 2.3 have a straightforward implication to the online scheduling of

recurrent task systems; specifically, any recurrent task system τ that is feasible on

m unit-speed processors is schedulable according edf on m-processors each of speed

2− 1m

. Of course, since there exist optimal online multiprocessor scheduling algorithms

for LL tasks, the lower bound on the resource-augmentation approximation ratio does

not directly apply to recurrent tasks.

While Theorem 2.3 is a powerful guarantee on the effectiveness of edf for schedul-

ing real-time jobs, the theorem (prior to this dissertation) had limited practical appli-

cations for the schedulability of sporadic or more general task models. The reason is

that prior to the work contained in this dissertation, there did not exist effective non-

trivial feasibility tests for sporadic and more general task systems upon multiprocessor

platforms; that is, there was no effective method to test the antecedent of Theorem 2.3

48

for sporadic or more general task systems. The feasibility results contained in this

dissertation allow for the real-time system designer to test the antecedent of Theo-

rem 2.3 and thereby directly use Thereom 2.3 to determine the edf-schedulability of

a recurrent task system (Corollary 6.4 in Chapter 6 makes use of this approach).

2.1.3 The Predictability of Multiprocessor Scheduling Algo-

rithms

From a real-time system designer’s perspective, the predictability of the system is

immensely important. That is, the system should be tolerant of variations in execution

provided these variations are within the system’s specified parameters. For example,

the designer might specify the worst-case execution requirement of each job of a real-

time instance I. If the designer verifies that the system is temporally correct when

executing instance I under the worst-case executions, a predictable system should

continue to be temporally correct under variations in execution where a job executes

for less than its worst-case requirement. If the system became unschedulable due to

some jobs executing for less than their worst-case execution time, this would represent

an anomaly in the scheduling algorithm used; for multiprocessor systems, it is not

immediately evident that scheduling algorithms are anomaly-free.

(Ha and Liu, 1994) addressed the multiprocessor scheduling of real-time jobs under

such variations in execution. In the context and terminology of this dissertation, a

system (using scheduling algorithm A) is predictable if and only if every I ∈ I that is

A-schedulable on Π implies that every I ′ ∈ F(I) is also A-schedulable on Π. (Recall

that F(I) is the set of all real-time instances that have the same jobs as I, but smaller

or equal execution requirements). (Ha and Liu, 1994) examined the predictability of

systems scheduled by work-conserving FJP scheduling algorithms. A work-conserving

scheduling algorithm always executes a job if it is active and a processor is available

49

(i.e., processors do not idle while jobs are awaiting execution). Algorithms such as

(full-migration) edf and dm are known to be work-conserving

Theorem 2.4 (from (Ha and Liu, 1994)) A multiprocessor system scheduled by

a work-conserving FJP scheduling algorithm is predictable under preemptions, full-

migration, and fixed-job-priority.

The implications of the above result for multiprocessor systems implies that a pre-

dictable, preemptable and full-migration system using a work-conserving FJP algo-

rithm can be validated under the worst-case execution requirements and is immedi-

ately guaranteed to be correct under variations in execution that require less than a

worst-case amount of time; this observation removes the burden on the designer of

validating the system under a potentially large (or infinite) number of executions that

may not require the worst-case execution time. Unfortunately, Ha and Liu observed

that not all restricted-migration systems are predictable; therefore, this property must

be verified for each individual restricted-migration scheduling algorithm.

§ Implications to Online Scheduling of Recurrent Task Systems. Theo-

rem 2.4 immediately implies the predictability of preemptable, full-migration systems

using work-conserving FJP scheduling algorithms that schedule recurrent task sys-

tems. Note that Theorem 2.4 does not represent a necessary condition for a system

being predictable. For example, Pfair-based scheduling algorithms (Baruah et al.,

1996) are not FJP algorithms or necessarily work-conserving; however, Pfair-based

scheduling algorithms are predictable scheduling algorithms for LL task systems.

A concept that is closely related to predictability is sustainability (Baruah and

Burns, 2006). Informally, a verification test for scheduling algorithm A (denoted

VA) is sustainable if VA determines that recurrent task system τ is A-schedulable on

platform Π, then a task system τ ′ with “reduced” temporal constraints (e.g., each

task of τ ′ is the same except has a larger period — see (Baruah and Burns, 2006)

50

for more detailed discussion) will also be determined to be A-schedulable by VA.

The advantage of a sustainable verification test is that the designer may validate the

system under the“worst-case”temporal constraints, but also be ensured that a system

that has reduced temporal constraints will remain verifiable by the same validation

technique. It turns out that the schedulability tests presented in Chapters 6 and 7

are sustainable. (We refer the interested reader to (Baruah and Burns, 2006)).

2.1.4 Synthetic Utilization

Thus far, we have only discussed qualitative properties (optimality, predictability,

and effectiveness) of multiprocessor scheduling of arbitrary real-time instances. We

have not addressed online or a priori verification techniques for such instances. An

online schedulability test for arbitrary real-time instances has been proposed based on

a metric of real-time workload called synthetic utilization (Abdelzaher et al., 2004),

defined below.

Definition 2.1 (Synthetic Utilization) The synthetic utilization, synth-util(I, t),

of real-time instance I at time t is:

synth-util(I, t)def

=∑

Ji∈I:Ai≤t<Ai+Di

Ei

Di

. (2.1)

Table 2.1 presents a brief overview of schedulability tests obtained using synthetic

utilization.

While synthetic utilization provides a simple test for the schedulability of a real-

time instance, it suffers from a serious drawback: any verification test using synthetic

utilization has an unbounded resource-augmentation approximation ratio. To see this,

consider an instance I comprised of n jobs, all arriving at time-instant 0 and having

an execution-requirement of 1, and with the i’th job’s deadline at time-instant i. This

51

Scheduling: Scheduling Algorithm

Paradigm edf-based dm-based

Full-Migration synth-util(I, t) ≤ m2

2m−1synth-util(I, t) ≤ (3 −

√7)m

(Andersson et al., 2003a) (Lundberg and Lennerstad, 2003)

Restricted-Migration synth-util(I, t) ≤ .31m ?

(Andersson et al., 2003b)

Table 2.1: A brief overview of the multiprocessor schedulability tests based on syn-thetic utilization. If a real-time instance I satisfies the test in the above table for allt ≥ 0, then I is schedulable on m processors according to the algorithm in the header.The reader is referred to the individual papers for greater details on the algorithmsconsidered (essentially edf or dm with minor modifications).

instance is feasible upon a single unit-capacity processor, yet has synthetic utilization

equal to∑n

i=1(1/i), which increases with n. Thus, there exists real-time instances

that are feasible on a multiprocessor platform Π (the above instance is feasible even

on a uniprocessor), but would require that Π be sped-up arbitrarily fast to verify the

schedulability of I. Since the goal of this dissertation is to derive verification tests

with constant-factor approximation ratios, we will not consider synthetic utilization

further due to its non-optimal performance in the worst case.

2.2 LL Tasks

While the previous section shows that there are scheduling algorithms for arbitrary

real-time instances with constant-factor approximation ratios, previous work has not

developed a priori verification tests with such approximation ratios. A fundamental

reason for the lack of good verification tests is the set of different real-time instances

that a verification test must implicitly consider consists of the entire space of real-time

instances I. In simple terms, the verification technique must handle an enormous set

of possibilities with very little information on what real-time instance will actually be

52

generated at run-time. On the other hand, a partially-specified real-time task system τ

(such as a task system in the LL model) generates a significantly smaller set of possible

real-time instances; furthermore, the types of jobs that a task system generates are

finite. For a partially-specified task system, there is also partial knowledge about the

characteristic of the jobs generated. Therefore, it is not surprising that there exist

exact feasibility tests, and either near-optimal or optimal schedulability tests for LL

task systems.

In this section, we describe prior work on the multiprocessor scheduling of LL task

systems. Section 2.2.1 introduces and discusses system utilization, a metric used in

feasibility and schedulability test. Section 2.2.2 discusses a challenge in the online

scheduling of LL task systems. Section 2.2.3 gives an overview of the schedulability

test obtained for various scheduling algorithms for LL task systems.

2.2.1 Task Utilization

An effective characterization of the real-time workload produced by a LL task is called

system utilization:

system-util(τ)def=

∑

τi∈τ

ei

pi

. (2.2)

The task utilization of task τi is denoted by uidef= ei

pi. Informally, task utilization repre-

sents the fraction of computational capacity that a task requires on a single processor.

The amount of execution over any interval of length t on processing platform Π that

a task τi requires is upper bounded by ui × t. (Horn, 1974) pointed out that, system

utilization may be used as an exact test for the feasibility of a LL task system.

Theorem 2.5 A LL task system τ is feasible on an m-processor platform Π if and

53

only if

system-util(τ) ≤ m. (2.3)

Throughout this dissertation, the following task parameter representing the maximum

utilization of any task in τ will also be useful.

max-util(τ)def= max

τi∈τ{ui}. (2.4)

2.2.2 Dhall’s Effect

The test of Theorem 2.5 is a simple, exact feasibility test for LL task systems on

multiprocessor platforms. For LL task systems scheduled on a uniprocessor platform,

the test system-util(τ) ≤ 1 is an exact schedulability test for edf. The next example

shows that, unfortunately, a simple, exact utilization-based test is not possible for

edf-schedulability on multiprocessor platforms.

Example 2.1 Consider platform Π with m processors, and LL task system τ com-

prised of m+1 tasks. Let τ1def= (ǫ, 1), τ2

def= (ǫ, 1), . . . , τm

def= (ǫ, 1) and τm+1

def= (1.1, 1.1).

(Recall that a LL task τi is specified by the pair (ei, pi)). By Theorem 2.5, τ is fea-

sible on Π. It is easy to see that if the system is scheduled by edf on Π and all

tasks generate a job simultaneously at time t = 0, a job of task τm+1 will miss its

deadline for all ǫ > 0 at time t = 1.1 (see Figure 2.1 for a visual depiction). Notice

that system-util(τ) = 1 + mǫ, which implies limǫ→0 system-util(τ) = 1.

Therefore, there exist LL task systems with utilization approaching one that are

not edf-schedulable on a multiprocessor platform comprised of m processors. This

effect of task systems having low utilization but being unschedulable on a multiproces-

sor system is known as Dhall’s effect (Dhall and Liu, 1978) ((Andersson and Jonsson,

54

2000) first coined the term). The effect may be observed for other online algorithms

such as rm.

2.2.3 Overview of Schedulability Tests

Over the past twenty-five years, researchers have made significant progress in address-

ing the challenges of multiprocessor scheduling of LL task systems and have developed

techniques to overcome Dhall’s effect. In Table 2.2, we will list the schedulability tests

for each multiprocessor scheduling paradigm and priority-driven algorithm class. We

refer the reader to the references listed for each result for further details. As evident

xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

0 ε 1

π2

π1

πm

xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx



...

xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

1.1

Deadline for jobs of tasks τ1, τ2, ..., τm

Deadline missfor job of task τm+1

Figure 2.1: Illustration of Dhall’s effect for task system in Example 2.1. The lightlyshaded jobs correspond to jobs of τ1, . . . τm arriving at time zero with (ei, pi) = (ǫ, 1).The black job corresponds to the job of τm+1 = (1.1, 1.1) also arriving at time zero.Since the jobs of τ1, . . . τm have deadline at time t = 1, they will execute (accordingto edf on all m processors in the interval from [0, ǫ). Thus, the job of τm+1 cannotexecute until t = ǫ and complete at t = 1.1 + ǫ, missing its deadline at time t = 1.1.However, (Devi, 2006) shows that the maximum amount of time by which any suchjob misses its deadline (called the tardiness of the job) is bounded.

55

Scheduling: Priority-Driven Class

Paradigm FTP FJP DP

Full- U ≤ m2

(1 − α) + α, if α < 1

3U ≤ m(1 − α) + α, if α < 1

2U ≤ m

Migration U ≤ m+1

3, otherwise U ≤ m+1

2, otherwise

(Bertogna et al., 2005b) (Srinivasan and Baruah, 2002) (Horn, 1974)

Restricted- U ≤ m(1 − α) + α, if α < 1

2

Migration U ≤ m+1

2, otherwise

U ≤ m+1

2(Baruah and Carpenter, 2003)

Partitioned (Andersson and Jonsson, 2003) U ≤ βm+1

β+1

(Lopez et al., 2004)

Table 2.2: The known multiprocessor schedulability tests for LL task systems.Columns represent the priority-driven class of the scheduling algorithm; the rowsrepresent the multiprocessor scheduling paradigm. To save space in the table, welet U denote system-util(τ), α denote max-util(τ), and β denote ⌊ 1

max-util(τ)⌋. If a

task system τ satisfies the test in an entry, then τ is schedulable on m processors ac-cording to some algorithm in the entry’s associated priority class and multiprocessorparadigm. This table is similar to the ones presented in (Carpenter et al., 2003; Devi,2006); however, the entries have been updated to reflect newer results.

by the existence of optimal scheduling algorithms, exact feasibility tests, and effec-

tive schedulability tests for each different class of algorithm, we may observe that the

multiprocessor scheduling of LL task systems is fairly well understood.

2.3 Sporadic Tasks

Similar to LL task systems, there are known exact uniprocessor feasibility tests for

sporadic task systems (Baruah et al., 1990a). Unfortunately, for the multiprocessor

scheduling of sporadic task systems there are currently no known exact feasibility

tests. In fact, we will see that prior to the work in this dissertation, the only non-

trivial work in the multiprocessor scheduling of sporadic task systems considered

full-migration FJP and FTP scheduling algorithms.

56

In Section 2.3.1, we discuss the shortcomings of traditional workload metrics for

the feasibility of sporadic task systems. In Section 2.3.2, we discuss related work

on the partitioned scheduling of sporadic task systems. In Section 2.3.3, we review

some recent results in the full-migration scheduling of sporadic tasks systems. To the

best of our knowledge, there does not exist any significant specific work on restricted-

migration scheduling of sporadic task systems.

2.3.1 Limitation of Traditional Workload Metrics

For sporadic task τi (where it is possible that di is not equal to pi), ui × t is no longer

an upper bound on the execution demand of τi over any interval of length t. It is

easy to see that system-util(τ) ≤ m is a necessary condition for a sporadic task system

being feasible upon an m-processor platform. However, this condition is not sufficient

for sporadic task system feasibility. In fact, it can be shown that there exist infeasible

task systems with arbitrarily small utilization. This is illustrated in the following

example:

Example 2.2 Consider the following sporadic task system consisting of three tasks

to be scheduled on a multiprocessor system comprised of two unit-capacity processors:

τ = {τ1 = (1, 1, r), τ2 = (1, 1, r), τ3 = (1, 1, r)}.

where r ≥ 2. Observe that system-util(τ) = 3/r ≤ 3/2 ≤ 2, and limr→∞ system-util(τ)

= 0; however, if each task of τ releases a job at time 0, each job must complete one

unit of execution by time 1. There is no way to schedule τ over the interval [0, 1);

therefore, τ is infeasible on two processors.

An upper bound on execution demand for sporadic task systems over any interval

with arbitrary relative deadlines is:

57

system-density(τ)def=

∑

τi∈τ

task-density(τi), (2.5)

where

task-density(τi) =ei

min(di, pi). (2.6)

Another useful workload metric for sporadic task systems is

max-task-density(τ)def= max

τi∈τtask-density(τi). (2.7)

(The quantity task-density(τi) is referred to as the density of τi.) It was shown in

(Ghazalie and Baker, 1995) that system-density(τ) ≤ 1 is a sufficient condition for the

preemptive uniprocessor feasibility of sporadic task systems — this result is easily

extended to show that system-density(τ) ≤ m is a sufficient condition for feasibility

upon a multiprocessor platform comprised of m unit-capacity processors. However,

this condition is not necessary for feasibility: consider the following example.

Example 2.3 Given a sporadic task system τ consisting of n tasks to be scheduled

on a single preemptive processor. The i’th task has execution-requirement 1, relative

deadline i, and inter-arrival separation n. It may be verified (see, e.g., (Baruah

et al., 1990b)) that this system is feasible. Its density system-density(τ) =∑n

i=1(1/i),

which grows without bound with increasing n. This example illustrates that there

are sporadic task systems τ of arbitrarily high density, system-density(τ), which are

feasible.

In summary, with respect to the preemptive scheduling of a sporadic task system

τ on a multiprocessor platform comprised of m unit-capacity processors:

1. system-util(τ) × t is a lower bound on execution demand over any interval of

length t in any real-time instance where task generate jobs as soon as legally

58

possible: system-util(τ) ≤ m is a necessary condition for feasibility;

2. system-density(τ)× t is an upper bound on execution demand over any interval

of length t in any real-time instance where task generate jobs as soon as legally

possible: system-density(τ) ≤ m is a sufficient condition for feasibility;

3. when di = pi for τi ∈ τ (i.e., a LL task system), the two coincide, giving us a

necessary and sufficient condition;

4. when di 6= pi the two bounds may leave a gap, where there is uncertainty

whether a task set is feasible; Examples 2.2 and 2.3 show the gap may be large.

The conceptual relationship of system-util(τ) and system-density(τ) is illustrated

in Figure 2.2. It can be seen that there is a region of uncertainty between the lower

and upper bounds. In the next chapter (Chapter 3) we seek to increase the precision

of determining the feasibility (or infeasibility) of sporadic task systems by considering

a more accurate workload metric.

2.3.2 Partitioned Scheduling

For LL systems, partitioned feasibility-analysis can be transformed to a bin-packing

problem (Johnson, 1973) and shown to be NP-hard in the strong sense; sufficient

feasibility tests for various bin-packing heuristics have recently been obtained (Lopez

et al., 2000; Lopez et al., 2004) (as shown in Table 2.2). For sporadic task systems, the

intractability result continues to hold. However, the bin-packing heuristics and related

analysis of (Lopez et al., 2000; Lopez et al., 2004) do not trivially extend. To our

knowledge, there have been no prior non-trivial positive theoretical results concerning

partitioned feasibility analysis of constrained and arbitrary sporadic task systems —

“trivial” results include the obvious ones that τ is feasible on m processors if (i) it is

feasible on a single processor; or (ii) the system obtained by replacing each task τi by a

59

m ≥ system-density(τ)feasible

?

infeasiblem < system-util(τ)

Figure 2.2: The entire rectangle represents the space of all possible sporadic tasksystems (i.e., I S). The region shaded by the slanted lines on top represents the spaceof task systems feasible on m processors according to the system-density(τ)-based test.The crosshatched region at the bottom represents the space of task systems deemedinfeasible according to the system-util(τ)-based test. The unshaded region with thequestion mark is referred to as the region of uncertainty because neither of the testscan determine the feasibility of the task systems. This dissertation seeks to narrowthe region of uncertainty for multiprocessor systems.

task τ ′i = (ei, min(di, pi), min(di, pi)) is deemed feasible using the heuristics presented

in (Lopez et al., 2000; Lopez et al., 2004). There has been some empirical work

on partitioning sporadic task systems; Saez, et al. (Saez et al., 1998) describe and

experimentally evaluate partitioning heuristics when the constraint that deadlines

equal periods is removed. In Chapter 7, we address the partitioned scheduling of

sporadic task systems by developing partitioning algorithms with a constant-factor

approximation ratio.

2.3.3 Full-Migration Schedulability Tests

As mentioned at the beginning of this section, there are no known non-trivial feasi-

bility tests for multiprocessor sporadic systems prior to the results of this dissertation

that are not associated with a schedulability test. Fortunately, there does exist some

60

work on schedulability tests. In this subsection, we briefly present some of the known

full-migration schedulability tests for the FTP and FJP scheduling of sporadic task

systems. A much more thorough overview can be found in (Baker and Baruah, 2007).

Though the tests presented have been shown to be reasonably effective empirically,

there are no known resource-augmentation approximation ratios associated with any

of the tests presented in this section. Chapter 6 will derive schedulability tests for

both FTP and FJP scheduling with constant-factor resource-augmentation approxi-

mation ratios; in addition, a brief theoretical comparison of FTP schedulability tests

for sporadic task systems will be made. We now present the prior schedulability tests.

§ FTP Scheduling. The first test presented in this section was developed in (Baker

and Cirinei, 2006) and is valid for sporadic task systems where tasks are indexed by

decreasing priority:

Theorem 2.6 (from (Baker and Cirinei, 2006)) Let τ be a sporadic task system

{τ1, . . . , τn} ordered by decreasing priority. A task τk ∈ τ is schedulable on m unit-

capacity processors according to static-priority scheduling if there exists

λ ∈ {task-density(τk)} ∪ {ui|ui ≥ task-density(τk) ∧ τi ∈ τ} such that either

∑

i<k

min (βk(i, λ), 1 − λ) < m(1 − λ) (2.8)

or,

[

∑

i<k

min (βk(i, λ), 1 − λ) = m(1 − λ)

]

∧

[∃i 6= k : 0 < βk(i, λ) < 1 − task-density(τk)]

(2.9)

where

61

βk(i, λ)def

=

ui

(

1 + max(

0, pi−ei

dk

))

, if ui ≤ λ

ui

(

1 + max(

0, di+pi−ei−λdi/ui

dk

))

, otherwise.(2.10)

If we restrict our attention to sporadic task systems where each task τi ∈ τ has

di ≤ pi, (Bertogna et al., 2005a) derived the following test:

Theorem 2.7 (from (Bertogna et al., 2005a)) Let τ be a sporadic task system

{τ1, . . . , τn} ordered by decreasing priority with di ≤ pi for each task τi. A task τk ∈ τ

is schedulable on m unit-capacity processors according to static-priority scheduling if

k−1∑

i=1

min(βk(i), 1 − ek

dk

) < m(1 − ek

dk

) (2.11)

or,

k−1∑

i=1


dk

) = m(1 − ek

dk

),

(

if ∃i 6= k : βk(i) ≤ 1 − ek

dk

)

(2.12)

where

βk(i)def

=eiNk(i) + min (ei, max (0, dk − piNk(i) + di − ei))

dk

(2.13)

and

Nk(i)def

=

(⌊

dk − ei

pi

⌋

+ 1

)

. (2.14)

§ FJP Scheduling. Research has been done on the full-migration edf-scheduling

of sporadic task systems.

Theorem 2.8 (from (Baker, 2005a)) Let τ be a sporadic task system {τ1, . . . , τn}.

A task τk ∈ τ is schedulable on m unit-capacity processors according to edf if there

exists λ ∈ {task-density(τk)} ∪ {ui|ui ≥ task-density(τk) ∧ τi ∈ τ} such that

62

∑

τi∈τ

min (βk(i, λ), 1) ≤ m(1 − λ) + λ (2.15)

where

βk(i, λ)def

=

ui

(

1 + max(

0, pi−ei

dk

))

, if ui ≤ λ

ui

(

1 + pi

dk

)

− λ di

dk, if ui > λ ∧ di ≤ pi

ui

(

1 + pi

dk

)

, otherwise

(2.16)

Again, if we restrict our attention to sporadic task systems where each task τi ∈ τ

has di ≤ pi, (Bertogna et al., 2005b) derived the following test:

Theorem 2.9 (from (Bertogna et al., 2005b)) Let τ be a sporadic task system

{τ1, . . . , τn} ordered by decreasing priority with di ≤ pi for each task τi. A task τk ∈ τ

is schedulable on m unit-capacity processors according to static-priority scheduling if

k−1∑

i=1


dk

) < m(1 − ek

dk

) (2.17)

or,

k−1∑

i=1


dk

) = m(1 − ek

dk

),

(

if ∃i 6= k : βk(i) ≤ 1 − ek

dk

)

(2.18)

where

βk(i)def

=eiNk(i) + min (ei, max (0, dk − piNk(i)/ei))

dk

(2.19)

and

Nk(i)def

= max

(

0,

(⌊

ek − ei

pi

⌋

+ 1

))

. (2.20)

63

In this subsection, we have presented two tests each for both FJP and FTP

scheduling. Unfortunately, it turns out that these tests are incomparable, in the

sense that there exist sporadic task systems that are schedulable according to one

test, but not the other. Therefore, determining the efficacy of each test is a non-

trivial challenge. We refer the interested reader to (Baker, 2006b) for an empirical

comparison of these various tests.

2.4 More General Task Models

Research on the multiprocessor scheduling of task systems that are more general

than the sporadic task model is virtually nonexistent. The only reference we are

aware of is on the distributed scheduling of distributed generalized multiframe (DGMF)

tasks (Chen et al., 2000). This model is identical to the GMF model presented in

Section 1.1.2.3, except a vector ~hidef= {h1, h2, . . . , hNi

} is added to the task specifica-

tion. The value hk ∈ {h1, h2, . . . , hNi} indicates which host (i.e., processor) the k’th

job in the GMF sequence will execute upon; that is, the task specification indicates

the job assignment to the processor. Tests are presented in (Chen et al., 2000) for

analyzing the FJP schedulability of any DGMF task using uniprocessor schedulability

analysis techniques. However, we consider this task model to lie beyond the scope

of this dissertation, due to the fact that after task specification the system does not

require any decisions about job-processor assignment, essentially sidestepping many

of the multiprocessor challenges by statically assigning jobs to processors. Therefore,

we will not consider this model further, but focus on other general models where the

processor assignment is not included in the task specification. Chapters 4 and 6 will

present feasibility and full-migration schedulability tests for general recurrent task

systems.

64

2.5 Summary

In this chapter, we reviewed some of the prior fundamental research in real-time mul-

tiprocessor scheduling. For arbitrary real-time instances (Section 2.1), we saw that

while optimal online scheduling is impossible, scheduling algorithms with constant-

factor approximation ratios exist; however, there are currently no known verification

techniques with such approximation ratios for arbitrary real-time instances. The

LL task model (Section 2.2) allows for optimal scheduling algorithms, despite many

online algorithms suffering from Dhall’s effect. Scheduling algorithms have been pro-

posed with effective schedulability tests (see Table 2.2). For the slightly more general

sporadic task model (Section 2.3), only recently have researchers focused on full-

migration schedulability tests; unfortunately, these tests currently have no known

resource-augmentation guarantee. Feasibility and schedulability tests for more gen-

eral task systems (Section 2.4) on multiprocessor platforms has been a completely

open question.

The remaining chapters of this dissertation will address some of the limitations

mentioned in this section. Primarily, we will develop feasibility and schedulability

tests with constant-factor resource-augmentation approximation ratios for sporadic

and more general task systems.

65

Chapter 3

A Metric of Real-time Workload:

Demand-Based Load and

Maximum Job Density

The previous chapter highlighted the shortcomings of the standard real-time workload

metrics used in verification tests for multiprocessor feasibility and schedulability of

sporadic and more general task systems. In this chapter, we suggest using a different

well-known characterization of real-time workload: the combination of demand-based

load (referred to as just load, interchangeably) and maximum job density. In this

chapter we show that these two workload characterizations are closely related to

feasibility on multiprocessor platforms. Additionally, we can derive values for both

demand-based load and maximum job density from large classes of partially-specified

recurrent task systems.

The remainder of this chapter is organized as follows. In Section 3.1, we formally

define the concepts of demand, maximum job density, and demand-based load. In

Section 3.2, we show that maximum job density and demand-based load can be used in

a necessary conditions for the feasibility of a real-time instance upon a multiprocessor

platform. In Section 3.3, we describe the relationship between the load and maximum

job density metric for real-time instances and the real-time instances produced by

recurrent task systems; it is shown that for many recurrent task models, tight upper

bounds on maximum job density and demand-based load may be obtained on the real-

time instances that are generated by task systems in these models. In Section 3.4, we

describe both exact algorithms and a polynomial-time approximation scheme (PTAS)

for calculating demand-based load for a sporadic task system.

3.1 Definitions

It is useful, for the purpose of formal analysis, to quantify the amount of computation

required over an interval by a real-time instance. We call this quantity the demand

over the interval. Informally, demand is an indication of how“temporally constrained”

the system is over that interval. Below is a more formal definition of demand.

Definition 3.1 (Demand of a Real-Time Instance I) The demand of a real-time

instance over a time interval [t1, t2] is the sum of the execution requirements of all

jobs in the instance that have both their arrival times and their deadlines within the

interval:

demand(I, t1, t2)def

=∑

(Ji∈I)∧(t1≤Ai)∧(Ai+Di≤t2)

Ei. (3.1)

Another useful indicator of how temporally constrained the system might be is

the maximum ratio of a job’s execution time to its scheduling window. If this ratio,

called maximum job density, is high then jobs of a real-time instance may require a

large fraction of processing time on the platform. Below is the formal definition of

maximum job density.

67

Definition 3.2 (Maximum Job Density of Real-Time Instance I) The

maximum job density of real-time instance I is the maximum ratio of any job’s exe-

cution requirement to its relative deadline:

max-job-density(I)def

= maxJi∈I

(Ei/Di). (3.2)

Intuitively, max-job-density(I) represents the maximum computational demand of any

individual job.

The demand-based load of a real-time instance I represents the maximum cumu-

lative computational demand of any subset of the set of jobs, in I. Informally, the

load may be interpreted as a lower bound on the minimum number of processors that

real-time instance I would require to meet the deadlines of jobs over all intervals.

More formally,

Definition 3.3 (Demand-Based Load of Real-Time Instance I) The demand-

based load of real-time instance I, load(I), is the maximum ratio, over all positive

intervals, of the demand of I over the interval to the interval length:

load(I)def

= maxt1<t2

demand(I, t1, t2)

t2 − t1. (3.3)

3.2 Infeasibility Test

The parameters load and max-job-density turn out to be very closely related to the

feasibility of a real-time instance on a multiprocessor platform; in fact, these two

parameter may be used in a necessary condition for feasibility, as the lemma below

shows:

68

Lemma 3.1 If real-time instance I is feasible on an identical multiprocessor platform

comprised of m unit-capacity processors, then

max-job-density(I) ≤ 1 and load(I) ≤ m.

Proof: The first condition follows from the observation that on a unit-capacity

processor, a job that meets its deadline by executing continually between its arrival

time and its deadline has a maximum job density of one; hence, one is an upper bound

on the density of any job. Taken over all jobs in I, this observation yields the first

condition.

For the second condition, the requirement that load(I) ≤ m is obtained by consid-

ering a set of jobs of I that defines load(I); i.e., the jobs over an interval [t1, t2) such

that all jobs arriving in, and having deadlines within, this interval have a cumulative

execution requirement equal to load(I)×(t2−t1). The total amount of execution that

all these jobs may receive over [t1, t2) is equal to m× (t2 − t1); hence, load(I) ≤ m.

If the converse of Lemma 3.1 were to hold, we would have an exact necessary and

sufficient condition for migratory feasibility analysis. Unfortunately, the converse of

Lemma 3.1 does not hold, as is illustrated by the following example.

Example 3.1 Consider the real-time instance I consisting of the three jobs J1, J2,

and J3. All three jobs arrive at time 0; jobs J1 and J2 have execution requirement

of one and a deadline at time 1; and J3 has an execution requirement of two and a

deadline at time 2.

max-job-density(I) equals max{1/1, 1/1, 2/2}, which is equal to 1.

Since all arrival times and deadlines are at time-instants 0, 1, and 2, load(I) can

69

be computed by considering the intervals [0, 1), [0, 2), and [1, 2):

load(I) = max

(

1 + 1

1,1 + 1 + 2

2,0

1

)

= 2 .

Thus, instance I satisfies the conditions of Lemma 3.1 for m = 2; however, it is

easy to see that all three jobs cannot be scheduled to meet their deadlines on two

unit-capacity processors (since m = 2).

Lemma 3.1 and Example 3.1 tell us that, while every instance I that is feasi-

ble upon a platform comprised of m unit-capacity processors has load(I) ≤ m and

max-job-density(I) ≤ 1, not every instance I ′ with load(I ′) ≤ m and

max-job-density(I ′) ≤ 1 is feasible on such a platform. Chapters 4, 6, and 7 will

further explore multiprocessor feasibility and schedulability tests based on load and

max-job-density.

3.3 Demand-Based Load of Partially-Specified Re-

current Task Systems

The infeasibility test of the previous section and the feasibility and schedulability tests

described in the remainder of this dissertation require that load parameter load(I) and

the job density parameter max-job-density(I) of the real-time instance I being ana-

lyzed be known. Thus, it may seem to the reader that the workload characterizations

of load(I) and max-job-density(I) are of limited interest for infinite real-time instances

— especially since we had argued in the introduction, many real-time systems are

comprised of collections of independent recurrent real-time tasks, each of which gen-

erates a potentially infinite sequence of jobs. However, in this section we will show

70

how, given the specifications of such a real-time system, we may determine bounds

on the load and max-job-density parameters of any real-time instance that could be

generated by the system during run-time. Our approach is based upon the concept

of the demand-bound function (dbf) of recurrent real-time tasks; in Section 3.3.1

below, we describe this concept and explain how it relates to determining load and

max-job-density. In Section 3.3.1, we briefly discuss the computational complexity

of the dbf approach towards multiprocessor feasibility and schedulability of general

task models.

3.3.1 The dbf Abstraction

We start out with a definition of the demand-bound function.

Definition 3.4 (Demand-Bound Function) Let τi denote a recurrent real-time

task, and t a non-negative real number. The demand-bound function dbf(τi, t) denotes

the maximum cumulative execution requirement that could be generated by jobs of τi

that have both ready times and deadlines within any time interval of duration t.

The demand-bound function is efficiently determined for all the recurring real-time

task models mentioned in this dissertation; algorithms for doing so for the LL and

sporadic task models are to be found in (Baruah et al., 1990b), for the multiframe and

generalized multiframe models in (Baruah et al., 1999), and for the DAG-based model

in (Baruah, 2003). As an illustrative example, we show the formula for computing

dbf for a task specified according to the sporadic task model — a formal proof of the

formula may be found in (Baruah et al., 1990b). Recall from the introduction that a

sporadic task τi be represented by the three-tuple (ei, di, pi), with the interpretation

that this task generates an infinite sequence of jobs each with execution requirement

ei and relative deadline di, and with the arrival times of successive jobs separated

by at least pi time units. For such a task, it has been shown (Baruah et al., 1990b)

71

that the cumulative execution requirement of jobs of τi over an interval [to, to + t)

is maximized if one job arrives at the start of the interval — i.e., at time-instant

to — and subsequent jobs arrive as rapidly as permitted — i.e., at instants to + pi,

to +2pi, to +3pi, . . . (such a sequence where the jobs of a task arrive as soon as legally

allowable is known as the synchronous arrival sequence); Equation (3.4) below follows

directly (Baruah et al., 1990b):

dbf(τi, t)def= max

(

0,

(⌊

t − di

pi

⌋

+ 1

)

× ei

)

. (3.4)

Given that we know how to determine the dbf for many of the important recurrent

real-time task models, we now discuss how to compute the value of load(I) for any

real-time instance I that is generated by a collection of recurrent real-time tasks

τ = {τ1, τ2, . . .}. The maximum cumulative execution requirement by jobs in I over

any time interval [t1, t2) is bounded from above by the sum of the maximum execution

requirements of the individual tasks in τ :

demand(I, t1, t2) ≤ (∑

τi∈τ

dbf(τi, t2 − t1)) .

From the definition of load (Equation 3.3), it follows that

load(I) ≤ maxt≥0

∑

τi∈τ

dbf(τi, t)

t

. (3.5)

How tight is the bound of Inequality 3.5? Clearly, it cannot in general be tight for

all instances I generated by a recurrent task system τ , since τ may generate different

instances that have different loads, while the bound of Inequality 3.5 is unable to

distinguish between such different instances. However, we do not in general know

72

beforehand which specific instance τ may generate during run-time; therefore, the

bound of Inequality 3.5 must hold for all instances that τ might legally generate. So

a more reasonable formulation of the “tightness” question for Inequality 3.5 would be:

Is there some instance I that could be generated by τ , for which the load bound of

Inequality 3.5 is tight?

The answer to this question depends upon the characteristics of the collection of

recurrent tasks comprising τ . Informally, the requirement for Inequality 3.5 to repre-

sent a tight bound (in the sense discussed above) is that the different tasks comprising

τ be completely independent of one another. This requirement is formalized in the

task independence assumptions (Baruah et al., 1999). We briefly review these inde-

pendence assumptions below; a more complete discussion may be found in (Baruah

et al., 1999).

There are two requirements that are satisfied by systems adhering to the task

independence assumptions.

1. The run-time behavior of a task does not depend upon the behavior of other

tasks in the system. That is, each task is an independent entity, perhaps driven

by separate external events. It is not permissible for one task to generate a job

directly in response to another task generating a job. Instances of task systems

not satisfying this assumption include systems where, for example, all tasks are

required to generate jobs at the same time instant, or where it is guaranteed

that certain tasks will generate jobs before certain other tasks. (However, such

systems can sometimes nevertheless be represented in such a manner as to satisfy

this assumption, by modelling the interacting tasks as a single task which is

assumed to generate the jobs actually generated by the interacting tasks.)

2. The workload constraints can be specified without making any references to “ab-

solute” time. That is, specifications such as “Task τi generates a job at time 3”

73

are forbidden. There are several scenarios within which this assumption holds.

Consider first a distributed system in which each task executes on a separate

node (jobs correspond to requests for time on a shared resource) and which

begins execution in response to an external event. All temporal specifications

are made relative to the time at which the task begins execution, which is not

a priori known. As another example, consider a distributed system in which

each task maintains its own (very accurate) clock, and in which the clocks of

different tasks are not synchronized with each other. The accuracy of the clocks

permits us to assume that there is no clock drift, and that all tasks use exactly

the same units for measuring time. However, the fact that these clocks are not

synchronized rules out the use of a concept of an absolute time scale.

These task independence assumptions are extremely general and are satisfied by

a wide variety of the kinds of task systems one may encounter in practice. Most

common task models, including the LL (Liu and Layland, 1973), multiframe (Mok

and Chen, 1996; Mok and Chen, 1997) generalized multiframe (Baruah et al., 1999),

and the DAG-based model (Baruah, 2003), satisfy these assumptions.

We now return to the issue of the tightness of the bound of Inequality 3.5. Let τ

denote a real-time system. If τ satisfies the task independence assumptions, then the

bound of Inequality 3.5 is tight in the sense that there is some real-time instance I

that could legally be generated by τ , for which

load(I) = maxt≥0

∑

τi∈τ

dbf(τi, t)

t

.

Using this notion of maximum load real-time instance generated by τ , We can in

fact define load in terms of the partially specified task system:

74

Definition 3.5 (load for Task System τ) The load of a recurrent partially-specified

task system τ in task model M is equal to the maximum load of any real-time instance

generated by the task system τ :

load(τ)def

= maxI∈I M(τ)

{load(I)}. (3.6)

§ Time Complexity. We now discuss the computational complexity of determining

bounds on

max-job-density(I) and load(I), when we are given that I is generated by a system τ

of recurrent real-time tasks. From the definition of max-job-density (Equation 3.2), it

follows that max-job-density(I) is easily bound for any system in which all possible

jobs that could be generated by each task can be enumerated; the computational

complexity of doing so is directly proportional to the computational complexity of

the enumeration1. For example, in the sporadic task model, each job Ji generated by

sporadic task τk has Ei

Di= ek

dk; therefore, to determine max-job-density(I), we find the

value of maxτk∈τ{ek/dk} which clearly has complexity O(n).

The determination of load(I) is somewhat more complex. From Inequality 3.5,

it can be seen that load(I) is defined in terms of the dbf functions of all the tasks

comprising τ . It has been shown (Chakraborty et al., 2001; Chakraborty, 2003) that

dbfs can be efficiently computed for all the formal models of recurrent tasks (the

LL (Liu and Layland, 1973), the sporadic (Mok, 1983), the multiframe (Mok and

Chen, 1996; Mok and Chen, 1997), the generalized multiframe (Baruah et al., 1999),

and the recurring real-time task model (Baruah, 2003)) discussed in this dissertation.

This is achieved by doing a pseudo-polynomial amount of pre-processing per task,

after which dbf(τi, t) for any t can be completed in polynomial time.

1When the jobs are characterized by upper bounds on their execution requirements, the value ofmax-job-density(I) so computed also becomes an upper bound.

75

To implement any of the (sufficient) multiprocessor feasibility test that are based

on demand-based load and maximum job density (presented in the future chapters

this dissertation) upon a task system τ , we would therefore do the following

1. Perform the dbfpreprocessing on each task in the task system τ .

2. Compute a bound on max-job-density(I).

3. Based upon the computed bound on max-job-density(I) and the available number

of processors m, determine a bound B on load(I) that is implied by a feasibility

or schedulability test (for example, Equation 4.1 of Theorem 4.1, or any other

test presented in this dissertation).

4. The question now becomes: Is there a t ≥ 0 such that

∑

τi∈τ

dbf(τi, t) > (B × t) ? (3.7)

If the answer is “no,” then τ is guaranteed to be feasible on the m unit-capacity

processors. On the other hand, if the answer is “yes” then our test is not able

to conclude the feasibility or otherwise of τ on the m processors.

For all the formal models of recurrent tasks considered in this dissertation, it can

be shown that if Inequality 3.7 is to be satisfied at all, it will be satisfied for some

“reasonably small” value of t — specifically, for some t with value that is no more

than pseudo-polynomial in the parameters of the task system (see (Baruah, 2003)).

Consequently, we can simply check Inequality 3.7 for all values of t, up to this pseudo-

polynomial bound, at which dbf(τi, t) changes value for some τi ∈ τ ; if Inequality 3.7

is not satisfied for all of these values of t, we can conclude that it will not be satisfied

for any value of t, and that τ is consequently feasible.

76

Recent work (Albers and Slomka, 2004; Baruah and Fisher, 2005b; Fisher and

Baruah, 2005a; Fisher and Baruah, 2005b) on approximation algorithms for unipro-

cessor scheduling has introduced many techniques for obtaining polynomial-time fea-

sibility tests for systems of recurrent real-time tasks by “sacrificing” a (quantifiable)

fraction of the computing capacity of the available computing capacity. In essence,

these techniques approximate the values of dbf(τi, t) beyond a certain (small) value

of t. We observe that these techniques all apply to our multiprocessor feasibility test

as well; hence, a less accurate variant of our test can be devised, with a run-time

complexity that is polynomial in the representation of the task system.

An alternative approach to evaluating Equation 3.7 for determining the feasibility

or schedulability of a task system is to pre-compute load(τ) for task system τ ; then,

the validation test is equivalent to checking if load(τ) > B. In some cases, this may

be more computationally expensive, since it may require dbf to be evaluated at a

larger number of values than the approach of Equation 3.7. However, for some mod-

els such as the sporadic task model, it has been shown that load may be efficiently

determined. (Baruah et al., 1990a; Ripoll et al., 1996) present algorithms that have

pseudo-polynomial time complexity for task systems that have a system utilization

strictly less than m. The next section presents additional algorithms for exactly com-

puting load for sporadic task systems that has pseudo-polynomial time complexity; a

PTAS is also presented for approximating load to within an arbitrary additive error

term ǫ.

77

3.4 Efficiently Calculating load for Sporadic Task

Systems

In this section, we address the issue of effectively calculating load(τ) if τ is a spo-

radic task system. We present an exact algorithm for computing load(τ) that involves

simulating the scheduling of τ to its hyperperiod (the least common multiple of the

task system’s periods, LCMni=1pi). We also present two other algorithms which ap-

proximate load(τ) within an arbitrary threshold ǫ > 0 of its exact value. The first

approximate algorithm runs in time pseudo-polynomial in the representation of the

task system; the second algorithm, in time polynomial in the representation of the

task system.

The remainder of this section is organized as follows. We define some additional

notation used throughout this section and prove useful properties of load(τ) in Sec-

tion 3.4.1. We then derive a simple method for exactly determining load(τ) in Sec-

tion 3.4.2. We describe the more practical approximations of load in Section 3.4.3.

3.4.1 Properties of Demand-Based Load for a Sporadic Task

System

Throughout this section, let f(τi, t) be defined to be dbf(τi, t) normalized by the

interval length: f(τi, t)def=

dbf(τi, t)

t.

Given a sporadic task system τ = {τ1, . . . , τn}, the demand-based load load(τ) can

be computed by determining maxt>0 f(τ, t) where f(τ, t)def=

n∑

i=1

f(τi, t).

The parameter load(τ) may be calculated using the above function because dbf(τi, t)

is a“tight”characterization (per the discussion of the previous subsection and (Baruah

et al., 1990a)) for sporadic task systems of the maximum demand of any real-time

instance over any interval of length t.

78

0

0.2

0.4

0.6

0.8

1

0 5 10 15 20 25

dem

and/

time

time

fupper

(a)

0

0.2

0.4

0.6

0.8

1

0 5 10 15 20 25

dem

and/

time

time

fupper

(b)

Figure 3.1: The zigzag solid lines in both graphs represent values f(τi, t) with respect

to t where in (a) τidef= (ei, di, pi) = (2, 7, 3), and in (b) τi

def= (2, 5, 6). The dotted lines

represent approximations to f(τi, t) with ki = 0. (Approximations are described inSection 3.4.3.2.)

79

Figure 3.1 illustrates an example of f(τi, t) for two different tasks. The demand-

based load, load(τ), describes the maximum value of f(τ, t) over all positive values of

t. Since the values of t are real and unbounded, the notation maxt>0 here denotes the

least upper bound of f(τ, t).

We will now show that load(τ) is potentially superior to system-util(τ) as a lower

bound on computational load, as it falls between system-util(τ) and system-density(τ).

The next lemma proves that system-util(τ) is a lower bound on the demand-based

load.

Lemma 3.2 load(τ) ≥ system-util(τ)

Proof: Observe that for each task τi ∈ τ , limt→∞ f(τi, t) = ui.

More precisely, for a given i and t, let 0 ≤ r < pi be the value such that ⌊ t−di

pi⌋ =

t−di−rpi

. It follows that

(⌊ t−di

pi⌋ + 1)ei

t=

( t−di−rpi

+ 1)ei

t

=(t + pi − di − r)ei

tpi

= ui + uipi − di − r

t.

Since −di < pi − di − r < pi − di, the absolute value of the fraction on the right

above is decreasing with respect to t, and so the limit of the entire expression is ui.

It follows that limt→∞ f(τ, t) = system-util(τ). The lemma immediately follows from

this limit.

The following lemma shows that system-density(τ) is an upper bound on the

demand-based load.

Lemma 3.3 load(τ) ≤ system-density(τ)

80

m ≥ system-density(τ)

load-based

Feasibility Test feasible

?

infeasiblem < load

m < system-util(τ)

Figure 3.2: load-based tests further reduce the region of uncertainty from Figure 2.2for the feasibility of sporadic task systems on multiprocessor platforms.

Proof: Note that, by definition of f(τ, t),

f(τ, t) =∑

i:di<t

⌊ t+pi−di

pi⌋ei

t≤

∑

i:di<t

ui

(

1 +pi − di

t

)

.

If pi ≥ di, the term pi−di

tis non-increasing with respect to t, and since di < t (by

the summation terms), pi−di

t≤ pi−di

di= pi

di− 1.

Otherwise, the term pi−di

tis increasing with respect to t, and in the limit pi−di

t= 0.

Therefore,

f(τ, t) ≤∑

i:di<t

ui(1 + max(0,pi

di

− 1)) =∑

i:di<t

task-density(τi)

≤n

∑

i=1

task-density(τi) = system-density(τ).

The benefit that load provides for infeasibility and feasibility tests due to Lemmas 3.2

and 3.3 is conceptually represented in Figure 3.2.

81

3.4.2 An Exact Algorithm for Calculating Load

To calculate load(τ), we must limit the number of values of t for which we evaluate

f(τ, t) to a finite number. It may seem that f(τ, t) needs to be checked at an infinite

number of t values. However, the following two observations are useful in showing

that only a finite number of values need to be checked.

1. The maximum value of f(τ, t) only occurs at “step” points (Lemma 3.4).

Therefore, the set of potential test points is countable.

2. f(τ, t) is maximized prior to τ ’s hyperperiod (Lemma 3.5). Therefore, the

maximum test point has a bounded value.

The following lemma formally restates and proves the first observation.

Lemma 3.4

maxt>0

f(τ, t) = max{f(τ, jpi + di) | i = 1, . . . , n; j = 0, . . .}

Proof: Since f(τ, t) is generally locally decreasing with respect to t, attention can

be limited to the values of t for which the derivative is discontinuous, i.e., t = jpi +di

for positive integer values j.

We may now show that the hyperperiod provides an upper bound on the maximum

possible t that we need to evaluate f(τ, t).

Lemma 3.5 Let L = LCMni=1pi. If load(τ) > system-util(τ) then load(τ) = f(τ, t)

for some t ≤ L.

Proof: The proof is by contradiction. Let cL+x be the least value (where c ≥ 1 and

0 < x < L) for which f(τ, cL+x) = load(τ) > system-util(τ) and f(τ, cL+x) > f(τ, t)

for every t < cL + x. If the lemma is false, there must be such a value.

82

By definition of function f ,

f(τ, cL + x) =

∑ni=1 dbf(τi, cL + x)

cL + x

=

∑ni=1 max

(

0,(⌊

cL+x−di

pi

⌋

+ 1)

× ei

)

cL + x

=

∑ni=1 max

(

0,(

cLpi

+⌊

x−di

pi

⌋

+ 1)

× ei

)

cL + x

≤ cL × system-util(τ) +∑n

i=1 dbf(τi, x)

cL + x.

Since f(τ, cL+x) equals load(τ), the above equations imply that cL×system-util(τ)

+∑n

i=1 dbf(τi, x) is at least (load(τ) × cL) + (load(τ) × x). Due to the assumption

that load(τ) exceeds system-util(τ),

(cL × system-util(τ)) +∑n

i=1 dbf(τi, x) > (cL × system-util(τ)) + (load(τ) × x)

⇒ ∑ni=1 dbf(τi, x) > (load(τ) × x)

⇒ f(τ, x) > load(τ).

The last inequality contradicts our supposition that for all t < cL + x, f(τ, t) <

f(τ, cL + x). Thus, the lemma holds.

The following corollary to Lemma 3.5 and Lemma 3.2 shows that if f(τ, t) does

not exceed system-util(τ) prior to the hyperperiod of τ , we may infer that load(τ) =

system-util(τ).

Corollary 3.1 If f(τ, t) ≤ system-util(τ) for all t ≤ L, then load(τ) = system-util(τ).

83

Lemmas 3.4 and 3.5 and Corollary 3.1 imply that we need to only check f(τ, t)

up to the task system’s hyperperiod. We can further limit the number of values t

that need to be considered in the exact computation of load if we iteratively make use

of the information we have gained by looking at the value of f(τ, t) up to any given

point in the computation. The next lemma formalizes this concept.

Lemma 3.6 If f(τ, t) ≥ system-util(τ) + γ for some γ > 0 then

t ≤ system-util(τ)maxτi∈τ (pi−di)

γ.

Proof: Observe that maxτi∈τ (pi − di) > 0; otherwise, for all τi ∈ τ ,

(⌊

t − di

pi

⌋

+ 1

)

ei ≤(

t−di

pi+ 1

)

ei

≤ uit − uidi + uipi ≤ uit

⇒ dbf(τi, t) ≤ uit

⇒ f(τ, t) ≤ system-util(τ).

The last statement contradicts the supposition of the lemma. Thus, there must exist

a τi ∈ τ such that pi − di > 0. Therefore,

system-util(τ) + γ ≤ f(τ, t) =

∑ni=1 max(0, (⌊ t−di

pi⌋ + 1)ei)

t

≤∑

i:di<t

ui +∑

i:di<t

ui

maxτj∈τ (pj − dj)

t

≤ system-util(τ)

(

1 +maxτi∈τ (pi − di)

t

)

⇒ γ ≤ system-util(τ)maxτi∈τ (pi − di)

t

⇒ t ≤ system-util(τ)maxτi∈τ (pi − di)

γ.

84

Calculate–load(τ, ǫ)

¤ Interpret a divide by zero, as infinity; ǫ can be zero for exact-case.

1 limit ← min{

(LCMni=1pi) ,

(∑n

i=1 ei

ǫ

)

,(

system-util(τ)maxτi∈τ (pi−di)

ǫ

)}

2 fmax ← system-util(τ);3 for each t = jpi + di, in increasing order, loop

exit when t ≥ limit;4 if f(τ, t) > fmax

then5 fmax ← f(τ, t);

6 limit ← min(limit, system-util(τ)maxτi∈τ (pi−di)

fmax−system-util(τ)+ǫ)

7 if fmax> system-density(τ) − ǫ then return system-density(τ);8 end if;9 end loop;

10 return fmax;

Figure 3.3: Pseudo-code for determining demand-based load within a value of ǫ.When ǫ equals zero, the algorithm calculates the exact value of the demand-basedload; otherwise, it is an approximation.

Our algorithm for calculating load(τ) is represented in Figure 3.3 by

calculate–load. The subroutine exactly calculates load(τ) when passed an ǫ param-

eter equal to zero; note that when ǫ = 0, Line 1 always sets limit to the hyperperiod,

and Line 6 never updates limit (non-zero ǫ values will be discussed in Section 3.4.3.1).

Lemmas 3.4, 3.5, and 3.6, and Corollary 3.1 show that calculate–load is correct

when ǫ = 0.

§ Time Complexity. The “worst-case scenario” with respect to

calculate–load(τ, 0)’s execution time occurs when load(τ) = system-util(τ); in this

case, f(τ, t) ≤ system-util(τ) for all t > 0. Therefore, fmax ≤ system-util(τ) for all

iterations of calculate–load(τ, 0). Observe that Line 6 of calculate–load updates

limit only if fmax− system-util(τ) > 0 (assuming ǫ = 0). Consequentially, limit is

never updated after Line 1 sets it to τ ’s hyperperiod, and the algorithm calculates

f(τ, t) for all integer values in the task system’s hyperperiod. If p1, p2, . . . , pn are

85

relatively prime and pi are integers greater than 1, then LCMni=1pi =

∏ni=1 pi ≥ 2n.

Therefore, the number of time f(τ, t) is evaluated in calculate–load(τ, 0) is poten-

tially exponential in the number of tasks in the task system.

3.4.3 Approximation Algorithms

We can accelerate the convergence of our demand-based load calculation if we permit

a bounded level of inaccuracy in our calculation. That is, we can significantly reduce

the number of values of t we must consider if we allow our calculated value of load(τ)

to lie within a specified range (or “tolerance”) of the actual value.

Let ǫ denote a tolerance within which load(τ) is to be approximated, for arbitrary

ǫ > 0. In this section, we propose two algorithms that calculate load(τ) to within an

additive error ǫ of its actual value. The first algorithm, discussed in Section 3.4.3.1, is

a pseudo-polynomial-time algorithm based on the idea of iterative convergence. The

second algorithm, presented in Section 3.4.3.2, is a polynomial-time approximation

scheme for determining load.

3.4.3.1 Pseudo-polynomial-time Approximation Scheme

The effect of introducing a bounded amount of inaccuracy allows us to limit the

number of values of t at which we evaluate f(τ, t). A useful observation is that after

a sufficiently large value of t, f(τ, t) does not exceed system-util(τ) by more than ǫ.

The next lemma quantifies the value of t for which f(τ, t) is within ǫ of system-util(τ).

Lemma 3.7 If t ≥∑n

i=1 ei

ǫ, then f(τ, t) ≤ system-util(τ) + ǫ.

86

Proof: For any τi ∈ τ with t ≥ di, dbf(τi, t) = (⌊ t−di

pi⌋ + 1)ei; otherwise, if t < di,

dbf(τi, t) = 0. Since ⌊x⌋ ≤ x, the function f(τ, t) can be bounded above as follows:

f(τ, t) ≤∑

i:di<t

t−di

piei + ei

t≤

∑

i:di<t

ui(1 − di

t) +

∑ni=1 ei

t

⇒ f(τ, t) ≤ system-util(τ) −∑

i:di<t uidi

t+

∑ni=1 ei

t.

It follows that for t ≥∑n

i=1 ei

ǫ,

f(τ, t) ≤ system-util(τ) −∑

i:di<t uidi

t+ ǫ.

Since −∑

i:di<t uidi

t< 0, for all t >

∑ni=1 ei

ǫthe following condition is true: f(τ, t) ≤

system-util(τ) + ǫ.

Since fmax never decreases and is initially system-util(τ) in calculate–load(τ, ǫ),

Lemma 3.7 implies that we need only evaluate values of t ≤∑n

i=1 ei

ǫ. This optimization

is reflected in Line 1 of calculate–load(τ, ǫ).

In addition, we can use Lemma 3.6 to further reduce the number of steps taken by

calculate–load(τ, ǫ). Suppose at step i in an iterative approximate computation of

load(τ) the maximum value of f(τ, t) has been computed over all values t ≤ ti, and

that value is fmax. The computation can terminate unless there is a value t > t0 such

that f(τ, t) > f(τ, t0) + ǫ. Letting γ = fmax− system-util(τ) + ǫ in Lemma 3.6 above

we have

t ≤ system-util(τ)maxτi∈τ (pi − di)

fmax − system-util(τ) + ǫ.

The above observations are applied in Lines 1 and 6 of the algorithm

calculate–load(τ, ǫ) for approximate computation of load(τ) (shown in Figure 3.3).

Line 7 allows the algorithm to terminate early if fmax is within ǫ of our upper bound

of system-density(τ).

87

§ Time Complexity. The values of t for which we must evaluate f(τ, t) in

calculate–load(τ, ǫ) is at most min{(

∑ni=1 ei

ǫ

)

,(

system-util(τ)maxτi∈τ (pi−di)

ǫ

)}

. Both

of these terms are polynomial in the task system’s parameters and 1/ǫ, and each eval-

uation of f(τ, t) requires O(n) time. Therefore, calculate–load(τ, ǫ) represents a

pseudo-polynomial approximation scheme (PPTAS).

3.4.3.2 Polynomial-time Approximation Scheme

Further theoretical reduction in the number of potential values for which to evaluate

f(τ, t) can be achieved by an approximation to dbf(τi, t). Our approximation will

allow us to “skip” intermediate test points in the calculation of load(τ). We may

approximate dbf(τi, t) for each task τi by “tracking” it exactly for ki + 1 steps (how

to pick ki will be discussed shortly), and then using the tightest linear upper bound

of dbf(τi, t) with slope ui after ki + 1 steps. A similar approximation was defined by

Albers and Slomka (Albers and Slomka, 2004) (based on an approximation introduced

by (Devi, 2003)) for uniprocessor feasibility analysis. Formally, the approximation can

be expressed by:

dbf∗(τi, t, ki)

def=

dbf(τi, t), if t < kipi + di,

ei + (t − di)ui, otherwise.(3.8)

We can now describe approximations to load(τ) using dbf∗(τi, t, ki):

f ∗(τi, t)def= dbf∗(τi,t,ki)

t,

f ∗(τ, t)def=

∑ni=1 f ∗(τi, t),

and

load∗(τ)def= max

t>0f ∗(τ, t).

88

Visual examples of f ∗(τi, t) are shown in Figure 3.1 with ki = 0.

It can be shown that if we pick kidef= max

(

⌈nui

ǫ− di

pi⌉, 0

)

, then load∗(τ) is within

ǫ of load(τ). The following lemma proves this assertion.

Lemma 3.8 For sporadic task system τ , if for all τi ∈ τ , ki = max(

⌈nui

ǫ− di

pi⌉, 0

)

,

then load(τ) ≤ load∗(τ) ≤ load(τ) + ǫ.

Proof: To prove the lemma it suffices to show for all t > 0 that f(τ, t) ≤ f ∗(τ, t) ≤

f(τ, t) + ǫ. Obviously, f(τ, t) ≤ f ∗(τ, t); so, we will focus on showing that f ∗(τ, t) ≤

f(τ, t) + ǫ for the ki specified in the lemma.

Consider the following partition of τ = τdbf−exact(t) ∪ τdbf−approx(t) where

τdbf−approx(t)def= {τi|kipi + di ≤ t} and τdbf−exact(t)

def= τ − τdbf−approx(t). Informally,

τdbf−exact(t) is the set of tasks that have not taken ki+1 exact steps of DBF∗ at time t;

and, τdbf−approx(t) is the set of tasks where dbf∗(τi, t, ki) uses the linear approximation

to dbf(τi, t) at time t.

Observe that dbf∗(τi, t, ki) ≤ dbf(τi, t) + ei for all t > 0 and τi ∈ τ . Therefore,

f ∗(τ, t) =∑

τi∈τdbf−exact

dbf(τi,t)t

+∑

τi∈τdbf−approx

dbf∗(τi,t,ki)t

≤ ∑ni=1

dbf(τi,t)t

+∑


ei

t

= f(τ, t) +∑


ei

t.

Note that for all τi ∈ τdbf−approx, the value of t is lower bounded by kipi + di. This

implies t ≥(

nui

ǫ− di

pi

)

pi + di = nei

ǫ. Thus,

f ∗(τ, t) ≤ f(τ, t) +∑


ǫn

≤ f(τ, t) + ǫ.

89

The immediate implication of the previous lemma is that we may approximate

load(τ) by effectively computing load∗(τ).

Informally, to determine load∗(τ) we need to only evaluate f ∗(τ, t) at times t where

the derivative of f ∗ is discontinuous. The points at which such discontinuities occur

for a given sporadic task system τ are:

S(τ, ǫ)def=

n⋃

i=1

{jpi + di|i = 1, . . . , n; j = 0, . . . , ki}.

The next lemma formalizes the assertion that to correctly calculate load∗(τ) it is

sufficient to only evaluate f ∗(τ, t) for values of t ∈ S(τ, ǫ). Let t1, t2 ∈ S(τ, ǫ) (t1 < t2)

be adjacent if there does not exist a t′ ∈ S(τ, ǫ) such that t1 < t′ < t2.

Lemma 3.9 Consider any two adjacent elements of t1, t2 ∈ S(τ, ǫ) ∪ {0} where t1 <

t2; for all t such that t1 < t < t2, the following condition holds,

f ∗(τ, t) ≤ max(f ∗(τ, t1), f∗(τ, t2), system-util(τ)).

Proof: Let τdbf−approx(t) and τdbf−exact(t) be the partition of τ defined in Lemma 3.8.

Consider any t in the interval (t1, t2). Clearly, τdbf−exact(t1) = τdbf−exact(t). Moreover,

there does not exist t′ in (t1, t2), τi ∈ τdbf−exact(t1), and ℓ ∈ N+ such that t′ = ℓpi +di.

This implies for all τi ∈ τdbf−exact(t),

f ∗(τi, t) = dbf(τ,t1)t

= dbf(τ,t1)/t1t/t1

= t1f∗(τi,t1)t

.

Also, τdbf−approx(t1) = τdbf−approx(t) which implies for all τi ∈ τdbf−approx(t) that

f ∗(τi, t) = t1f∗(τi,t1)t

+ ui(t−t1)t

. So, for t in the interval (t1, t2), we may express f ∗(τ, t)

90

in terms of this partition:

f ∗(τ, t) =∑

τi∈τdbf−exactf ∗(τi, t)

+∑

τi∈τdbf−approxf ∗(τi, t)

=∑

τi∈τdbf−exact

t1f∗(τi,t1)t

+∑


[

t1f∗(τi,t1)t

+ ui(t−t1)t

]

=t1f∗(τ,t1)+(t−t1)

∑

τi∈τdbf−approxui

t.

Let us now look at the partial derivative of f ∗(τ, t) with respect to t:

∂f∗(τ,t)∂t

=t∑


t2

−(

t1f∗(τ,t1)+(t−t1)∑


)

t2

=t1

(

∑

τi∈τdbf−approxui−f∗(τ,t1)

)

t2.

Therefore, if f ∗(τ, t1) ≥∑

τi∈τdbf−approxui, then f ∗(τ, t) is non-increasing and f ∗(τ, t)

≤ f ∗(τ, t1). Otherwise, f ∗(τ, t) is bounded from above by system-util(τ). To complete

the lemma, consider t in the interval (0, min{S}) (i.e., 0 and min{S} are adjacent).

In this case, f ∗(τ, t) = 0 ≤ f ∗(τ, min{S}).

Let tmaxdef= max{t ∈ S(τ, ǫ)}. By the previous lemma, values of t in the interval

(0, tmax) that are not in S(τ, ǫ) do not contribute to the calculation of load∗(τ) (i.e.,

f ∗(τ, t) 6= load∗(τ)). The next lemma shows that values of t in the interval (tmax,∞)

also do not contribute to load∗(τ):

Lemma 3.10 For all t > tmax, the following inequality holds

max (f ∗(τ, tmax), system-util(τ)) ≥ f ∗(τ, t).

Proof: Define τdbf−approx(tmax) as in Lemma 3.8. For all t > tmax and τi ∈ τ ,

91

PTAS–load(τ, ǫ)

1 fmax ← system-util(τ) + ǫ;

¤ kidef= max

(

⌈nui

ǫ− di

pi⌉, 0

)

for all τi ∈ τ .

2 for each t ∈ S(τ, ǫ), in increasing order loop

3 if f ∗(τ, t) > fmax

then

4 fmax ← f ∗(τ, t);

5 if fmax≥ system-density(τ) then return system-density(τ);

6 end if;

7 end loop;

8 return fmax;

Figure 3.4: Pseudo-code for determining demand-based load within a value of ǫ in

polynomial-time.

t > kipi + di. This implies that τdbf−approx(tmax) equals τ . Therefore,

f ∗(τ, t) =tmaxf

∗(τ, tmax) + (t − tmax)system-util(τ)

t.

If f ∗(τ, tmax) > system-util(τ), then f ∗(τ, t) is decreasing in the interval of (tmax,∞)

implying f ∗(τ, tmax) ≥ f ∗(τ, t); otherwise, f ∗(τ, t) ≤ system-util(τ).

The algorithm ptas–load(τ, ǫ) is presented in Figure 3.4. Lemma 3.8 showed that

to approximate load(τ) we could calculate load∗(τ) instead (via evaluating f ∗(τ, t)).

Lemmas 3.9 and 3.10 showed that we only need to consider the values of t in the set

S(τ, ǫ). By these lemmas, ptas–load(τ, ǫ) correctly approximates load(τ) to within

an additive value of ǫ. It should be noted that the heuristics of Lemmas 3.5, 3.6,

and 3.7 can also be applied to ptas–load(τ, ǫ) to further speed-up computation.

The number of iterations of ptas–load(τ, ǫ) is entirely based on the size of the set

92

S(τ, ǫ). The size of the set is:

n∑

i=1

(ki + 1) ∈ O(

n2

ǫ

)

.

because ki ≤ n/ǫ. A straightforward implementation of f ∗(τ, t) has O(n) complexity.

Therefore, the total time complexity of ptas–load(τ, ǫ) is O(n3/ǫ). The runtime of

the algorithm is polynomial in the number of tasks and ǫ, and independent of the task

parameters. Therefore, ptas–load(τ, ǫ) represents a polynomial-time approximation

scheme.

3.5 Summary

As we observed in Chapter 2, the system utilization and system density parameters of

a task system do not effectively characterize the computational demand of a sporadic

task system and more general task systems. For the purposes of developing a better

characterization, we propose in this chapter using the well-known parameter of load

and max-job-density. In this chapter, we showed that these two metrics are closely

related to the feasibility of a recurrent task system on a multiprocessor platform —

later chapters will provide further support for this claim. Furthermore, we described

how both load and max-job-density may be efficient computed for recurrent task sys-

tems in general. We described, in the last section of this chapter, how to compute

load(τ) of a sporadic task system by providing an exact and approximate algorithms.

We demonstrated that it is possible to approximate load(τ) to within an additive

constant ǫ > 0 in polynomial-time.

93

Chapter 4

The Restricted- and Full-Migration

Feasibility Analysis of General

Task Systems

Given a real-time instance I, determining whether I is restricted-migration feasi-

ble upon a given number of processors is at least as hard as bin-packing (Johnson,

1973), and so is NP-hard in the strong sense. Exponential-time algorithms are known

for solving this problem. By applying network flow techniques, migratory feasibility

analysis can be performed in time polynomial in the number of jobs in instance I.

A migratory, static schedule for instance I may also be obtained from these tech-

niques. However, both the exponential-time restricted-migration feasibility analysis

and the polynomial-time full-migration feasibility analysis algorithms require that all

parameters of all the jobs in instance I be known beforehand. As discussed in the

introduction, this may not always be possible since many real-time applications are

comprised of partially-specified recurrent real-time tasks.

In the last chapter, we observed (Lemma 3.2) that load and max-job-density may be

used as necessary conditions for the feasibility of real-time instances on multiproces-

sor platforms. In this chapter, we derive conditions based on load and max-job-density

that are sufficient for a real-time instance to be feasible upon an m-processor plat-

form. Both restricted-migration and full-migration systems are considered. Further-

more, the feasibility results contained in this chapter have a constant factor resource-

augmentation approximation ratio (discussed in the introduction of this dissertation).

The feasibility results of this chapter are obtained for real-time instances; by way of

Sections 3.3 and 3.4 of the previous chapter, these results are also directly relevant to

determining the feasibility of partially-specified recurrent task systems because load

and max-job-density may be obtained for these systems. Section 4.1 will derive suf-

ficient feasibility conditions for restricted-migration systems. Section 4.2 will obtain

conditions for full-migration systems.

4.1 Restricted-Migration Feasibility

We now derive sufficient conditions for determining whether a given real-time instance

I is restricted-migration feasible upon a specified number of processors based on the

parameters max-job-density(I) and load(I). Since migratory scheduling is more gen-

eral than restricted-migration scheduling (in the sense that every restricted-migration

schedule is also a migratory schedule in which no migrations happened to occur), these

sufficient conditions are sufficient conditions for migratory feasibility analysis, too.

Suppose that a given real-time instance I = J1, J2, . . . is infeasible upon m unit-

capacity processors. Without loss of generality, let us assume that the jobs are indexed

by non-decreasing order of relative deadline — i.e., Di ≤ Di+1 for all i ≥ 1 (note,

that this is a different order than assumed in the introduction). We now describe

an (off-line) multiprocessor algorithm for scheduling this collection of jobs; since the

collection of jobs has no schedule (by virtue of being infeasible), it is necessary that

at some point during its execution this algorithm will report failure.

95

Our algorithm considers jobs in the order J1, J2, J3, . . ., i.e., in non-decreasing order

of relative deadline. In considering a job Ji, the goal is to assign it to a processor in

such a manner that all the jobs assigned to each processor are preemptive uniprocessor

feasible.

We now describe in detail the algorithm for assigning job Ji:

• For each processor πk, 1 ≤ k ≤ m, let I(πk) denote the jobs that have already

been assigned to this processor; our assignment algorithm ensures that I(πk) is

preemptive uniprocessor feasible.

• Assign Ji to any processor πk such that doing so retains feasibility on that

processor; if no such πk exists, declare failure and exit.

A pseudo-code representation of this job-assignment algorithm is presented in

Figure 4.1.

jobAssign

¤ There are m unit-capacity processors, denoted π1, π2, . . . , πm

¤ I(πk) denotes the jobs already assigned to processor πk

1 for i ← 1, 2, . . .2 if there is a processor πk such that (I(πk)

⋃{Ji}) is preemptive unipro-cessor feasible

3 then¤ assign Ji to πk

4 I(πk) ← I(πk)⋃ {Ji}

5 else return assignment failed

Figure 4.1: Pseudo-code for job-assignment algorithm.

4.1.1 Algorithm Analysis

Now, we will examine the conditions under which Algorithm JobAssign may fail.

By ensuring that these conditions are never satisfied, we can then obtain a sufficient

96

schedulability test for feasibility analysis of real-time instances.

Theorem 4.1 Let I denote a real-time instance satisfying

load(I) ≤ 1

3

(

m − (m − 1)max-job-density(I))

. (4.1)

Algorithm JobAssign successfully assigns every job to some processor, on a platform

comprised of m unit-capacity processors.

Proof:

Let us assume that Algorithm JobAssign does not successfully assign all jobs in

I, and let Ji denote the first job for which Algorithm JobAssign returns assignment

failed.

For each processor πk, I(πk) is feasible by construction. Let Wk denote the min-

imum amount of execution that is performed over the interval [Ai, Ai + Di) in any

preemptive uniprocessor schedule for I(πk) that meets all deadlines. Since Ji cannot

be accommodated on any processor (i.e., I(πk)∪ {Ji} is infeasible for all processors),

it must be the case that Wk > (Di − Ei) on each processor (otherwise, there would

be enough “idle time” with respect to I(πk) to complete Ji’s execution by its dead-

line, and I(πk) ∪ {Ji} would be feasible on πk). Summing over all m processors, we

conclude thatm

∑

k=1

Wk > m · (Di − Ei) (4.2)

Up until Algorithm JobAssign fails to assign job Ji to a processor, only jobs with

relative deadline at most Di have been assigned to processors. Therefore, no job in

I(πk) for any processor πk can have an arrival time of at most Ai and absolute deadline

exceeding Ai +Di; otherwise, such a job would have relative deadline greater than Di

and would contradict the assignment ordering of Algorithm JobAssign. Thus, all

97

- time

6

?

6

?

6

?Ai (Ai + Di)As (As + Ds) Af (Af + Df )

Figure 4.2: (Proof of Theorem 4.1.) Job Ji is being considered. Jobs Js and Jf arethe previously-assigned jobs with earliest arrival time and latest absolute deadlinerespectively, whose “intervals” overlap with that of Ji.

of the execution of jobs of I(πk) (for any processor πk) in the interval [Ai, Ai + Di)

belongs to jobs that arrive or have a deadline in the interval [Ai, Ai + Di).

Let Js denote the job with earliest arrival time that has already been assigned

to some processor, such that As + Ds > Ai; and let Jf denote the job with latest

(absolute) deadline that has already been assigned to some processor, such that Af <

Ai + Di (see Figure 4.2).

Since jobs are considered in order of their relative deadline and Js and Jf were

both assigned prior to job Ji being considered, it must be the case that Ds and Df

are both no larger than Di:

Ds ≤ Di and Df ≤ Di . (4.3)

From Figure 4.2, it is immediately evident that the interval [As, Af + Df ) is of size

no more than 3 × Di:

(Af + Df ) − As ≤ 3 · Di . (4.4)

Hence, all the work contributing to the expression on the right-hand side of In-

equality 4.2 was generated by jobs that both arrive in, and have their deadline within,

the interval [As, Af + Df ), which is of length at most ≤ 3 × Di. By the definition of

load(I), the maximum amount of work arriving in, and having deadlines within, an

interval of this length is at most (Af +Df −As)× load(I), of which an amount Ei (cor-

98

responding to the execution requirement of Ji) does not contribute to the right-hand

side of Inequality 4.2. Hence,

(Af + Df − As)load(I) − Ei > m(Di − Ei)

⇒ (By Inequality 4.4 above) (4.5)

3 · Di · load(I) − Ei > m(Di − Ei)

≡ load(I) >m − (m − 1) Ei

Di

3. (4.6)

Note that the right-hand side decreases as Ei

Diincreases; i.e., this condition is more

likely to be satisfied for larger values of Ei

Di. Since a sufficient condition for feasibility

is that the negation of Condition 4.6 always hold, this is ensured by requiring that

the negation of Condition 4.6 hold for the largest possible value of Ei

Di, i.e., for Ei

Di=

max-job-density(I):

load(I) ≤ 1

3

(

m − (m − 1) max-job-density(I))

,

which is exactly Inequality 4.1 of the statement of the theorem.

Recall that our goal has been to obtain sufficient conditions for preemptive mul-

tiprocessor feasibility. Specifically, we had set out to obtain a feasibility region in the

two-dimensional space [0, 1]× [0,m], such that any instance I with max-job-density(I)

and load(I) lying in this region is guaranteed to be feasible. Theorem 4.1 yields

such a feasibility region: for given m, any instance I satisfying Equation 4.1 is guar-

anteed to be feasible upon m unit-capacity processors. In fact, it is also known

from uniprocessor scheduling theory that any instance I satisfying (load(I) ≤ 1 and

max-job-density(I) ≤ 1) is feasible upon a single unit-capacity processor; hence, such

an instance I is also feasible upon m unit-capacity processors for all m > 1. Thus,

we can modify Inequality 4.1 to come up with the following sufficient condition for

99

- max-job-density(I)

6load(I)

(1, 1)

(0, 0)

m3

1m4m−1

m2

4m−1

Figure 4.3: The feasibility region, as defined by Equation 4.7. All real-time instancesI for which (max-job-density(I), load(I)) lies beneath the solid line are guaranteedfeasible on m unit-capacity processors.

feasibility:

load(I) ≤ max

(

1,1

3

(

m − (m − 1)max-job-density(I))

)

. (4.7)

This feasibility region is depicted visually in Figure 4.3.

By Lemma 3.1, recall that a necessary condition for instance I to be feasible

on m unit-capacity processors is that load(I) ≤ m and max-job-density(I) ≤ 1; In-

equality 4.7 provides sufficient conditions. The following corollary to Theorem 4.1

formalizes this fact:

Corollary 4.1 Any real-time instance I satisfying the following two conditions

load(I) ≤ m2

4m − 1(4.8)

max-job-density(I) ≤ m

4m − 1(4.9)

is feasible upon an m-processor unit-capacity multiprocessor platform under restricted-

migration multiprocessor scheduling1.

1An alternative statement of this corollary could be: the point ( m4m−1

, m2

4m−1) lies in the feasibility

region of Figure 4.3 — this point is depicted by the dotted lines in Figure 4.3.

100

Proof: In order that instance I be feasible on m unit-capacity processors it is, by

Equation 4.1, sufficient that

load(I) ≤ 1

3× (m − (m − 1)max-job-density(I))

⇐ (From Condition 4.9)

load(I) ≤ 1

3×

(

m − (m − 1)m

4m − 1

)

≡ load(I) ≤ m2

4m − 1,

which is true (by Condition 4.8 above).

The result in Corollary 4.1 above can be considered to be an analog of a “utiliza-

tion” test (Liu and Layland, 1973), or a “processor demand criterion” test (Baruah

et al., 1990b), for uniprocessor systems. According to Lemma 3.1, a necessary con-

dition for real-time instance to be feasible on m unit-capacity processors is that

load(I) ≤ m and max-job-density(I) ≤ 1; by Corollary 4.1 above, it is sufficient

that load(I) ≤ m2/(4m − 1) and max-job-density(I) ≤ m/(4m − 1). Hence to within

a constant factor of less than four, we have obtained bounds on the values of load(I)

and max-job-density(I) that are needed (and suffice) for feasibility. The resource-

augmentation approximation ratio for the test of Corollary 4.1 is described in the

Theorem below.

Theorem 4.2 Any real-time instance I that is feasible (according to some hypothet-

ically optimal feasibility test) upon m-processors of unit capacity is guaranteed to

satisfy the conditions of Equation 4.8 and 4.9 of Corollary 4.1 on an m-processor

platform where each processor has speed 4 − 1m

.

Proof: Let I be a real-time instance that is feasible on platform Π with m unit

capacity processors. Consider I now scheduled upon platform (4− 1m

)·Π (i.e., platform

101

Π where each processor has been sped up by a factor of (4 − 1m

). If we normalize

the execution time of I to the speed of processor (4 − 1m

) · Π, we obtain a new

representation of I (denoted by I ′) on the faster processing platform; for each job

Ji ∈ I, there is a job J ′i ∈ I ′ with identical arrival time and relative deadline, but

with E ′i equal to mEi

4m−1. By Equation 3.1, for all 0 ≤ t1 < t2, demand(I ′, t1, t2) is equal

to m4m−1

· demand(I, t1, t2); by Equations 3.3 and 3.2,

load(I ′) =m

4m − 1· load(I)

max-job-density(I ′) =m

4m − 1· max-job-density(I).

Since I is feasible on Π then by Lemma 3.2, it must be that load(I) ≤ m and

max-job-density(I) ≤ 1. Substituting into the equations above, we obtain

load(I ′) ≤ m2

4m − 1

max-job-density(I ′) =m

4m − 1.

Thus, I ′ (equivalently I) is feasible by Corollary 4.1 on (4 − 1m

) · Π.

4.1.2 Tightness of the Bound

As stated, Corollary 4.1 asserts that our algorithm exhibits behavior that is, in a cer-

tain sense, no more than a factor of approximately four off optimal behavior. In fact,

if we restrict our attention only to real-time systems that are characterized exclu-

sively by their load and max-job-density parameters, we can show that our algorithm

is actually within a factor of 223

of optimal behavior. We do this by proving that a

necessary condition for instance I to be feasible is in fact tighter than assumed above:

102

there are instances I with max-job-density(I) = 23

+ ǫ and load(I) = 2m3

+ ǫ for any

positive ǫ, that are not feasible on m unit-capacity processors. Consider the following

real-time instance I consisting of four jobs (each job is represented by the three-tuple

(Ai, Ei, Di)):

I = {J1 = (0,4

3, 2); J2 = J3 = (1,

2

3, 1); J4 = (1,

4

3, 2)} .

It may be verified that max-job-density(I) = 23

and load(I) = 43. This instance is

feasible upon two processors; however, increasing the execution requirement of any of

the four jobs by any amount at all would render it infeasible. For any (even) number

of processors m, similar instances can be constructed with max-job-density two-thirds

and load equal to 2m3

. Since no algorithm, not even an optimal one, is able to schedule

such an instance it therefore follows that our algorithm is worse than optimal by a

factor of no more than 2/31/4

= 83.

4.2 Full-Migration Feasibility

The results of the previous section certainly apply to the full-migration feasibility

of a real-time instance, in addition to restricted-migration feasibility (observe that

restricted-migration schedules are also full-migration schedules in which jobs do not

migrate). The results contained in this section improve upon these sufficient feasi-

bility analysis conditions for full-migration scheduling, and introduce schedulability

conditions based on load and max-job-density for edf scheduling.

Let us begin by stating the main full-migration feasibility result that we have

obtained. After some discussion of the result, we will formally prove it in Section 4.2.1.

Theorem 4.3 For real-time instance I and identical multiprocessor platform Π com-

prised of m unit-speed processors, if

103

- max-job-density(I)

6load(I)

(1, 1)

(0, 0)

m

1(√

2 − 1)

m3

(√

2 − 1)m

··············································

Figure 4.4: The feasibility region, as defined by Equation 4.10. All real-time in-stances I for which (max-job-density(I), load(I)) lies beneath the curved solid line areguaranteed feasible on m unit-capacity processors. The dashed line illustrates therestricted-migration feasibility bound of Theorem 4.1 (from previous section).

load(I) <m − (m − 2)max-job-density(I)

1 + max-job-density(I)(4.10)

then I is globally feasible upon platform Π.

As in the previous section, our goal is to obtain sufficient conditions for preemptive

multiprocessor feasibility (in this case under full-migration scheduling). Looking again

at the feasibility region in the two-dimensional space [0, 1]× [0,m], Theorem 4.3 yields

the following feasibility region: for given m, any instance I satisfying Equation 4.10

is guaranteed to be feasible upon m unit-speed processors. The feasibility region for

unit-capacity processors is depicted visually in Figure 4.4.

We may show that for any real-time instance I our feasibility conditions are within

a constant factor of the optimal value of load(I) and max-job-density(I). Consider the

following corollary to Theorem 4.3.

Corollary 4.2 Any real-time instance I satisfying the following two conditions

load(I) ≤ (√

2 − 1)m (4.11)

max-job-density(I) ≤ (√

2 − 1) (4.12)

104

is feasible upon an m-processor unit-capacity multiprocessor platform under full-migration

multiprocessor scheduling.2

Proof: Assume that Conditions 4.11 and 4.12 hold for real-time instance I. Notice

that the condition of Equation 4.10 of Theorem 4.3, decreases as max-job-density(I)

increases. So, all instances I with max-job-density(I) ≤√

2 − 1 and

load(I) < m−(m−2)(√

2−1)

1+(√

2−1)= (

√2 − 1)m + 2 −

√2

⇐ load(I) ≤ (√

2 − 1)m(4.13)

are feasible according to Theorem 4.3 on m unit-capacity processors. However, Equa-

tion 4.13 implies that all instances I that satisfy Conditions 4.11 and 4.12 are feasible

on m unit-capacity processors.

As with Corollary 4.1, the result in Corollary 4.2 above can be considered to be an

analog of a “utilization” test, or a “processor demand criteria” test, for uniprocessor

systems. According to Lemma 3.2, a necessary condition for real-time instance to be

feasible on m unit-capacity processors is that load(I) ≤ m and max-job-density(I) ≤ 1;

by Corollary 4.2 above, it is sufficient that load(I)

≤ (√

2− 1)m and max-job-density(I) ≤√

2− 1. Hence, to within a constant factor of

less than 2.5, we have obtained bounds on the values of load(I) and max-job-density(I)

that are needed (and suffice) for feasibility. The following theorem states the resource

augmentation guarantee of Corollary 4.2. The proof of the theorem is nearly identical

to Theorem 4.2 and will be omitted.



satisfy the conditions of Equations 4.11 and 4.12 of Corollary 4.2 of Corollary 4.1 on

2An alternative statement of this corollary could be: the point(√

2 − 1, (√

2 − 1)m)

lies in the

feasibility region of Figure 4.4 — this point is depicted by the dotted lines in the figure.

105

an m-processor platform where each processor has speed√

2 + 1.

§ How tight is this bound?. Theorem 4.4 asserts that our algorithm exhibits

behavior that is, in a certain sense, no more than a factor of approximately 2.5

off optimal behavior. In fact, if we restrict our attention only to real-time systems

that are characterized exclusively by their load and max-job-density parameters, we

can show that our algorithm is actually within a factor of approximately 1.61 of

optimal behavior in this same sense. Consider the same real-time instance I with

load(I) = 2m3

+ ǫ and max-job-density(I) = 23

+ ǫ from Section 4.1.2; recall that I is

infeasible on two processors of unit capacity for all ǫ > 0; therefore, no algorithm

(not even an optimal algorithm) would be able to declare this task system feasible.

Comparing the load of this infeasible task system with the bounds of Corollary 4.2,

it follows that our algorithm is worse than optimal by a factor of no more than

2/3√2−1

≈ 1.61.

4.2.1 Proof of Theorem 4.3

In this section, we give the detailed proof of the feasibility conditions of Theorem 4.3.

The proof will rely heavily on the definitions and notation of Sections 1.3.1 and 1.4.1.

Section 4.2.1.1 gives some additional preliminary notation needed for the proof. Sec-

tion 4.2.1.2 gives a detailed outline of the main steps of the proof. Section 4.2.1.3

contains the entire proof.

4.2.1.1 Notation

Throughout this section, we redefine the order and indexing of jobs to be in non-

decreasing order of absolute deadlines (i.e., for all Ji, Jj ∈ I, i < j if and only if

Ai + Di ≤ Aj + Dj). Inevitably, in any schedule on a processing platform a job

may be prevented from executing (even though it has remaining execution) because

106

the platform is busy executing other jobs. If job Jk is prevented from executing

because job Ji was being executed on the platform, we say that Ji interfered with

the execution of Jk. Below, we define one form of interference that will be used

throughout the proof of Theorem 4.3: arrival-time interference. The next definition

formally defines a predicate for arrival-time interference for any two jobs in schedule

SI for instance I.

Definition 4.1 (Arrival-Time Interference Predicate) A job Ji arrival-time in-

terferes with job Jk in a given schedule SI if Ji arrives earlier than Jk (or at exactly

the same time, but has lower index), and there exists a non-empty interval (t1, t2) in

the intersection of [Ai, Ai + Di) and [Ak, Ak + Dk) where for all t ∈ (t1, t2):

1. Ji is executing.

2. No processor is idle.

3. No processor is executing Jk.

Let φ : SI×I×I → {true, false} be the predicate which denotes that Ji arrival-time

interferes with Jk. Formally,

φ(SI , Ji, Jk)

def

= ((Ai < Ak) ∨ ((Ai = Ak) ∧ (i < k)))

∧

[∃(t1, t2) ⊆ [Ai, Ai + Di) ∩ [Ak, Ak + Dk)

such that

(t2 > t1)

∧ (∀t ∈ (t1, t2) : ∃πℓ ∈ Π :: SI(πℓ, t, Ji) = 1)

∧ (∀t ∈ (t1, t2), πℓ ∈ Π : SI(πℓ, t) 6= ⊥, Jk)].

(4.14)

107

- time

6

?

6

??

6

?

6. . .

(Ai−2 + Di−2)Ai (Ai + Di)Ai−1 (Ai−1 + Di−1) Ai+1 (Ai+1 + Di+1)Ai+2

. . .

Figure 4.5: Visual depiction of a spanning chain.

The final definition in this notation section describes a subset of jobs whose arrival

sequence forms a chain. The proof of Theorem 4.3 will reason about the amount of

execution that must occur on the processing platform over a chain.

Definition 4.2 (Spanning Chain for finite instance Ifinite) Let Ifinitedef

=

{J1, . . . , Jn}, ordered according to absolute deadline. Let Jℓ1 be the job of Ifinite with the

earliest arrival time. A spanning chain for Ifinite is a sequence of jobs Jℓ1 , Jℓ2 , . . . , Jℓr

with the following three properties:

1. Jℓr= Jn,

2. for all 1 < i ≤ r, Aℓi−1< Aℓi

≤ Aℓi−1+ Dℓi−1

,

3. for all 1 < i < r, Aℓi−1+ Dℓi−1

< Aℓi+1.

Figure 4.5 gives a visual illustration of a segment of a spanning chain. Notice,

the spanning chain is equivalent to ensuring that every time point in the interval

[A1, An+Dn) is contained in the scheduling window of at least one job of the spanning

chain, but not more than two jobs.

4.2.1.2 Outline

The following is an informal outline of the steps taken in the proof of Theorem 4.3

(in the next subsection).

108

1. We will prove the contrapositive of Theorem 4.3. That is, if real-time instance

I is infeasible, then load(I) ≥ m−(m−2)max-job-density(I)

1+max-job-density(I). The first step of the

proof is to assume that I is infeasible on m-processor platform Π.

2. We consider Iinfeas which is a subset of real-time instance I. Iinfeas is defined to

be the first n jobs of I (in order of index) where {J1, J2, . . . , Jn−1} is feasible on

Π, but Iinfeasdef= {J1, . . . , Jn−1, Jn} is infeasible on Π. (Recall, in this section, it

is assumed that jobs are indexed by their absolute deadlines).

3. We define a schedule S ′Iinfeas

(Equation 4.15 below) on Π in which jobs J1, . . . , Jn−1

meet their deadline, but Jn misses its deadline. We then consider the set of all

possible sequences of jobs Jℓ1 , Jℓ2 , . . . , Jℓs∈ Iinfeas that arrival-time interfere

with each other, and Jℓsarrival-time interferes with Jn (i.e., for 1 ≤ i < s,

φ(S ′Iinfeas

, Jℓi, Jℓi+1

) and φ(S ′Iinfeas

, Jℓs, Jn)).

4. Over all possible sequences from the previous step, we consider the sequence

γ = {Jγ1 , Jγ2 , . . . , Jγs} with the earliest arriving job. Let Aγ1 be the arrival time

of the earliest job in γ. Let I ′ be the subset of jobs of Iinfeas with arrival-times

greater than Aγ1 . We will show that the work (with respect to I ′ and the system

work function) done over the interval of each job Ji in γ∪{Jn} must be at least[

m − (m − 1)Ei

Di

]

Di. Lemma 4.1 proves this statement.

5. From the sequence γ ∪{Jn}, we argue that there must exist a spanning chain χ

(according to Definition 4.2). Each job Jχiof the spanning chain also has the

property that the amount of work done over its scheduling window must be at

least [m − (m − 1)max-job-density(I)] Dχi(Corollary 4.3).

6. We use the lower bound on the amount of work from the previous step and

properties of the spanning chain χ to show that the load(I ′) must be at least

m−(m−2)max-job-density(I)

1+max-job-density(I).

109

7. Since I ′ ⊆ I, load(I ′) ≤ load(I). Thus, if I is infeasible, the load must equal

or exceed the bound from the previous step. Theorem 4.3 follows from the

contrapositive of this statement.

4.2.1.3 Proof

§ Construction of S ′Iinfeas

and I ′. Assume the notation defined in Step 2 of

the proof outline. That is, let Π be a unit-speed identical m-processor platform;

let Iinfeas be the first n jobs of I such that {J1, J2, . . . , Jn−1} is feasible on Π, but

Iinfeasdef= {J1, . . . , Jn−1, Jn} is infeasible on Π.

We now describe how to construct the schedule S ′Iinfeas

over the jobs of Iinfeas.

Informally, S ′Iinfeas

is the schedule which maximizes the amount of time that Jn executes

and jobs Iinfeas − {Jn} complete by their deadlines. Let σ be the set of all schedules

for Iinfeas on Π that are valid for Iinfeas − {Jn}, and satisfy only Conditions 1 and 2

of validity (Definition 1.6), but not Condition 3 for Jn (i.e., job Jn does not complete

execution by its deadline). Informally, σ is the set of schedules of Iinfeas on Π such that

Jn misses a deadline, but all other jobs meet their deadline. Note that σ is non-empty

due to the fact that Iinfeas − {Jn} is feasible. We may now define schedule S ′Iinfeas

:

S ′Iinfeas

def= arg max

S∈σ{W (S, Jn, An, An + Dn)}. (4.15)

Let Γ be the set of all possible sequences of jobs of Iinfeas, ending with Jn, such

that each job in the sequence arrival-time interferes with the subsequent job (Γ may

be empty). Formally,

Γdef= {{Ja1 , Ja2 , . . . , Jas

} ⊆ Iinfeas|φ(S ′Iinfeas

, Ja1 , Ja2) ∧ . . . ∧ φ(S ′Iinfeas

, Jas, Jn)}. (4.16)

110

Define γ ∈ Γ to be the maximum length interference sequence with the earliest

arriving job (i.e., for all δ ∈ Γ, Aγ1 < Aδ1 or ((Aγ1 = Aδ1) ∧ (γ1 < δ1))). We will now

construct another real-time instance I ′ that is all jobs of Iinfeas that arrive after Aγ1–

i.e., I ′ def= {Ji ∈ Iinfeas|Ai ≥ Aγ1}.

The next lemma gives a lower bound on the amount of work that must be done on

behalf of jobs in I ′ over each job in the maximum interference sequence for schedule

S ′Iinfeas

.

Lemma 4.1 For all jobs Jγiin the maximum length interference sequence of schedule

S ′Iinfeas

plus job Jn (i.e., γ ∪ {Jn}),

WI′(S′Iinfeas

, Aγi, Aγi

+ Dγi) ≥ Dγi

[

m − (m − 1)Eγi

Dγi

]

.

Proof: The proof is by contradiction; assume that there exists a Jγi∈ γ ∪ {Jn}

where

WI′(S′Iinfeas

, Aγi, Aγi

+ Dγi) < Dγi

[

m − (m − 1)Eγi

Dγi

]

= m(Dγi− Eγi

) + Eγi.

(4.17)

Assume that Eγi< Dγi

; otherwise, Equation 4.17 is vacuously false. Let E ′γi

≤

Eγibe the amount of time during which Jγi

executes in S ′Iinfeas

over the interval

[Aγi, Aγi

+ Dγi). Let Yγi

be the set of non-overlapping contiguous intervals during

Jγi’s scheduling window in which no processor is executing Jγi

, i.e., Yγi

def= {(t1, t2) ⊆

[Aγi, Aγi

+ Dγi)| (∀πk ∈ Π, t ∈ (t1, t2) : S ′

Iinfeas(πk, t) 6= Jγi

) ∧ (t2 > t1)}. Since Jγican

only execute for E ′γi

amount of time, then the sum of the interval lengths of Yγiis

∑

(t1,t2)∈Yγi(t2 − t1) = Dγi

− E ′γi≥ Dγi

− Eγi. Note by Equation 4.17, there exists an

interval time (t′1, t′2) that is a subset of an interval of Yγi

such that at least one processor

is idle for all t ∈ (t′1, t′2); this fact follows by observing that if such a (t′1, t

′2) did not

exist the amount of work WI′−{Jγi}(S

′Iinfeas

, Aγi, Aγi

+ Dγi) = m(Dγi

− E ′γi

) + E ′γi

≥

111

m(Dγi− Eγi

) + Eγi, contradicting Inequality 4.17. Let Zγi

be the set of all such idle

sub-intervals of Yγi; that is, Zγi

def= {(t′1, t′2) ⊆ (t1, t2) (∈ Yγi

) | (∀t ∈ (t′1, t′2) : ∃πk ∈

Π :: S ′Iinfeas

(πk, t) = ⊥) ∧ (t′2 > t′1)}. Note that Zγiis non-empty.

If Jγi= Jn, the fact that there exists an idle point in an interval in Yn contradicts

the definition of S ′Iinfeas

; so, the lemma holds for Jn. We now consider Jγi6= Jn (note,

if Jγi= Jγs

then Jγi+1is Jn). From the definition of φ(S ′

Iinfeas, Jγi

, Jγi+1), there exists

an Xγi

def= (t1, t2) ⊆ [Aγi

, Aγi+ Dγi

)∩ [Aγi+1, Aγi+1

+ Dγi+1) where |Xγi

| > 0 and for all

t ∈ Xγi,

(

∃πk ∈ Π, S ′Iinfeas

(πk, t) = Jγi

)

∧(

∀πk ∈ Π, S ′Iinfeas

(πk, t) 6= Jγi+1

)

. (4.18)

Similarly, there exists Xγi+1⊆

(

[Aγi+1, Aγi+1

+ Dγi+1) ∩ [Aγi+2

, Aγi+2+ Dγi+2

))

, . . . ,

Xγs⊆ ([Aγs

, Aγs+ Dγs

) ∩ [An, An + Dn)). That is, for each job Jγk(γk ≥ γi) in the

maximum length interference sequence there exists a well-defined interval Xγkin the

intersection of Jγk’s scheduling window and its successor’s (Jγk+1

) scheduling window

such that Jγkis executing while all other processors are busy, and Jγk+1

is not ex-

ecuting. Zγicontains intervals during Jγi

’s activation in which Jγiis not executing

and there is an idle processor; Let (ta, tb) be any interval of set Zγi. We may define a

new schedule S(0) (based on S ′Iinfeas

) where we move min(|tb − ta|, |Xγi|) units of Jγi

’s

execution from times in Xγito Zγi

. In doing this “swap,” we have not violated any of

the conditions of a valid schedule (Definition 1.6). Now in S(0) there exists t ∈ Xγi

and πk ∈ Π such that S(0)(πk, t) = ⊥. Thus, we can move min(|tb − ta|, |Xγi|, |Xγi+1

|)

units of Jγi+1execution from Xγi+1

to Xγi; define such a schedule to S(1) (based on

schedule S(0)). We can repeat this “swapping” procedure until we define S(s−i+1)

where we allow min (|tb − ta|, |Xγi|, . . . , |Xγs

|, xremaining) units of additional execution

for Jn to occur in Xγswhere xremaining is

(

En − W (S ′Iinfeas

, Jn, An, An + Dn))

. Thus,

we have defined a schedule which is valid for Iinfeas − {Jn} (in any of our swappings

112

we have not violated Properties 1 through 3), and has

W (S(s−i+1), Jn, An, An + Dn) > W (S ′Iinfeas

, Jn, An, An + Dn).

The above inequality directly contradicts the definition of S ′Iinfeas

in Equation 4.15;

therefore, our supposition is false, and the lemma is true.

§ Existence of a spanning chain for I ′. The maximum-length interference

sequence is not necessarily guaranteed to be a spanning chain according to Defini-

tion 4.2. However, it may be shown that there must exist χ ⊆ γ ∪ {Jn} that is a

spanning chain. Let C be the set of all subsets of γ ∪ {Jn} that “cover” the interval

[Aγ1 , An +Dn). (A set X of sub-intervals covers the interval [Aγ1 , An +Dn) if for each

t ∈ [Aγ1 , An + Dn) there is a sub-interval in X that contains the point t). Define, the

minimum cover of interval [Aγ1 , An + Dn) to be

χdef= arg min

X∈C{|X|}. (4.19)

It is relatively easy to see that for all times t in [Aγ1 , An + Dn), at most two

jobs in χ contain t in their activation; if not, then consider the job with the earliest

arrival,Jfirst and the job with the latest deadline, Jlast that cover t. By definition,

all other jobs that contain t are contained in the union of the scheduling windows of

Jfirst and Jlast, and thus are not contained in χ. The fact that t is contained in the

scheduling window of at most two jobs and at least one job of χ implies that χ is a

spanning chain.

The existence of such a spanning chain is significant because we may use Lemma 4.1

to infer the amount of work that must be done over each job of the spanning tree.

The next corollary describes the amount of work that must be done over each job of

the spanning chain χ.

113

Corollary 4.3 For each job Jχi∈ χ,

WI′(S′Iinfeas

, Aχi, Aχi

+ Dχi) ≥ Dχi

[m − (m − 1)max-job-density(I)] . (4.20)

§ Lower bound on the load of I ′. In the remainder of this section, we complete

our proof of Theorem 4.3 by deriving a lower bound on the load of I ′. We obtain

such a lower bound on I ′ by deriving a lower bound on the minimum amount of

work (with respect to jobs of I ′) that can occur over the spanning chain χ (given

by the set {Jχ1 , . . . , Jχr} ordered according to arrival-time). The derivation of the

lower bound (Lemma 4.3) for χ is obtained by reasoning about a linear program L

(Figure 4.6) that is indirectly related to I ′ and χ. L is significant because we may

show (Lemma 4.2) that there exists a feasible3 solution to L which corresponds to I ′

and S ′Iinfeas

. This solution has an objective value equal to WI′(S′Iinfeas

, Aχ1 , Aχr+ Dχr

)

(or, equivalently, WI′(S′Iinfeas

, Aγ1 , An + Dn)). Therefore, we may conclude that

WI′(S′Iinfeas

, Aχ1 , Aχr+ Dχr

) ≥ Optimal value of L. (4.21)

We first note that throughout this section we will assume that r > 1. If r = 1, then

χ = {Jn}. However, we know from Lemma 4.1 that the work done over [An, An +Dn)

exceeds or equals Dn [m − (m − 1)max-job-density(I)]. This trivially satisfies the lower

bound of

Dn

[

m − (m − 2)max-job-density(I)

1 + max-job-density(I)

]

,

which will be provided by Lemma 4.3. Therefore, we consider only the non-trivial

3Feasible here refers to a non-negative assignment to the variables of L that satisfies each of theconstraints in the system

114

Linear Program L.

Let ~a,~b ∈ Rr(>0) and ~c, ~d ∈ R

r−1(>0).

Minimize

F (~a,~b,~c, ~d)def= m

[

r∑

i=1

bχi+

r−1∑

i=1

dχi

]

+r

∑

i=1

aχi+

r−1∑

i=1

cχi(4.22)

subject to the following constraints:

∑ri=1 [aχi

+ bχi] +

∑r−1i=1 [cχi

+ dχi]

= Aχr+ Dχr

− Aχ1 (4.23a)

max-job-density(I) (bχ1 + dχ1)− (1 − max-job-density(I))(aχ1 + cχ1) ≥ 0 (4.23b)

max-job-density(I)(

dχi−1+ bχi

+ dχi

)

−(

1 − max-job-density(I))(cχi−1+ aχi

+ cχi

)

≥ 0(∀i : 1 < i < r − 1) (4.23c)

max-job-density(I)(

bχr+ dχr−1

)

−(

1 − max-job-density(I))(aχr+ cχr−1

)

≥ 0. (4.23d)

Figure 4.6: Linear System representing the minimum amount of work done over thespanning chain χ (with respect to jobs of I ′).

case where r > 1 in the remainder of this section.

We will now informally describe L. Over the interval [Aχ1 , Aχr+Dχr

), at least one

processor must be busy executing a job of I ′; otherwise, using similar reasoning to

Lemma 4.1, we could obtain a contradiction to Equation 4.15. L formally describes

an abstract system containing jobs of χ where the processors of platform Π over the

interval [Aχ1 , Aχr+ Dχr

) are either all busy or at least one processor is busy. Note

that L does not necessarily correspond to a system physically obtainable from Iinfeas;

however, Equation 4.21 shows that an optimal solution to abstract system L can

115

provide a lower bound on the work done in system Iinfeas. Informally, each variable of

the linear system L has the following interpretation.

• aχirepresents the amount of time during the interval in which Jχi

is the only

active job of χ and at least one processor of Π is busy. ~adef= (aχ1 , . . . , aχr

).

• bχirepresents the amount of time during the interval in which Jχi

is the only

active job of χ, and all processors of Π are busy executing. ~bdef= (bχ1 , . . . , bχr

).

• cχirepresents the amount of time during the interval in which Jχi

and Jχi+1over-

lap (where i < r), and at least one processor of Π is busy. ~cdef= (cχ1 , . . . , cχr−1).

• dχirepresents the amount of time during the interval in which Jχi

and Jχi+1

overlap (where i < r), and all processors of Π are busy executing. ~ddef=

(dχ1 , . . . , dχr−1).

The objective function, F (~a,~b,~c, ~d), of system L (Equation 4.22) represents the

minimum amount of work done by platform Π over the interval [Aχ1 , Aχr+ Dχr

).

The equality constraint (Equation 4.23a) specifies that the total of all the interval

lengths represented by vectors ~a, ~b, ~c, and ~d must sum to the length of interval

[Aχ1 , Aχr+Dχr

). Inequality constraints (Equations 4.23b-d) enforce the lower-bound

work requirements described by Corollary 4.3. To see that each constraint corresponds

to the lower bound of Corollary 4.3, consider the constraint of Equation 4.23b:

max-job-density(I)(bχ1 + dχ1) − (1 − max-job-density)(aχ1 + cχ1) ≥ 0

≡ max-job-density(I)(aχ1 + bχ1 + cχ1 + dχ1) − (aχ1 + cχ1) ≥ 0

≡ max-job-density(I)Dχ1 ≥ aχ1 + cχ1 .

⇐ Eχ1

Dχ1

· Dχ1 ≥ aχ1 + cχ1 ≡ Eχ1 ≥ aχ1 + cχ1

116

The last statement is implied by Lemma 4.1 since if Eχ1 < aχ1 +cχ1 , then the total

amount of time over [Aχ1 , Aχ1 + Dχ1) during which all processors are busy does not

exceed Dχ1 − Eχ1 . By the arguments of Lemma 4.1 this would imply that we could

execute Jn for more than schedule S ′Iinfeas

which is a contradiction. We may justify

the constraints of Equation 4.23(c-d) by similar arguments.

The next lemma shows the existence of a solution that corresponds to I ′, χ, and

S ′Iinfeas

. Thus, the optimal value of L provides a lower bound on WI′(S′Iinfeas

, Aχ1 , Aχr+

Dχr).

Lemma 4.2 There exists a feasible assigment to ~a, ~b, ~c, and ~d in L such that the

objective function value, F (~a,~b,~c, ~d) of this solution is equal to WI′(S′Iinfeas

, Aχ1 , Aχr+

Dχr).

Proof: Consider the following assignment to variables of L:

aχ1

def= [Aχ2 − Aχ1 ] −

WI′ (S′

Iinfeas,Aχ1 ,Aχ2 )−[Aχ2−Aχ1 ]

m−1

bχ1

def=

WI′ (S′

Iinfeas,Aχ1 ,Aχ2 )−[Aχ2−Aχ1 ]

m−1

(∀i : 1 ≤ i < r) :

cχi

def= [Aχi

+ Dχi− Aχi+1

]−WI′ (S

′

Iinfeas,Aχi+1 ,Aχi

+Dχi)−[Aχi

+Dχi−Aχi+1 ]

m−1

dχi

def=

WI′ (S′

Iinfeas,Aχi+1 ,Aχi

+Dχi)−[Aχi

+Dχi−Aχi+1 ]

m−1

117

(∀i : 1 < i < r) :

aχi

def= [Aχi+1 − Aχi−1

− Dχi−1]−

WI′ (S′

Iinfeas,Aχi−1+Dχi−1 ,Aχi+1 )−[Aχi+1−Aχi−1−Dχi−1 ]

m−1

bχi

def=

WI′ (S′

Iinfeas,Aχi−1+Dχi−1 ,Aχi+1 )−[Aχi+1−Aχi−1−Dχi−1 ]

m−1.

aχr

def= [Aχr

+ Dχr− Aχr−1 − Dχr−1 ]−

WI′ (S′

Iinfeas,Aχr−1+Dχr−1 ,Aχr+Dχr )−[Aχr+Dχr−Aχr−1−Dχr−1 ]

m−1

bχr

def=

WI′ (S′

Iinfeas,Aχr−1+Dχr−1 ,Aχr+Dχr )−[Aχr+Dχr−Aχr−1−Dχr−1 ]

m−1

It is a relatively straightforward exercise (through algebra and an application of

Corollary 4.3) to show that the above assignment satisfies the constraints of Equa-

tions 4.23(a-d), and that F (~a,~b,~c, ~d) = WI′(S′Iinfeas

, Aχ1 , Aχr+ Dχr

).

The final lemma of this section gives the optimal value of L which will be ultimately

used in the lower bound on load(I).

Lemma 4.3 The optimal objective function value of L is

[Aχr+ Dχr

− Aχ1 ]

(



)

(4.23)

Proof: Consider the following solution to L:

aχ1 =max-job-density(I)

1+max-job-density(I)

[

Aχr+Dχr−Aχ1

r−1

]

aχi=

2max-job-density(I)


[

Aχr+Dχr−Aχ1

r−1

]

, ∀i = 2, . . . , r − 1

aχr=

max-job-density(I)


[

Aχr+Dχr−Aχ1

r−1

]

bχi= 0, ∀i = 1, . . . , r

cχi= 0, ∀i = 1, . . . , r − 1

dχi=

1−max-job-density(I)


[

Aχr+Dχr−Aχ1

r−1

]

∀i = 1, . . . , r − 1

118

It can easily be verified that is a feasible solution to L with objective function value

F (a, b, c, d) equal to the value in Equation 4.23.

We may obtain the dual linear program L by mechanical transformation from L.

The dual program L is shown in Figure 4.7.

For the dual program L, it may also be verified that

ω0 =m(1−max-job-density(I))+2max-job-density(I)


ωi = m−1

1+max-job-density(I)∀i = 1, . . . , r

is a feasible solution to L with objective function value G(ω0, ω1, . . . , ωr) equal to

the value in Equation 4.23. It is known (e.g., see (Murty, 1983), Theorem 4.4) that

if there exist solutions to both the primal and dual linear program with the same

objective value z, then z is an optimal solution in both. Therefore, Equation 4.23 is

an optimal solution for L.

Dual Linear Program L.

Let ω0 ∈ R+ and ω1, . . . , ωr ∈ R

MaximizeG(ω0, ω1, ω2, . . . , ωr)

def= [Aχr

+ Aχr− Aχ1 ] × ω0 (4.24)

subject to the following constraints:

ω0 − (1 − max-job-density(I))ωi ≤ 1 for all i = 1, . . . , r (4.25a)ω0 + max-job-density(I)ωi ≤ m for all i = 1, . . . , r (4.25b)ω0 − (1 − max-job-density(I))(ωi + ωi+1) ≤ 1 for all i = 1, . . . , r − 1 (4.25c)ω0 + max-job-density(I)(ωi + ωi+1) ≤ m for all i = 1, . . . , r − 1 (4.25d)

(4.25)

Figure 4.7: The dual linear program to L.

We may now finally complete the proof of Theorem 4.3:

119

Proof of Theorem 4.3

By Equation 4.21 and Lemma 4.3, the minimum amount of work done over the

interval [Aχ1 , Aχr+ Dχr

) is

[Aχr+ Dχr

− Aχ1 ]

(



)

.

A lower bound on the load(I ′) is obtained by dividing the above expression by [Aχr+

Dχr− Aχ1 ]. This immediately implies


1 + max-job-density(I)≤ load(I ′) ≤ load(I) (4.26)

The last inequality follows because I ′ ⊆ Iinfeas ⊆ I.

By Step 7 of the proof outline, we take the contrapositive of this statement to

obtain the theorem.

4.3 Summary

Feasibility on preemptive uniprocessors is well understood; in the notation of this

dissertation, a necessary and sufficient condition for any real-time instance I to be

feasible (and edf schedulable) upon a unit-capacity uniprocessor is that

load(I) ≤ 1 and max-job-density(I) ≤ 1 .

In Chapter 3 of this dissertation, we have already shown necessary conditions for

feasibility on preemptive multiprocessors (Lemma 3.2). In this chapter, we obtained

(Theorems 4.1 and 4.3) sufficient conditions for a real-time instance I, characterized

only by its load and max-job-density parameters, to be feasible on a multiprocessor

120

platform (in both the restricted- and full-migration setting). Since we have proven fea-

sibility for the general real-time workload model of real-time instances, the results ob-

tained in this chapter are applicable to any task model where load and max-job-density

may be obtained (see Chapter 3). Furthermore, the feasibility tests obtained in this

chapter are at most a small constant factor from the optimal feasibility tests (Corol-

laries 4.1 and 4.2), in terms of resource augmentation.

121

Chapter 5

The Impossibility of Optimal

Online Multiprocessor Scheduling

Algorithms for General Task

Systems

After reading the previous chapter on feasibility, it is natural to wonder: does there

exist an algorithm which is guaranteed to successfully schedule any feasible general

task system on a multiprocessor platform? In other words, does there exist optimal

scheduling algorithms for general task models? For LL task systems, the answer

to that question is “yes.” (Srinivasan and Anderson, 2002) describe a Pfair-based

algorithm that is optimal for LL and periodic task systems. For the most general

real-time workload abstraction, arbitrary real-time instances, (Dertouzos and Mok,

1989) show that optimal online scheduling of arbitrary real-time instances that are

not known a priori is impossible. Thus, optimality exists for some of the stricter real-

time task models and does not exist for the most general model of real-time work.

This immediately prompts another question: for which general real-time recurrent

task models do optimal scheduling algorithms exist?

Unfortunately, for all the general task models discussed in this dissertation, opti-

mal online scheduling is, in fact, impossible. In this chapter, we show that optimal

online scheduling of sporadic task systems is impossible. This immediately implies

that optimal online scheduling of any task model that generalizes the sporadic task

system is impossible, as well. Therefore, even a slight amount of generalization from

the LL task model (the sporadic task model simply adds a relative deadline param-

eter to the task specification) causes the existence of optimal scheduling algorithms

to disappear.

Our method of proving that optimal online algorithms do not exist for sporadic

task systems is as follows.

1. Find a potentially feasible sporadic task system τ on some processing platform

Π.

2. Prove that the task system is feasible a multiprocessor platform Π. This means

that for any real-time instance generated by τ on Π there exists a schedule on

Π that will meet all deadlines.

3. For the feasible task system τ , show there exists a set of real-time instances

generated by τ that are identical up to a time t (denoted by I ′(τ)); however,

at time t they require any online scheduling algorithm A to make a decision

regarding which active jobs to schedule (i.e., there are more active jobs than

processors at time t). Show that regardless of the choice made by A at time t,

there exists a real-time instance in I ′(τ) that causes the choice made by A at

time t to result in a deadline miss.

In this brief chapter, we give the details of Steps 1 and 3 which are contained in the

next section. Step 3 especially gives insight into why optimal online scheduling of

123

sporadic task systems is impossible. The proof of feasibility (Step 2), though very

important to showing the nonexistence of optimal scheduling algorithms, is extremely

complex and not necessary to understanding the main result of this chapter; therefore,

we have decided to defer the details of Step 2 until Appendix A.

5.1 Impossibility of Optimal Online Scheduling

In accordance with Step 1 of the above approach, consider the following task system,

τexample, comprised of six tasks (recall the a sporadic task is specified by three-tuple

(ei, di, pi)).

• τ1 = (2, 2, 5)

• τ2 = (1, 1, 5)

• τ3 = (1, 2, 6)

• τ4 = (2, 4, 100)

• τ5 = (2, 6, 100)

• τ6 = (4, 8, 100)

(5.1)

Theorem 5.1 τexample is feasible on two processors.

Proof: Proved in Appendix A.3.

Lemma 5.1 No non-clairvoyant, optimal online algorithm exists for the multipro-

cessor scheduling real-time sporadic task systems on two processors.

Proof: The proof is by contradiction. Assume there exists an optimal online

algorithm, A, for scheduling sporadic real-time tasks on two processors. Then, by

Theorem 5.1, A must find a valid schedule for τexample where no deadline is missed;

more formally, for all I ∈ I S(τexample), the schedule A(I) is valid (Definition 1.6).

Figure 5.1a shows task system τexample.

124

Let each task of τexample release a job at time zero. Figure 5.1b shows the slots

at which A must execute τ1, τ2, τ3, and τ4 (i.e., any other order would result in a

deadline miss). Let Izero(τexample) be the set of all real-time instances generated by

τexample where each task generates a job at time instant zero and all jobs execute

for their respective task’s worst-case execution requirement; all real-time instances in

Izero(τexample) must include the following six jobs (recall a real-time job is specified by

(Ai, Ei, Di)): (0, 2, 2), (0, 1, 1), (0, 1, 2), (0, 2, 4), (0, 2, 6), and (0, 4, 8). Note, that by

the minimum separation parameter (period) of each task, the earliest the second job

of any task may be generated is at time five. So, for all I and I ′ in Izero(τexample), I≤5

and I ′≤5 are identical.

For any I ∈ Izero(τexample), there exist two possible choices that A must make

regarding the execution of τ5.

1. A schedules τ5 for x (0 < x ≤ 2) units of time in the interval (2, 4].

2. A does not schedule τ5 in the interval (2, 4].

Since A is an online scheduling algorithm, by Definition 1.3, any I, I ′

∈ Izero(τexample) where I≤5 = I ′≤5 implies that the schedule generated by A for both

I and I ′ is identical up to t = 5. Thus, algorithm A will make the same choice

(either choice 1 or 2, above) for all instances in Izero(τexample). We will show that for

either choice made by algorithm A there exists an Imiss ∈ Izero(τexample) that forces a

deadline miss. Let us consider both cases.

1. A schedules τ5 for x (0 < x ≤ 2) units of time in the interval (2, 4]: Consider

any real-time instance I in Izero(τexample) where, in addition to the six jobs that

all real-time instances in Izero(τexample) must contain, I includes a job generated

by τ1, τ2, and τ3 at t = 6; that is, I must include the jobs: (6, 2, 2), (6, 1, 1),

and (6, 1, 2). It is obvious that the two processors are fully utilized by τ1, τ2,

125

and τ3 over the interval (6, 8]; therefore, τ6 may not execute over the interval

(6, 8] (otherwise, either τ1, τ2, or τ3 will miss a deadline. This implies that τ6

must execute in the interval (2, 6] given real-time instance I. However, I chose

to execute τ5 in (2, 4] for x time units, and τ4 requires a processor to execute

job (0, 2, 4) continuously. Thus, given the choice by A and real-time instance

I, there only exists 4 − x units of time in which τ6 may execute in the interval

(2, 4]; τ6 will miss a deadline at t = 8. Figure 5.2a shows this scenario.

2. A does not schedule τ5 in the interval (2, 4]: Consider any real-time instance I ′

in Izero(τexample) where, in addition to the six jobs that all real-time instances

in Izero(τexample) must contain, I ′ includes a job generated by τ1 and τ2 at t = 5;

that is, I ′ must include the jobs (5, 2, 2) and 5, 1, 1). It is clear that the two

processors are fully utilized by τ1 and τ2 over interval (5, 6]. However, since A

chose not to execute τ5 in the interval (2, 4], τ5 must continuously execute in the

interval (4, 8] to meet its deadline. In this scenario, three jobs must continuously

execute in the interval (5, 6]. Therefore, either τ1, τ2, or τ5 will miss a deadline

in the interval (5, 6]. Figure 5.2b illustrates this scenario.

Since for any of the choices made by A over the interval (2, 4], there exists a real-

time instance I ∈ Izero(τexample) that causes A to miss a deadline, this contradicts

our assumption that there exists an optimal algorithm A. Therefore, no optimal

algorithm for scheduling sporadic real-time tasks upon a two-processor platform can

exist.

We may easily generalize the above lemma to an arbitrary number of processors

(m > 1).

Theorem 5.2 No non-clairvoyant, optimal online algorithm exists for the multipro-

126

xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx



τ1

τ3τ4τ5

τ2

τ6

(2,2,5)

(1,1,5)

(1,2,6)

(2,4,100)

(2,6,100 )

(4,8,100)

xxxxxxxxxxxxxxxxxx



xxxxxxxxxxxxxxxxxx


xxxxxxxxxxxxxxxxxx

xxxxxxxxxxxxxxxxxx

xxxxxxxxxxxxxxxxxx

(a)

xxxxxxxxxxxxxxxxxx



(b)

0 1 2 3 4 5 6

π1

π2

Figure 5.1: (a) Task system τexample. (b) The times at which tasks τ1, τ2, τ3, and τ4

must execute.

cessor scheduling real-time sporadic task systems on two or more processors.

Proof: For any Π comprised of m > 1 identical unit-speed processors, consider

the task system τ ′example

def= τexample ∪ {τ ′

1, τ′2, . . . , τ

′m−2} where τ ′

i = (1, 1, 1) for all

0 < i ≤ m − 2. It is easy to see that τ ′example is feasible on Π, as we can dedicate

a processor to each of the tasks in {τ ′1, τ

′2, . . . , τ

′m−2} and by Theorem 5.1 τexample is

feasible on the remaining two processors. The argument of Lemma 5.1 holds in the

case where each of {τ ′1, τ

′2, . . . , τ

′m−2} generate jobs at time zero and successive jobs

as soon as legally allowable. Therefore, the jobs generated by τexample cannot use the

additional processors, and the argument of the lemma is identical.

The above negative result immediately extends to any task model that generalizes

the sporadic task model. The reason is that for any model M that generalizes the

sporadic model, there exists a τ ′Mexample specified in model M such that I ∈ I M(τ ′M

example)

if and only if I ∈ I S(τ ′example). Therefore, the argument of Lemma 5.1 is unchanged

for this more general task system.

127

xxxxxxxxxxxxxxxxxx





(a)

0 1 2 3 4 5 6 7 8





(b)

0 1 2 3 4 5 6xxxxxxxxxxxxxxxxxx

7 8

xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx






Task τ6misses adeadline



π2

π1

π2

π1

Task τ5misses adeadline

xxxxxxxxxxxxxxxxxx

xxxxxxxxxxxxxxxxxx

xxxxxxxxxxxxxxxxxx

xxxxxxxxxxxxxxxxxx

xxxxxxxxxxxxxxxxxx

xxxxxxxxxxxxxxxxxx

xxxxxxxxxxxxxxxxxx

xxxxxxxxxxxxxxxxxx


xxxxxxxxxxxxxxxxxx

xxxxxxxxxxxxxxxxxx

xxxxxxxxxxxxxxxxxx

xxxxxxxxxxxxxxxxxx

xxxxxxxxxxxxxxxxxxx

xxxxxxxxxxxxxxxxxx

xxxxxxxxxxxxxxxxxxx

Figure 5.2: Scenario (a): A schedules τ5 for x (0 < x ≤ 2) units of time in the interval(2, 4]. Scenario (b): A does not schedule τ5 in the interval (2, 4].

Corollary 5.1 No non-clairvoyant, optimal online algorithm exists on two or more

processors for the multiprocessor scheduling real-time task system in models that gen-

eralize the sporadic task model.

5.2 Summary

In this chapter, we have seen that there exists a sporadic task system that is feasible

upon a multiprocessor platform for which there does not exist an online algorithm

that can successfully schedule every real-time instance generated by this task system.

The existence of such a feasible task system implies that optimal online scheduling

of sporadic and more general task systems is impossible. This chapter identified the

128

feasible task system and proved that no online scheduling algorithm can successfully

schedule all feasible instances. Appendix A includes a detailed proof that this task

system is feasible.

The consequence of this negative result are far-reaching in that algorithms that

are optimal for LL and periodic task systems no longer retain their optimality for

small generalizations of the model. Without optimality, it is not immediately clear

what should be the theoretical basis for evaluating the effectiveness of a real-time

multiprocessor scheduling algorithm for sporadic and more general task systems. The

following chapters will explore using resource-augmentation as an analysis technique

for identifying online scheduling algorithms with constant-factor approximation ra-

tios.

129

Chapter 6

The Full-Migration Schedulability

Analysis for General Task Systems

The impossibility result of the previous chapter implies that scheduling algorithms

such as Pfair are no longer optimal for sporadic or more general task models. From

Chapter 2, we saw that many other traditional online full- and restricted-migration

algorithms suffer from Dhall’s effect, implying these algorithms are provably non-

optimal even for the simple LL task model. Taken together, these negative results

may impart a rather pessimistic impression on the reader, concerning the efficient

multiprocessor online scheduling of task systems in the sporadic and more general task

models. However, we will see in this chapter, even though optimality is impossible,

online scheduling and schedulability analysis of general task systems with constant-

factor approximation ratios(in terms of resource augmentation) is achievable. In fact,

many traditional online scheduling algorithms such as edf or dm have schedulability

tests with constant-factor resource-augmentation approximation ratios.

The remainder of this chapter is organized as follows. In Section 6.1, we intro-

duce some additional notation useful in reasoning about the fixed job-priority (and

fixed task-priority) scheduling of real-time instances. In Section 6.2, we obtain suf-

ficient conditions for the multiprocessor dm-schedulability of real-time instances. In

Section 6.3, we derive sufficient conditions for edf-schedulability. Both Sections 6.2

and 6.3 present resource augmentation bounds for their respective schedulability tests.

6.1 Notation

For each job Ji in real-time instance I, recall from Section 1.3 of Chapter 1 that we

use the parameter ρ(Ji) to denote the priority assigned to Ji by a fixed-job-priority

scheduling algorithm. Given any priority-level p, we will now overload the definition

of demand to include the execution requirements of jobs in the instance with priority

≥ p, that have both their arrival times and their deadlines within the interval:

demand(p, I, t1, t2)def=

∑

(Ji∈I)∧(ρ(Ji)≥p)∧(t1≤Ai)∧(Ai+Di≤t2)

Ei.

Similarly, load may be overloaded with respect to priority-level p; load(p, I) as

follows:

load(p, I)def= max

t1<t2

demand(p, I, t1, t2)

t2 − t1. (6.1)

Intuitively, load(p, I) denotes the maximum possible cumulative computational de-

mand, normalized by interval length, of priority p or greater that is generated by

real-time instance I. Clearly, load(p, I) ≤ load(I).

Finally, for each i, let ∆i be defined as follows:

∆idef=

(

maxρ(Jk)≥ρ(Ji)

{Dk})

/Di, (6.2)

i.e., ∆i denotes the ratio of the largest deadline parameter of a job with priority

131

greater than or equal to Ji’s priority, to Ji’s deadline parameter. This parameter will

prove useful in determining schedulability conditions for real-time instances.

6.2 dm-schedulability Conditions

We now derive sufficient conditions for determining whether a given real-time instance

I is dm schedulable upon a specified number of processors. In addition, we will

quantify the“goodness”of these schedulability conditions via a resource-augmentation

metric. The first result that we obtain is valid for all fixed-job-priority scheduling

algorithms. (Note that fixed-task-priority scheduling algorithms are special subset of

fixed-job-priority scheduling algorithms; therefore, even though we have categorized

dm as fixed-task-priority, it is a fixed-job-priority algorithm, as well).

Theorem 6.1 Let I denote a real-time instance such that for each Ji ∈ I, the fol-

lowing condition is satisfied:

load(ρ(Ji), I) ≤ 1

2∆i + 1

(

m − (m − 1)Ei

Di

)

. (6.3)

Instance I is fixed job-priority schedulable (under the full-migration scheduling

paradigm) upon a platform comprised of m unit-capacity processors.

Proof: We will prove the contrapositive. Suppose that a given real-time instance I =

{J1, J2, . . .} is unschedulable (according to an arbitrary fixed-job-priority scheduling

algorithm) upon m unit-capacity processors, and let Ji denote the first job which

misses its deadline. Since Ji does not receive Ei units of execution over the interval

[Ai, Ai + Di), it must be the case that jobs of equal or greater priority execute on all

m processors for strictly more than (Di −Ei) time units during this interval. That is,

such jobs of equal or greater priority execute within this interval for > m× (Di −Ei)

132

time units.

Let Js denote the job with earliest arrival time that has priority ≥ ρ(Ji), such

that Ai < As + Ds and Js is executed by the fixed-job-priority algorithm during the

interval [Ai, Ai + Di); and let Jf denote the job with latest (absolute) deadline that

has priority ≥ ρ(Ji), such that Af < Ai + Di and Jf is executed by the fixed-job-

priority algorithm during the interval [Ai, Ai +Di). At least m× (Di −Ei) time units

of execution occurring over [Ai, Ai +Di) are generated by jobs of priority ≥ ρ(Ji) that

arrive in, and have their deadlines within [As, Af + Df ).

Recall, from Equation 6.2, that ∆i denotes the ratio of the largest deadline pa-

rameter of a job with priority greater than or equal to Ji’s priority, to Ji’s deadline

parameter. Hence, Ds and Df are both ≤ Di × ∆i. Therefore, the intervals [As, Ai)

and [Ai +Di, Af +Df ) both have a respective length of at most Di×∆i. It is evident

that the interval [As, Af + Df ) is of length at most (2∆i + 1) × Di. Hence, for Ji

to miss its deadline, it is necessary that the demand of jobs over [As, Af + Df ) (not

including Ji’s contribution) satisfy the following inequality:

(Af + Df − As)load(ρ(Ji), I) − Ei > m(Di − Ei)

⇒(2∆i + 1) · Di · load(ρ(Ji), I) − Ei > m(Di − Ei)

≡ load(ρ(Ji), I) >m−(m−1)

EiDi

2∆i+1 . (6.4)

We have seen above that, in order for Ji to miss its deadline, it is necessary that

Condition 6.4 be satisfied; Theorem 6.1 below follows by noting that Condition 6.4 is

the negation of Equation 6.3.

Observe that in dm-scheduling, jobs are assigned priorities in inverse proportion

of their relative deadline parameter; therefore, for a job Ji, it must be the case that

all higher-priority jobs Jk have Dk ≤ Di. Thus, ∆i ≤ 1. The following corollary

immediately follows from combining this observation and Theorem 6.1.

133

Corollary 6.1 Let I be a real-time instance such that for all Ji ∈ I the following

condition is satisfied:


3

(

m − (m − 1)Ei

Di

)

. (6.5)

Instance I is dm schedulable upon a platform comprised of m unit-capacity processors.

§ Resource Augmentation.. The following corollary gives a simple test for the

dm-schedulability of a real-time instance I.

Corollary 6.2 Any real-time instance I satisfying the following conditions

∀Jk ∈ I : load(ρ(Jk), I) ≤ m2

4m − 1(6.6)

max-job-density(I) ≤ m

4m − 1(6.7)

is successfully scheduled by the deadline-monotonic scheduling algorithm upon an m-

processor unit-capacity multiprocessor platform.

Proof: In order that instance I be successfully scheduled by the deadline-monotonic

scheduling algorithm on m unit-capacity processors it is, by Equation 6.5, sufficient

that for all jobsJk ∈ I,

load(I) ≤ 1

3× (m − (m − 1)max-job-density(I))

⇐ (From Condition 6.7)

load(I) ≤ 1

3×

(

m − (m − 1)m

4m − 1

)

≡ load(I) ≤ m2

4m − 1,

which is true, (by Condition 6.6 above).

134

Clearly, Conditions 6.6 and 6.7 are necessary for any I to be schedulable upon

a platform comprised of m processors each of computing capacity m/(4m − 1); by

Corollary 6.2 above, these conditions are also sufficient for I to be schedulable upon m

unit-capacity processors. Hence to within a constant factor of less than four, we have

obtained bounds on the multiplicative speed-up needed for the conditions based on

load and max-job-density to suffice for schedulability. The following theorem formally

states this; the proof of the theorem is identical to Theorem 4.2.




platform where each processor has speed 4 − 1m

.

6.3 edf-schedulability conditions

In the previous section, we were able to derive conditions for fixed-job-priority schedul-

ing algorithms, in general (Theorem 6.1). To obtain these conditions, we needed to

consider, for any job Ji, all the jobs with priority > ρ(Ji), even with deadlines after

Ji. For the edf-scheduling algorithm, we only need to consider jobs with deadlines

prior to Ji’s. This allows us to derive the following sufficient conditions for edf

schedulability.

Theorem 6.3 Let I denote a real-time instance such that for each Ji ∈ I, the fol-

lowing condition is satisfied:


∆i + 1

(

m − (m − 1)Ei

Di

)

. (6.8)

Instance I is edf schedulable upon a platform comprised of m unit-capacity proces-

135

- time

6

?

6

?Ai (Ai + Di)As (As + Ds)

Figure 6.1: Job Ji is being considered. Job Js is a job with priority ≥ ρ(Ji) with theearliest arrival time whose “intervals” overlap with that of Ji.

sors.

Proof:

We will prove the contrapositive of the theorem. Suppose that a given real-time

instance I = {J1, J2, . . .} is unschedulable (according to the edf scheduling algo-

rithm) upon m unit-capacity processors, and let Ji denote the first job which misses

its deadline. As in Theorem 6.1, since Ji does not receive Ei units of execution over

the interval [Ai, Ai + Di), it must be the case that jobs of equal or greater priority

execute on all m processors for strictly more than (Di − Ei) time units during this

interval. Since in edf-scheduling only jobs with earlier absolute deadlines have higher

priority, at least m(Di−Ei) units of execution belongs to jobs that have their deadline

in the interval [Ai, Ai + Di).

Let Js denote the job with earliest arrival time that has priority ≥ ρ(Ji), such that

Ai < As + Ds ≤ Ai + Di (see Figure 6.1). It is evident from the figure that all the

execution by jobs of priority ≥ ρ(Ji) occurring over [Ai, Ai + Di) must be generated

by jobs that arrive in, and have their deadlines within, [As, Ai + Di).

Ds is ≤ Di × ∆i. From Figure 6.1, it is clear that the interval [As, Ai + Di) is

of length at most (∆i + 1) × Di. Hence, for Ji to miss its deadline, it is necessary

that th demand of jobs over [As, Ai +Di) (not including Ji’s contribution) satisfy the

136

following inequality:

(Ai + Di − As)load(ρ(Ji), I) − Ei > m(Di − Ei)

⇒(∆i + 1) · Di · load(ρ(Ji), I) − Ei > m(Di − Ei)

≡ load(ρ(Ji), I) >m−(m−1)

EiDi

∆i+1 . (6.9)

We have seen above that, in order for Ji to miss its deadline, it is necessary that

Condition 6.9 be satisfied; Theorem 6.3 below follows by noting that Condition 6.9 is

the negation of Equation 6.8.

If the ratio between the largest relative deadline and smallest relative deadline of

any jobs of real-time instance I is bounded from above by constant K (i.e., for all

Ji ∈ I, ∆i ≤ K), then we may obtain a schedulability condition for edf similar to

Corollary 6.1:

Corollary 6.3 Let I be a real-time instance such that for each Ji ∈ I the following

condition is satisfied:


K + 1

(

m − (m − 1)Ei

Di

)

(6.10)

where Kdef

= maxJi∈I{∆i}. Then, instance I is edf schedulable upon a platform com-

prised of m unit-capacity processors.

The above test works very well for small values of K. However, since the test

of Corollary 6.3 is dependent on the ratio between the largest and smallest relative

deadlines of I and this ratio may not be a priori bounded, a constant-factor resource-

augmentation bound may not be derived from this test. Instead, we will now in the

next subsection discuss a different edf-schedulability test based on Corollary 4.2 that

has a resource-augmentation approximation ratio.

137

6.3.1 A Different edf-schedulability Test

As discussed in Chapter 2, Phillips et al. (Phillips et al., 1997) related the concept of

multiprocessor feasibility with edf schedulability using a technique known as resource

augmentation. Specifically, they proved that any real-time instance I feasible on a

platform with m unit-capacity processors is schedulable according to edf on process-

ing platform with m processors each of speed (2 − 1m

). The following schedulability

condition is immediately obtained from the feasibility test of Corollary 4.2 and using

a resource augmentation speed-up result of (Phillips et al., 1997).

Corollary 6.4 For real-time instance I and identical multiprocessor platform Π com-

prised of m unit-speed processors, if

load(I) ≤ (√

2 − 1)m2

2m − 1and (6.11)

max-job-density(I) ≤ (√

2 − 1)m

2m − 1, (6.12)

then the earliest-deadline-first scheduling algorithm (edf) can successfully schedule I

upon Π.

Proof: Consider a platform where each processor is of speed m2m−1

times speed of

platform Π (denote this platform as m2m−1

· Π). Let I ′ denote real-time instance I

normalized to the speed of m2m−1

· Π; that is, for each job Ji ∈ I, there exists a job J ′i

with the same arrival-time and deadline with 2m−1m

Ei execution requirement. Thus,

load(I ′) =2m − 1

m· load(I) and (6.13)

max-job-density(I ′) =2m − 1

m· max-job-density(I). (6.14)

138

By Corollary 4.2, if load(I ′) ≤ (√

2 − 1)m and max-job-density(I ′) ≤ (√

2 − 1),

then I ′ is feasible on m2m−1

· Π. By Theorem 2.3, I ′ is also edf-schedulable on Π.

Substituting Equations 6.13 and 6.14 into these condition, we obtain Conditions 6.11

and 6.12 of the corollary.

Using the same techniques of Theorem 4.2, the following resource augmentation

result immediately follows.




platform where each processor has speed 2 + 2√

2 −√

2+1m

.

6.4 Full-Migration Schedulability of Sporadic Task

Systems

As shown in Section 3.4, the load and max-job-density of a sporadic task system can

be efficiently determined. In this section, we give an application of the conditions for

dm scheduling derived in Section 6.2 to the scheduling of sporadic task systems. We

compare the schedulability tests of this dissertation to previously-known tests for the

dm scheduling of sporadic task systems.

Consider a sporadic task system τ = {τ1, τ2, . . . , τn} comprised of n tasks. Without

loss of generality, assume that the tasks are indexed according to decreasing priorities:

task τ1 is assigned the greatest priority, τn the least priority, and τk’s priority is higher

than τk+1’s for all k. We make no assumption about the relationship between a task’s

relative deadline and period parameter; however, if dk > pk, our current analysis

assumes that it is possible for two jobs of τk to be active at the same time (i.e., for a

139

given time t, two or more jobs of τk may execute concurrently).

Let I denote any real-time instance that is generated during run-time by sporadic

task system τ . Since the job Ji under consideration is assumed to have been generated

by task τk, only tasks τ1, τ2, . . . , τk will contribute to the load(ρ(Ji), I) term. The

maximum cumulative execution requirement by jobs of these tasks over any time

interval [t1, t2) is at most the sum of the maximum execution requirements of the

individual tasks:

demand(ρ(Ji), I, t1, t2) ≤ (k

∑

j=1

dbf(τj, t2 − t1)).

From the definition of load with respect to a given priority (Equation 6.1), it follows

that

load(ρ(Ji), I) ≤ maxt≥0

{

(k

∑

j=1

dbf(τj, t))/t

}

. (6.15)

Let us now overload the definition of load to apply to sporadic tasks of priority greater

or equal to task τk as follows:

load(k, τ)def= max

t≥0

{

(k

∑

j=1

dbf(τj, t))/t

}

.

That is, load(k, τ) denotes the maximum cumulative computational demand, normal-

ized by interval length, of priority k or greater that can be generated by sporadic task

system τ .

Recall that only the jobs generated by tasks {τ1, τ2, . . . , τk−1} have greater priority

than a job of τk. Therefore, using the definition of load from Equation 6.15, we may

derive the following additional corollary from Corollary 6.1.

140

Corollary 6.5 Any sporadic task system τ = {τ1, τ2, . . . , τn} satisfying

∀k : 1 ≤ k ≤ n : load(k, τ) ≤ 1

3

(

m − (m − 1)ek

dk

)

(6.16)

is dm schedulable upon a platform comprised of m unit-capacity processors.

The same resource-augmentation technique, used in Section 6.2, can be applied

to sporadic task systems to obtain a resource-augmentation bound of 4 − 1m

for dm

scheduling of sporadic tasks.

Recently, researchers have focussed on the multiprocessor static-priority schedul-

ing of sporadic task systems. Baker (Baker, 2003; Baker, 2006a) and Bertogna et

al. (Bertogna et al., 2005b) have derived sufficient conditions for the multiprocessor,

static-priority schedulability of sporadic task systems.

For the sake of comparison, recall Theorem 2.7 from Chapter 2. Previous work

has only empirically evaluated the effectiveness of the approach of Theorem 2.7, and,

to the best of our knowledge, no resource-augmentation bounds have been obtained.

In the next theorem, we give a lower bound on the resource-augmentation bound

associated with Theorem 2.7:

Theorem 6.5 The schedulability tests of Theorem 2.7 (Conditions 2.11 and 2.12)

cannot have a resource-augmentation bound smaller than 3 − 2m

for a platform with

m-processors (where m > 1).

Proof: Consider the following task system τ comprised of (m − 1)ℓ small tasks

and a single large task: the small tasks τi ∈ {τ1, . . . τ(m−1)ℓ} have specification τidef=

(ei, di, pi) = (1, ℓ, ℓ); the large task τ(m−1)ℓ+1def= (ℓ, ℓ, ℓ). This task system may be

scheduled on a platform with m unit-capacity processors. It is easy to see that the

(m − 1)ℓ small tasks may be scheduled upon the first (m − 1) processors (ℓ fit on

141

each processor), and the large task is schedulable on the last processor. However,

task τ(m−1)ℓ+1 cannot be verified to be schedulable according to Theorem 4 for ℓ ≥ 2:

observe that for all i ∈ {1, . . . , (m − 1)ℓ}, β(m−1)ℓ+1(i) equals 2ℓ

(with respect to

τ(m−1)ℓ+1). Since β(m−1)ℓ+1(i) = 2ℓ

> 1 − 1 = (1 − e(m−1)ℓ+1

d(m−1)ℓ+1) for all ℓ ≥ 2, we must

check Conditions 2.11 of Theorem 2.7 to verify if τ(m−1)ℓ+1 is schedulable. This is

equivalent to checking∑(m−1)ℓ

k=1 (1 − 1) < m(1 − 1), which is obviously false.

We now determine the smallest constant α > 1 such that τ(m−1)ℓ+1 is schedulable

on m α-speed processors according to Conditions 2.11 and 2.12 of Theorem 2.7. For

τi ∈ {τ1, . . . τ(m−1)ℓ} and ℓ ≥ 2,

βℓ(i) =1· 1

α+min( 1

α,max(0,ℓ−(1·ℓ)+ℓ− 1

α))ℓ

= 2α·ℓ .

If β(m−1)ℓ+1(i) > 1 − e(m−1)ℓ+1

α·d(m−1)ℓ+1, then to determine if τ(m−1)ℓ+1 is schedulable ac-

cording to Theorem 2.7, we must check Condition 2.11:

∑(m−1)ℓi=1 (1 − 1

α) < m(1 − 1

α)

⇒ (m − 1)ℓ < m.

The last inequality is false for all ℓ ≥ 2 and m > 1; thus, if β(m−1)ℓ+1(i) >

1 − e(m−1)ℓ+1

α·d(m−1)ℓ+1, the schedulability of τ(m−1)ℓ+1 cannot be verified by Theorem 2.7 for

any speedup α > 1. So, we will assume that β(m−1)ℓ+1(i) ≤ 1 − e(m−1)ℓ+1

α·d(m−1)ℓ+1for all τi ∈

{τ1, . . . τ(m−1)ℓ} and ℓ ≥ 2. Thus, to find the smallest α that satisfies Conditions 2.11

or 2.12 (with respect to the schedulability of τ(m−1)ℓ+1), we must solve the following

equation obtained from the conditions of Theorem 2.7:

m(1 − 1

α) ≥ (m − 1) · ℓ · 2

α · ℓ. (6.17)

Solving for α > 1, we find that α ≥ 3 − 2m

which is a lower bound on the resource-

142

augmentation factor needed by Conditions 2.11 and 2.12.

The previous theorem only provides a lower bound on the resource-augmentation

factor, while Corollary 6.2 gives an upper bound. It is interesting to note that the

test of Corollary 6.5 applied to the example task system used in the previous proof

requires a speed-up of 4 − 1m

. Further work is needed to obtain an upper bound on

the resource-augmentation factor for the test of Theorem 2.7.

6.5 Summary

The results of this chapter have shown that despite the non-existence of optimal mul-

tiprocessor scheduling algorithms, online algorithms with constant-factor resource-

augmentation approximation ratios exist. We have derived sufficient conditions for the

well-known edf and dm scheduling algorithms in terms of load and max-job-density.

For the multiprocessor scheduling, these results are the first known schedulability con-

ditions with resource-augmentation approximation ratios. Future work will explore

whether these conditions may be tightened and if there exists scheduling algorithms

with better resource augmentation bounds.

143

Chapter 7

The Partitioned Scheduling and

Schedulability Analysis of Sporadic

Task Systems

In this chapter, we report our findings concerning the preemptive multiprocessor

scheduling of sporadic real-time systems under the partitioned paradigm. Observe

that the process of partitioning tasks among processors reduces a multiprocessor

scheduling problem to a series of uniprocessor problems (one to each processor).

In this chapter, we consider both fixed-task-priority and fixed-job-priority schedul-

ing algorithms at the uniprocessor scheduling level. For fixed-job-priority algorithms,

the optimality of edf for preemptive uniprocessor scheduling (Liu and Layland, 1973;

Dertouzos, 1974) makes edf a reasonable algorithm to use as the run-time scheduling

algorithm on each processor. For fixed-task-priority algorithms, dm is known to be

optimal for special subclasses of sporadic task systems on uniprocessors (Leung and

Whitehead, 1982). Therefore, we consider both of these algorithms in this chapter.

Throughout this chapter, we will refer to special subclasses of sporadic task sys-

tems that are classified by the relationship between the values of pi and di for each

τi ∈ τ . For the purposes of this chapter, we consider three subclasses based on this

relationship.

• Implicit-deadline: Each sporadic task τi ∈ τ satisfies the constraint that

di = pi.

• Constrained: Each sporadic task τi ∈ τ satisfies the constraint that di ≤ pi.

• Arbitrary: There is no restriction placed on the relationship between di and

pi.

§ Summary of Contributions. For edf-scheduled processors, we propose two

polynomial-time partitioning algorithms. The first algorithm we propose assigns spo-

radic tasks to processors based on their task density parameter (density is defined

as the execution requirement of a task divided by the minimum of either the period

or relative deadline parameter). The second partitioning algorithm we propose uses

an approximation to the demand-bound function (similar to the one defined in Sec-

tion 3.4) for a sporadic task and task utilization (utilization is defined as execution

requirement divided by the period parameter) as dual criteria for assigning a task to

a processor. For dm-scheduled processors, we propose a polynomial-time partitioning

algorithm which uses an approximation based on a different real-time workload char-

acterization (not yet mentioned in this dissertation) called the request-bound function.

All of the proposed partitioning algorithms are variants of the First-Fit bin-packing

heuristic (Johnson, 1973).

For the partitioning algorithms using the demand-bound function and request-

bound function approximation, we derive sufficient conditions for success of our al-

gorithm (Theorems 7.4, 7.5, 7.8, and 7.9). We also show (Corollary 7.2 and Theo-

rem 7.10) that our demand-based and request-bound function partitioning algorithms

both have the following performance guarantee for arbitrary sporadic task systems.

145

If a sporadic task system is feasible on m identical processors, then the

same task system can be partitioned by our algorithm among m identical

processors in which the individual processors are (4− 2m

) times as fast as in

the original system, such that all jobs of all tasks assigned to each processor

will always meet their deadlines if scheduled using the preemptive edf

scheduling algorithm.

A slightly tightened guarantee may also be defined for the performance of the algo-

rithms over constrained sporadic task systems.

For the density-based partitioning algorithm, we also derive sufficient conditions

for success (Theorem 7.1). However, we show (Theorem 7.2) that density-based par-

titioning does not have a performance guarantee on the necessary speed of each pro-

cessor for the partitioning algorithm to succeed (i.e., there cannot exist a guarantee

for density-based partitioning similar to the preceding guarantee for demand-based

partitioning).

§ Organization. The remainder of this chapter is organized as follows. In Sec-

tion 7.1, we design partitioning algorithms for when edf is used to schedule each

individual processor. In Section 7.1.2 we present, prove the correctness of, and eval-

uate the performance of a very simple density-based partitioning algorithm. In Sec-

tion 7.1.3, we present, and prove correct, a somewhat more sophisticated demand-

based partitioning algorithm. In Section 7.1.4, we prove that this algorithm satisfies

a property that the simpler algorithm does not possess: if given sufficiently faster

processors, it is able to guarantee to meet all deadlines for all feasible systems. We

also list some other results that we have obtained, and include a brief discussion of

the significance of our results. In Section 7.1.5 we propose an improvement to the

algorithm: although this improvement does not seem to effect the worst-case behavior

of the algorithm, it is shown to be of use in successfully partitioning some sporadic

146

task systems that our basic algorithm — the one presented in Section 7.1.3 — fails

to handle.

In Section 7.2, we design a partitioning algorithm for using dm-scheduling algo-

rithm on each processor. In Section 7.2.2, we present our polynomial-time parti-

tioning algorithm and prove its correctness. Section 7.2.3 evaluates the efficacy of

the partitioning algorithm in terms of sufficient conditions for success and resource

augmentation approximation bounds.

7.1 edf-based Partitioning

Most partitioning schemes use the following steps at a high-level:

1. Sort tasks in order of some criteria.

2. In the sorted order of Step 1, assign each task to a processor upon which it“fits.”

A task “fits” on a processor if it will always meet all deadlines when assigned to

the processor, and it does not cause another previously-assigned task to miss a

deadline.

3. After each task has been assigned to a processor, use a uniprocessor scheduling

algorithm on each processor to schedule the processor’s respective tasks.

Observe that Steps 1 and 2 occur prior to system runtime. Step 3 occurs during

runtime after task assignment.

A further remark about Step 2: a uniprocessor schedulability test is typically the

process by which it is determined whether a task fits on a processor. In this section,

we consider edf scheduling. Two possible different edf-schedulability tests are the

demand-bound function or task density. In the next subsection (Section 7.1.1), we

define an approximation to the demand-bound function and compare it with task

147

density. In Section 7.1.2, we define a polynomial-time partitioning algorithm using

the density-based schedulability test. In Section 7.1.3, we define a polynomial-time

partitioning algorithm using the dbf-approximation.

7.1.1 Approximation of dbf(τi, t)

The function dbf(τi, t), if plotted as a function of t for a given task τi, is represented

by a series of “steps,” each of height ei, at time-instants di, di + pi, di + 2pi, · · · ,

di + kpi, · · · . As reviewed in Section 3.4, Albers and Slomka (Albers and Slomka,

2004) Slomka (Albers and Slomka, 2004) have proposed a technique for approximat-

ing the dbf, which tracks the dbf exactly through the first several steps and then

approximates it by a line of slope ei/pi (see Figure 7.1). In the following, we are

applying this technique in essentially tracking dbf exactly for a single step of height

ei at time-instant di, followed by a line of slope ei/pi; this is essentially dbf(τi, t, ki)

defined in Equation 3.8 with ki = 1.

dbf∗(τi, t) =

0, if t < di

ei + ui × (t − di), otherwise(7.1)

As stated earlier, it has been shown that the cumulative execution requirement

of jobs of τi over an interval is maximized if one job arrives at the start of the inter-

val, and subsequent jobs arrive as rapidly as permitted. Intuitively, approximation

dbf∗ (Equation 7.1) models this job-arrival sequence by requiring that the first job’s

deadline be met explicitly by being assigned ei units of execution between its arrival-

time and its deadline, and that τi be assigned ui × δ t of execution over time-interval

[t, t+ δ t), for all instants t after the deadline of the first job, and for arbitrarily small

positive δ t (see Figure 7.2 for a pictorial depiction).

148

-t

6

dbf(τi, t)

. . . . .e

. . . . .2e

. . . . . . .3e

. . . . . . . . . .4e

. . . . . . . . . . . .5e

.

.

.

.

.

.

di

.

.

.

.

.

.

.

.

.

.

.

di + pi

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

di + 2pi

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

di + 3pi di + 4pi

Figure 7.1: The step function denotes a plot of dbf(τi, t) as a function of t. Thedashed line represents the function dbf∗(τi, t), approximating dbf(τi, t); for t < di,dbf∗(τi, t) ≡ dbf(τi, t) = 0.

Observe that the following inequalities hold for all τi and for all t ≥ 0:

dbf(τi, t) ≤ dbf∗(τi, t) < 2 · dbf(τi, t) (7.2)

with the ratio dbf∗(τi, t)/dbf(τi, t) being maximized just prior to the deadline of the

second job of τi — i.e., at t = di + pi − ǫ for ǫ an arbitrarily small positive number

— in the synchronous arrival sequence; at this time-instant, dbf∗(τi, t) → 2e while

dbf(τi, t) = e.

§ Comparison of dbf∗ and the density approximation. It is known that a

sufficient condition for a sporadic task system to be feasible upon a unit-capacity

uniprocessor is (∑

τi∈τ task-density(τi)) ≤ 1; i.e., the sum of the densities of all tasks

in the system not exceed one (see, e.g., (Liu, 2000, Theorem 6.2)). This condition

is obtained by essentially approximating the demand bound function dbf(τi, t) of τi

by the quantity (t · task-density(τi)). This approximation is never superior to our

149

-time

6

?di

6

pi

?pi + di

?

6ui

ei units here -¾

Figure 7.2: Pictorial representation of task τi’s reservation of computing capacity in aprocessor-sharing schedule. ei units of execution are reserved over the interval [0, di).The hatched region depicts τi’s reservation of computing capacity over the interval[di,∞) – over any time interval [t, t + δ t), an amount δ t × ui of computing capacityis reserved.

approximation dbf∗, and is inferior if di 6= pi: for our example in Figure 7.1, this

approximation would be represented by a straight line with slope ei/di passing through

the origin.

7.1.2 Density-Based Partitioning

In this section, we present a simple, efficient, algorithm for partitioning a sporadic task

system among the processors of a multiprocessor platform. We also show that this

partitioning algorithm does not offer a resource-augmentation performance guarantee.

Let us suppose that we are given sporadic task system τ comprised of n tasks

τ1, τ2, . . . τn, and a platform comprised of m unit-capacity processors π1, π2, . . . , πm.

For each task τi, recall that its density task-density(τi) is defined as follows:

task-density(τi)def=

ei

min(di, pi),

and

max-task-density(τ)def= max

τi∈τ

(

task-density(τi))

.

150

Without loss of generality, let us assume that the tasks in τ are indexed according

to non-increasing density: task-density(τi) ≥ task-density(τi+1) for all i, 1 ≤ i < n.

Our partitioning algorithm considers the tasks in the order τ1, τ2, . . . . Suppose that

tasks τ1, τ2, . . ., τi−1 have all been successfully allocated among the m processors, and

we are now attempting to allocate task τi to a processor. Our algorithm for doing

this is a variant of the First Fit (Johnson, 1974) algorithm for bin-packing, and is as

follows (see Figure 7.3 for a pseudo-code representation). For any processor πℓ, let

τ(πℓ) denote the tasks from among τ1, . . . , τi−1 that have already been allocated to

processor πℓ. Considering the processors π1, π2, . . . , πm, in any order, we will assign

task τi to any processor πk, 1 ≤ k ≤ m, that satisfies the following condition:

task-density(τi) +

∑

τj∈τ(πk)

task-density(τj)

≤ 1. (7.3)

Lemma 7.1 Algorithm densityPartition successfully partitions any sporadic

task system τ on m ≥ 1 processors, that satisfies the following condition:

∑

τi∈τ

task-density(τi) ≤ m − (m − 1) × max-task-density(τ). (7.4)

Proof: Let us suppose that Algorithm densityPartition fails to partition task

system τ ; in particular, let us assume that it fails to assign task τi to any processor,

for some i ≤ n. It must be the case that each of the m processors fails the test of

Equation 7.3; i.e., tasks previously assigned to each processor have their densities

summing to more than (1 − task-density(τi)). Summing over all m processors, this

151

densityPartition(τ,m)

¤ The collection of sporadic tasks τ = {τ1, . . . , τn} is to be partitioned onm identical, unit-capacity processors denoted π1, π2, . . . , πm. (Tasksare indexed according to non-increasing density: task-density(τi) ≥task-density(τi+1) for all i.) τ(πk) denotes the tasks assigned to pro-cessor πk; initially, τ(πk) ← ∅ for all k.

1 for i ← 1 to n¤ i ranges over the tasks

2 for k ← 1 to m¤ k ranges over the processors, considered in any order

3 if τi satisfies Conditions 7.3 on processor πk then¤ assign τi to πk; proceed to next task

4 τ(πk) ← τ(πk)⋃ {τi}

5 break;6 end (of inner for loop);7 if (k > m) return partitioning failed

8 end (of outer for loop);9 return partitioning succeeded

Figure 7.3: Pseudo-code for density-based partitioning algorithm.

implies that

i−1∑

j=1

task-density(τj) > m × (1 − task-density(τi))

≡i

∑

j=1

task-density(τj) > m − (m − 1)task-density(τi)

⇒∑

τi∈τ

task-density(τi) > m − (m − 1) × max-task-density(τ).

Hence, any system which Algorithm densityPartition fails to partition must have

∑

τi∈τ task-density(τi) > m− (m− 1)×maxτi∈τ (task-density(τi)). The lemma follows.

Theorem 7.1 Algorithm densityPartition successfully partitions any sporadic task

152

system τ on m ≥ 2 processors, that satisfies the following condition:

n∑

i=1

task-density(τi) ≤

m − (m − 1)max-task-density(τ) if max-task-density(τ) ≤ 12

m2

+ max-task-density(τ) if max-task-density(τ) ≥ 12.

(7.5)

Proof: Let us define a task τi to be dense if task-density(τi) > 1/2. We consider

two cases separately.

§ Case 1: There are no dense tasks. In this case max-task-density(τ) ≤ 1/2, and

Equation 7.5 reduces to

n∑

i=1

task-density(τi) ≤ m − (m − 1) × max-task-density(τ);

by Lemma 7.1, this condition is sufficient to guarantee that Algorithm densityPar-

tition successfully partitions the tasks in τ .

§ Case 2: There are dense tasks. In this case max-task-density(τ) > 1/2, and

Equation 7.5 reduces to

n∑

i=1

task-density(τi) ≤m

2+ max-task-density(τ). (7.6)

Observe first that any task system satisfying Condition 7.6 has at most m dense

tasks – this follows from the observation that one task will have density equal to

max-task-density(τ), and at most (m − 1) tasks may each have density greater than

12

while observing the∑n

i=1 task-density(τi) bound of Condition 7.6. We consider

separately the two cases when there are strictly less than m dense tasks, and when

there are exactly m dense tasks.

§ Case 2.1: Fewer than m dense tasks. Let nh denote the number of dense tasks.

Algorithm densityPartition assigns each of these nh tasks to a different processor.

153

We will apply Lemma 7.1 to prove that Algorithm densityPartition successfully

assigns the tasks {τnh+1, τnh+2, . . . , τn} on (m− nh) processors; the correctness of the

theorem would then follow from the observation that Algorithm densityPartition

does in fact have (m − nh) processors left over after tasks τ1 through τnhhave been

assigned.

Let us now compute upper bounds on the cumulative and maximum densities of

the task system {τnh+1, τnh+2, . . . , τn}:

n∑

j=nh+1

task-density(τj) =n

∑

j=1

task-density(τj) −nh∑

j=1

task-density(τj)

⇒n

∑

j=nh+1

task-density(τj) ≤m

2+ max-task-density(τ) −

nh∑

j=1

task-density(τj)

⇒ (from the fact that task-density(τ1) = max-task-density(τ)

and λ ≥ 1

2for all i = 2, . . . , nh)

n∑

j=nh+1

task-density(τj) ≤m

2+ max-task-density(τ)

−(max-task-density(τ) +1

2× (nh − 1))

≡n

∑

j=nh+1

task-density(τj) ≤m − nh + 1

2. (7.7)

Furthermore, since each task in {τnh+1, τnh+2, . . . , τn} is, by definition of nh, not dense,

it is the case that

nmax

j=nh+1

(

task-density(τj))

≤ 1

2. (7.8)

By applying the bounds derived in Equation 7.7 and Equation 7.8 above to Lemma 7.1,

we may conclude that Algorithm densityPartition successfully assigns the tasks

154

{τnh+1, τnh+2, . . . , τn} on (m − nh) processors:

n∑

j=nh+1

task-density(τj) ≤ (m − nh)−

(m − nh − 1)n

maxj=nh+1

(

task-density(τj))

⇐ m − nh + 1

2≤ (m − nh) −

m − nh − 1

2

≡ m − nh + 1

2≤ m − nh + 1

2

which is, of course, always true.

§ Case 2.2: Exactly m dense tasks. Let task-density(τR) denote the cumula-

tive densities of all the non-dense tasks: task-density(τR)def=

∑nj=m+1 task-density(τj).

Observe that task-density(τR) < m2− (m − 1) × 1

2= 1

2. By Equation 7.5, since

max-task-density(τ) > 12, the total system density does not exceed m

2+

max-task-density(τ). Because task-density(τ1) = max-task-density(τ), the (m−1) tasks

τ2, τ3, . . . , τm have total density ≤ m2− task-density(τR), it must be the case that

task-density(τm) is at most (m2− task-density(τR))/(m − 1). We will prove that

task-density(τm) + task-density(τR) ≤ 1, from which it will follow that all the tasks

τm, . . . , τn fit on a single processor. Indeed,

task-density(τm) + task-density(τR) ≤ 1

⇐m2− task-density(τR)

m − 1+ task-density(τR) ≤ 1

≡ m − 2task-density(τR) + 2mtask-density(τR) − 2task-density(τR) ≤ 2m − 2

≡ task-density(τR)(2m − 4) ≤ m − 2

≡ task-density(τR) ≤ 1

2,

which is true.

155

7.1.2.1 Resource Augmentation Analysis

Consider the task system τ comprised of the n tasks τ1, τ2, . . . , τn, with τi having the

following parameters:

ei = 2i−1, di = 2i − 1, pi = ∞.

Observe that τ is feasible on a single unit-capacity processor, since

n∑

i=1

dbf(τi, t) =

⌊log2(t+1)⌋∑

i=1

2i−1,

which is ≤ t for all t ≥ 0, with equality at t = 2 − 1, 22 − 1, . . . , 2n − 1 and strict

inequality for all other t.

Now, task-density(τi) > 12

for each i; i.e., each task is dense. Hence, Algo-

rithm densityPartition will assign each task to a distinct processor, and hence

is only able to schedule τ upon n unit-capacity processors. Alternatively, Algo-

rithm densityPartition would need a processor of computing capacity ≥ n in

order to have Condition 7.3 be satisfied for all tasks on a single processor. By making

n arbitrarily large, we have the following result:

Theorem 7.2 For any constant ξ ≥ 1, there is a sporadic task system τ and some

positive integer m such that τ is (global or partitioned) feasible on m unit-capacity

processors, but Algorithm densityPartition fails to successfully partition the tasks

in τ among (i) ξ×m unit-capacity processors, or (ii) m processors each of computing

capacity ξ.

156

7.1.3 Algorithm edf-partition

Given sporadic task system τ comprised of n tasks τ1, τ2, . . . τn, and a platform com-

prised of m unit-capacity processors π1, π2, . . . , πm, we now describe another algorithm

for partitioning the tasks in τ among the m processors. In Section 7.1.4 we will prove

that this algorithm, unlike the one described above in Section 7.1.2, does make non-

trivial resource-augmentation performance guarantees. With no loss of generality, let

us assume that the tasks in τ are indexed according to non-decreasing order of their

relative deadline parameter (i.e., di ≤ di+1 for all i, 1 ≤ i < n). Our partitioning

algorithm (see Figure 7.4 for a pseudo-code representation) considers the tasks in the

order τ1, τ2, . . . . Suppose that tasks τ1, τ2, . . ., τi−1 have all been successfully allocated

among the m processors, and we are now attempting to allocate task τi to a processor.

For any processor πℓ, let τ(πℓ) denote the tasks from among τ1, . . . , τi−1 that have

already been allocated to processor πℓ. Considering the processors π1, π2, . . . , πm, in

any order, we will assign task τi to the first processor πk, 1 ≤ k ≤ m, that satisfies

the following two conditions:

di −∑

τj∈τ(πk)

dbf∗(τj, di)

≥ ei (7.9)

and

1 −∑

τj∈τ(πk)

uj

≥ ui. (7.10)

If no such πk exists, then we declare failure: we are unable to conclude that sporadic

task system τ is feasible upon the m-processor platform.

The following lemma asserts that, in assigning a task τi to a processor πk, our

partitioning algorithm does not adversely affect the feasibility of the tasks assigned

earlier to each processor.

157

edf-partition(τ,m)

¤ The collection of sporadic tasks τ = {τ1, . . . , τn} is to be partitioned onm identical, unit-capacity processors denoted π1, π2, . . . , πm. (Tasks areindexed according to non-decreasing value of relative deadline parame-ter: di ≤ di+1 for all i.) τ(πk) denotes the tasks assigned to processorπk; initially, τ(πk) ← ∅ for all k.

1 for i ← 1 to n

¤i ranges over the tasks, which are indexed by non-decreasing value ofthe deadline parameter

2 for k ← 1 to m¤ k ranges over the processors, considered in any order

3 if τi satisfies Conditions 7.9 and 7.10 on processor πk then¤ assign τi to πk; proceed to next task

4 τ(πk) ← τ(πk)⋃ {τi}

5 break;6 end (of inner for loop)7 if (k > m) return partitioning failed

8 end (of outer for loop)9 return partitioning succeeded

Figure 7.4: Pseudo-code for partitioning algorithm.

Lemma 7.2 If the tasks previously assigned to each processor were edf-feasible on

that processor and our algorithm assigns task τi to processor πk, then the tasks assigned

to each processor (including processor πk) remain edf-feasible on that processor.

Proof: Observe that the edf-feasibility of the processors other than processor πk is

not affected by the assignment of task τi to processor πk. It remains to demonstrate

that, if the tasks assigned to πk were edf-feasible on πk prior to the assignment of τi

and Conditions 7.9 and 7.10 are satisfied, then the tasks on πk remain edf-feasible

after adding τi.

The scheduling of processor πk after the assignment of task τi to it is a uniprocessor

scheduling problem. It is known (see, e.g. (Baruah et al., 1990b)) that a uniprocessor

system of preemptive sporadic tasks is feasible if and only all deadlines can be met

for the synchronous arrival sequence (i.e., when each task has a job arrive at the same

158

time-instant, and subsequent jobs arrive as rapidly as legal). Also, recall that edf

is an optimal preemptive uniprocessor scheduling algorithm. Hence to demonstrate

that πk remains edf-feasible after adding task τi to it, it suffices to demonstrate that

all deadlines can be met for the synchronous arrival sequence. Our proof of this fact

is by contradiction. That is, we suppose that a deadline is missed at some time-

instant tf , when the synchronous arrival sequence is scheduled by edf, and derive

a contradiction which leads us to conclude that this supposition is incorrect, i.e., no

deadline is missed.

Observe that tf must be ≥ di, since it is assumed that the tasks assigned to πk are

edf-feasible prior to the addition of τi, and τi’s first deadline in the critical arrival

sequence is at time-instant di.

By the processor demand criterion for preemptive uniprocessor feasibility (see,

e.g.,

(Baruah et al., 1990b)), it must be the case that

dbf(τi, tf ) +∑

τj∈τ(πk)

dbf(τj, tf ) > tf ,

from which it follows, since dbf∗ is always an upper bound on dbf, that

dbf∗(τi, tf ) +

∑

τj∈τ(πk)

dbf∗(τj, tf ) > tf . (7.11)

Since tasks are considered in order of non-decreasing relative deadline, it must be the

case that all tasks τj ∈ τ(πk) have dj ≤ di. We therefore have, for each τj ∈ τ(πk),

dbf∗(τj, tf ) = ej + uj(tf − dj) (By definition)

= ej + uj(di − dj) + uj(tf − di)

= dbf∗(τj, di) + uj(tf − di). (7.12)

159

Furthermore,

dbf∗(τi, tf ) +

∑

τj∈τ(πk)

dbf∗(τj, tf )

≡ (ei + ui(tf − di)) + (By Equation 7.12 above)

∑

τj∈τ(πk)

dbf∗(τj, di) + uj(tf − di)

≡

ei +∑

τj∈τ(πk)

dbf∗(τj, di)

+(tf − di)

ui +∑

τj∈τ(πk)

uj

.

Consequently, Inequality 7.11 above can be rewritten as follows:

ei +∑

τj∈τ(πk)

dbf∗(τj, di)

+

(tf − di)

ui +∑

τj∈τ(πk)

uj

> (tf − di) + di. (7.13)

However by Condition 7.9, (ei+∑

τj∈τ(πk) dbf∗(τj, di)) ≤ di); Inequality 7.13 therefore

implies

(tf − di)

ui +∑

τj∈τ(πk)

uj

> (tf − di),

which in turn implies that

(ui +∑

τj∈τ(πk)

uj) > 1,

which contradicts Condition 7.10.

In the special case where the given sporadic task system is known to be constrained

— i.e., each task’s relative deadline parameter is no larger than its period parameter

— Lemma 7.3 below asserts that it actually suffices to test only Condition 7.9, rather

160

than having to test both Condition 7.9 and Condition 7.10.

Lemma 7.3 For constrained sporadic task systems, any τi satisfying Condition 7.9

during the execution of Line 3 in the algorithm of Figure 7.4 satisfies Condition 7.10

as well.

Proof: To see this, observe (from Equation 7.1) that for any constrained task τk,

for all t ≥ dk it is the case that

dbf∗(τk, t) = uk × (t + pk − dk) ≥ uk × t.

Hence,

Condition 7.9

≡

di −∑

τj∈τ(πk)

dbf∗(τj, di) ≥ ei

⇒ di −∑

τj∈τ(πk)

(uj × di) ≥ ei

⇒ 1 −∑

τj∈τ(πk)

uj ≥ei

di

⇒ 1 −∑

τj∈τ(πk)

uj ≥ ui

≡ Condition 7.10.

Hence, if the task system τ being partitioned is known to be a constrained task

system, we need only check Condition 7.9 (rather than both Condition 7.9 and Con-

dition 7.10) on line 3 in Figure 7.4. The correctness of the partitioning algorithm

follows, by repeated applications of Lemma 7.2.

Theorem 7.3 If our partitioning algorithm returns partitioning succeeded on

task system τ , then the resulting partitioning is edf-feasible.

161

Proof Sketch: Observe that the algorithm returns partitioning succeeded if

and only if it has successfully assigned each task in τ to some processor.

Prior to the assignment of task τ1, each processor is trivially edf-feasible. It

follows from Lemma 7.2 that all processors remain edf-feasible after each task as-

signment assignment as well. Hence, all processors are edf-feasible once all tasks in

τ have been assigned.

Time Complexity

In attempting to map task τi, observe that our partitioning algorithm essentially eval-

uates, in Equations 7.9 and 7.10, the workload generated by the previously-mapped

(i − 1) tasks on each of the m processors. Since dbf∗(τj, t) can be evaluated in con-

stant time (see Equation 7.1), a straightforward computation of this workload would

require O(i+m) time. Hence, the run-time of the algorithm in mapping all n tasks is

no more than∑n

i=1 O(i + m), which is O(n2) under the reasonable assumption that

m ≤ n.

7.1.4 Evaluation

Our dbf∗-based partitioning algorithm represents a sufficient, rather than exact, test

for feasibility — it is possible that there are systems that are feasible under the parti-

tioned paradigm but which will be incorrectly flagged as “infeasible” by our partition-

ing algorithm. Indeed, this is to be expected since a simpler problem – partitioning

collections of sporadic tasks that all have their deadline parameters equal to their

period parameters – is known to be NP-hard in the strong sense while our algorithm

runs in O(n2) time. In this section, we offer a quantitative evaluation of the efficacy

of our algorithm. Specifically, we derive some properties (Theorem 7.5 and Corol-

lary 7.2) of our partitioning algorithm, which characterize its performance. We would

162

like to stress that these properties are not intended to be used as feasibility tests to

determine whether our algorithm would successfully schedule a given sporadic task

system – since our algorithm itself runs efficiently in polynomial time, the “best” (i.e.,

most accurate) polynomial-time test for determining whether a particular system is

successfully scheduled by our algorithm is to actually run the algorithm and check

whether it performs a successful partition or not. Rather, these properties are in-

tended to provide a quantitative measure of how effective our partitioning algorithm

is vis a vis the performance of an optimal scheduler.

The parameters that are useful to characterize the behavior of our algorithm are:

load(τ), max-job-density(τ), system-util(τ), and max-util(τ) (these are defined in Chap-

ters 2 and 3). Intuitively, the larger of max-job-density(τ) and max-util(τ) represents

the maximum computational demand of any individual task, and the larger of load(τ)

and system-util(τ) represents the maximum cumulative computational demand of all

the tasks in the system.

Lemma 7.4 follows immediately.

Lemma 7.4 If task system τ is feasible (under either the partitioned or the global

paradigm) on an identical multiprocessor platform comprised of mo processors of com-

puting capacity ξ each, it must be the case that

ξ ≥ max(max-job-density(τ), max-util(τ)),

and

mo · ξ ≥ max(load(τ), system-util(τ)).

Proof: Observe that

1. Each job of each task of τ can receive at most ξ · di units of execution by its

deadline; hence, we must have ei ≤ ξ · di.

163

2. No individual task’s utilization may exceed the computing capacity of a proces-

sor; i.e., it must be the case that ui ≤ ξ.

Taken over all tasks in τ , these observations together yield the first condition.

In the second condition, the requirement that moξ ≥ system-util(τ) simply reflects

the requirement that the cumulative utilization of all the tasks in τ not exceed the

computing capacity of the platform. The requirement that moξ ≥ load(τ) is obtained

by considering a sequence of job arrivals for τ that defines load(τ); i.e., a sequence

of job arrivals over an interval [0, to) such that∑n

j=1 dbf(τj ,to)

to= load(τ). The total

amount of execution that all these jobs may receive over [0, to) is equal to mo · ξ · to;

hence, load(τ) ≤ mo · ξ.

Lemma 7.4 above specifies necessary conditions for our partitioning algorithm to

successfully partition a sporadic task system; Theorem 7.5 below specifies a sufficient

condition. But first, a technical lemma that will be used in the proof of Theorem 7.5.

Lemma 7.5 Suppose that our partitioning algorithm is attempting to schedule task

system τ on a platform comprised of unit-capacity processors.

C1: If system-util(τ) ≤ 1, then Condition 7.10 is always satisfied.

C2: If load(τ) ≤ 12, then Condition 7.9 is always satisfied.

Proof: The proof of C1 is straightforward, since violating Condition 7.10 requires

that (ui +∑

τj∈τ(πk) uj) exceed 1.

To see why C2 holds as well, observe that load(τ) ≤ 12

implies that∑

τj∈τ dbf(τj, to)

≤ to2

for all to ≥ 0. By Inequality 7.2, this in turn implies that∑

τj∈τ dbf∗(τj, to) ≤ to

for all to ≥ 0; specifically, at to = di when evaluating Condition 7.9. But, violating

Condition 7.9 requires that (dbf∗(τi, di) +∑

τj∈τ(πk) dbf∗(τj, di)) exceed di.

Corollary 7.1

164

1. Any sporadic task system τ satisfying (system-util(τ) ≤ 1∧

load(τ) ≤ 12) is

successfully partitioned on any number of processors ≥ 1.

2. Any constrained sporadic task system τ satisfying (load(τ) ≤ 12) is successfully

partitioned on any number of processors ≥ 1.

Proof Sketch: Part 1 follows directly from Lemma 7.5, since Line 3 of the parti-

tioning algorithm in Figure 7.4 will always evaluate to “true.”

Part 2 follows from Lemmas 7.5 and 7.3. By Lemma 7.3, we need only determine

that Condition 7.9 is satisfied, in order to ensure that Line 3 of the partitioning

algorithm in Figure 7.4 evaluate to “true.” By Condition C2 of Lemma 7.5, this is

ensured by having load(τ) ≤ 12.

Thus, any sporadic task system satisfying both system-util(τ) ≤ 1 and load(τ) ≤ 12

is successfully scheduled by our algorithm. We now describe, in Theorem 7.5, what

happens when one or both these conditions are not satisfied.

Lemma 7.6 Let m1 denote the number of processors, 0 ≤ m1 ≤ m, on which Con-

dition 7.9 fails when the partitioning algorithm is attempting to map task τi. It must

be the case that

m1 <2load(τ) − ei

di

1 − ei

di

. (7.14)

Proof: Since τi fails the test of Condition 7.9 on each of the m1 processors, it must

be the case that each such processor πℓ satisfies

∑

τj∈τ(πℓ)

dbf∗(τj, di) > (di − ei).

Summing over all m1 such processors and noting that the tasks on these processors

165

is a subset of the tasks in τ , we obtain

n∑

j=1

dbf∗(τj, di) > m1(di − ei) + ei

⇒ (By Inequality 7.2)

2n

∑

j=1

dbf(τj, di) > m1(di − ei) + ei

⇒∑n

j=1 dbf(τj, di)

di

>m1

2(1 − ei

di

) +ei

2di

. (7.15)

By definition of load(τ) for sporadic task systems,

∑nj=1 dbf(τj, di)

di

≤ load(τ). (7.16)

Chaining Inequalities 7.15 and 7.16 above, we obtain

m1

2(1 − ei

di

) +ei

2di

< load(τ)

⇒ m1 <2load(τ) − ei

di

1 − ei

di

,

which is as claimed by the lemma.

Lemma 7.7 Let m2 denote the number of processors, 0 ≤ m2 ≤ m − m1, on which

Condition 7.10 fails (but Condition 7.9 is satisfied) when the partitioning algorithm

is attempting to map task τi. It must be the case that

m2 <system-util(τ) − ui

1 − ui

. (7.17)

Proof: Since none of the m2 processors satisfies Condition 7.10 for task τi, it must

be the case that there is not enough remaining utilization on each such processor to

accommodate the utilization of task τi. Therefore, strictly more than (1 − ui) of the

166

capacity of each such processor has already been consumed; summing over all m2

processors and noting that the tasks already assigned to these processors is a subset

of the tasks in τ , we obtain the following upper bound on the value of m2:

(1 − ui)m2 + ui <n

∑

j=1

uj

⇒ m2 <system-util(τ) − ui

1 − ui

,

which is as asserted by the lemma.

Theorem 7.4 Any constrained sporadic task system τ is successfully scheduled by

our algorithm on m unit-capacity processors, for any m satisfying

m ≥ 2load(τ) − max-job-density(τ)

1 − max-job-density(τ). (7.18)

Proof: Our proof is by contradiction – we will assume that our algorithm fails

to partition task system τ on m processors, and prove that, in order for this to be

possible, m must violate Inequality 7.18 above. Accordingly, let us suppose that our

partitioning algorithm fails to obtain a partition for τ on m unit-capacity processors.

In particular, let us suppose that task τi cannot be mapped on to any processor. By

Lemma 7.3, it must be the case that Condition 7.9 fails for task τi on each of the m

processors; i.e., m1 in the statement of Lemma 7.6 is equal to the total number of

processors m. Consequently, Inequality 7.14 of Lemma 7.6 yields

m <2load(τ) − ei

di

1 − ei

di

.

By Corollary 7.1, it is necessary that load(τ) > 12

hold. Since max-job-density(τ) ≤ 1

(if not, the system is trivially non-feasible), the right-hand side of the above inequality

167

is maximized when ei

diis as large as possible, this implies that

m <2load(τ) − max-job-density(τ)

1 − max-job-density(τ),

which contradicts Inequality 7.18 above.

Theorem 7.5 Any sporadic task system τ is successfully scheduled by our algorithm

on m unit-capacity processors, for any m satisfying

m ≥(

2load(τ) − max-job-density(τ)

1 − max-job-density(τ)+

system-util(τ) − max-util(τ)

1 − max-util(τ)

)

. (7.19)

Proof: Once again, our proof is by contradiction – we will assume that our algorithm

fails to partition task system τ on m processors, and prove that, in order for this to

be possible, m must violate Inequality 7.19 above.

Let us suppose that our partitioning algorithm fails to obtain a partition for τ on m

unit-capacity processors. In particular, let us suppose that task τi cannot be mapped

on to any processor. There are two cases we consider: the case where Condition 7.9

fails and the case where Condition 7.9 is satisfied (implying that Condition 7.10 must

have failed). Let Π1 denote the m1 processors upon which this mapping fails because

Condition 7.9 is not satisfied (hence for the remaining m2def= (m − m1) processors,

denoted Π2, Condition 7.9 is satisfied but Condition 7.10 is not).

By Lemma 7.5 above, m1 will equal 0 if load(τ) ≤ 12, while m2 will equal 0 if

system-util(τ) ≤ 1. Since we are assuming that the partitioning fails, it is not possible

that both load(τ) ≤ 12

and system-util(τ) ≤ 1 hold.

Let us extend previous notation as follows: for any collection of processors Πx, let

τ(Πx) denote the tasks from among τ1, . . . , τi−1 that have already been allocated to

some processor in the collection Πx. Lemmas 7.6 and 7.7 provide, for task systems

on which our partitioning algorithm fails, upper bounds on the values of m1 (i.e., the

168

number of processors in Π1) and m2 (the number of processors in Π2) in terms of the

parameters of the task system τ .

We now consider three separate cases.

§ Case (i): (load(τ) > 12

and system-util(τ) ≤ 1). As stated in Lemma 7.5 (C1),

Condition 7.10 is never violated in this case, and m2 is consequently equal to zero.

From this and Lemma 7.6 (Inequality 7.14), we therefore have

m <2load(τ) − ei

di

1 − ei

di

.

In order for our algorithm to successfully schedule τ on m processors, it is sufficient

that the negation of the above hold:

m ≥2load(τ) − ei

di

1 − ei

di

.

Since the right-hand side of the above inequality is maximized when ei

diis as large

as possible, this implies that



which certainly holds for any m satisfying the statement of the theorem (Inequal-

ity 7.19).

§ Case (ii): (load(τ) ≤ 12

and system-util(τ) > 1). As stated in Lemma 7.5 (C2),

Condition 7.9 is never violated in this case, and m1 is consequently equal to zero.

From this and Lemma 7.7 (Inequality 7.17), we therefore have

m <system-util(τ) − ui

1 − ui

.

We once again observe that it is sufficient that the negation of the above hold in order

169

for our algorithm to successfully schedule τ on m processors:

m ≥ system-util(τ) − ui

1 − ui

.

Since the right-hand side of the above inequality is maximized when ui is as large as

possible, this implies that

m ≥ system-util(τ) − max-util(τ)

1 − max-util(τ),

which once again holds for any m satisfying the statement of the theorem (Inequal-

ity 7.19).

§ Case (iii): (system-util(τ) > 1 and load(τ) > 12). In this case, both m1 and m2

may be non-zero. From m1 + m2 = m and Inequality 7.17, we may conclude that

m1 > m − system-util(τ) − ui

1 − ui

. (7.20)

For Inequalities 7.20 and 7.14 to both be satisfied, we must have

2load(τ) − ei

di

1 − ei

di

> m − system-util(τ) − ui

1 − ui

⇒ m <2load(τ) − ei

di

1 − ei

di

+system-util(τ) − ui

1 − ui

. (7.21)

Hence for our algorithm to successfully schedule τ , it is sufficient that the negation

of Inequality 7.21 hold:

m ≥(

2load(τ) − ei

di

1 − ei

di


1 − ui

)

. (7.22)

Observe that the right hand-side of Inequality 7.22 is maximized when ui and ei

diare

both as large as possible; by Equations 2.4 and 3.2, these are defined to be max-util(τ)

170

and max-job-density(τ) respectively. We hence get Inequality 7.19:

m ≥(

2load(τ) − max-job-density(τ)



1 − max-util(τ)

)

as a sufficient condition for τ to be successfully scheduled by our algorithm.

Recall that in the technique of resource augmentation, the performance of the

algorithm being discussed is compared with that of a hypothetical optimal one, under

the assumption that the algorithm under discussion has access to more resources

than the optimal algorithm. Using Theorem 7.5 above, we now present such a result

concerning our partitioning algorithm.

Theorem 7.6 Our algorithm makes the following performance guarantees:

1. if a constrained sporadic task system is feasible on mo identical processors each

of a particular computing capacity, then our algorithm will successfully partition

this system upon a platform comprised of m processors that are each (2mo

m+1−

1m

) times as fast as the original.

2. if an arbitrary sporadic task system is feasible on mo identical processors each


this system upon a platform comprised of m processors that are each (3mo

m+1−

2m

) times as fast as the original.

Proof: Let us assume that τ = {τ1, τ2, . . . , τn} is feasible on mo processors each of

computing capacity equal to ξ. Below, we consider separately the cases when τ is a

constrained sporadic task system and an arbitrary sporadic task system:

§ 1. τ is a constrained sporadic task system. Since τ is feasible on mo ξ-

speed processors, it follows from Lemma 7.4 that the tasks in τ satisfy the following

properties:

max-job-density(τ) ≤ ξ, and load(τ) ≤ mo · ξ.

171

Suppose that τ is successfully scheduled by our algorithm on m unit-capacity pro-

cessors. By substituting the inequalities above in Equation 7.18 of Theorem 7.4, we

get


1 − max-job-density(τ)

⇐ m ≥ 2moξ − ξ

1 − ξ

≡ ξ ≤ m

2mo + m − 1

≡ 1

ξ≥ 2

mo

m+ 1 − 1

m,

which is as claimed in the statement of the theorem.

§ 2. τ is an arbitrary sporadic task system. Since τ is feasible on mo ξ-

speed processors, it follows from Lemma 7.4 that the tasks in τ satisfy the following

properties:

max-job-density(τ) ≤ ξ, max-util(τ) ≤ ξ, load(τ) ≤ mo·ξ, and system-util(τ) ≤ mo·ξ.

Suppose once again that τ is successfully scheduled by our algorithm on m unit-

capacity processors. By substituting the inequalities above in Equation 7.19 of The-

orem 7.5, we get




1 − max-util(τ)

⇐ m ≥ 2moξ − ξ

1 − ξ+

moξ − ξ

1 − ξ

≡ ξ ≤ m

3mo + m − 2

≡ 1

ξ≥ 3

mo

m+ 1 − 2

m,

which is as claimed in the statement of the theorem.

172

By setting mo ← m in the statement of Theorem 7.6 above, we immediately have

the following corollary.

Corollary 7.2 Our algorithm makes the following performance guarantees:

1. if a constrained sporadic task system is feasible on m identical processors each


this system upon a platform comprised of m processors that are each (3 − 1m

)

times as fast as the original.

2. if an arbitrary sporadic task system is feasible on m identical processors each of

a particular computing capacity, then our algorithm will successfully partition

this system upon a platform comprised of m processors that are each (4 − 2m

)

times as fast as the original.

7.1.5 A Pragmatic Improvement

We have made several approximations in deriving the results above. One of these

has been the use of the approximation dbf∗(τi, t) of Equation 7.1 in Condition 7.9,

to determine whether (the first job of) task τi can be accommodated on a processor

πk. We could reduce the amount of inaccuracy introduced here, by refining the

approximation: rather than approximating dbf(τi, t) by a single step followed by a

line of slope ui, we could explicitly have included the first ki steps, followed by a line

of slope ui (as proposed by Albers and Slomka (Albers and Slomka, 2004)). For the

173

case ki = 2, such an approximation, denoted dbf′(τi, t) here, is as follows:

dbf′(τi, t) =

0, if t < di

ei, if di ≤ t < di + pi

2ei + ui × (t − (di + pi)), otherwise.

(7.23)

If we were to indeed use an approximation comprised of ki steps, instead of the

single-step approximation dbf∗, in Condition 7.9 in determining whether a processor

can accommodate an additional task, we would need to explicitly re-check that the

first ki deadlines of all tasks previously assigned to the processor continue to be met.

This is because it is no longer guaranteed that the new deadlines (those of τi) will

occur after the deadlines of previously-assigned tasks, and hence it is possible that

adding τi to the processor will result in some previously-added task missing one of its

deadlines. However, the benefit of using better approximations is a greater likelihood

of determining a system feasible; we illustrate by an example.

Example 7.1 Suppose that task τj = (1, 1, 10) has already been assigned to proces-

sor πk when task τi = (1, 2, 20) is being considered. Evaluating Condition 7.9, we

have

di − dbf∗(τj, 2) ≥ ei

≡ 2 − 0.1 × (2 + 10 − 1) ≥ 1

≡ 2 − 1.1 ≥ 1,

which is false; hence, we determine that τi fails the test of Condition 7.9 and cannot

be assigned to processor πk.

However, suppose that we were to instead approximate the demand bound function

to two steps rather than one, by using the function dbf′ (Equation 7.23 above). We

174

would need to consider two deadlines for both the new task τi as well as the previously-

assigned task τj. The deadlines for τi are at time-instants 2 and 22, and for τj at

time-instants 1 and 11. The demand-bound computations at all four deadlines are

shown below.

• At t = 1: dbf′(τj, 1) + dbf′(τi, 1) = 1 + 0 = 1, which is ≤ 1.



• At t = 22: dbf′(τj, 22) + dbf′(τi, 22) = 2 + 3.1 = 5.1, which is ≤ 22.

Furthermore, τi also passes the test of Condition 7.10, since uj +ui = 0.1+0.05 which

is ≤ 1.

As this example illustrates, the benefit of using a finer approximation is enhanced

feasibility: the feasibility test is less likely to incorrectly declare a feasible system to

be infeasible. To analyze the run-time cost of this enhancement, assume that for all

τi, τj ∈ τ , ki = kj. The cost of this improved performance is run-time complexity:

rather than just check Condition 7.9 at di for each processor during the assignment

of task τi, we must check a similar condition on a total of (i × ki) deadlines over

all m processors (observe that this remains polynomial-time, for constant ki). Hence

in practice, we recommend that the largest value of ki that results in an acceptable

run-time complexity for the algorithm be used.

From a theoretical perspective, we were unable to obtain a significantly better

bound than the ones in Theorem 7.5 and Corollary 7.2 by using a finer approximation

in this manner.

§ Discussion. We reiterate that the results in Corollary 7.2 are not intended to

be used as feasibility tests to determine whether our algorithm would successfully

175

schedule a given sporadic task system; rather, these properties provide a quantitative

measure of how effective our partitioning algorithm is.

Observe that there are two points in our partitioning algorithm during which errors

may be introduced. First, we are approximating a solution to a generalization of the

bin-packing problem. Second, we are approximating the demand bound function

dbf by the function dbf∗, thereby introducing an additional approximation factor

of two (Inequality 7.2). While the first of these sources of errors arises even in the

consideration of implicit-deadline systems, the second is unique to the generalization

in the task model. Indeed, it can be shown that

any implicit-deadline sporadic task system τ that is (global or partitioned)

feasible on m identical processors can be partitioned in polynomial time,

using our partitioning algorithm, upon m processors that are (2− 1m

) times

as fast as the original system, when edf is used to schedule each processor

during run-time.

Thus, the generalization of the task model costs us a factor of 2 in terms of resource

augmentation for arbitrary deadlines, and a factor of less than 2, asymptotically

approaching 1.5 as m → ∞, for constrained deadlines.

The purpose of the above discussion on the partitioning of implicit-deadline sys-

tems is only intended to identify the sources of the error in the approximation factors

for constrained-deadline and arbitrary systems (Corollary 7.2). In practice, a system

designer would use the tighter analysis of (Lopez et al., 2000; Lopez et al., 2004) for

the partitioning implicit-deadline sporadic tasks systems.

176

- t

6

rbf(τi, t)

ei

. . . .2ei

. . . . . .3ei

. . . . . . . . .4ei

. . . . . . . . . . .5ei

.

.

.

.

.

.

pi

.

.

.

.

.

.

.

.

.

.

.

2pi

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

3pi

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

4pi 5pi

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Figure 7.5: A plot of rbf(τi, t) as a function of t. The double lines indicate theapproximation rbf∗(τi, t).

7.2 dm-based Partitioning

This section focuses on partitioning sporadic task systems when dm is used to sched-

ule each individual processor. In Section 7.2.1, we introduce a different workload

characterization called request-bound function that is useful in fixed-task-priority sys-

tems; in the same section, we define an approximation to the request-bound function

and compare it to the demand-bound function. In Section 7.2.2, we present our

polynomial-time partitioning algorithm for sporadic task systems and prove its cor-

rectness. Section 7.2.3 evaluates the efficacy of the partitioning algorithm in terms of

sufficient conditions for success and resource augmentation approximation bounds.

7.2.1 The Request-Bound Function

For any sporadic task τi and any real number t ≥ 0, the request-bound function

rbf(τi, t) is the largest cumulative execution requirement of all jobs that can be

generated by τi to have their arrival times within a contiguous interval of length t.

177

Every time a task τi releases a job, ei additional units of processor time are requested.

The following function provides an upper bound on the total execution time requested

by task τi at time t (i.e., the scenario where a task releases jobs as soon as legally

possible):

rbf(τi, t)def=

⌈

t

pi

⌉

ei. (7.24)

Fisher and Baruah (Fisher and Baruah, 2005b) proposed a method for approxi-

mating rbf; the following function can be obtained by applying the approximation

technique:

rbf∗(τi, t)

def= ei + ui × t. (7.25)

As stated earlier, it has been shown that the cumulative execution requirement of

jobs of τi over an interval is maximized if one job arrives at the start of the interval,

and subsequent jobs arrive as rapidly as permitted. Intuitively, approximation rbf∗

(Equation 7.25 above) models this job-arrival sequence by requiring that the first job’s

deadline be met explicitly by being assigned ei units of execution upon its arrival, and

that τi be assigned an additional ui × ∆ t of execution over time-interval [t, t + ∆ t),

for all instants t after the arrival of the first job, and for arbitrarily small positive ∆ t.

Figure 7.5 illustrates both rbf and rbf∗.

§ Relationship Between rbf∗ and dbf. The following observation will be im-

portant in quantitatively evaluating the partitioning algorithm presented in the next

section. The next lemma essentially provides an upper bound on rbf∗(τi, t) in terms

of τi’s utilization and demand-bound function.

Lemma 7.8 Given a sporadic task τi, the following inequality holds for all t ≥ di,

178

rbf∗(τi, t) ≤ dbf(τi, t) + (ui × t) (7.26)

Proof: Observe from the definition of dbf that for t ≥ di, dbf(τi, t) ≥ ei. Substi-

tuting this inequality into Equation 7.25, we obtain Equation 7.26.

7.2.2 A Polynomial-Time Partitioning Algorithm

Bin-packing heuristics for fixed-task-priority scheduling have been extensively stud-

ied (Andersson and Jonsson, 2003; Oh and Baker, 1998). In this section, we are only

considering fixed-task-priority scheduling algorithms. Deadline-monotonic scheduling

(dm) is known to be optimal for the fixed-task-priority scheduling of constrained spo-

radic task systems on uniprocessors (Leung and Whitehead, 1982). dm assigns to

each task τi a priority equivalent to 1di

and schedules, at any time, the active task

with the highest priority. In general, dm performs relatively well in simulations for

arbitrary task systems (Baker, 2005b). Therefore, dm is an appropriate algorithm to

use to schedule the tasks on each uniprocessor. For the remainder of this section, we

will consider a task to fit on a processor if it can be scheduled according to dm with

respect to all tasks previously assigned to the processor.

Section 7.2.2.1 presents a partitioning algorithm for sporadic task system where

dm is used on each processor. Section 7.2.2.2 shows that this algorithm is correct. Sec-

tion 7.2.2.3 shows the partitioning algorithm runs in time polynomial in the number

of tasks in the task system.

7.2.2.1 Algorithm dm-partition

We now describe a simple partitioning algorithm called dm-partition. Given a

sporadic task system τ comprised of n sporadic tasks τ1, τ2, . . . , τn, and a processing

platform Π comprised of m unit-capacity processors π1, π2, . . . , πm, dm-partition

179

will attempt to partition τ among the processors of Π. The dm-partition algorithm

is a variant of a bin-packing heuristic known as first-fit-decreasing . For this section,

we will assume the tasks of τi are indexed in non-decreasing order of their relative

deadline (i.e., di ≤ di+1, for 1 ≤ i < n).

The dm-partition algorithm considers the tasks in decreasing dm-priority order

(i.e., τ1, τ2, . . .). We will now describe how to assign task τi assuming that tasks

τ1, τ2, . . . , τi−1 have already successfully been allocated among the m processors. Let

τ(πℓ) denote the set of tasks already assigned to processor πℓ where 1 ≤ ℓ ≤ m.

Considering the processors π1, π2, . . . , πm in any order, we will assign task τi to the

first processor πk, 1 ≤ k ≤ m, that satisfies the following two conditions:

di −∑

τj∈τ(πk)

rbf∗(τj, di)

≥ ei (7.27)

and

1 −∑

τj∈τ(πk)

uj

≥ ui. (7.28)

If no such πk exists, then Algorithm dm-partition returns partitioning failed:

it is unable to conclude that sporadic task system τ is feasible upon the m-processor

platform. Otherwise, dm-partition returns partitioning succeeded.

§ Constrained Task Systems. We may eliminate the need for checking Inequal-

ity 7.28 by considering constrained task systems. For these systems, it is sufficient to

check only Inequality 7.27:

Lemma 7.9 For a constrained sporadic task system τ , any τi ∈ τ and πk ∈ Π satis-

fying Inequality 7.27, while attempting to assign τi to πk in dm-partition, will also

satisfy Inequality 7.28.

180

Proof: Observe that Inequality 7.27 implies

di −∑

τj∈τ(πk) (ej + di × uj) ≥ ei

⇒ 1 −(

∑

τj∈τ(πk)ej

di

)

−(

∑

τj∈τ(πk) uj

)

≥ ei

di.

Since di ≤ pi,

1 −∑

τj∈τ(πk)

uj ≥ei

pi

= ui.

Thus, Inequality 7.28 will evaluate to “true.”

7.2.2.2 Proof of Correctness

In order to prove that dm-partition is correct, we are obligated to show that after

each assignment of a task to a processor, the system is dm-feasible. In particular, we

must prove that if dm-partition returned partitioning succeeded then the set

of tasks assigned to each processor is dm-feasible on that processor (Theorem 7.7).

However, before we can prove the correctness of dm-partition, we need a feasibility

test for uniprocessors that uses rbf∗. The following lemma provides such a test.

Lemma 7.10 Tasks τ(πk) are dm-feasible on processor πk if for each τi ∈ τ(πk) and

a ∈ N+ the following condition is satisfied:

∃t : (a − 1)pi < t ≤ (a − 1)pi + di ::(

aei +∑

τj∈τ(πk)æ<i

rbf∗(τj, t) ≤ t

)

.(7.29)

Proof: By Lemma 9 of (Fisher and Baruah, 2005a), if Inequality 7.29 is satisfied,

then algorithm Approx(τ(πk), 0.5) (described in (Fisher and Baruah, 2005a)) will

return “τ(πk) is dm-feasible on πk.” By Theorem 5 of (Fisher and Baruah, 2005a),

Approx is correct, and the lemma follows.

181

The next lemma shows that algorithm dm-partition maintains the invariant

that Inequalities 7.27 and 7.28 remain true for all assigned tasks during every step

of the algorithm. The invariant is useful to show that the assignment of a task to a

processor does not affect the feasibility of the previously-assigned tasks.

Lemma 7.11 For each πk ∈ Π, the following conditions always hold for algorithm

dm-partition: for every τj ∈ τ(πk),

dj −∑

τℓ∈τ(πk)ℓ<j

rbf∗(τℓ, dj)

≥ ej (7.30)

and

1 −∑

τℓ∈τ(πk)ℓ<j

uℓ ≥ uj. (7.31)

Proof: Observe that Inequality 7.30 or 7.31 for some πk ∈ Π can only be falsified

by the assignment of a task τi by algorithm dm-partition. Thus, we only need to

show that Lemma 7.11 is maintained after every task assignment. We will prove this

by induction on the assignment of tasks:

Base Case: Prior to any assignment of a task to a processor by dm-partition,

each processor is empty. Therefore, Inequalities 7.30 and 7.31 are vacuously

true.

Inductive Step: Assume Inequalities 7.30 and 7.31 remain true for the assign-

ments of τ1, τ2, . . . , τi−1 (by the inductive hypothesis). We must show that

Inequalities 7.30 and 7.31 continue to hold after the assignment of τi. Let τi

be assigned to processor πk by dm-partition. Observe that Inequalities 7.30

and 7.31 for πs 6= πk are unaffected and remain true. Additionally, for τj ∈

τ(πk) − {τi} , it must be that j < i, since tasks are assigned in order by

182

dm-partition. Thus, for all τj ∈ τ(πk) − {τi}, Inequalities 7.30 and 7.31 are

unaffected as well. The lemma follows as dm-partition ensures that Inequal-

ities 7.30 and 7.31 are true (via Inequalities 7.27 and 7.28) for τi and πk upon

assigning τi.

Finally, we are prepared to prove the correctness of algorithm dm-partition.

Theorem 7.7 If dm-partition returns partitioning succeeded, then the tasks

of τ assigned to processors of Π are dm-feasible on their respective processors.

Proof: The proof is by contradiction. Since dm-partition returned partitioning

succeeded, then each task of τ is assigned to a processor of Π. For the sake of

contradiction, assume there exists a processor πk ∈ Π such that each task τ(πk) will

not always meet all deadlines when scheduled on πk. By Lemma 7.10, this implies

that there exists a task τi ∈ τ(πk) and a ∈ N+ such that

∀t : (a − 1)pi < t ≤ (a − 1)pi + di ::(

aei +∑

τj∈τ(πk)j<i

rbf∗(τj, t) > t

)

.(7.32)

By Lemma 7.11, each task-processor assignment satisfies Inequalities 7.30 and 7.31.

Inequality 7.30 implies

ei +∑

τj∈τ(πk)j<i

(ej + diuj) ≤ di (by defititon of rbf∗). (7.33)

The next equation follows from multiplying both sides of Inequality 7.31 by (a−1)pi:

[(a − 1)pi]

ui +∑

τj∈τ(πk)j<i

uj

≤ (a − 1)pi. (7.34)

183

By summing the Inequalities 7.33 and 7.34, we obtain

aei +∑

τj∈τ(πk)j<i

(ej + uj [(a − 1)pi + di]) ≤ (a − 1)pi + di

⇒ aei +∑

τj∈τ(πk)j<i

rbf∗(τj, (a − 1)pi + di) ≤ (a − 1)pi + di.

The last inequality contradicts Inequality 7.32. Therefore, our supposition that there

exists a πk where τ(πk) does not always meet all deadlines is incorrect. It follows that

for each πk ∈ Π, τ(πk) is dm-feasible on πk.

7.2.2.3 Computational Complexity

Obviously, to sort each task in (non-decreasing) relative deadline order requires Θ(n lg n)

time. In attempting to map task τi, observe that Algorithm dm-partition essen-

tially evaluates, in Equations 7.27 and 7.28, the workload generated by the previously-

mapped (i− 1) tasks on each of the m processors. Since rbf∗(τj, t) can be evaluated

in constant time (see Equation 7.25), a straightforward computation of this workload

would require O(i + m) time. Hence the runtime of the algorithm in mapping all n

tasks is no more than∑n

i=1 O(i+m), which is O(n2) under the reasonable assumption

that m ≤ n.

7.2.3 Theoretical Evaluation

In this section, we quantitatively evaluate the effectiveness of dm-partition by pro-

viding sufficient conditions for success (Section 7.2.3.1) and in terms of a resource

augmentation approximation bounds (Section 7.2.3.2).

184

7.2.3.1 Sufficient Schedulability Conditions

The main results of this section (Theorems 7.8 and 7.9) provide an upper-bound

on the minimum number of processors necessary for dm-partition to successfully

partition a sporadic task system. The bound is dependent on the utilization and load

parameter of the task system.

Before presenting the main theorems of this section, we require several technical

lemmas. The next lemma characterizes the conditions under which Inequalities 7.27

or 7.28 are trivially satisfied.

Lemma 7.12 Given a sporadic task system τ and an m unit-capacity processor sys-

tem Π, dm-partition has the following properties.

P1: If system-util(τ) ≤ 1, Inequality 7.28 is always satisfied.

P2: If system-util(τ) ≤ 1 and load(τ) ≤ 1 − system-util(τ), then Inequality 7.27 is

always satisfied.

Proof: P1 is trivially true, since violating Inequality 7.28 requires that (ui +

∑

τj∈τ(πk) uk) exceed 1.

To see P2, observe that system-util(τ) ≤ 1 and load(τ) ≤ 1− system-util(τ) implies

for all t ≥ 0,

∑nj=1 dbf(τj ,t)

t≤ 1 − system-util(τ)

⇒ ∑nj=1 dbf(τj, di) ≤ di − disystem-util(τ)

⇒ ∑nj=1 dbf(τj, di) + disystem-util(τ) ≤ di. (7.35)

Consider any τi ∈ τ . Observe that for all j < i, dj ≤ di. Thus, for all j ≤ i,

dbf(τj, di) ≥ ej. Observing that {τ1, τ2, . . . , τi−1} ⊂ τ and combining lower-bound on

185

dbf with Inequality 7.35 implies

ei +∑i−1

j=1 ej + di

(

∑i−1j=1 uj

)

≤ di

⇒ ei +∑i−1

j=1 (ej + diuj) ≤ di

⇒ di −∑i−1

j=1 rbf∗(τj, di) ≥ ei. (7.36)

P2 follows from Inequality 7.36 and noting that for all πk ∈ Π τ(πk) ⊆ {τ1, τ2, . . . , τi−1}.

Corollary 7.3 Any sporadic task system τ with (system-util(τ) ≤ 1) ∧

(load(τ) ≤ 1 − system-util(τ)) can be successfully partitioned using dm-partition on

m ≥ 1 processors.

The next two lemmas provide an upper-bound on the number of processor on

which either Inequality 7.27 or 7.28 of dm-partition will evaluate to false.

Lemma 7.13 Let m1 denote the number of processors on which Inequality 7.27 fails

while dm-partition attempts to assign τi to a processor. It must be the case that

m1 <load(τ) + system-util(τ) − ei

di

1 − ei

di

. (7.37)

Proof: Let Π1 ⊆ Π be the set of all processors on which Inequality 7.27 fails. Then,

for all πk ∈ Π1,

di −∑

τj∈τ(πk) rbf∗(τj, di) < ei

⇒ di −∑

τj∈τ(πk) (dbf(τj, di) + diuj) < ei. (7.38)

Inequality 7.38 follows from Inequality 7.26 of Lemma 7.8. By noting that for each

πk, τ(πk) is a subset of τ − {τi}, and summing Inequality 7.38 over all πk ∈ Π1, we

186

obtain,

m1di −∑n

j=1 dbf(τj, di) − disystem-util(τ) < m1ei − ei

⇒ m1(di − ei) + ei <∑n

j=1 dbf(τj, di) + disystem-util(τ)

⇒ m1(1 − ei

di) + ei

di< load(τ) + system-util(τ).

The last inequality implies the lemma.

Lemma 7.14 Let m2 denote the number of processors on which Inequality 7.28 fails

and Inequality 7.27 is satisfied while dm-partition attempts to assign τi to a pro-

cessor. It must be the case that


1 − ui

. (7.39)

Proof: Let Π2 ⊆ Π − Π1 be the set of all processors on which Inequality 7.28 fails

(while Inequality 7.27 is satisfied). Then, for all πk ∈ Π2,

1 −∑

τj∈τ(πk)

uj < ui.

Noting that for each πk, τ(πk) is a subset of τ , and summing Inequality 7.2.3.1 over

all πk ∈ Π1, we obtain,

m2 − system-util(τ) < m2ui − ui

⇒ m2(1 − ui) < system-util(τ) − ui.

The last inequality implies the lemma.

We are now prepared to prove the sufficient conditions for the success of

187

dm-partition over a set of constrained task systems (Theorem 7.8) and arbitrary

task systems (Theorem 7.9).

Theorem 7.8 Any constrained sporadic task system τ is successfully scheduled by

dm-partition on m unit-capacity processors for any m satisfying

m ≥ load(τ) + system-util(τ) − max-job-density(τ)

1 − max-job-density(τ). (7.40)

Proof: We will prove the contrapositive of the theorem. Assume that dm-partition

fails to assign task τi to any processor of Π. By Lemma 7.9, Inequality 7.27 is false

for every πk ∈ Π (i.e., m = m1). Thus, by Lemma 7.13,

m <load(τ) + system-util(τ) − ei

di

1 − ei

di

. (7.41)

Corollary 7.3 implies that system-util(τ) > 1 or load(τ) > 1 − system-util(τ). Both

inequalities imply that load(τ) + system-util(τ) > 1. Therefore, the right-hand-side of

Inequality 7.41 is maximized when ei

diis as large as possible. It follows that

m <load(τ) + system-util(τ) − max-job-density(τ)


which proves the contrapositive of the theorem.

Theorem 7.9 Any sporadic task system τ is successfully scheduled by dm-partition

on m unit-capacity processors for any m satisfying

m ≥ load(τ) + system-util(τ) − max-job-density(τ)



1 − max-util(τ).

(7.42)

Proof: We will prove the contrapositive of the theorem. Assume that dm-partition

fails to assign task τi to any processor of Π. We now consider four subcases based on

188

the values of system-util(τ) and load(τ). Each of the subcases, is either not possible

or will imply the contrapositive of the theorem.

1. system-util(τ) ≤ 1 ∧ load(τ) ≤ 1 − system-util(τ): By Corollary 7.3, τ would be

trivially partitionable on a single processor by dm-partition. Therefore, this

subcase is impossible as it contradicts our supposition that dm-partition fails

to assign some task a processor.

2. system-util(τ) ≤ 1 ∧ load(τ) > 1 − system-util(τ): By Lemma 7.12, Inequal-

ity 7.28 is always satisfied. Therefore, Inequality 7.27 must be violated for all

πk when attempting to assign τi (i.e., m = m1). Using reasoning identical to

Theorem 7.8, we will obtain

m <load(τ) + system-util(τ) − max-job-density(τ)


from which the contrapositive of the theorem follows.

3. system-util(τ) > 1 ∧ load(τ) ≤ 1 − system-util(τ): Notice that system-util(τ) > 1

implies that load(τ) < 0. This subcase is trivially impossible.

4. system-util(τ) > 1 ∧ load(τ) > 1 − system-util(τ): Recall that m1 denotes the

number of processor on which Inequality 7.27 of dm-partition fails while at-

tempting to assign τi. m2 denotes the remaining processors on which Inequal-

ity 7.28 fails. Therefore, m = m1 + m2. From Lemma 7.14,


1 − ui

. (7.43)

Lemma 7.13 implies

m1 <load(τ) + system-util(τ) − ei

di

1 − ei

di

. (7.44)

189

Summing Inequalities 7.44 and 7.43, we obtain

m = m1 + m2 <load(τ) + system-util(τ) − ei

di

1 − ei

di


1 − ui

. (7.45)

Since system-util(τ) > 1 and load(τ) + system-util(τ) > 1, the right-hand-side of

Inequality 7.45 is maximized when both ei

diand ui are as large as possible. This

implies the contrapositive of the theorem.

7.2.3.2 Resource Augmentation

Theorem 7.10 Given an identical multiprocessor platform Π with m processors and

a sporadic task system τ (global or partition) feasible on Π , the dm-partition algo-

rithm has the following performance guarantees:

1. if τ is a constrained system, then dm-partition will successfully partition τ

upon a platform comprised of m processors that are each(

3 − 1m

)

times as fast

as the processors of Π.

2. if τ is an arbitrary system, then dm-partition will successfully partition τ

upon a platform comprised of m processors that are each(

4 − 2m

)

times as fast

as the processors of Π.

Proof: Assume that we are given task system τ feasible on m processors each of

speed ξ, it follows from Lemma 7.4 that τ must satisfy the following properties:

load(τ) ≤ m · ξ system-util(τ) ≤ m · ξ

max-job-density(τ) ≤ ξ max-util(τ) ≤ ξ.(7.46)

190

Suppose that τ is successfully scheduled on m unit-capacity processor by dm-partition.

We first show (1) by substituting the Inequalities of 7.46 above into Inequality 7.40

of Theorem 7.8:

m ≥ load(τ)+system-util(τ)−max-job-density(τ)

1−max-job-density(τ)

⇐ m ≥ mξ+mξ−ξ1−ξ

⇐ m ≥ 2mξ−ξ1−ξ

≡ ξ ≤ m3m−1

≡ 1ξ≥ 3 − 1

m.

The last implication is claimed by (1) of the theorem.

To show (2), we similarly substitute the Inequalities of 7.46 into Inequality 7.42

of Theorem 7.9:

m ≥ load(τ)+system-util(τ)−max-job-density(τ)

1−max-job-density(τ)+

system-util(τ)−max-util(τ)

1−max-util(τ)

⇐ m ≥ 2mξ−ξ1−ξ

+ mξ−ξ1−ξ

≡ ξ ≤ m4m−2

≡ 1ξ≥ 4 − 2

m,

which is claimed by (2) of the theorem.

7.2.3.3 Comparison with Prior Implicit-Deadline Partitioning Results

Polynomial-time partitioning algorithms for implicit-deadline systems are known (Oh

and Baker, 1998; Andersson and Jonsson, 2003). To evaluate the theoretical loss of

schedulability resulting from the move to a more general task model, let us consider

the resource augmentation bound of the partitioning algorithm analyzed by Andersson

and Jonsson (Andersson and Jonsson, 2003). By using the logic of Theorem 7.10, the

following theorem can be obtained from their 0.5m utilization bound:

191

Theorem 7.11 Given an identical multiprocessor platform Π with m processors and

an implicit-deadline sporadic task system τ (global or partition) feasible on Π, can be

partitioned in polynomial-time upon a platform comprised of m processors that are

each approximately 2 times as fast as the processors of Π.

7.3 Summary

Most prior theoretical research concerning the multiprocessor partitioning of sporadic

task systems has imposed the additional constraint that all tasks have their deadline

parameter equal to their period parameter. In this chapter, we have removed this con-

straint, and have considered the partitioning of arbitrary sporadic task systems upon

preemptive multiprocessor platforms. We have designed an algorithm for performing

the partitioning of a given collection of sporadic tasks upon a specified number of

processors, and have proved the correctness of, and evaluated the effectiveness of,

this partitioned algorithm. The techniques developed in this chapter allow for parti-

tioning to be achieved in polynomial-time while still ensuring a low constant-factor

resource-augmentation approximation ratio.

192

Chapter 8

Conclusions and Future Work

The increasing ubiquity of real-time systems has led to a diverse range in the behavior

and complexity of real-time applications. Researchers have addressed the increased

diversity for uniprocessor systems by developing general real-time task models that

successfully characterize complex interactions by real-time applications. With the re-

cent emergence and commercial acceptance of multicore platforms, many future real-

time systems will undoubtedly be multiprocessor systems. However, until recently,

most multiprocessor real-time research has focused primarily upon simple task models

such as the periodic or LL task model; the techniques available to a real-time sys-

tem designer to temporally verify the correctness of an application on multiprocess

exhibiting more complex behavior have been limited. The results in this disserta-

tion increase the types of real-time applications and behavior that can be temporally

verified upon a multiprocessor platform.

Towards this the goal of verifying increasingly general task systems upon multi-

processor platforms, we have proposed the following thesis: Optimal online multipro-

cessor real-time scheduling algorithms for sporadic and more general task systems are

impossible; however, efficient, online scheduling algorithms and associated feasibility

and schedulability tests, with provably bounded deviation from any optimal test, exist.

In Section 8.1, we summarize the contributions of this dissertation that support this

thesis. In Section 8.2, we describe related work by us not included in this dissertation.

Section 8.3 describes a future research agenda on problems arising from this disserta-

tion and other related problems. We conclude with some remarks in Section 8.4.

8.1 Summary of Results

We will now summarize the contributions of this dissertation.

8.1.1 Efficient Workload Characterization for General Task

Systems

In Chapter 2, we observed that many traditional real-time workload characterizations

that were effective in validation techniques for LL multiprocessor systems perform

arbitrarily poorly for analyzing sporadic or more general task systems. Chapter 3

present a characterization of real-time work load using a combination of demand-based

load and maximum job density (represented by load and max-job-density, respectively).

We describe how these two characterizations can be effectively determined for all

partially-specified recurrent task systems that generalize the sporadic task model and

those satisfying the task independence assumptions. To support this claim, we develop

several algorithms to efficiently compute load for sporadic task systems. We develop

two algorithms that approximate load to within an additive ǫ of its actual value in

pseudo-polynomial time and a PTAS that requires O(n3/ǫ).

8.1.2 Multiprocessor Feasibility Tests

In Chapter 4, we develop feasibility tests for real-time instances based on load and

max-job-density. We develop a test for the full-migration feasibility of a real-time in-

194

Scheduling Paradigm: Task Model

Sporadic GMF/Recurring/etc..

Full-Migration√

2 + 1 (Theorem 4.4)

Restricted-Migration Constrained: 3 − 1m 4 − 1

m (Theorem 4.2)

Partitioned Arbitrary: 4 − 2m Future Work

(Corollary 7.2)

Table 8.1: A summary of the resource-augmentation approximation ratios guaran-teed by the feasibility tests presented in this dissertation. In each entry, we givethe approximation ratio and the corresponding theorem number. The results forthe partitioned feasibility correspond to the schedulability tests of Chapter 7 (i.e., aschedulability test is also by definition a feasibility test). Also, the results for parti-tioned feasibility of sporadic task systems immediately imply resource-augmentationresults for restricted-migration feasibility of sporadic task systems since a partitionedschedule is also by definition a restricted-migration schedule.

stance; the test obtain is load(I) ≤ m−(m−2)·max-job-density(I)

1+max-job-density(I). We show that this test

is a factor of at most ≈ 1.61 away from the optimal test that could be obtained using

load and max-job-density. We also develop a test for restricted-migration feasibility:

load(I) ≤ m−(m−1)·max-job-density(I)

3. We show that this test is a factor of at most 8

3

from the optimal attainable test using load and max-job-density. Per the discussion of

Chapter 3 in Section 3.3, these tests may be immediately used as feasibility test for

partially-specified recurrent task systems where load and max-job-density parameter

may be determined.

For partitioned systems, the schedulability tests of Chapter 7 give feasibility tests

for partitioned sporadic task systems. The resource augmentation guarantees for the

feasibility tests obtained in this dissertation are summarized in Table 8.1.

195

8.1.3 Impossibility of Optimal Online Multiprocessor Schedul-

ing

The feasibility tests of Chapter 4 imply that for a task system satisfying there is a

valid interleaving of jobs’ execution upon an m-processor platform that would meet

all deadlines. However, Chapter 5 shows that even though there may exist a valid

schedule there may not be an online algorithm that always generates a schedule

meeting all the task system’s deadlines. To show this, we give an example sporadic

task system that is feasible upon two processors (the proof of feasibility is contained

in Appendix A). We show for this task system, that there exists a sequence of job

releases that causes any choice made by an online scheduling algorithm to result in

a deadline miss. The existence of such a feasible task system implies that optimal

online multiprocessor scheduling algorithms do not exist for sporadic or more general

tasks systems. Thus, the search for scheduling algorithms and validation techniques

with constant-factor approximation ratios is justified.

8.1.4 Multiprocessor Schedulability Tests

We obtained schedulability tests for scheduling algorithms in Chapters 6 and 7. In

Chapter 6, we obtained schedulability tests for dm and edf under the full-migration

scheduling paradigm. For dm, if load(I) ≤ m−(m−1)·max-job-density(I)

3is satisfied then

dm will meet all deadlines when scheduling I upon an m-processor platform. For

edf, if both load(I) ≤ (√

2−1)m2

2m−1and max-job-density(I) ≤ (

√2−1)m

2m−1is satisfied, then I

may be scheduled by edf to meet all deadlines upon an m-processor platform.

In Chapter 7, we obtained schedulability tests for sporadic task systems assum-

ing either edf or dm is used to schedule each individual processor. Furthermore,

we obtained conditions for arbitrary sporadic task systems (i.e., no restriction place

196

Scheduling Paradigm: Task Model

Sporadic GMF/Recurring/etc..

Full-Migration dm: 4 − 1m (Theorem 6.2)

edf: 2 + 2√

2 −√

2+1m (Theorem 6.4)

Restricted-Migration For both edf and dm

Constrained: 3 − 1m

Partitioned Arbitrary: 4 − 2m Future Work

(Corollary 7.2)

Table 8.2: A summary of the resource-augmentation approximation ratios guaranteedby the schedulability tests presented in this dissertation. In each entry, we give theapproximation ratio and the corresponding theorem number.

relationship between di and pi for any task τi) and tighter conditions for constrained-

deadline sporadic task systems (i.e., task systems where each task τi has di ≤ pi).

For partitioned platforms scheduled by edf on each uniprocessor, an arbitrary spo-

radic task system τ is partitionable according to Algorithm partition if m ≥2·load(τ)−max-job-density(τ)



1−max-util(τ), and constrained-deadline sys-

tem τ is partitionable if m ≥ 2·load(τ)−max-job-density(τ)

1−max-job-density(τ). For partitioned platforms

scheduled by dm on each processor, an arbitrary sporadic task system τ is partition-

able according to dm-partition if m ≥ load(τ)+system-util(τ)−max-job-density(τ)



1−max-util(τ), and constrained-deadline system τ is partitionable if m ≥

load(τ)+system-util(τ)−max-job-density(τ)

1−max-job-density(τ). The resource augmentation guarantees for

the schedulability tests obtained in this dissertation are summarized in Table 8.2.

8.2 Related Research Contributions

We now outline our other research contributions not previously mentioned in this

dissertation.

197

§ Partitioning of real-time task with memory constraints. Most prior the-

oretical research on partitioning algorithms for real-time multiprocessor platforms

has focused on ensuring that the cumulative computing requirements of the tasks

assigned to each processor do not exceed the processor’s processing power. However,

many multiprocessor platforms have only limited amounts of local per-processor mem-

ory (e.g., Intel’s IXP2XXX series of network processors); if the memory limitation of

a processor is not respected, thrashing between “main” memory and the processor’s

local memory may occur during run-time and may result in performance degradation.

Our research has developed an approximation algorithm for task partitioning (Fisher

et al., 2005). In addition, we have considered an approximation algorithm (Baruah

and Fisher, 2005a) for architectures that allow instruction-memory to be compressed

at the expense of additional instruction decoding time (examples include the ARM

Thumb and MIPS16); the goal in partitioning such an architecture is to find a com-

pression that minimizes the cumulative code-size of each task while simultaneously

ensuring temporal constraints.

§ Resource-locking durations in uniprocessor systems. There has recently

been much interest in the design and implementation of open environments (Deng

and Liu, 1997) (also called hierarchically-scheduled systems or real-time virtualiza-

tion) for real-time applications. Such open environments allow for multiple indepen-

dently developed and validated real-time applications to co-execute upon a single

shared platform. Given the specifications of such a system, an important objective is

to determine, for each non-preemptable serially reusable resource, the length of the

longest interval of time for which the resource may be locked. In (Bertogna et al.,

2006; Fisher et al., 2007a), we have extended current scheduling-theoretic analysis

techniques to obtain resource-locking durations. A high-level scheduler that arbi-

trates access to such non-preemptable shared resources among different applications

198

will use this knowledge to determine the schedulability of the entire system.

§ Fully polynomial-time approximation algorithms for uniprocessor schedu-

lability analysis. In static-priority scheduling algorithms for sporadic task systems,

each task is assigned a distinct priority, and all jobs of a task execute at the task’s

priority. The computational complexity of analyzing whether a given static-priority

assignment is feasible on a single processor is currently unknown; the best known tests

are pseudo-polynomial. In (Fisher and Baruah, 2005b; Fisher and Baruah, 2005a),

we propose a fully polynomial-time approximation (FPTAS) algorithm with the fol-

lowing guarantee: for any specified value of ǫ, where 0 < ǫ < 1, the FPTAS correctly

identifies, in time polynomial in the number of tasks in the task system and 1/ǫ, all

task systems that are static-priority feasible (with respect to a given priority assign-

ment) on a processor that has (1 − ǫ) times the computing capacity of the processor

for which the task system is specified.

8.3 Future Research Agenda

In this section, we will briefly outline some ideas for future research.

8.3.1 Open Questions from Dissertation

Multiprocessor real-time scheduling theory is still a nascent area of research; therefore,

there are many open areas of research not addressed in this dissertation. In this

subsection, we briefly list some potential avenues for future research extending the

results of this dissertation.

§ Tightened conditions. In Chapter 4, we show that our multiprocessor feasibility

condition using demand-based load is at most a factor of approximately 1.61 from the

optimal load condition. An interesting challenge is to decrease this gap in feasibility

199

analysis. Similarly, the question of whether similar gaps for schedulability analysis of

various multiprocessor scheduling algorithms may be decreased is an open question.

§ Partitioned scheduling of general task systems. Despite our success at an-

alyzing partitioned scheduling for the sporadic task model, we are currently unaware

of non-trivial feasibility and schedulability algorithms for the partitioned scheduling

of more general task models (such as GMF or DAG-based tasks). An open avenue of

research is determining whether the techniques for partitioning sporadic task system

apply to more general task models.

§ Resource sharing in multiprocessor platforms. A large body of research

exists for resource sharing and synchronization in uniprocessor systems. Sophisti-

cated protocols have been developed to ensure that the number of priority inversions

are minimized (a priority inversion occurs when a lower-priority task “blocks” the

execution of a higher-priority task due to synchronization). However, the issue of re-

source sharing in multiprocessor systems has not been addressed for the general task

models discussed in this dissertation. Development of resource-sharing and synchro-

nization protocols and analysis techniques are a prerequisite for the design of actual

multiprocessor real-time systems.

§ Multiprocessor scheduling under precedence constraints. For many actual

real-time systems, data must be communicated to a task τj before it may begin

execution. The producer of this data is frequently another different task τi. In

such a scenario, there is a precedence constraint between τi and τj. The existence of

precedence constraints complicates schedulability analysis for a system. We would like

to explore whether schedulability-analysis techniques developed in this dissertation

could be meaningfully extended to account for precedence constraints between tasks.

200

8.3.2 Real-time Processor Virtualization with Resource Shar-

ing

As mentioned in Section 8.2, virtualized real-time systems (i.e., open environments)

have recently received much attention. There exists research that addresses the shar-

ing of additional shared non-preemptable resources; however, most of the previously-

proposed techniques place severe restrictions on the individual applications — in

effect requiring that the resource-requesting jobs comprising these applications be

made available in a first-come first-serve manner to the higher-level scheduler (as

would happen, e.g., if the applications were scheduled using non-flexible, table-driven

scheduling). We are currently developing an approach using sophisticated resource-

sharing protocols to increase the schedulability of the entire open environment and

decrease the complexity of system schedulability analysis. An interesting and chal-

lenging problem is to extend resource-sharing techniques for uniprocessor open envi-

ronments to multiprocessor open environments. Also, the development of operating

system support for developing applications in an open environment would pose many

practical research problems.

8.3.3 Approximate Response-time Analysis for Uniproces-

sors

As mentioned in Section 8.2, some of our research has been on developing an FPTAS

for schedulability analysis on uniprocessor systems. An interesting problem, related to

schedulability analysis, is response-time analysis. Response-time analysis determines

the maximum elapsed time from release of a task to its completion with respect to a

given scheduling algorithm. The response-time analysis is useful in modeling the com-

munication between tasks in a distributed real-time system. Typically, response-time

201

analysis must be performed multiple times to successfully obtain a useful communi-

cation model. Thus, a fast, tunable approximation of response-time analysis would

increase the efficiency of such techniques. We have recently been exploring (Richard

et al., 2007; Fisher et al., 2007b) whether an FPTAS for response-time analysis can

be obtained for static-priority, uniprocessor systems, and there remains opportunities

for further research in this topic.

8.4 Concluding Remarks

With the emergence of multicore and related technologies, the standard for future

embedded real-time systems will be a multiprocessor platform. Due to the hetero-

geneity of system consumers, applications run upon these platforms are likely to be

extremely diverse and characterized by complex software interactions (e.g., commu-

nication and resource-sharing). Current temporal analysis techniques cannot address

many of these complex interactions on a multiprocessor system; this dissertation ad-

dresses some fundamental problems of analyzing complex multiprocessor real-time

systems that were unanalyzable with previously known methods. Future research

will continue to remove some of the simplifying assumptions from the real-time task

models, thereby increasing the number of real-time systems that may utilize multi-

processor platforms.

202

Appendix A


Section 5.1 introduced task system τexample (given by Equation 5.1) used to prove that

online multiprocessor scheduling of arbitrary and constrained task systems requires

clairvoyance. In this appendix, we give a formal proof of Theorem 5.1. In other

words, we prove the task system τexample is feasible on two processors. In Section A.1,

we informally outline our proof. In Section A.2, we introduce the formal notation

involved in τexample’s feasibility proof. In Section A.3, we give the entire feasibility

proof.

A.1 Outline

The goal of Theorem 5.1 is to show that task system τexample is feasible on two

processors. However, we are unaware of any existing, non-trivial feasibility test for

constrained-deadline task systems on a multiprocessor platform. Thus, we must tailor

an argument specially for task system τexample, and show that for every legal arrival

sequence of jobs of τexample there exists a schedule where no deadlines are missed. The

following steps informally explain our proof of feasibility.

1. Show that τexample−{τ6} is feasible on two processors: This can be shown

by giving a partition of τexample−{τ6} on two processors. The tasks individually

assigned to a processor will be shown to be feasible on that processor with

respect to the Deadline-Monotonic (dm) scheduling algorithm. Let the schedule

203

constructed by this approach be called SI . Lemma A.2 corresponds to this step.

2. Construct a modified schedule S ′I : For any real-time instance

I ∈ I SWCET(τexample), we construct a global schedule (i.e., non-partitioned) by

moving as much work as possible to the first processor π1. The construction

for S ′I is given by Equation A.18. Lemma A.3 proves the validity of S ′

I . In

Lemmas A.4, A.5, A.6, A.7, and A.8, we prove several desirable properties that

S ′I exhibits.

3. Construct a schedule S ′′I that leaves sufficient room for τ6 to be com-

pletely assigned to the second processor: We used the properties of S ′I

(Lemmas A.4, A.5, A.6, A.7, and A.8) to show that a schedule S ′′I can always

be constructed leaves the second processor idle for four units between the re-

lease and deadline of a job of τ6. Obviously, τ6 can be completely assigned to

these idle times. Therefore, τexample is feasible on two unit-capacity processors

(Theorem 5.1).

Please note that we only consider real-time instances in I SWCET(τexample); the fea-

sibility of any instance I ′ ∈ I S(τexample) follows from the fact that there exists an

I ∈ I SWCET(τexample) such that I ′ ∈ F(I). So, we only need to consider a valid

schedule S ′′I and it suffices to use the same schedule for I ′ (except the jobs of I ′ will

potentially execute for less than the jobs of I).

In the next section, we discuss some additional notation used in this appendix. In

Section A.3, we formally carry-out the steps outlined in this subsection.

A.2 Notation

In this section, we present notation for describing the behavior of a sporadic task

system τ . The remainder of this appendix heavily relies on the notation presented in

204

Sections 1.3 and 1.4. In the remainder of this appendix, we will assume that τ is a

constrained-deadline system (i.e., for all τi ∈ τ , di ≤ pi). For each I ∈ I SWCET(τ), let

I(τi) ⊆ I denote the jobs generated by τi in instance I.

The next two functions give the “nearest” job release-time and deadline with re-

spect to a given time t and real-time instance I(τi).

Definition A.1 (Job-Release Function) If τi is in a scheduling window at time t

in real-time instance I then ri(I, t) is the release time of the most recently released

job of τi (with respect to time t). Otherwise, ri(I, t) = ∞ if τi is not in a scheduling

window at time t. More formally,

ri(I, t)def

=

Ak , if ∃Jk ∈ I(τi) such that Ak ≤ t ≤ Ak + Dk

∞ , otherwise.(A.1)

Definition A.2 (Job-Deadline Function) If τi is in a scheduling window at time

t for real-time instance I then di(I, t) is the absolute deadline of the most recently

released job of τi (with respect to time t). Otherwise, di(I, t) = −∞ if τi is not in a

scheduling window at time t.

di(I, t)def

=

Ak + Dk , if ∃Jk ∈ I(τi) such that Ak ≤ t ≤ Ak + Dk

−∞ , otherwise.(A.2)

The following function is useful for identifying the current active job (if any) of

task τi at time t.

Definition A.3 (Active-Job Function) If τi is in a scheduling window at time

t for real-time instance I, the ϕi(I, t) is current active job at time t. Otherwise,

205

ϕi(I, t) = ⊥, if τi is not in a scheduling window at time t.

ϕi(I, t)def

=

Jk , if ∃Jk ∈ I(τi) such that Ak ≤ t ≤ Ak + Dk

⊥ , otherwise.(A.3)

Similar to Definition 1.2 which defined a schedule function with respect to jobs of

a real-time instance, we can define the schedule S as a function with respect to task

τi.

Definition A.4 (Task-Schedule Function) SI(πℓ, t, τi) is an indicator function de-

noting whether task τi is scheduled at time t for schedule SI . In other words,

SI(πℓ, t, τi)def

=

1 , if ∃Jk ∈ I(τi) :: SI(πℓ, t, Ji) = 1

0 , otherwise.(A.4)

The next definition defines the work that task τi receives on πℓ over a given interval.

The job work function (Definition 1.4) is used.

Definition A.5 (Task Work Function) Wi(SI , πℓ, t1, t2) denotes the amount of pro-

cessor time that τi receives from schedule SI on processor πℓ over the interval [t1, t2)

for real-time instance I. In other words,

Wi(SI , πℓ, t1, t2)def

=∑

Jk∈I(τi)

W (SI , Ji, t1, t2). (A.5)

The next function (Definition A.6) is useful for characterizing the maximum

amount of processor time a task may receive over a given interval of time.

Definition A.6 (Execution-Bound Function) ebf(τi, t) bounds the maximum

amount of time that a constrained task τi (i.e., di ≤ pi) can execute over an in-

terval of length t. Specifically, ebf(τi, t) is the maximum execution time of jobs of τi

206

over the interval [0, t] over all real-time instances I and valid schedules. Formally,

ebf(τi, t)def

= max

⌊

tpi

⌋

ei + g1

(⌊

tpi

⌋

, t)

+ g2

(⌊

tpi

⌋

, t)

,

(⌊

tpi

⌋

− 1)

ei + g1

(

max((⌊

tpi

⌋

− 1)

, 0)

, t)

+g2

(

max((⌊

tpi

⌋

− 1)

, 0)

, t)

, (A.6)

where

g1(c, t)def

= min (ei, max (0, t − c × pi − (pi − di) − g2(c, t))) (A.7)

and

g2(c, t)def

= min (ei, t − c × pi) . (A.8)

The following claim provides an upper bound on the amount of time that τi may

execute over any interval of length t, if ci jobs are completely “contained” within the

interval:

Claim A.1 If I ∈ I SWCET(τ) is a real-time instance for constrained-deadline sporadic

task system τ , and t1, t2 ∈ R+ such that 0 ≤ t1 ≤ t2. Let ci be the number of jobs

of τi with arrival time in [t1, t2] that arrive at least pi time units before t2 (i.e.,

cidef

= |{Jk ∈ I(τi) : t1 ≤ Ak < t2 − pi}| .). Then, the following inequality quantifies the

maximum amount of work that can occur over the interval [t1, t2) on task τi ∈ τ in

any valid schedule SI on platform Π:

∑

πℓ∈Π

Wi(SI , πℓ, t1, t2) ≤ ci × ei + g1(ci, t2 − t1) + g2(ci, t2 − t1). (A.9)

Proof: Notice for I(τi) there is at most one job that arrives within pi prior to

t1 (denote this job by Jprior), and at most one job that arrives with pi prior to t2

(denote this job by Jafter). Let Ci denote the set {Jk ∈ I(τi) : t1 ≤ Ak < t2 − pi};

207

hence, ci = |Ci|. The amount of execution by jobs of τi that can occur over the

interval [t1, t2) is:

Work done by job Jprior in the interval [t1, t2) +

Execution requirement of jobs correspond to the arrivals in Ci +

Work done by job Jafter in the interval [t1, t2)

Obviously, the execution requirement of jobs of Ci is equal to ci × ei; so, we

will focus the maximum work contributed by the remaining two jobs. Assume that

job Jprior is released at time t1 − (pi − xprior) and job Jafter is released at time t2 −

xafter where 0 ≤ xprior, xafter < pi. If Jafter (or Jprior) does not exist, we will use

the convention that xafter (or xprior, respectively) will be equal to zero which implies

that the overhanging job does not execute during the interval [t1, t2). Since there

are ci jobs arriving in-between jobs Jprior and Jafter, the following inequality must

hold by minimum inter-arrival separation parameter pi (see Figure A.1 for a visual

justification of this inequality):

(t2 − xafter) − (t1 − (pi − xprior)) ≥ (ci + 1)pi

⇒ xprior + xafter ≤ (t2 − t1) − cipi.(A.10)

Since Jprior is released at time t1 − (pi − xprior), its deadline occurs at t1 − (pi −

xprior) + di. Therefore, the most Jprior may execute during the interval [t1, t2) is

min(ei, max(0, xprior − (pi − di))). Since Jafter is released at time t2 − xafter, its dead-

line occurs at t2 − xafter + di. The most Jafter may execute during interval [t1, t2) is

min(ei, xafter). The maximum amount of time that jobs Jprior and Jafter may execute

in [t1, t2) is

F (xprior, xafter)def= min(ei, max(0, xprior − (pi − di))) + min(ei, xafter). (A.11)

208

time

t1t1-(pi-xprior) t2t2-xafter

> pi > pi > pi > pi

ci jobs

Figure A.1: The example release sequence shows that times t1 − (pi − xprior) and t2 −xafter are separated by at least (ci +1)pi time. The “up-arrows” indicate a job release;the corresponding“down-arrows” indicate the job’s absolute deadline. Inequality A.10follows from the following facts: (i) ci jobs of τi arrive between Jprior and Jafter and eachjob arrival is separated by pi time units; (ii) the successive job of τi after Jprior arrivesat least pi time units after J. (Note that if Jprior does not exist then by assumptionxprior is zero); and (iii) by assumption of the claim, the last job in the sequence ofci jobs must arrive at least pi time units before t2 (this last fact is important if Jafter

does not exist and xafter is equal to zero).

The following values of xprior and xafter maximize F (xprior, xafter) (Equation A.11):

x∗after = min (ei, (t2 − t1) − cipi) (A.12)

and

x∗prior = (t2 − t1) − cipi − x∗

after. (A.13)

To see why the above values of x∗prior and x∗

after maximize Equation A.11, let us

consider a fixed x′after set to a value different than x∗

after. We will show that for

each fixed value of x′after (and all possible values of x′

prior subject to Inequality A.10),

F (x∗prior, x

∗after) ≥ F (x′

prior, x′after). First, note that both min(ei, max(0, xprior−(pi−di)))

and min(ei, xafter) are both non-decreasing, and increase at most linearly (with respect

to xprior and xafter, respectively).

In the first case, we consider values of x′after exceeding x∗

after. Fix x′after = x∗

after + γ

where 0 ≤ γ ≤ (t2 − t1) − cipi − x∗after. The upper bound on γ follows from the fact

209

that the ci jobs complete contained in [t1, t2] require cipi time; the most we could

increase x′after to is bounded by the interval length (t2 − t1) minus the execution of

completely contained jobs (cipi) minus the value of x∗after. If (t2−t1)−cipi−x∗

after > 0,

then by the property of min function in Equation A.12, x∗after = ei; however, in this

case, ei = min(ei, x∗after) = min(ei, x

∗after + γ) = min(ei, x

′after). So, min(ei, x

′after)

does not exceed min(ei, x∗after). Furthermore, since min(ei, max(0, xprior − (pi − di)))

is non-decreasing, its maximum value is achieved when x′prior is as large as possible,

which is x′prior = (t2 − t1) − cipi − x′

after = (t2 − t1) − cipi − x∗after − γ. It is easy to

see that min(ei, max(0, x∗prior − (pi − di))) ≥ min(ei, max(0, x′

prior − (pi − di))); thus,

F (x∗prior, x

∗after) ≥ F (x′

prior, x′after) when x′

prior exceeds x∗prior.

In the second case, consider values of x′after at most x∗

after. Fix x′after = x∗

after − γ

where 0 ≤ γ ≤ x∗after. Since x∗

after ≤ ei, then min(ei, x′after) = min(ei, x

∗after − γ) =

min(ei, x∗after)−γ. However, min(ei, max(0, xprior− (pi−di))) only grows linearly with

xprior; so, min(ei, max(0, x′prior−(pi−di))) ≤ min(ei, max(0, x∗

prior−(pi−di)))+γ. Thus,

F (x′prior, x

′after) ≤ min(ei, x

∗after) + min(ei, max(0, x∗

prior − (pi − di))) = F (x∗prior, x

∗after).

Therefore, x∗prior and x∗

after maximize Equation A.11.

Observe that when xprior and xafter are set according to Equations A.13 and A.12,

Equation A.11 is equal to g1(ci, t2 − t1)+ g2(ci, t2 − t1) which immediately implies the

claim.

Claim A.1 provides an upper bound on the amount of work with respect to ci, the

number of jobs completely contained in an interval [t1, t2). We will now show, in the

following lemma, that ebf is an upper bound on the amount of work over all possible

values of ci.

Lemma A.1 Let SI ∈ SI,Π be a schedule for constrained-deadline task system τ on

platform Π (with respect to release sequence I ∈ I SWCET(τ)) that satisfies Conditions 1

210

and 2 of validity (Definition 1.6). Then, for all t1, t2 ∈ R+ such that 0 ≤ t1 ≤ t2,

ebf(τi, t2 − t1) ≥∑

πℓ∈Π

Wi(SI , πℓ, t1, t2). (A.14)

Proof: Observe that ci ≤⌊

t2−t1pi

⌋

. We will now provide a case analysis based on ci

of the maximum work over interval [t1, t2). There are three cases we will consider:

1. ci ≤⌊

t2−t1pi

⌋

− 2,

2. ci =⌊

t2−t1pi

⌋

− 1, and

3. ci =⌊

t2−t1pi

⌋

.

We prove in each of the above cases that Equation A.14 holds. To see that

Equation A.14 holds for Case 1, observe that Wi(SI , πℓ, t1, t2) ≤⌊

t2−t1pi

⌋

× ei; this fact

follows because the “overhanging” jobs of interval [t1, t2) contribute at most 2ei units

of work while the “completely contained” jobs of [t1, t2) contribute (⌊

t2−t1pi

⌋

− 2)× ei.

From Equation A.6 (Definition A.6), it follows that ebf(τi, t2 − t1) ≥⌊

t2−t1pi

⌋

× ei

because functions g1 and g2 will both be non-negative; so, Equation A.14 holds for

Case 1.

By Claim A.1, ebf(τi, t2−t1) is an upper bound on Wi(SI , πℓ, t1, t2) in both Cases 2

and 3; therefore, Equation A.14 holds for all possible cases and the lemma follows.

A.3 Proof

In this section, we prove Theorem 5.1 by following the steps outlined in Section A.1.

Section A.3.1 gives the construction for schedule SI . Section A.3.2 describes the

construction of schedule S ′I and gives several lemmas describing important properties

211

of S ′I . Section A.3.3 contains the proof of Theorem 5.1 by showing that a schedule

S ′′I can be constructed to accommodate τ6 on processor π2.

A.3.1 Construction of Schedule S

The first step of the outline (Section A.1) of the proof requires us to show that

τexample − {τ6} is feasible on two processors. We can easily obtain feasibility of this

task system by partitioning τexample − {τ6} into two sets and scheduling each subset

on its own processor using a uniprocessor scheduling algorithm. For simplicity of

analysis, we will use dm on each processor.

Audsley et al. (Audsley et al., 1991) developed a test to determine whether each

task in a constrained-deadline task system can always meet all deadlines. Let THibe

the set of tasks with priority greater than or equal to task τi under the dm priority

assignment. The following theorem restates their result.

Theorem A.1 (from (Audsley et al., 1991)) In a constrained-deadline, sporadic

task system, task τi always meets all deadlines using dm on a preemptive uniprocessor

if and only if ∃t ∈ (0, di] such that

cbf(τi, t)def

=∑

τj∈THi

rbf(τj, t) + ei ≤ t. (A.15)

Using this result, we obtain the following lemma which states that τexample − {τ6}

is feasible on the given two-processor platform:

Lemma A.2 τexample −{τ6} is feasible on a multiprocessor platform composed of two

unit-capacity processors.

Proof: Define the following partition of τexample − {τ6}:

212

τAdef= {τ1, τ4}, (A.16)

and

τBdef= {τ2, τ3, τ5}. (A.17)

Assign τA to π1 and τB to π2. It is easy to verify (by Theorem A.1) that τA and

τB are feasible with respect to their assigned processors.

Let SI be the schedule constructed by partitioned dm for task system τexample−{τ6}

with partitions τA and τB. From the previous argument, SI is valid for any I ∈

I SWCET(τexample − {τ6}).

A.3.2 Construction of Schedule S ′I

We now proceed to Step 2 of our proof outline: construct a schedule S ′I that is globally

(non-partitioned) feasible. The goal of this step is to move as much computation off

processor π2 as possible. To accomplish this goal, for every idle instant on processor

π1 in schedule SI , we move a task in its scheduling window on π2 to π1 (if such a task

exists). The construction “builds” schedule S ′I for processor π1, first. After S ′

I(π1, t)

is constructed, then S ′I is constructed for π2. Note that such a schedule could not

be constructed online (i.e., it is an off-line constructed schedule), since S ′I(π2, t) may

require that S ′I(π1, t

′) be known for some t′ > t (i.e., S ′I(π2, t) requires knowledge of

future events).

In schedule S ′I(π1, t), tasks of set τA (tasks τ1 and τ4) execute at exactly the same

times as they did in schedule SI(π1, t). However, the tasks of set τB move as much

execution as possible (without disturbing tasks of τA) from processor π2 to processor

π1. Consider an arbitrary time t; S ′I(π1, t) is constructed using the following rules:

213

1. If at time t processor π1 is busy executing a job from tasks of τA in schedule SI ,

S ′I(π1, t) equals SI(π1, t).

2. If processor π1 is idle at time t in schedule SI , then:

(a) If task τ5 is in its scheduling window (i.e., r5(I, t) < ∞) and it has not

already executed for more than e5 time units in S ′I on processor π1, then

S ′I at time t is set to the current active job of τ5 – i.e S ′

I(π1, t) = ϕ5(I, t);

(b) else, if task τ3 is in its scheduling window (i.e., r3(I, t) < ∞) and it has not

already executed for more than e3 time units in S ′ on processor π1, then


I(π1, t) = ϕ3(I, t);

(c) else, if task τ2 is in its scheduling window (i.e., r2(I, t) < ∞) and it has not

already executed for more than e2 time units in S ′ on processor π1, then


I(π1, t) = ϕ2(I, t);

(d) else, leave processor π1 idle.

The execution of jobs of tasks in τA that could not be moved to processor π1 is

executed on processor π2 (with the added constraint that a task does not execute in

parallel with itself). Figure A.2 presents a visual example comparing schedules SI and

S ′I for a possible release sequence. The following construction is the inductive formal

definition of the modified schedule for all I ∈ I SWCET(τexample−{τ6}) and t ≥ 0. Please

note that S ′I(π1, t) is inductively constructed first for all t ≥ 0. S ′

I(π2, t) is constructed

after S ′I for processor π1.

214

xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx





τ1τ3τ4τ5

(2,2,5)

(1,2,6)

(2,4, )

(2,6, )



xxxxxxxxxxxxxxxxxx

(a)

xxxxxxxxxxxxxxxxxx


(b)

0 1 2 3 4 5 6∞

∞

xxxxxxxxxxxxxxxxxx


xxxxxxxxxxxxxxxxxx



(c)

0 1 2 3 4 5 6


S

S'

I

I

xxxxxxxxxxxxxxxxxx

xxxxxxxxxxxxxxxxxx

xxxxxxxxxxxxxxxxxx

xxxxxxxxxxxxxxxxxx

xxxxxxxxxxxxxxxxxx

xxxxxxxxxxxxxxxxxx

xxxxxxxxxxxxxxxxxx

xxxxxxxxxxxxxxxxxx

xxxxxxxxxxxxxxxxxx

xxxxxxxxxxxxxxxxxx

Figure A.2: Consider a release sequence containing tasks τ1, τ3, τ4, and τ5. (a) replicatesthe legend for these tasks. Let τ1, τ4, and τ5 release jobs at time t = 0; τ3 releases a jobat t = 4; and, τ1 releases a second job at t = 5. (b) presents schedule SI . (c) presentsschedule S′

I . Note that the execution of τ5 in the interval [1, 2) is moved from the secondprocessor to [4, 5) on the first processor.

S′I(π1, t)

def=

SI(π1, t) , if SI(π1, t) 6= ⊥,

ϕ5(I, t) , if r5(I, t) < ∞ and

W5(S′I , π1, r5(I, t), t) < e5,

ϕ3(I, t) , if r3(I, t) < ∞ and

W3(S′I , π1, r3(I, t), t) < e3,

ϕ2(I, t) , if r2(I, t) < ∞ and

W2(S′I , π1, r2(I, t), t) < e2,

⊥ , otherwise

S′I(π2, t)

def=

ϕ2(I, t) , if (SI(π2, t, τ2) = 1) and (S′I(π1, t, τ2) = 0) and

(W2(S′I , π1, r2(I, t), d2(I, t)) + W2(S

′I , π2, r2(I, t), t) < e2),


(W3(S′I , π1, r3(I, t), d3(I, t)) + W3(S

′I , π2, r3(I, t), t) < e3),


(W5(S′I , π1, r5(I, t), d5(I, t)) + W5(S

′I , π2, r5(I, t), t) < e5),

⊥ , otherwise

(A.18)

215

Lemma A.3 S ′I is valid for any I ∈ I S

WCET(τexample − {τ6}).

Proof: The first condition of validity (Definition 1.6) for schedule S ′I is easily seen

to be satisfied by noting that a task from τA = {τ1, τ4} is scheduled according to SI

which is valid. Validity Condition 1 is satisfied for tasks in τB = {τ2, τ3, τ5} since S ′I

only schedules a task τi ∈ τB on π1 if it is in a scheduling window(ri(I, t) < di(I, t)).

A task τi ∈ τB is only scheduled on π2 in S ′I if it was already scheduled on π2 in SI

(which is valid).

The second condition of validity follows vacuously from the definition of S ′I . To

show the third condition we must show for any Jk ∈ I(τA)∪ I(τB) that was generated

by τi ∈ τA ∪ τB satisfies ei ≤∑

πℓ∈Π Wi(S′I , πℓ, Ak, Ak + Dk) ≤ ei. First, note that for

any Jk ∈ I(τA), the second condition of validity follows immediately from the validity

of SI because all jobs in I(τA) are scheduled exactly the same in S ′I as SI . So, assume

that Jk ∈ I(τB). Also, for Jk it is the case that Ek equals ei, since I ∈ I SWCET(τ).

Let us first show that ei ≤ ∑

πℓ∈Π Wi(S′I , πℓ, Ak, Ak + Dk) for each Jk ∈ I(τB).

Assume that there exists Jk ∈ I(τB) such that ei >∑

πℓ∈Π Wi(S′I , πℓ, Ak, Ak + Dk).

Assume that Jk was generated by τi ∈ τB. Observe there must be a minimum time

instant t′ in the interval [Ak, Ak + Dk] such that ei >∑

πℓ∈Π Wi(S′I , πℓ, Ak, t

′) and

either S ′I(π1, t

′, τi) = 1 or S ′I(π2, t

′, τi) = 1. Regardless of which processor τi executes

on at time t′, there are two cases with respect to the amount of work that has been

done on processor π2 up until time t′. Either τi has executed on processor π2 (Case

1), or it has not (Case 2). We will show in both cases a contradiction arises.

1. Wi(S′I , π2, Ak, t

′) > 0: In this case, let t′′def= max{t ∈ [Ak, t

′] : S ′I(π2, t, τi) = 1}.

Note that t′′ must exist and S ′I(π2, t

′′, τi) = 1 as Wi(S′I , π2, Ak, t

′) > 0. However,

ei >∑

πℓ∈Π Wi(S′I , πℓ, Ak, t

′) and Ak = ri(I, t) implies that

216

Wi(S′I , π1, ri(I, t′), di(I, t′))+Wi(S

′I , π2, ri(I, t′), t′′) > ei. This contradicts Equa-

tion A.18 and S ′I(π2, t

′′, τi) = 1.

2. Wi(S′I , π2, Ak, t

′) = 0: This implies that Wi(S′I , π1, Ak, t

′) > ei. If S ′I(π1, t

′, τi) =

1, this contradicts Equation A.18. If S ′I(π2, t

′, τi) = 1, Wi(S′I , π1, Ak, t

′) > ei

implies Wi(S′I , π1, ri(I, t′), di(I, t′))+Wi(S

′I , π2, ri(I, t′), t′) > ei which again con-

tradicts Equation A.18.

Since in both cases a contradiction arises, such a t′ cannot exist and

ei ≤∑

πℓ∈Π Wi(S′i, πℓ, Ak, Ak + Dk).

Let us show that ei ≥∑

πℓ∈Π Wi(S′I , πℓ, Ak, Ak + Dk). Assume that there exists

Jk ∈ I(τB) such that ei <∑

πℓ∈Π Wi(S′I , πℓ, Ak, Ak + Dk). So,

∀t ∈ [Ak, Ak + Dk] : Wi(S′I , π1, ri(I, t), di(I, t)) + Wi(S

′I , π2, ri(I, t), t) < ei. (A.19)

(Again, note that ri(I, t) equals Ak and di(I, t) equals Ak + Dk). Let Pidef= {t ∈

[Ak, Ak + Dk] : SI(π2, t, τi) = 1}. For all t ∈ Pi, S ′I(π2, t, τi) = 0 if and only if

S ′(π1, t, τi) = 1 (by Equations A.18 and A.19). This implies that∑

πℓ∈Π Wi(S′I , πℓ, Ak,

Ak+Dk) =∑

πℓ∈Π Wi(SI , πℓ, Ak, Ak+Dk). By the validity of SI ,∑

πℓ∈Π Wi(S′I , πℓ, Ak,

Ak + Dk) = ei. This contradicts our assumption; therefore, ei ≥ ∑

πℓ∈Π Wi(S′I , πℓ,

Ak, Ak + Dk). Condition 3 of validity and the lemma follows.

We now wish to prove several lemmas which characterize the properties of schedule

S ′I with respect to the interval in which τ6 is in a scheduling window. The main

observation from these properties is that S ′I can be modified to provide enough idle

instants on processor π2 to successfully schedule τ6.

We will first quantify the maximum amount of τ5’s computation that has been

moved to processor π1 in the schedule S ′I . Let t5 be the arrival time Ak of some job

Jk ∈ I(τ5). The following lemma bounds the amount of processor π1’s computation

217

that the set of tasks τA can consume over the interval [t5, t5 +6] (i.e., the time interval

during which τ5 is in a scheduling window).

Lemma A.4 For all I ∈ I SWCET(τexample),

∑

τi∈τAWi(S

′I , π1, t5, t5 + 6) ≤ 5.

Proof: By the synchronous arrival sequence:

W1(S′I , π1, t5, t5 + 6) ≤ ebf(τ1, 6) = 3

and,

W4(S′I , π1, t5, t5 + 6) ≤ 2 (since p4 = 100).

Thus,∑

τi∈τAWi(S

′I , π1, t5, t5 + 6) ≤ 5.

The next lemma shows that the work done by task τ5 on processor π2 over the

interval [t5, t5 + 6] in schedule S ′I does not exceed one.

Lemma A.5 For all I ∈ I SWCET(τexample), W5(S

′I , π2, t5, t5 + 6) ≤ 1.

Proof: By Lemma A.4,∑

τi∈τAWi(S

′I , π1, t5, t5 + 6) ≤ 5. This implies that the set

of idle points t ∈ [t5, t5 + 6] where SI(π1, t) = ⊥ has cardinality at least one. By

Equation A.18, τ5 will execute whenever τ1 and τ4 are not executing on π1; thus,

W5(S′I , π1, t5, t5 + 6) ≥ 1. By validity of S ′

I , W5(S′I , π2, t5, t5 + 6) ≤ 1.

The goal of our argument is to show that there exists a multiprocessor schedule

for τexample − {τ6} which leaves processor π2 idle for at least four time units over the

interval during which τ6 is in a scheduling window. Our argument relies on either

having sufficient idle time in schedule S ′I to execute τ6 completely on π2, or being

able to move work of tasks τ2, τ3, and τ5 in schedule S ′I to accommodate task τ6 (by

creating a new schedule S ′′I ). Therefore, it would be useful to reason about intervals

during which π1 is continuously busy executing only tasks of τA ∪ {τ5}.

218

Observe that ebf(τ1, 8) + ebf(τ4, 8) + ebf(τ5, 8) = 4 + 2 + 2 = 8. However, task

τ5 may not always be able to entirely execute on processor π1 in schedule S ′I . For any

t5 equal to the arrival time of some job in I(τ5), let α(t5, S′I) be the amount of time

that τ5 executes on processor π2 in schedule S ′I for real-time instance I. That is,

α(t5, S′I)

def= W5(S

′I , π2, t5, t5 + 6). (A.20)

Therefore, the amount time that τ5 can execute on processor π1 over any interval for

instance I is at most 2 − α(t5, S′I).

The next lemma will describe the implications of processor π1 being continuously

busy during the interval [t5, t5 + 6] where τ5 must partially execute on processor π2.

The implications of a continuously busy processor π1 for [t5, t5 + 6] is that τ4 must

execute during this interval. Let t4 be an arrival time of some job of τ4 in I where

[t5, t5 + 6] ∩ [t4, t4 + 4] 6= ∅. The next lemma will show that π1 is continuously busy

in the interval [t4, t4 + 4] as well.

Lemma A.6 For all I ∈ I SWCET(τexample−{τ6}), if

∑

τj∈τA∪{τ5} Wj(S′I , π1, t5, t5+6) =

6 and α(t5, S′I) > 0, then

∃t4 :: ([t4, t4 + 4] ∩ [t5, t5 + 6] 6= ∅) ∧

∑

τj∈τA∩{τ5}Wj(S

′I , π1, t4, t4 + 4) = 4

. (A.21)

Proof: Observe that W4(S′I , π1, t5, t5+6) ≤ 2 and W5(S

′I , π1, t5, t5+6) ≤ 2−α(t5, S

′I),

by e4 = 2, e5 = 2, and definition of α. This implies

W1(S′I , π1, t5, t5 + 6) ≥ 6 − 2 − (2 − α(t5, S

′I)) = 2 + α(t5, S

′I) > 2. (A.22)

Because e1 = 2, Equation A.22 means that there exist two jobs J1 and J2 of

219

τ1 that have scheduling windows over the interval [t5, t5 + 6]. Let r11 and r1

2 be the

release times of J1 and J2 in instance I(τ1), respectively. Observe that since e1

d1= 1,

J1 continuously executes on processor π1 over the interval [r11, r

11 +2], and J2 executes

continuously over the interval [r12, r

12 + 2]. Also, t5 ≤ r1

1 + 2 ≤ r12 ≤ t5 + 6. Because

p1 = 5, |r12 − (r1

1 +2)| ≥ 3 and W1(S′I , π1, r

11 +2, r1

2) = 0. Since [r11 +2, r1

2] ⊂ [t5, t5 +6]

and α(t5, S′I) > 0, the most τ5 can execute on processor π1 in the interval [r1

1 +2, r12] is

2−α(t5, S′I). Therefore, W4(S

′I , π1, r

11 + 2, r1

2) > 1 which implies there exists a arrival

time t4 of some job of I(τ4) such that [t4, t4 + 4] ∩ [t5, t5 + 6] 6= ∅.

From this it follows that there are two cases. We will show that the cases imply

Equation A.21. The cases are:

1. [t4, t4 +4] ⊂ [t5, t5 +6]: By the antecedent of the lemma, π1 is continuously busy

executing jobs of I(τA ∩ {τ5}). Thus, Equation A.21 follows trivially.

2. |[t4, t4 + 4] ∩ [t5, t5 + 6]| < 4: Given this case, there are two possibilities. Either

the job of τ4 is released before t5 or it is released after t5 (otherwise, [t4, t4 +4] ⊂

[t5, t5 + 6]). More formally, the subcases are:

a) t4 < t5 < t4 + 4: In this case, [t4, t4 + 4] ∩ [r11, r

11 + 2] 6= ∅, because

W4(S′I , π1, , r

11 + 2, r1

2) > 1 and [r11, r

11 + 2] ∩ [t5, t5 + 6] 6= ∅. This implies

three additional subcases:

i) τ4 executes entirely after r11 + 2: This implies that t4 ∈ [r1

1, r11 + 2];

otherwise, there would not be enough execution left in [r11, t4 + 4] for

τ4 to execute. Since t4 < t5 in this case, r11 < t4 < t5. Thus, the

interval [t4, t4 + 4] is a subset of [r11, t5 + 6]. Since π1 is continuously

busy executing J1 during [r11, r

11+2] and by the antecedent of the lemma

π1 is continuously busy executing jobs of τA∩{τ5} during [t5, t5 +6], it

must be that π1 is also continuously busy executing jobs of τA ∩ {τ5}

220

in the interval [t4, t4 + 4]. This implies Equation A.21.

ii) τ4 executes both before r11 and after r1

1+2: Observe that job J1 executes

continuously over [r11, r

11 + 2]. Since τ4 executes both before and after

r11 and S ′

I is valid, it follows that [r11, r

11 + 2] ⊂ [t4, t4 + 4]. Thus, τ4

is continuously executed on processor π1 over the intervals [t4, r11] and

[r11 + 2, t4 + 4]. Thus,

W4(S′I , π1, t4, r

11) + W1(S

′I , π1, r

11, r

11 + 2)+

W4(S′I , π1, r

11 + 2, t4 + 4)

= (r11 − t4) + e1 + ((t4 + 4) − (r1

1 + 2))

= e1 + e4

= 2 + 2 = 4

This implies Equation A.21.

iii) τ4 executes entirely before r11: This case is impossible because

W4(S′I , π1, r

11 + 2, r1

2) > 1.

b) t5 + 2 < t4 < t5 + 6: Symmetric to Case a.

Observe that the longest possible interval in which processor π1 is continuously

busy executing the jobs of τ1, τ4, and τ5 is of length 8−α(t5, S′I) (due to the fact that

both τ4 and τ5 can execute at most one job in any 100 time unit interval). The longest

possible continuously busy interval of jobs of these tasks on processor π1 contains two

jobs of τ1, one job of τ4, and 2 − α(t5, S′I) units of a job of τ5. We will now use

Lemma A.6 to show that if all intervals of length 8−α(t5, S′I) that contain [t5, t5 + 6]

are not continuously busy on processor π1, then we can fit τ5 entirely on π1 (i.e.,

α(t5, S′I) = 0). The next lemma formally states this observation.

221

Lemma A.7 If for all t′ ∈ [t5 − 2 + α(t5, S′I), t5]

∑

τj∈τA∪{τ5}Wj(S

′I , π1, t

′, t′ + 8 − α(t5, S′I)) < 8 − α(t5, S

′I),

then α(t5, S′I) = 0.

Proof: The proof is by contradiction. Assume that the antecedent is true for a

given I, but α(t5, S′I) > 0. Since we cannot move all of τ5’s execution in the interval

[t5, t5 + 6] to processor π1, this implies that

∑


′I , π1, t5, t5 + 6) = 6. (A.23)

Observe that the antecedent of Lemma A.6 is thus satisfied. Therefore, by similar

reasoning, two different jobs of τ1 must have scheduling windows in the interval [t5, t5+

6]. J1 and J2 execute continuously in intervals [r11, r

11 + 2] and [r1

2, r12 + 2] (borrowing

notation from Lemma A.6). From Equation A.23, τ1, τ4, and τ5 execute continuously

in the interval [min(r11, t5), max(r1

2 + 2, t5 + 6)]. By Lemma A.6, there exists t4 that

corresponds to an arrival job of τ4 with [t4, t4 + 4] ∩ [t5, t5 + 6] 6= ∅. Furthermore,

π1 is continuously busy executing jobs of τA ∪ {τ5} in the interval [t4, t4 + 4]. Since

each of these intervals overlap and are continuously busy, the following interval is

continuously busy,

[min(r11, t5, t4), max(r1

2 + 2, t5 + 6, t4 + 4)].

This interval obviously includes two jobs of τ1, one job of τ4, and 2 − α(t5, S′I) units

of a job of τ5. Define,

tstartdef= min(r1

1, t5, t4), (A.24)

222

and

tenddef= max(r1

2 + 2, t5 + 6, t4 + 4). (A.25)

Then,∑


′I , π1, tstart, tend) = 8 − α(t5, S

′I)

However, observe that tstart ∈ [t5−2+α(t5, S′I), t5]; otherwise, processor π1 would

not be continuously busy over the interval [t5, t5+6], contradicting Equation A.23. We

have thus shown that a continuously busy interval on processor π1 of size 8−α(t5, S′I)

exists if α(t5, S′I) > 0. This contradicts our assumption; therefore, α(t5, S

′I) = 0.

In order to achieve our stated goal (i.e., processor π2 has enough idle instances

to successfully schedule τ6), we must show that there is still enough idle instances

on processor π1 to move the execution of τB. In the next lemma, we show that any

time interval in which π1 is continuously busy executing jobs of τA ∪ {τ5} must be

surrounded by continuously idle intervals (with respect to jobs of τA ∪ {τ5}). These

idle intervals must be of length at least two. The next lemma formally states this

observation.

Lemma A.8 If α(t5, S′I) > 0 then

∑


′I , π1, tstart − 2, tstart) = 0, (A.26)

and

∑


′I , π1, tstart + (8 − α(t5, S

′I)), tstart + (8 − α(t5, S

′I) + 2) = 0. (A.27)

Proof: We will prove that α(t5, S′I) > 0 implies Equation A.26. Equation A.27 can

223

be proven symmetrically. Observe that

W1(S′I , π1, r

11 − 3, r1

1) = 0 (A.28)

because p1 = 5 and d1 = 2. From Lemma A.7, α(t5, S′I) > 0 implies that

∑

τj∈τA∪{τ5} Wj(S′I , π1, tstart, tend) = 8 − α(t5, S

′I)

⇒ ∑

τj∈{τ4,τ5} Wj(S′I , π1, r

11 + 2, r1

2) ≥ 3.

The last implication follows from reasoning in Lemma A.6. Since at least three units

of τ4 and τ5 must execute in the interval [r11 + 2, r1

2], this leaves at most 1 − α(t5, S′I)

units left to execute either before r11 and/or after r1

2 + 2. This implies

tstart ≥ r11 − 1 + α(t5, S

′I). (A.29)

Equations A.28 and A.29 imply that the latest another job of τ1 could execute

prior to tstart is tstart − 2. Since τ5 and τ4 have periods equal to 100, and they release

jobs contained within [tstart, tend], they are not active in the interval [tstart − 2, tstart].

Therefore,

∑


′I , π1, tstart − 2, tstart) = 0.

A.3.3 Construction of Schedule S ′′I and Proof of Theorem 5.1

We now have sufficient tools to successfully prove Theorem 5.1. We will show that

if S ′I does not have enough idle instants to schedule τ6 entirely on processor π2, then

we can create a schedule S ′′I .

224


Let t6 be the arrival of any job of task τ6 in real-time instance I ∈ I SWCET. Consider

S ′I constructed in Section A.3.2 (Equation A.18). If

∑

τj∈τexample−{τ6} Wj(S′I , π2, t6, t6 +

8) ≤ 4, then we are done (i.e., there is sufficient idle time on π2 in S ′I to execute the

job of τ6 for 4 time units in [t6, t6 + 8]). So assume that

∑

τj∈τexample−{τ6}Wj(S

′I , π2, t6, t6 + 8) > 4. (A.30)

Observe that

W2(S′I , π2, t6, t6 + 8) ≤ ebf(τ2, 8) = 2, (A.31)

and

W3(S′I , π2, t6, t6 + 8) ≤ ebf(τ3, 8) = 2, (A.32)

Equations A.30, A.31, and A.32 together imply that [t6, t6 + 8] must overlap with

the scheduling window of a job Jk of τ5. Let t5 be the arrival time of Jk. Furthermore,

Jk must execute some on processor π2, so α(t5, S′I) > 0. From Equation A.30, [t5, t5 +

6] ∩ [t6, t6 + 8] 6= ∅. Since α(t5, S′I) > 0, Lemma A.7 implies that [t5, t5 + 6] overlaps

with two jobs of τ1 and one job of τ4; Using the definition of tstart and tend from

Equations A.24 and A.25,∑

τj∈τA∪{τ5} Wj(S′I , π1, tstart, tend) = 8 − α(t5, S

′I).

Considering the intervals Xstartdef= [tstart, tstart + 8 − α(t5, S

′I)], X5

def= [t5, t5 +

6], and X6def= [t6, t6 + 8] (recall that X5 ⊂ Xstart and X6 ∩ X5 6= ∅), we pro-

vide three comprehensive cases for the overlap of these intervals. In each of these

cases (and their subcases), we show that the subcase is impossible given that τ6

does not fit (Equation A.30), or it is possible to construct a schedule S ′′I such that

∑

τj∈τexample−{τ6} Wj(S′′I , π2, t6, t6 + 8) ≤ 4. The cases for overlap are as follows.

1. t6 ≤ tstart ≤ t6 + α(t5, S′I).

225

2. tstart < t6:

a) t5 > t6 − 2.

b) t6 − 6 < t5 ≤ t6 − 2.

3. tstart > t6 + α(t5, S′I):

a) t5 < t6 + 4.

b) t6 + 8 > t5 ≥ t6 + 4.

We invite the reader to verify that this list of cases is comprehensive. Proof that

each of these cases are either impossible (Case 1) or can be modified to create a valid

schedule S ′′ (Cases 2 and 3) follows.

1. t6 ≤ tstart ≤ t6 + α(t5, S′I): This implies that Xstart ⊂ X6. By Lemma A.8,

∑


′I , π1, t6, tstart) = 0, (A.33)

and∑


′I , π1, tstart + (8 − α(t5, S

′I)), t6 + 8) = 0. (A.34)

Also, |tstart − t6| + |tstart + (8 − α(t5, S′I)) − (t6 + 8)| = α(t5, S

′I). Lemma A.5

implies that α(t5, S′I) ≤ 1; therefore, X6 must overlap with two jobs of τ3

(otherwise,∑

τj∈τexample−{τ6} Wj(S′I , π2, t6, t6 + 8) ≤ 4). Let r3

1 and r32 be the

release times of the first and second jobs of τ3 that overlaps with X6. For

∑

τj∈τexample−{τ6} Wj(S′I , π2, t6, t6 + 8) > 4 to be true, we must now show that

τ3 does not execute on processor π2 in interval [t6, tstart] and [tstart + (8 −

α(t5, S′I)), t6] (i.e., there is not enough idle time on π1 to accommodate the two

jobs of τ3). It must be that |([r31, r

31 + 2] ∪ [r3

2, r32 + 2]) ∩ Xstart| ≤ 4 − α(t5, S

′I)

because p3 = 6 and d3 = 2. This means that together the two jobs of τ3

226

overlap with at least α(t5, S′I) units of idle time on π1. By the definition of

S ′ and Equations A.33 and A.34, W3(S′I , π2, t6, t6 + 8) ≤ 2 − α(t5, S

′I) because

we may move at least α(t5, S′I) units of τ3’s execution to π1. This implies

∑

τj∈τexample−{τ6} Wj(S′I , π2, t6, t6 + 8) ≤ 4 which contradicts the assumption.

Thus, this case is impossible.

2. tstart < t6:

a) t5 > t6 − 2: Let ydef= max(t6 − t5, 0). Thus t5 + 6 equals t6 − y + 6.

Since [t5, t5 + 6] ⊂ [tstart, tstart + (8 − α(t5, S′I))], it must be that tstart +

(8 − α(t5, S′I)) ≥ t6 − y + 6. By Lemma A.8, the amount of execution of

τA ∪ {τ5} in the interval [tstart, tstart + (8 − α(t5, S′I))] is maximized when

tstart +(8−α(t5, S′I)) equals t6−y+6. In this case, Lemma A.8 the earliest

a job of τA∪{τ5} can execute after tstart+(8−α(t5, S′I)) is at time t6+8−y.

Therefore,

∑


′I , π1, tstart + (8 − α(t5, S

′I)), t6 + 8) ≤ y.

We will now show that τ3 can execute on processor π2 for at most 2 −

α(t5, S′I) time units over the interval [t6, t6 + 8]. Observe that

|tstart + (8 − α(t5, S′I)) − t6| < 8 − α(t5, S

′I) − y (A.35)

and

∣

∣

(

[r31, r

31 + 2] ∪ [r3

2, r32 + 2]

)

∩ [t6, tstart + (8 − α(t5, S′I))]

∣

∣ ≤ 4−α(t5, S′I)−y.

(A.36)

227

By definition of S ′I and Equation A.36,

W3(S′I , π2, t6, tstart + (8 − α(t5, S

′I))) ≤ 2 − α(t5, S

′I) − y. (A.37)

τ5 does not have a scheduling window over the interval of [tstart + (8 −

α(t5, S′I)), t6+8]. So, by definition of S ′

I and Equation A.35, τ3 will execute

for at most y on processor π2 over the interval [tstart+(8−α(t5, S′I)), t6+8].

Therefore,

W3(S′I , π2, tstart + (8 − α(t5, S

′I)), t6 + 8) ≤ y. (A.38)

By Equations A.31, A.37, and A.38 and the fact that τ5 executes on pro-

cessor π2 for at most α(t5, S′I) time units implies

∑

τj∈τexample−{τ6} Wj(S′I , π2, t6, t6 + 8) = 2 + (2 − α(t5, S

′I)) + α(t5, S

′I)

≤ 4.

This contradicts our assumption; therefore, Case 2a is impossible.

b) t6 −6 < t5 ≤ t6 −2: Consider a modified schedule S ′′I in which more of τ5’s

228

execution on processor π2 is moved to the interval [t5, t6).

S ′′I (π1, t)

def= S ′

I(π1, t)

S ′′I (π2, t)

def=

ϕ5(I, t) , if (t5 ≤ t < t6) and(S ′I(π2, t) =

⊥) and (S ′I(π1, t, τ5) = 0) and

(W5(S′′I , π2, r5(I, t), t) < α(t5, S

′I)),

⊥ , if (S ′I(π2, t, τ5) = 1) and

(W5(S′′I , π2, r5(I, t), t) ≥ α(t5, S

′I)),

S ′I(π2, t) , otherwise

(A.39)

It is easy to see that S ′′I is valid, as we are only moving execution of τ5

during τ5’s scheduling window. From the definition of S ′I and S ′′

I , the only

times τ5 does not execute in [t5, t6) on processor π2 is when:

i. π1 is busy executing τ5,

ii. π2 is busy executing τ2 or τ3, or

iii. τ5 has completed execution (i.e., τ5 has executed for exactly α(t5, S′I)

time units on π2).

These cases taken together imply that in the interval [t5, t6) for schedule S ′′I

processor π2 must execute jobs of tasks in τB for at least α(t5, S′I) time units.

More formally,

∑

τj∈τB

Wj(S′′I , π2, t5, t6) ≥ α(t5, S

′I).

Due to the fact that t6 − t5 ≥ 2 and that ebf(τ2, 10) = 2 and ebf(τ3, 10) = 2,

229

∑

τj∈τexample−{τ6}Wj(S

′′I , π2, t6, t6 + 8) ≤ 4.

We have defined a valid schedule S ′′I that leave enough idle instants for τ6 to

execute entirely on processor π2.

3. tstart > t6 + α(t5, S′I):

a) t5 < t6 + 4: Symmetric to Case 2a.

b) t6 + 8 > t5 ≥ t6 + 4: Symmetric to Case 2b.

230

Bibliography

Abdelzaher, T., Sharma, V., and Lu, C. (2004). A utilization bound for aperiodictasks and priority driven scheduling. IEEE Transactions on Computers, 53(3).

Albers, K. and Slomka, F. (2004). An event stream driven approximation for theanalysis of real-time systems. In Proceedings of the EuroMicro Conference onReal-Time Systems, pages 187–195, Catania, Sicily. IEEE Computer SocietyPress.

Anderson, J. and Srinivasan, A. (2004). Mixed Pfair/ ERfair scheduling of asyn-chronous periodic tasks. Journal of Computer And System Sciences, 68(1):157–204.

Andersson, B., Abdelzaher, T., and Jonsson, J. (2003a). Global priority-driven ape-riodic scheduling on multiprocessors. In Proceedings of the Parallel and Dis-tributed Processing Symposium, Nice, France.

Andersson, B., Abdelzaher, T., and Jonsson, J. (2003b). Partitioned aperiodicscheduling on multiprocessors. In Proceedings of the Parallel and DistributedProcessing Symposium, Nice, France.

Andersson, B. and Jonsson, J. (2000). Fixed-priority preemptive multiprocessorscheduling: To partition or not to partition. In Proceedings of the Interna-tional Conference on Real-Time Computing Systems and Applications, pages337–346, Cheju Island, South Korea. IEEE Computer Society Press.

Andersson, B. and Jonsson, J. (2003). The utilization bounds of partitioned andpfair static-priority scheduling on multiprocessors are 50%. In Proceedings ofthe EuroMicro Conference on Real-Time Systems, pages 33–40, Porto, Portugal.IEEE Computer Society Press.

Audsley, N. C., Burns, A., Richardson, M. F., and Wellings, A. J. (1991). HardReal-Time Scheduling: The Deadline Monotonic Approach. In Proceedings 8thIEEE Workshop on Real-Time Operating Systems and Software, pages 127–132,Atlanta.

Baker, T. and Baruah, S. (2007). Schedulability analysis of multiprocessor sporadictask systems. In Son, S. H., Lee, I., and Leung, J. Y.-T., editors, Handbook ofReal-Time and Embedded Systems. Chapman Hall/ CRC Press.

Baker, T. and Cirinei, M. (2006). A necessary and sometimes sufficient condition forthe feasibility of sets of sporadic hard-deadline tasks. In Proceedings of the IEEEReal-time Systems Symposium, pages 178–190, Rio de Janeiro. IEEE ComputerSociety Press.

231

Baker, T. and Shaw, A. (1989). The cyclic executive model and ada. Real-TimeSystems, 1:7–26.

Baker, T. P. (2003). An analysis of deadline-monotonic schedulability on a multipro-cessor. Technical Report TR-030201, Department of Computer Science, FloridaState University.

Baker, T. P. (2005a). An analysis of EDF schedulability on a multiprocessor. IEEETransactions on Parallel and Distributed Systems, 16(8):760–768.

Baker, T. P. (2005b). Comparison of empirical success rates of global vs. partitionedfixed-priority and EDF scheduling for hard real time. Technical Report TR-050601, Department of Computer Science, Florida State University.

Baker, T. P. (2006a). An analysis of fixed-priority schedulability on a multiprocessor.Real-Time Systems: The International Journal of Time-Critical Computing,32(1–2):49–71.

Baker, T. P. (2006b). A comparison of global and partitioned EDF schedulability testsfor multiprocessors. In Proceeding of the International Conference on Real-Timeand Network Systems, pages 119–127, Poitiers, France.

Baruah, S. (2003). Dynamic- and static-priority scheduling of recurring real-timetasks. Real-Time Systems: The International Journal of Time-Critical Com-puting, 24(1):99–128.

Baruah, S. and Burns, A. (2006). Sustainable scheduling analysis. In Proceedings ofthe IEEE Real-time Systems Symposium, pages 159–168, Rio de Janeiro. IEEEComputer Society Press.

Baruah, S. and Carpenter, J. (2003). Multiprocessor fixed-priority scheduling withrestricted interprocessor migrations. In Proceedings of the EuroMicro Confer-ence on Real-time Systems, pages 195–202, Porto, Portugal. IEEE ComputerSociety Press.

Baruah, S., Chen, D., Gorinsky, S., and Mok, A. (1999). Generalized multiframe tasks.Real-Time Systems: The International Journal of Time-Critical Computing,17(1):5–22.

Baruah, S., Cohen, N., Plaxton, G., and Varvel, D. (1996). Proportionate progress:A notion of fairness in resource allocation. Algorithmica, 15(6):600–625.

Baruah, S. and Fisher, N. (2005a). Code-size minimization in multiprocessor real-time systems. In Proceedings of the International Workshop on Parallel andDistributed Real-Time Systems, pages 137a– 137a, Denver, Colorado.

232

Baruah, S. and Fisher, N. (2005b). The partitioned scheduling of sporadic real-time tasks on multiprocessor platforms. In Proceedings of the Workshop onCompile/Runtime Techniques for Parallel Computing, Oslo, Norway.

Baruah, S., Gehrke, J., and Plaxton, G. (1995). Fast scheduling of periodic tasks onmultiple resources. In Proceedings of the Ninth International Parallel ProcessingSymposium, pages 280–288. IEEE Computer Society Press. Extended versionavailable via anonymous ftp from ftp.cs.utexas.edu, as Tech Report TR–95–02.

Baruah, S., Howell, R., and Rosier, L. (1990a). Algorithms and complexity concerningthe preemptive scheduling of periodic, real-time tasks on one processor. Real-Time Systems: The International Journal of Time-Critical Computing, 2:301–324.

Baruah, S., Mok, A., and Rosier, L. (1990b). Preemptively scheduling hard-real-timesporadic tasks on one processor. In Proceedings of the 11th Real-Time SystemsSymposium, pages 182–190, Orlando, Florida. IEEE Computer Society Press.

Bertogna, M., Cirinei, M., and Lipari, G. (2005a). Improved schedulability analysis ofEDF on multiprocessor platforms. In Proceedings of the EuroMicro Conferenceon Real-Time Systems, pages 209–218, Palma de Mallorca, Balearic Islands,Spain. IEEE Computer Society Press.

Bertogna, M., Cirinei, M., and Lipari, G. (2005b). New schedulability tests for real-time tasks sets scheduled by deadline monotonic on multiprocessors. In Proceed-ings of the 9th International Conference on Principles of Distributed Systems,pages 306–321, Pisa, Italy. Springer.

Bertogna, M., Fisher, N., and Baruah, S. (2006). Resource-locking durations in static-priority systems. Manuscript under review.

Bini, E. and Buttazzo, G. C. (2005). Measuring the performance of schedulabilitytests. Real-Time Systems, 30(1):129–154.

Calandrino, J., Anderson, J., and Baumberger, D. (2007). A hybrid real-time schedul-ing approach for large-scale multicore platforms. In Proceedings of the Eu-roMicro Conference on Real-Time Systems, Pisa, Italy. IEEE Computer SocietyPress. To appear.

Calandrino, J., Leontyev, H., Block, A., Devi, U., and Anderson, J. (2006). LitmusRT :A testbed for empirically comparing real-time multiprocessor schedulers. InProceedings of the 27th IEEE Real-Time Systems Symposium, pages 111–123,Rio de Janerio. IEEE Computer Society Press.

Carpenter, J., Funk, S., Holman, P., Srinivasan, A., Anderson, J., and Baruah, S.(2003). A categorization of real-time multiprocessor scheduling problems and

233

algorithms. In Leung, J. Y.-T., editor, Handbook of Scheduling: Algorithms,Models, and Performance Analysis. CRC Press LLC.

Chakraborty, S. (2003). System-Level Timing Analysis and Scheduling for EmbeddedPacket Processors. PhD thesis, Swiss Federal Institute of Technology (ETH),Zurich. Available as Diss. ETH No. 15093.

Chakraborty, S., Erlebach, T., and Thiele, L. (2001). On the complexity of schedulingconditional real-time code. In Proceedings of the 7th Workshop on Algorithmsand Data Structures, pages 38–49, Providence, RI. Springer Verlag.

Chen, D., Mok, A., and Baruah, S. (2000). Scheduling distributed real-time tasks inthe dgmf model. In IEEE Real-Time Technology and Applications Symposium,pages 14–22. IEEE.

Deng, Z. and Liu, J. (1997). Scheduling real-time applications in an Open environ-ment. In Proceedings of the Eighteenth Real-Time Systems Symposium, pages308–319, San Francisco, CA. IEEE Computer Society Press.

Dertouzos, M. (1974). Control robotics : the procedural control of physical processors.In Proceedings of the IFIP Congress, pages 807–813.

Dertouzos, M. and Mok, A. K. (1989). Multiprocessor scheduling in a hard real-timeenvironment. IEEE Transactions on Software Engineering, 15(12):1497–1506.

Devi, U. (2003). An improved schedulability test for uniprocessor periodic task sys-tems. In Proceedings of the EuroMicro Conference on Real-time Systems, Porto,Portugal. IEEE Computer Society Press.

Devi, U. (2006). Soft Real-Time Scheduling on Multiprocessors. PhD thesis, De-partment of Computer Science, The University of North Carolina at ChapelHill.

Dhall, S. K. and Liu, C. L. (1978). On a real-time scheduling problem. OperationsResearch, 26:127–140.

Fisher, N., Anderson, J., and Baruah, S. (2005). Task partitioning upon memory-constrained multiprocessors. In Proceedings of the IEEE International Confer-ence on Embedded and Real-Time Computing Systems and Applications, pages416–421, Hong Kong. IEEE Computer Society Press.

Fisher, N. and Baruah, S. (2005a). A fully polynomial-time approximation scheme forfeasibility analysis in static-priority systems. In Proceedings of the EuroMicroConference on Real-Time Systems, pages 117–126, Palma de Mallorca, BalearicIslands, Spain. IEEE Computer Society Press.

234

Fisher, N. and Baruah, S. (2005b). A polynomial-time approximation scheme forfeasibility analysis in static-priority systems with bounded relative deadlines. InProceedings of the 13th International Conference on Real-Time Systems, pages233–249, Paris, France.

Fisher, N., Bertogna, M., and Baruah, S. (2007a). Resource-locking durations inedf-scheduled systems. In 13th IEEE Real-Time and Embedded Technology andApplications Symposium, pages 91–100. IEEE.

Fisher, N., Nguyen, T. H. C., Goossens, J., and Richard, P. (2007b). Paramet-ric polynomial-time algorithms for computing response-time bounds for static-priority tasks with release jitters. In Proceedings of the International Conferenceon Real-Time Computing Systems and Applications, Daegu, Korea. IEEE Com-puter Society Press. To appear.

Ghazalie, T. M. and Baker, T. P. (1995). Aperiodic servers in a deadline schedulingenvironment. Real-Time Systems, 9.

Ha, R. and Liu, J. W. S. (1994). Validating timing constraints in multiprocessor anddistributed real-time systems. In Proceedings of the 14th IEEE InternationalConference on Distributed Computing Systems, pages 162–171, Los Alamitos.IEEE Computer Society Press.

Hong, K. and Leung, J. (1988). On-line scheduling of real-time tasks. In Proceedings ofthe Real-Time Systems Symposium, pages 244–250, Huntsville, Alabama. IEEE.

Horn, W. (1974). Some simple scheduling algorithms. Naval Research Logistics Quar-terly, 21:177–185.

Jeffay, K., Stanat, D., and Martel, C. (1991). On non-preemptive scheduling ofperiodic and sporadic tasks. In Proceedings of the 12th Real-Time SystemsSymposium, pages 129–139, San Antonio, Texas. IEEE Computer Society Press.

Johnson, D. (1974). Fast algorithms for bin packing. Journal of Computer andSystems Science, 8(3):272–314.

Johnson, D. S. (1973). Near-optimal Bin Packing Algorithms. PhD thesis, Departmentof Mathematics, Massachusetts Institute of Technology.

Kalyanasundaram, B. and Pruhs, K. (2000). Speed is as powerful as clairvoyance.Journal of the ACM, 47(4):617–643.

Kirsch, C., Sanvido, M., Henzinger, T., and Pree, W. (2002). A giotto-based he-licopter control system. In Proceedings of the Second International Workshopon Embedded Software (EMSOFT), volume 2491 of Lecture Notes in ComputerScience, pages 46–60. Springer-Verlag.

235

Kolmogorov, A. N. and Fomin, S. V. (1970). Introductory Real Analysis. DoverPublications, Inc., New York.

Leung, J. and Whitehead, J. (1982). On the complexity of fixed-priority schedulingof periodic, real-time tasks. Performance Evaluation, 2:237–250.

Liu, C. and Layland, J. (1973). Scheduling algorithms for multiprogramming in ahard real-time environment. Journal of the ACM, 20(1):46–61.

Liu, J. W. S. (2000). Real-Time Systems. Prentice-Hall, Inc., Upper Saddle River,New Jersey 07458.

Lopez, J. M., Diaz, J. L., and Garcia, D. F. (2004). Utilization bounds for EDFscheduling on real-time multiprocessor systems. Real-Time Systems: The In-ternational Journal of Time-Critical Computing, 28(1):39–68.

Lopez, J. M., Garcia, M., Diaz, J. L., and Garcia, D. F. (2000). Worst-case utilizationbound for EDF scheduling in real-time multiprocessor systems. In Proceedingsof the EuroMicro Conference on Real-Time Systems, pages 25–34, Stockholm,Sweden. IEEE Computer Society Press.

Lundberg, L. and Lennerstad, H. (2003). Global multiprocessor scheduling of aperi-odic tasks using time-independent priorities. In Proceedings of the IEEE Real-Time Technology and Applications Symposium, pages 170–180. IEEE ComputerSociety Press.

Mok, A. K. (1983). Fundamental Design Problems of Distributed Systems for TheHard-Real-Time Environment. PhD thesis, Laboratory for Computer Sci-ence, Massachusetts Institute of Technology. Available as Technical ReportNo. MIT/LCS/TR-297.

Mok, A. K. and Chen, D. (1996). A multiframe model for real-time tasks. In Pro-ceedings of the 17th Real-Time Systems Symposium, Washington, DC. IEEEComputer Society Press.

Mok, A. K. and Chen, D. (1997). A multiframe model for real-time tasks. IEEETransactions on Software Engineering, 23(10):635–645.

Murty, K. G. (1983). Linear Programming. John Wiley & Sons, Inc., New York.

Oh, D.-I. and Baker, T. P. (1998). Utilization bounds for N-processor rate monotonescheduling with static processor assignment. Real-Time Systems: The Interna-tional Journal of Time-Critical Computing, 15:183–192.

Park, D.-W., Natarajan, S., Kanevsky, A., and Kim, M. J. (1995). A generalized uti-lization bound test for fixed-priority real-time scheduling. In 2nd InternationalWorkshop on Real-Time Computing Systems and Applications, pages 68–72,Washington, DC, USA. IEEE Computer Society.

236

Phillips, C. A., Stein, C., Torng, E., and Wein, J. (1997). Optimal time-criticalscheduling via resource augmentation. In Proceedings of the Twenty-Ninth An-nual ACM Symposium on Theory of Computing, pages 140–149, El Paso, Texas.

Phillips, C. A., Stein, C., Torng, E., and Wein, J. (2002). Optimal time-criticalscheduling via resource augmentation. Algorithmica, 32(2):163–200.

Richard, P., Goossens, J., and Fisher, N. (2007). Approximate feasibility analysis andresponse-time bounds of static-priority tasks with release jitters. In Proceedingsof the 15th International Conference on Real-Time and Network Systems, pages105–112, Nancy, France.

Ripoll, I., Crespo, A., and Mok, A. K. (1996). Improvement in feasibility testing forreal-time tasks. Real-Time Systems: The International Journal of Time-CriticalComputing, 11:19–39.

Saez, S., Vila, J., and Crespo, A. (1998). Using exact feasibility tests for allocatingreal-time tasks in multiprocessor systems. In Proceedings of the Tenth EuroMi-cro Workshop on Real-time Systems, pages 53–60, Berlin, Germany.

Srinivasan, A. and Anderson, J. (2002). Optimal rate-based scheduling on multipro-cessors. In Proceedings of the 34th ACM Symposium on the Theory of Comput-ing, pages 189–198.

Srinivasan, A. and Baruah, S. (2002). Deadline-based scheduling of periodic tasksystems on multiprocessors. Information Processing Letters, 84(2):93–98.

237