Recent developments in RTOS research · This report discusses current research on real time...

15
Recent developments in RTOS research Lennart Downar Supervisor: M.Sc. Ulrich Gabor Abstract. Multi/many-core platforms offer the potential to greatly en- hance capabilities of embedded systems. In the case of real-time safety- critical environments like smart robotics systems (e.g. the mars rover Curiosity), real-time monitoring of nuclear power plants or medical sys- tems, those platforms are still rarely used. A reason for this is the di- minished possibility of predicting such a system’s behavior. In this re- port we summarize the work of Paolillo et al. [13] on HIPPEROS and Kluge et al.’s work on MOSSCA[10]. HIPPEROS is an asymmetric dis- tributed real-time multi-core operating system for safety-critical appli- cations. MOSSCA, also targeting safety-critical use cases, is a system specifically designed for future NoC many-core architectures. We de- scribe both system’s architectures, compare them where possible and give a critical evaluation of their concepts.

Transcript of Recent developments in RTOS research · This report discusses current research on real time...

Page 1: Recent developments in RTOS research · This report discusses current research on real time operating systems. To this end two recent publications are discussed in detail, namely

Recent developments in RTOS research

Lennart Downar

Supervisor: M.Sc. Ulrich Gabor

Abstract. Multi/many-core platforms offer the potential to greatly en-hance capabilities of embedded systems. In the case of real-time safety-critical environments like smart robotics systems (e.g. the mars roverCuriosity), real-time monitoring of nuclear power plants or medical sys-tems, those platforms are still rarely used. A reason for this is the di-minished possibility of predicting such a system’s behavior. In this re-port we summarize the work of Paolillo et al. [13] on HIPPEROS andKluge et al.’s work on MOSSCA[10]. HIPPEROS is an asymmetric dis-tributed real-time multi-core operating system for safety-critical appli-cations. MOSSCA, also targeting safety-critical use cases, is a systemspecifically designed for future NoC many-core architectures. We de-scribe both system’s architectures, compare them where possible andgive a critical evaluation of their concepts.

Page 2: Recent developments in RTOS research · This report discusses current research on real time operating systems. To this end two recent publications are discussed in detail, namely

Table of Contents

1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31.1 Structure of the paper . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

2 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42.1 Safety-Critical Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

3 Existing Multi/Many-Core Real-Time Operating Systems . . . . . . . . . . . 63.1 L4 Microkernel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63.2 HeliOS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63.3 Multikernel/Barrelfish . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73.4 PikeOS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

4 HIPPEROS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84.1 Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

Task Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8Kernel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8Communication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9Reconfigurability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

5 MOSSCA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105.1 Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

Nodes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11Communication Channels . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11Servers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11Kernel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12Inter-partition Communication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126.1 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

HIPPEROS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13MOSSCA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

6.2 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

Page 3: Recent developments in RTOS research · This report discusses current research on real time operating systems. To this end two recent publications are discussed in detail, namely

Recent developments in RTOS research 3

1 Introduction

Nowadays, embedded systems are everywhere (e.g. ranging from household de-vices such as washing machines to nuclear power plants). A big fraction of theserequire the assurance of meeting deadlines (hard and soft). Since traditional op-erating systems are not necessarily designed to meet this goal, so called real-timeoperating systems (RTOS ) have been developed. While non real-time systemshave to guarantee correctness of the results and processing, real-time systemsalso have to give the same priority for meeting timing conditions [17].

A subclass of real-time systems are safety-critical real-time systems. Theyguarantee that specific algorithms get executed on time (meeting hard dead-lines), thus ensuring reliability at a processing level. A good example can befound in the domain of air traffic control: a collision-avoidance systems is highlydependent on the reliability of the underlying RTOS.

Traditionally, advances in general purpose computer architecture were sub-sequently available for embedded platforms. For the past decade, multi-core sys-tems have established themselves for general use, but, while real-time theory hasexplored the possibilities to leverage the benefits of multi-core systems, in thedomain of safety-critical software operating system implementations still rely onuni-core techniques and architectures[10]. Industrial solutions are available (e.g.PikeOS), however, these cannot be considered mature and widely applicable yet,as will be seen in Chapter 3.

This report discusses current research on real time operating systems. Tothis end two recent publications are discussed in detail, namely A New Config-urable and Parallel Embedded Real-time Micro-Kernel for Multi-core platformsby Paolillo et al. [13] and An operating system for safety-critical applications onmanycore processors by Kluge et al. [10]. Both attempt tackle the problem ofdesigning a safety-critical real-time operating system for multiple cores.

1.1 Structure of the paper

In Section 2 some preliminaries are introduced such as the terms multi- andmany-core, as well as current industry standards when dealing with multiplecores. We will also discuss the requirements that safety-critical systems have tocomply with and approaches to fulfill these requirements. Section 3 will intro-duce some existing RTOS from the industry and research communities. In Sec-tion 4 we take an in-depth look at the HIPPEROS (HIgh Performance ParallelEmbedded Real-time Operating Systems) architecture as described by Paolilloet al.[13]. Section 5 will introduce MOSSCA (Manycore Operating System forSafety Critical Applications). Section 6 concludes this report with a comparisonof HIPPEROS and MOSSCA as well as a critical evaluation of both systems, aswell as a short outlook on future developments.

Page 4: Recent developments in RTOS research · This report discusses current research on real time operating systems. To this end two recent publications are discussed in detail, namely

4 Recent developments in RTOS research

2 Preliminaries

A subtle but important difference can be seen in the titles of the two aforemen-tioned publications. While Paolillo et al. tackle “Multi-core” platforms, Klugeet al. speak of “manycore” systems. As of now, there is no precise definitionfor these terms. Some define multi-core platforms as those who follow the tra-ditional uni-bus design found in most modern everyday CPUs, while many-coreplatforms implement different designs. Others simply define multi-core systemsas those up to 64 cores, while many-cores go massively beyond this limit.

Another way to distinguish those terms has been given by James Reindersfrom Intel: for him the main difference is that a multi-core platform (rangingusually from 2 to 16 cores) is more of an incremental approach to existing systemdesign. Many-core systems (with potentially hundred or thousands of cores) onthe other hand are designed to optimize the hardware assuming that only parallelprograms will run on it [1]. In Figure 1 this difference is visualized by depictingIntel’s desktop processors as “Multicore” and Xeon processors as “Many-core”.

Fig. 1. Intel’s view on multi- and many-core architecture[2]

What unites most definitions is that the term is defined better by the archi-tecture of the system rather than the number of cores it has. Although, indirectly,the number of cores comes into play as the architecture poses some feasibilitylimits on the number of cores.

For the following discussion we will define the terms as follows: like Kluge etal. we will use the terms to differentiate between traditional and non traditionalarchitectures. Systems with multiple cores, a shared cache and communicationover one bus are termed multi-core systems. Architectures diverting from thisapproach, like GPUs, NoCs (Network on a Chip) or a crossbar-connection of thecores, will be called many-core systems.

Page 5: Recent developments in RTOS research · This report discusses current research on real time operating systems. To this end two recent publications are discussed in detail, namely

Recent developments in RTOS research 5

2.1 Safety-Critical Systems

A system is defined as safety-critical if its failure can result in the loss of life(e.g.crash of an airplane), create severe damage of property or results in destructionor contamination of the environment [10]. Typical examples are automobiles,avionics and medical systems. For example, in the year 2000 a program re-sponsible for calculating the right amount of radiation for a medical treatment,gave incorrect values resulting in the deaths of 8 patients and approximately 20severely injured people [7].

In order to ensure the reliability for such a system in real world applications, itis common to certify such systems. In Avionics the US-American FAA (FederalAviation Administration) and the EASA (European Aviation Safety Agency)demand a certification under DO-178-C or ED-12C. Both standards categorizefailures with respect to 5 types of consequences:

– Catastrophic: conditions that would result in multiple fatalities, usuallywith the loss of the airplane.

– Hazardous: reducing the capability of the airplane or crew to cope withadverse conditions such that a large reduction in safety or fatal injuriescould occur.

– Major: reducing the capability of the airplane or crew to cope with adverseconditions.

– Minor: airplane safety is not significantly reduced.– No Effect: no effect on safety.

Other popular norms are DIN EN 61508 and EN 50128, which also classifyfailures in different levels of safety-demands, the Safety Integrity Levels (SIL).For each level the accepted probability of failure on demand decreases [7].

Different procedures to ensure requirements are defined in such standards.However, in order to achieve those requirements, some fundamental propertiesare necessary for all safety-critical systems.

1. The predictability of a system is of major concern, as with a predictablebehavior follows the ability to analyze the systems (e.g. the systems WCET ).A system should therefore have a predictable timing behavior to ensure thatdeadlines are kept.

2. Partitioning in space and time is necessary to make the access to sharedresources predictable. If no shared resources are accessed, it even guaranteesfreedom of interference.

3. Fine-grained communication should only be possible between threads ofone partition/application. Communication across partition borders shouldbe provided through specialized OS mechanisms.

While these three requirements are standard in the literature, a fourth pointwas added by Kluge et al. as ongoing research in the area of dynamic recon-figuration of software hints that this might be an important requirement forsafety-critical systems in the future:

Page 6: Recent developments in RTOS research · This report discusses current research on real time operating systems. To this end two recent publications are discussed in detail, namely

6 Recent developments in RTOS research

– Reconfigurability: A system should be reconfigurable during runtime andthe reconfiguration should happen in a predictable manner.

Especially for highly safety-critical systems (like avionics) a certification pro-cess has been the preferable way of guaranteeing system safety. At the same time“certification can be expensive and time-consuming. Thus, it is imperative thatsafety-critical systems be built using certification-friendly hardware platformsand design processes”[16]. However, due to a lack of predictability concerningshared hardware resources (e.g. caches, main memory etc.), guaranteeing a sys-tem’s safety did not seem possible anymore [9].

Utilizing multi-core hardware has therefore been approached through com-plete isolation of cores (fulfilling the requirement of partitioning in space andtime). However, this completely removes the advantage of potentially runningparallel programs. Another approach is to provide a pessimistic upper bound forthe worst case execution time [12]. While this has been proven to be possible, itdoes not give very satisfying results as these upper bounds might be very high.Hence, provisioning of adequate resources for the worst case might outweigh thebenefits gained by additional processing power.

3 Existing Multi/Many-Core Real-Time OperatingSystems

In the area of multi/many-core RTOS, several solutions exist which can be seenas inspiration (or competitors) for HIPPEROS and MOSSCA. In this sectionwe briefly present some of the prominent research systems, as well as a systemcurrently used in the industry.

3.1 L4 Microkernel

The L4 microkernel is a mature kernel in the area of (safety-critical) real-timeoperating systems. It has been developed for over 20 years now and its lastgeneration has brought forth seL4, which was the first OS kernel that has un-dergone a complete formal verification, as well as an in-depth analysis of itsworst-case execution times. The main principles of L4 are minimality and highIPC performance [6].

3.2 HeliOS

HeliOS was designed with heterogeneous platforms in mind. The main goal is toprovide OS abstraction across different Instruction Set Architectures (ISA). Thisis realized using so called satellite Kernels. Satellite kernels are executed mostlyautonomously, they only implement a few functions like memory, process andthread management. The key advantage is the unified API for the programmer.While the speed of cores may differ, heterogeneous cores look similar to anapplication. Another interesting aspect with regard to system safety is the ability

Page 7: Recent developments in RTOS research · This report discusses current research on real time operating systems. To this end two recent publications are discussed in detail, namely

Recent developments in RTOS research 7

of applications to define postitive or negative affinity towards other processes.A positive affinity indicates, that two applications could benefit from runningon the same core (e.g. they will exchange information very often), a negativeaffinity indicates, that two applications might interfere with each other if run onthe same core [15].

HeliOS was not targeted at embedded platforms, but the idea of a heteroge-neous kernel has served as inspiration for HIPPEROS.

3.3 Multikernel/Barrelfish

To fully utilize their potential, current and future many-core designs call fornew architectures developed from scratch without bearing the traditional systemdesign in mind [13]. Amongst the first to advocate this were the developers ofthe Multikernel architecture (with its implementation Barrelfish [3]).

The multikernel architecture, where traditional OS functions are implementedthrough explicit message passing, can be seen in Figure 2. The system state isviewed as replicated over all cores (instead of a shared system state) and commu-nication is implemented using asynchronous messages passing. Another feature,similar to HeliOS, is that a multikernel can be run on heterogeneous architec-tures.

Fig. 2. Overview of the Multikernel architecture[3]

3.4 PikeOS

PikeOS is a proprietary real-time operating system developed by SYSGO. Basedon an implementation of SSV (Safe and Secure Virtualization)[5], PikeOS allowsmultiple operating systems to run on top of it by acting as a hypervisor and thusachieving separation of resources and communication. In particular, SYSGOadvertises PikeOS as the first safety-critical RTOS to be certified to the highestlevel of EN 50128 (SIL 4) on a multi-core platform. At this point SYSGO has notdisclosed its techniques, however, SYSGO mentions that the system has beencertified on a “dual-core Intel Core i7 processor”. So even though this might

Page 8: Recent developments in RTOS research · This report discusses current research on real time operating systems. To this end two recent publications are discussed in detail, namely

8 Recent developments in RTOS research

be a big step forward, a proven and certified safety-critical real-time operatingsystem, which runs on a wide range of different multi-/many-core platforms, isstill subject of active research. At this point, it is questionable if the approachtaken by PikeOS would scale to more cores or if it is “just” a very specializedsolution for one specific architecture and processor.

4 HIPPEROS

The current industry solution is to isolate individual cores time and space byassigning each core to a partition with limited communication channels [13]. Eachpartition is then considered to be independent making them basically uni-coreplatforms. A well known industry standard exhibiting this is AUTOSAR.

The objectives for the development of HIPPEROS were therefore to developa new configurable kernel, which runs on different architectures and exhibitsscalability for an arbitrary number of cores [13]. This way a RTOS solutionshould be developed to allow the evaluation and exploration of real-time theorysolutions in a multi-core setting.

4.1 Architecture

Task Model The well known sporadic task model, which consists of periodicand sporadic tasks, was chosen for HIPPEROS. Any computation or transmis-sion of data that is executed repeatedly in a regular or semi-regular manner,is termed a periodic task. Usually these are tasks, which provide continuouslyneeded functions of the system [11]. A periodic task Ti consists of several jobs.The period pi of Ti describes the minimum length of all time intervals betweenrelease times of all consecutive jobs in Ti. The execution time of a task is simplythe cumulative execution time of all jobs in the task.

A jobs/task arising from the need to respond to external events, like a pilotswitching off the auto-pilot, are termed sporadic tasks. It is characteristic for theirinter-release times (the time passing between a tasks desired starting times) tobe arbitrarily small. Additionally they have hard deadlines and therefore oneconcern is to always ensure that they meet their deadlines.

In HIPPEROS, a task is defined by the program to be executed and by itstiming information (sporadic/aperiodic, offset, deadline, period and worst-caseexecution time) [13]. The advantage of this model is that a lot of schedulingand validation algorithms for real-time systems are based on the periodic taskmodel, thus making it easy to implement existing solutions (e.g. scheduling algo-rithms and associated resource allocation protocols) for HIPPEROS. The generalscheduler API is preemptive and priority-based.

Kernel Established kernels like the Linux kernel use a symmetric kernel design;that is, each core runs the same kernel code while data structures are protectedby locking mechanisms. A potential problem with such a design in a multi/many-core setting is so called kernel serialisation, which happens when instead of

Page 9: Recent developments in RTOS research · This report discusses current research on real time operating systems. To this end two recent publications are discussed in detail, namely

Recent developments in RTOS research 9

executing in parallel, each kernel thread is actually executed sequentially. Thisproblem becomes more likely with many-cores, as symmetric kernels have beenproven not to scale with an increasing number of cores [13].

To alleviate this problem Cerqueira et al. [4] proposed an asymmetric ap-proach, with one core running the scheduler and dispatching processes to theother cores. HIPPEROS follows this asymmetric design by implementing a mas-ter core which is responsible for managing global resources, keeping the systemstate coherent and manging the scheduling of tasks. Processes are run on slavecores, which handle their local resources. This way, the kernel serialisation isavoided and the kernel is executed in parallel. Choosing this design also almostcompletely removes the need for locking mechanisms to synchronize the systemstate, as it is only visible to the master core. To allow the kernel to be executedin parallel, the approach suggested by Cerqueira et al. was extended: in HIP-PEROS the master core is also responsible for system calls and process messagepassing mechanisms [13].

Communication The general structure of communication in HIPPEROS isillustrated in Figure 3. Whenever a scheduling decision has to be made themaster core is woken up. The master core then uses an inter-processor-interrupt(IPI ) to send the decision to the slave core executing the concerning job. Systemcalls occurring on a slave core, which might impact scheduling, are sent via IPIto the master core. The slave core then waits in a busy loop until the master coreanswers or sends a context switch request. To guarantee the correct execution ofsystem calls and context switches, small data structures protected by mutexesare used.

For Inter-Process-Communication (IPC) HIPPEROS provides two differentAPIs:

– Copy buffer IPC Here the message is copied from the sender buffer to thereceiver buffer (the copy operation is done locally by the slave core).

– Zero Copy IPC A page is shared between sender and receiver.

In both cases the master core is warned by an IPI about the impending com-munication process.

Reconfigurability Another core part of HIPPEROS’ design, which currentlysupports x86 and ARM architectures, is the possibility for configuration at buildtime (e.g. the scheduling policy). For code size reduction only the chosen policieswill be included in the binary image of the kernel. Configuration decisions haveto be made depending for example on the following questions:

– Does the system have a MMU or not? As many critical embedded platformsare still MMU-free, it is important to present also MMU-free memory models.

– Which scheduling policy should be chosen? This is mostly a decision of pref-erence by the system architect.

Page 10: Recent developments in RTOS research · This report discusses current research on real time operating systems. To this end two recent publications are discussed in detail, namely

10 Recent developments in RTOS research

Fig. 3. HIPPEROS Architektur

– Should all cores be utilized or only a limited number of cores? This can beeither for power saving reasons or for making the remaining cores availablein a mixed-criticality environment for other operating systems.

5 MOSSCA

MOSSCA specifically targets many-core systems, as the authors believe thatwith future advances in processor technology we will see a shift from system onchip (SoC) architectures to network on chip (NoC) architectures.

The main idea of MOSSCA is to extend the partitioning approach such thatseveral cores can make up a partition instead of just one core. Threads belong-ing to the same application can then run in the same “neighborhood”, i.e. onadjacent cores. The overall goal is to achieve a separation in time and spacesimilar e.g to gang scheduling [14]. As mentioned before, the difficulty in de-signing safety-critical multi/many-core real-time operating system stems fromthe inherent resource sharing in modern computer architectures. By focusing onNoC systems, the authors see the chance to reduce those ambiguities in resourcesharing to the NoC interconnect and off-chip I/O.

A schematic depiction of this partitioning of application threads to adjacentcores can be seen in Figure 4. In this example we have three different applicationseach with their individual number of threads. These threads are then groupedinto partitions of adjacent cores, with each partition belonging to one application.A fundamental assumption for MOSSCA is that enough cores are available toexecute each thread on a separate core. This assumption already shows thatMOSSCA is geared towards specific systems and applications are tuned andfitted for the architecture. A dynamic changing of programs is not envisioned

Page 11: Recent developments in RTOS research · This report discusses current research on real time operating systems. To this end two recent publications are discussed in detail, namely

Recent developments in RTOS research 11

Fig. 4. Mapping of threads to partitions on a many-core processor[10]

and thus the number of threads should be known (or realistically estimated)before deployment. The other core part of MOSSCA’s design is the introductionof usage policies for the NoC communication and ensuring the compliance ofapplications with those policies. As interferences “cannot be excluded due to theshared nature of [...] resources” [10], these concepts are supposed to at least keepthem at bay.

5.1 Architecture

Nodes Each node maps directly to one of the physical cores residing on the chipand are the major execution resource for applications. The different functionalunits of a core are represented by node capabilities such as: timer-units, cachesor I/O pins.

Nodes are designed as containers for application threads, i.e. each core canexecute one thread. If a thread uses only node-local resources, it is executed infull isolation from the rest of the processor and therefore no interferences fromother threads can occur.

Communication Channels Application threads can communicate via commu-nication channels, which are unidirectional between a sending and a receivingnode in MOSSCA. A channel policy then defines properties and constraints fora specific channel and ensures e.g. that an application cannot exceed certainlimits. This way an overload of the NoC and the receiving node is eliminated.Applications exceeding their limits are blocked by the kernel, thus ensuring that“faults in the application do not propagate through the system” [10].

Servers If a service is used by several applications or threads, servers can beused to provide the service in a centralized manner. The idea is that a task,which cannot be executed on each node, would be concerned with some man-agement function (e.g. of shared resources). Several servers are envisioned forthe MOSSCA environment, namely I/O servers, OS servers, application libraryservers and IPC (inter-partition communication) servers (IPCS). OS servers playan important role in keeping interferences between partitions low. An OS server

Page 12: Recent developments in RTOS research · This report discusses current research on real time operating systems. To this end two recent publications are discussed in detail, namely

12 Recent developments in RTOS research

manages the on-chip resources and the mapping and scheduling of threads tonodes. Interferences are reduced by providing each partition with its own OSserver.

In its current implementation all servers process requests in a first-in first-out ordering, allowing to calculate an upper bound for the response time of arequest. However, the bound is still highly pessimistic as in the worst case arequest would first have to wait for requests of all other clients to be finished[10].

Kernel The kernel runs on each node and provides the MOSSCA API througha system call interface to the applications. It is divided between services thatinteract with an OS server and services that can be performed locally or onnon-OS servers. The kernel also ensures that policies regarding the usage ofcommunication channels are not violated. This way the kernel of a sending nodeensures that the receiving node will not experience a message buffer overflow[10].

Inter-partition Communication Threads belonging to one task can exchangeinformation using communication channels. For inter partition communicationMOSSCA provides inter-partition communication servers (IPCS). A node in theNoC can buffer received messages until they are processed by the job runningon it. Of course this buffer is limited, potentially leading to an overflow on thereceiver side and in turn to a performance impact on the application threadrunning on that node. This should not be an issue within a partition since theapplication developer would be able to implement policies for the communicationand synchronization of his own application threads. However, he might have littleknowledge concerning the interaction with other applications. Inter-partitionmessages should therefore be handled by a mediating server, the IPCS. TheIPCS stores inter-partition messages until the receiver is ready to receive them,preventing them from interfering with intra-partition messages.

Figure 5 shows the general architecture of MOSSCA. Every core/node on theNoC has a kernel running on it, providing its local resource sharing capabilities.On top of this one can run an application (with optional stub interface to easeimplementation of customized servers). Mandatory are a node running an OSserver and an IPCS for inter-partition communication. If any I/O devices arepresent, there is usually one node with a direct connection to the I/O interfaces.On top of that node runs a dedicated I/O server. Should some threads insidea partition use a library routine, it can also be beneficial to make the libraryavailable through an application library server.

6 Conclusion

While MOSSCA has a distributed symmetric kernel, with each core running thesame kernel code, HIPPEROS implements a distributed asymmetric kernel (how-ever MOSSCA’s kernels are mostly responsible for core local task). MOSSCA

Page 13: Recent developments in RTOS research · This report discusses current research on real time operating systems. To this end two recent publications are discussed in detail, namely

Recent developments in RTOS research 13

Fig. 5. Overview of the MOSSCA system architecture[10]

avoids kernel serialisation by giving the kernels only the abilities to manage lo-cal core requests. The higher OS functionality is “outsourced” into dedicatedOS servers. An OS server can be compared to HIPPEROS master core, in that,the higher level functions are not present on every node, but on a specializedentity (an OS server). While HIPPEROS only has one such special entity (themaster core), MOSSCA can potentially provide arbitrarily many entities and inturn achieves better scalability.

HIPPEROS and MOSSCA both go beyond existing solutions in the industryas well as research. While borrowing ideas from other systems they try to comeup with systems, which improve predictability by removing as much uncertaintyfrom inter process communication as possible.

However, MOSSCA remains a proof of concept at this point as it has notbeen actually implemented for any architecture. HIPPEROS is available for x86and ARM architectures. Still, MOSSCA seems far more fleshed out and, becauseof its focus on NoC many-core systems, might also be more suitable to tacklefuture system development.

6.1 Evaluation

HIPPEROS One of the goals of HIPPEROS is to achieve scalability on anarbitrary number of cores. However, in the end the authors suggest that theirapproach scales only up to 8 cores. The goal of large scalability has therefore notbeen achieved. The main reason is that more cores would most likely overloadthe master core as processes would have to wait for the completion of systemcalls from other processes (recall that certain system calls get delegated to themaster core). The master core would then become a bottle-neck for system calls[13]. Interestingly enough, ideas from MOSSCA could provide a solution to thisproblem. From an abstract point of view MOSSCA is similar to HIPPEROSin that it distributes its kernel. However, communication delegating functionsare handled by multiple OS servers, even suggesting one OS server per parti-tion. This achieves the goal of scalability and could serve as a solution for HIP-PEROS, i.e multiple master cores. Paolillo et al. even actually a similar strategyto enhance scalability: applying a clustering approach with several independentmicro-kernels executed in parallel [13].

Page 14: Recent developments in RTOS research · This report discusses current research on real time operating systems. To this end two recent publications are discussed in detail, namely

14 Recent developments in RTOS research

MOSSCA One of MOSSCAs core ideas is to ensure that no undesired behavioroccurs, like congestion, through the strict compliance to policies. How exactlythose policies might have to be designed and whether it is indeed feasible todesign such policies with the desired guarantees remains to be proven. Withoutfurther information, it seems that a lot of the heavy work in designing a safety-critical system is delegated to these policies. However, under the assumption ofwell made policies, the architecture of MOSSCA does provide a certain level ofpredictability.

Kluge et al. conclude that MOSSCA achieves predictability and analysabilityby mediating as much work as possible on the local node, where a service re-quest originates. Inter-partition communication is delegated to servers. Explicitdetails, on how the communication to servers is prevented from overloading aserver, is not given. Therefore, it seems that critical aspects are simply dele-gated away: “MOSSCA services used by critical applications are implementedsuch that they can be performed in bounded time.”[10]. How exactly such animplementation has to be designed is not investigated and left to the developer.Furthermore, dynamic resource allocation during runtime will have to be esti-mated offline beforehand. This way the worst-case execution time for any OSservice in MOSSCA can be bounded. While this approach might help in provid-ing a more realistic WCET (as the WCET is mostly affected by shared busesand shared resources [8]), it means that more computation has to be performedoffline beforehand, making the execution behavior of the system rather static.Additionally, as has been shown by Nowotsch and Paulitsch[12], providing abounded worst-case response is not impossible for a multi-core RTOS, but it isusually quite pessimistic and much worse than a single-core solution. Kluge etal. do not offer an answer to the question whether they expect this to improvewith MOSSCA.

Nevertheless, convincing arguments for using NoC for future multi/many-core system are given: while the possibilities of occurrence of interferences arenot entirely removed, it seems easier to handle them when every core has its ownresources.

6.2 Future Work

A major improvement for HIPPEROS would be to increase its scalability. Othersuggested improvements are integration of wait-free data structures for Inter-Processor-Interrupts, as well as an integration into mixed-critical systems.

For a better evaluation basis MOSSCA should be implemented on a real sys-tem and evaluated with a WCET analysis, to see if the theoretical improvementstranslate into practice. Dynamic reconfiguration is also envisioned for the future.

As the interconnect will eventually become the bottleneck in future many-core platforms, new ways to improve interconnect propagation times are beinginvestigated. Some of these include optical interconnects (e.g. several NoC areconnected via optical communication channels to reduce propagation time) or 3DNoC (physically stacking several NoC with connections from one to the other)[15].

Page 15: Recent developments in RTOS research · This report discusses current research on real time operating systems. To this end two recent publications are discussed in detail, namely

Recent developments in RTOS research 15

References

1. Ask James Reinders: Multicore vs. Manycore. http://goparallel.sourceforge.net/ask-james-reinders-multicore-vs-manycore/ (2016), [Online; accessed 20-January-2016]

2. Examples of different CPU architectures for Intel solution. http://goparallel.sourceforge.net/wp-content/uploads/2012/09/Untitled1.png (2016), [Online;accessed 20-January-2016]

3. Baumann, A., Barham, P., Dagand, P.E., Harris, T., Isaacs, R., Peter, S., Roscoe,T., Schupbach, A., Singhania, A.: The multikernel: a new os architecture for scal-able multicore systems. In: Proceedings of the ACM SIGOPS 22nd symposium onOperating systems principles. pp. 29–44. ACM (2009)

4. Cerqueira, F., Vanga, M., Brandenburg, B.B.: Scaling global scheduling with mes-sage passing. In: Real-Time and Embedded Technology and Applications Sympo-sium (RTAS), 2014 IEEE 20th. pp. 263–274. IEEE (2014)

5. Douglas, H., Gehrmann, C.: Secure virtualization and multicore platforms state-of-the-art report. Swedish Institute of Computer Science (2009)

6. Elphinstone, K., Heiser, G.: From l3 to sel4 what have we learnt in 20 years of l4 mi-crokernels? In: Proceedings of the Twenty-Fourth ACM Symposium on OperatingSystems Principles. pp. 133–150. ACM (2013)

7. Fischer, D., Lambertz, K.: Kriterien fur das Testen sicherheit-skritischer Systeme. http://www.heise.de/developer/artikel/

Kriterien-fuer-das-Testen-sicherheitskritischer-Systeme-1983147.html

(2013), [Online; accessed 28-January-2016]8. Kelter, T.: WCET analysis and optimization for multi-core real-time systems.

Ph.D. thesis (2015)9. Kinnan, L.M.: Use of multicore processors in avionics systems and its potential im-

pact on implementation and certification. In: Digital Avionics Systems Conference,2009. DASC’09. IEEE/AIAA 28th. IEEE (2009)

10. Kluge, F., Gerdes, M., Ungerer, T.: An operating system for safety-criticalapplications on manycore processors. In: IEEE 17th International Sympo-sium on Object/Component/Service-Oriented Real-Time Distributed Computing(ISORC). pp. 238–245. IEEE press (2014)

11. Liu, J.W.S.W.: Real-Time Systems. Prentice Hall PTR, Upper Saddle River, NJ,USA, 1st edn. (2000)

12. Nowotsch, J., Paulitsch, M.: Leveraging multi-core computing architectures inavionics. In: Dependable Computing Conference (EDCC), 2012 Ninth European.pp. 132–143. IEEE (2012)

13. Paolillo, A., Desenfans, O., Svoboda, V., Goossens, J., Rodriguez Lobera, B.: A newconfigurable and parallel embedded real-time micro-kernel for multi-core platforms(2015)

14. Tanenbaum, A.S., Bos, H.: Modern operating systems. Prentice Hall Press (2014)15. Vajda, A.: Programming Many-Core Chips. Springer Science & Business Media

(2011)16. Ward, B.C., Herman, J.L., Kenna, C.J., Anderson, J.H.: Making shared caches

more predictable on multicore platforms. In: Real-Time Systems (ECRTS), 201325th Euromicro Conference on. pp. 157–167. IEEE (2013)

17. Worn, H.: Echtzeitsysteme: Grundlagen, Funktionsweisen, Anwendungen.Springer-Verlag (2006)